Anwar Ghuloum from Intel talks on Intel's blog about the problems with GPGPU:
Hundreds of GigaFLOPs are available in your PC today….in fact, you might even have a TeraFLOP in there. As someone who cut his teeth on a Cray C90 (15 GFLOPS max), this is an intriguing opportunity to dabble; for the latter-day high performance computing programming (whether you’re trying to predict protein structure, price options, or trying to figure out how to thread your game), it is almost too tempting to ignore. However, like a shimmering, unreachable oasis, today’s GPUs offer the promise of all the performance you require, but achieving that goal for all but a few applications (notably, those they were designed for: rasterization)is elusive.
I do a lot of work in data parallel computation and deterministic programming models…this means a lot of my peers and external collaborators are from the GPGPU community, so I expect some hate mail on this :-) . But I think that by approaching parallelism purely from the GPU (or more generally, streaming) side of things, we’re losing track of the many valuable lessons learned from the slightly broader-scoped High Performance Computing community in the last forty years. (However, even looking at where graphics is going, we might find shortcomings in existing GPU designs.)
There are three major problems with taking GPUs outside the niche of rasterization today: The programming model, system bottlenecks, and the architecture. (No, I’m not going to talk about form factor and cooling, which are bigger near-term show stoppers for the IT architect.)