I spent a while this evening reading through the documentation for the beta release of NVIDIA’s CUDA GPGPU system. My motivation for this was that nvcc, the CUDA compiler, is based on a code drop of the EkoPath compiler, which I’ve worked on intermittently over the past few years.
The programming model that these GPUs enforce is incredibly complex. It’s more than a little reminiscent of the Connection Machine (for a blast from the past, see a collection of scanned CM-5 documentation.
The idea is that the GPU executes a “kernel” of compute-intensive, highly parallelisable, code on behalf of the CPU. Data is transferred to the GPU when a kernel starts to execute, and back to the host when it completes. The GPU may execute multiple kernels simultaneously, if it is capable of it.