The Tech Report talked to NVIDIA and heard they will deliver beta support for the new OpenCL standard next quarter, with full support following in the second quarter of 2009. Nvidia CUDA General Manager (and former Ageia CEO) Manju Hegde said they can't move any faster, because the OpenCL working group has not yet completed its conformance sets, and they expect it will take a couple of months until these are available.
NVIDIA doesn't see OpenCL or other technologies like DirectX 11 Compute as real competition for its CUDA technology. The main goal of the GPU maker is to pave the way for more GPGPU applications that take advantage of the parallel processing power of the GPU, as this will boost sales of GPUs and drive more consumers towards GPUs with higher margins.
We went on to ask about some of the differences between C for CUDA and OpenCL. According to Hegde, OpenCL is designed to be "OpenGL-like" in that it gives developers complete hardware access and expects them to handle "all the tedious hardware housekeeping" like initializing devices, allocating buffers, and managing memory. By contrast, C for CUDA offers two styles of programming: a high-level style where "the abstraction level is at the same level as C," and a driver-level API that's on "the same level as OpenCL."
Hegde told us the vast majority of developers using C for CUDA favor the higher-level style. That applies particularly to developers writing scientific applications, since those folks may be experts in their fields and have a good grasp of C, but they might not necessarily care to learn the intricacies of the computing hardware.
We were also curious about the potential performance differences between OpenCL and C for CUDA apps. As far as that goes, Hegde noted that performance largely depends on how programmers break up their algorithms into multiple threads and match the host hardware's architecture. The high-level flavor of C for CUDA might induce "some performance loss" compared to a lower-level approach, but Hegde said that's a small consideration. Because Nvidia also offers a driver-level API, OpenCL and C for CUDA should almost be in a "dead heat" in terms of compute performance.