VR Zone reports that while AMD is touting its heterogeneous unified memory architecture (hUMA) as a big competitive advantage, not all developers are convinced of the technology's virtues. One of them is Epic Games' Tim Sweeney, who told the site that the programming model uniformity still remains an enormous barrier to the productive movement of code among processing units. Sweeney acknowledges that hUMA is a welcome improvement, but it seems he's not convinced whether the economics of coding for such a platform makes sense considering the marginal performance gain versus the additional complexity of coding.
One developer was not necessarily convinced. In an email to VR-Zone, Epic Games’ Tim Sweeney says that the non-uniformity of programming languages between the GPU and CPU will still be a barrier even with hUMA.
Uniform memory access to a cache coherent shared address space is a very welcome improvement to current CPU/GPU model. On the PC platform, the challenge is how to expose it. DirectX? OpenGL? Their pace of adopting new technology is mighty slow for this significant a change. And the bigger source of non-uniformity, the programming model uniformity (C++ on CPU vs OpenCL/CUDA on GPU) remains and is an enormous barrier to the productive movement of code among processing units. Ultimately, I would love to see vector computing hardware addressed in mainstream programming languages such as C++ through loop vectorization and other compiler-automated parallelism transformations, rather than by writing code in separate languages designed for GPU.
VR-Zone reached out to AMD to respond to Sweeney’s statement and AMD corporate fellow Phil Rogers, who is seen as the go-to-guy for all things heterogeneous computing at the company, had this to say:
Like AMD, it seems Tim Sweeney clearly sees the future that HSA is aiming at: single source, high level language programs that can run both on the CPU and GPU. This is ultimately what will allow Tim, and hundreds of thousands of other software developers, to easily write software that accelerates on HSA platforms. This is exactly why we are developing HSA – unifying the addressing, providing full memory coherency and extending the capability of the GPU parallel processor to fully support C++. We are re-inventing the heterogeneous platform to eliminate all of the barriers that currently prevent a C++ or Java program offloading its parallel content to the GPU.
Tim is correct that in addition to the arrival of the HSA platform, the programming model must evolve to give easy access to GPU acceleration. OpenCL 2.0 and C++ AMP are very good steps in this direction. OpenCL 2.0 brings unified addressing and memory coherency options. C++ AMP is a single source approach that adds just two new keywords, restrict and arrayview, to allow particular methods to be compiled to both CPU and GPU and offloaded opportunistically. The HSA platform will allow both of these programming models to evolve still further towards the pure C++ model that is natural to the programmer.