NVIDIA said to be working on driver solution for poor DX12 Async Compute support

Posted on Saturday, September 05 2015 @ 23:41 CEST by Thomas De Maesschalck
NVIDIA relogo
If you've been following the Ashes of the Singularity controversy for some time, you'll know that the developer recently mentioned that one of the key differences between AMD and NVIDIA on DirectX 12 is that the latter doesn't properly support Async Compute, which seems to be leading to a big performance disadvantage for NVIDIA.

Now we hear from Oxide developer Kollock that NVIDIA is still working on its Async Compute implementation, and that the developer is working closely with NVIDIA as they fully implement the feature. Support will be added via a future driver release.

One key difference seems to be that AMD has a fully hardware based solution, while NVIDIA implements it using a combination of hardware and software:
Mahigan of Overclock offered this thorough explanation of Async Compute:

“The Asynchronous Warp Schedulers are in the hardware. Each SMM (which is a shader engine in GCN terms) holds four AWSs. Unlike GCN, the scheduling aspect is handled in software for Maxwell 2. In the driver there’s a Grid Management Queue which holds pending tasks and assigns the pending tasks to another piece of software which is the work distributor. The work distributor then assigns the tasks to available Asynchronous Warp Schedulers. It’s quite a few different “parts” working together. A software and a hardware component if you will.

With GCN the developer sends work to a particular queue (Graphic/Compute/Copy) and the driver just sends it to the Asynchronous Compute Engine (for Async compute) or Graphic Command Processor (Graphic tasks but can also handle compute), DMA Engines (Copy). The queues, for pending Async work, are held within the ACEs (8 deep each)… and ACEs handle assigning Async tasks to available compute units.

Simplified…

Maxwell 2: Queues in Software, work distributor in software (context switching), Asynchronous Warps in hardware, DMA Engines in hardware, CUDA cores in hardware.
GCN: Queues/Work distributor/Asynchronous Compute engines (ACEs/Graphic Command Processor) in hardware, Copy (DMA Engines) in hardware, CUs in hardware.”
Via: eTeknix


About the Author

Thomas De Maesschalck

Thomas has been messing with computer since early childhood and firmly believes the Internet is the best thing since sliced bread. Enjoys playing with new tech, is fascinated by science, and passionate about financial markets. When not behind a computer, he can be found with running shoes on or lifting heavy weights in the weight room.



Loading Comments