AMD explained Microsoft's upcoming DirectX12 API will finally enable the use of Asynchronous Shaders to process multiple command streams in parallel. Each queue can submit commands without waiting for other tasks to complete which promises to result in performance gains thanks to more efficient use of computing power. Full details on how it works can be read at Tom's Hardware.
In DirectX 12, however, a new merging method called Asynchronous Shaders is available, which is basically asynchronous multi-threaded graphics with pre-emption and prioritization. What happens here is that the ACEs (Asynchronous Compute Engines) on AMD's GCN-based GPUs will interleave the tasks, filling the gaps in one queue with tasks from another, kind of like merging onto a highway where nobody moves to the side for you. Despite that, it can still move the main command queue to the side to let priority tasks pass by when necessary. It probably goes without saying that this leads to a performance gain.
On AMD's GCN GPUs, each ACE can handle up to eight queues, and each ACE can address its own fair share of shaders. The most basic GPUs have just two ACEs, while more elaborate GPUs carry eight.
AMD provided one performance example: the firm ran the LiquidVR SDK sample and achieved 245fps with Asynchronous Shaders off and post-processing off. With post-processing enabled, the framerate fell to 158fps but by enabling Asynchronous Shaders it jumped to 230fps. This means they basically got post-processing effects at almost no performance hit.