The execution units (EUs) on Intel's Graphics Gen9 are grouped together in clusters referred to as a subslice. Each subslice of most Gen9-based products has 8 EUs and each EU is capable is running 7 threads each, meaning a single subslice can run 56 simultaneous threads.
The Intel Core i7 processor 6700K with Intel HD Graphics 530 features a single slice with three subslices, meaning it offers a total of 24 EUs capable of running 168 hardware threads. The GT2 integrated graphics part makes up about 40 percent of the die area of the 6700K, eDRAM support is optional with support for 64MB to 128MB of eDRAM. A more advanced version is the Skylake GT3 graphics, this model consists of two slices offering a grand total of 48 EUs, and then you'll also have GT4 with three slices offering 72 EUs. For a full summary of the changes in Skylake's iGPU, head over to Legit Reviews.
NEW CHANGES FOR INTEL PROCESSOR GRAPHICS GEN9And here's an overview of the HD Graphics 530 specifications and peak performance:
Intel processor graphics gen9 includes many refinements throughout the micro architecture and supporting software, over Intel processor graphics gen8. Generally, these changes are across the domains of memory hierarchy, compute capability, and product configuration. They are briefly summarized here, with more detail integrated throughput the paper.
Gen9 Memory Hierarchy Refinements:
Coherent SVM write performance is significantly improved via new LLC cache management policies. The available L3 cache capacity has been increased to 768 Kbytes per slice (512 Kbytes for application data). The sizes of both L3 and LLC request queues have been increased. This improves latency hiding to achieve better effective bandwidth against the architecture peak theoretical. In Gen9 EDRAM now acts as a memory-side cache between LLC and DRAM. Also, the EDRAM memory controller has moved into the system agent, adjacent to the display controller, to support power efficient and low latency display refresh. Texture samplers now natively support an NV12 YUV format for improved surface sharing between compute APIs and media fixed function units.
Gen9 Compute Capability Refinements:
Preemption of compute applications is now supported at a thread level, meaning that compute threads can be preempted (and later resumed) midway through their execution. Round robin scheduling of threads within an execution unit. Gen9 adds new native support for the 32-bit float atomics operations of min, max, and compare/exchange. Also the performance of all 32-bit atomics is improved for kernel scenarios that issued multiple atomics back to back. 16-bit floating point capability is improved with native support for denormals and gradual underflow.
Gen9 Product Configuration Flexibility:
Gen9 has been designed to enable products with 1, 2 or 3 slices. Gen9 adds new power gating and clock domains for more efficient dynamic power management. This can particularly improve low power media playback modes