In the "CAF: Core to Core Communication Acceleration Framework" paper, the researchers propose a core-to-core communication acceleration framework (CAF) that involves the implementation of a hardware-based queue management device (QMD) on the processor, rather than the traditional software-based queue implementations.
Not only can the QMD enhance core-to-core communication performance by a factor of 2 to 12 versus current software-based implementations, it can also expedite some basic computational functions by as much as 15 percent thanks to its ability to aggregate data from multiple cores.
As the number of cores in a multicore system increases, core-to-core (C2C) communication is increasingly limiting the performance scaling of workloads that share data frequently. The traditional way cores communicate is by using shared memory space between them. However, shared memory communication fundamentally involves coherence invalidations and cache misses, which cause large performance overheads and incur a high amount of network traffic.More details will be presented at the 25th Annual Conference on Parallel Architectures and Compilation Techniques, Sept. 11-15, Haifa, Israel.
Many important workloads incur significant C2C communication and are affected significantly by the costs, including pipelined packet processing which is widely used in software-based networking solutions. In these workloads, threads run on different cores and pass packets from one core to another for different stages of processing using software queues.
In this paper, we analyze the behavior and overheads of software queue management. Based on this analysis, we propose a novel C2C Communication Acceleration Framework (CAF) to optimize C2C communication. CAF offloads substantial communication burdens from cores and memory to a designated, efficient hardware device we refer to as Queue Management Device (QMD) attached to the Network on Chip. CAF combines hardware and software optimizations to effectively reduce the queue-induced communication overheads and improve the overall system performance by up to 2-12x over traditional software queue implementations.
On a related note, Apple held its annual launch event in San Francisco. The iPhone 7 was announced, as well as the iWatch 2 and some uber-expensive "AirPods" wireless earphones. Because I'm getting pretty bored by these Apple announcements and as everyone and his dog is already covering this stuff, I'm not going to go further into the details.