In order to reach the 10 PFLOPS target, the computational power in Stampede is split in two parts. First 2 PFLOPS come from 6,400 nodes carrying two Intel Xeon E5 Series processors (Sandy Bridge-EP) and 32GB of DDR3 memory. Second part of the system comes from the countless MIC cards (now known as Xeon Phi), which were supposed to deliver 8 PFLOPS. As it turns out, the "countless MIC coprocessors" fell a bit short of the target, with TACC expecting more than 7PFLOPS, but less than 8PFLOPS. Third part of the Stampede system is 16 memory nodes with 1TB of DDR3 memory and two NVIDIA Tesla K20 boards. Furthermore, Tesla K20 boards are located in 128 out of 6,400 compute nodes for computational purposes, bringing the total number of K20 boards to 144. This number pales in comparison to around 6,500 Xeon Phi boards. The ScaleMP virtual SMP solution is used in order to create a shared memory environment, spanning across all 16TB of memory. This part will mostly target "big data".The final retail price of Intel's Xeon Phi remains a mystery though, but given that TACC took most of the pre-production Xeon Phi boards and ordered so many of them, it will likely be a lot higher.
While the prices of Intel Xeon E5 systems and the Tesla boards were delivered at special but still realistic pricing, we were quite surprised to learn that the computing center only paid around $400 per Xeon Phi board. Given that competing Tesla K20 boards retail for $3199 (available in December), this can be viewed from a price dumping perspective. Bear in mind the TACC only had $2.4 million for Xeon Phi boards, and reaching 8PFLOPS e.g. 7+ PFLOPS requires around 6,000-7,000 boards. At $400, it is quite a steal.
Intel Xeon Phi preferred price to start at $400?
Posted on Tuesday, October 16 2012 @ 19:33 CEST by Thomas De Maesschalck