Intel presented a first teraflop-on-a-chip prototype at the Intel Fall Developer Forum.
Each of the 80 processors on the wafer contain a die with eighty cores -- 6400 cores in total. Each CPU has more than one terabyte per second of throughput between the CPU cores and the on-die SRAM. Otellini claims that this technology will be available within 5 years, putting it in line with the previously outlined Gesher family expected to ship in 2010.
To put that into perspective, the fastest public supercomputer in 1996 was the ASCI Red which featured over 4,500 compute nodes using 200MHz Pentium Pro processors and was the first computer to break the 1 teraflops barrier.
Each of the individual CPUs runs at 3.1GHz in a very simple configuration. These are far from production-ready processors and are mainly for demonstration purposes. Each processor is also unique in the fact that the packaging is three dimensional. The cache substrate is "stacked" directly underneath the FPUs, thus saving space and latency.