Today Intel's CTO Justin Rattner told the audience that Intel's next-generation architecture will be Core. The first Core-based products will be launched later this year and they will be 65nm chips.
Rattner shared a couple of details on the Core microarchitecture:
The desktop version of Core is Conroe. This processor will have a 65 thermal design power. It will be 40 percent faster and use 40 percent less power than Intel's current Pentium D 950! More details over at The Tech Report.
A four-issue-wide, 14-stage main pipeline — This will obviously be a shorter pipeline than the 31 stages in Netburst processors, much closer to the current Pentium M and Core Duo CPUs, as expected. Micro-fusion — Known as micro-ops fusion in the Pentium M, this allows the processor to fuse together certain types of internal “micro-ops” instructions—behind the CPU’s instruction decoder, in the RISC core—and execute them as one for more performance per clock. Macro-fusion — This is a new one, but wasn’t explained in great detail. Presumably, the CPU will be able to fuse together certain x86 ISA instructions a la micro-ops fusion. The example given was the fusion of the compare and jump instructions. Single-cycle execution of 128-bit SSE — Core processors will execute the entire family of 128-bit SSE instructions in a single cycle, for what Intel is calling a boost in digital media performance. Obviously, higher performance per clock in SSE instructions will accelerate a range of applications. Shared on-chip L2 cache — The dual-core Core (ack!) processors will feature a single, unified L2 cache that should allow for efficient sharing of data between the processor cores with no need for external bus traffic for cache coherency protocol traffic between the cores. Rattner said that there would be no partitioning of the L2 cache between cores, and in the event that one core should shut itself down to save power during a period of inactivity, the other core could make use of the full L2 cache if needed. Smarter memory access — This one seems to come around every time Intel revises its CPU, but Core will indeed include new cache prefetch algorithms, which is probably necessary for best results given the move to a shared L2 cache. Also, as we learned at the last IDF, Core will have a feature called memory disambiguation that attempts to opportunistically reorder memory loads and stores when possible in order to lower effective access latencies. Advanced power gating — Clock gating shuts down logic on the chip when it’s not needed at the time. In his keynote speech, Intel’s Pat Gelsinger described the Core architecture’s clock gating as “super-fine grained.”