Armv9 architecture promises 30 percent higher performance

Posted on Wednesday, March 31 2021 @ 12:42 CEST by Thomas De Maesschalck

Big news from Arm as the company presented technical details about the Armv9 architecture. It's the biggest update since the introduction of ARMv8 in 2011 -- the v9 will be fully backward compatible and will focus on AI, security, digital signal processing, and specialized computing applications. The chip designer expects the next two chip generations based on ARMv9 will deliver performance increases of over 30 percent.

Arm-based chips are used far widely than x86 processors and the current ARMv8 architecture has found its way into 100 billion chips! Arm expects Armv9 will end up in at least 300 billion devices before it moves on to Armv10. AnandTech discusses the Armv9 architecture in-depth over here.

ARMv9

Today, Arm introduced the Arm®v9 architecture in response to the global demand for ubiquitous specialized processing with increasingly capable security and artificial intelligence (AI). Armv9 is the first new Arm architecture in a decade, building on the success of Armv8 which today drives the best performance-per-watt everywhere computing happens.

“As we look toward a future that will be defined by AI, we must lay a foundation of leading-edge compute that will be ready to address the unique challenges to come,” said Simon Segars, chief executive officer, Arm. “Armv9 is the answer. It will be at the forefront of the next 300 billion Arm-based chips driven by the demand for pervasive specialized, secure and powerful processing built on the economics, design freedom and accessibility of general-purpose compute.”

The number of Arm-based chips shipped continues to accelerate, with more than 100 billion devices shipped over the last five years. At the current rate, 100 percent of the world’s shared data will soon be processed on Arm; either at the endpoint, in the data networks or the cloud. Such pervasiveness conveys a responsibility on Arm to deliver more security and performance, along with other new features in Armv9. The new capabilities in Armv9 will accelerate the move from general-purpose to more specialized compute across every application as AI, the Internet of Things (IoT) and 5G gain momentum globally.

To address the greatest technology challenge today – securing the world’s data – the Armv9 roadmap introduces the Arm Confidential Compute Architecture (CCA). Confidential computing shields portions of code and data from access or modification while in-use, even from privileged software, by performing computation in a hardware-based secure environment.

The Arm CCA will introduce the concept of dynamically created Realms, useable by all applications, in a region that is separate from both the secure and non-secure worlds. For example, in business applications, Realms can protect commercially sensitive data and code from the rest of the system while it is in-use, at rest, and in transit. In a recent Pulse survey of enterprise executives, of enterprise executives, more than 90% of the respondents believe that if Confidential Computing were available, the cost of security could come down enabling them to dramatically increase their investment in engineering innovation.

"The increasing complexity of use cases from edge to cloud cannot be addressed with a one-size-fits-all solution," said Henry Sanders, corporate vice president and chief technology officer, Azure Edge and Platforms at Microsoft. "As a result, heterogeneous compute is becoming more ubiquitous, requiring greater synergy among hardware and software developers. A good example of this synergy between hardware and software are the ArmV9 confidential compute features which were developed in close collaboration with Microsoft. Arm is in a unique position to accelerate heterogeneous computing at the heart of an ecosystem, fostering open innovation on an architecture powering billions of devices."

AI everywhere demands specialized, scalable solutions
The ubiquity and range of AI workloads demands more diverse and specialized solutions. For example, it is estimated there will be more than eight billion AI-enabled voice-assisted devices in use by the mid-2020si, and 90 percent or more of on-device applications will contain AI elements along with AI-based interfaces like vision or voiceii.

To address this need, Arm partnered with Fujitsu to create the Scalable Vector Extension (SVE) technology, which is at the heart of Fugaku, the world’s fastest supercomputer. Building on that work, Arm has developed SVE2 for Armv9 to enable enhanced machine learning (ML) and digital signal processing (DSP) capabilities across a wider range of applications.

SVE2 enhances the processing ability of 5G systems, virtual and augmented reality, and ML workloads running locally on CPUs, such as image processing and smart home applications. Over the next few years, Arm will further extend the AI capabilities of its technology with substantial enhancements in matrix multiplication within the CPU, in addition to ongoing AI innovations in its Mali™ GPUs and Ethos™ NPUs.

Maximizing performance through system design
Over the past five years, Arm designs have increased CPU performance annually at a rate that outpaces the industry. Arm will continue this momentum into the Armv9 generation with expected CPU performance increases of more than 30% over the next two generations of mobile and infrastructure CPUs.

However, as the industry moves from general-purpose computing towards ubiquitous specialized processing, annual double-digit CPU performance gains are not enough. Along with enhancing specialized processing, Arm’s Total Compute design methodology will accelerate overall compute performance through focused system-level hardware and software optimizations and increases in use-case performance.

By applying Total Compute design principles across its entire IP portfolio of automotive, client, infrastructure and IoT solutions, Armv9 system-level technologies will span the entire IP solution, as well as improving individual IP. Additionally, Arm is developing several technologies to increase frequency, bandwidth, and cache size, and reduce memory latency to maximize the performance of Armv9-based CPUs.

A unique vision for the next decade of computing
“Addressing the demand for more complex AI-based workloads is driving the need for more secure and specialized processing, which will be the key to unlocking new markets and opportunities,” said Richard Grisenthwaite, SVP, chief architect and fellow, Arm. “Armv9 will enable developers to build and program the trusted compute platforms of tomorrow by bridging critical gaps between hardware and software, while enabling the standardization to help our partners balance faster time-to-market and cost control alongside the ability to create their own unique solutions.”

Armv9 architecture promises 30 percent higher performance

About the Author