Fact-checked by Grok 2 weeks ago

ARM Cortex-A78

The ARM Cortex-A78 is a high-performance, power-efficient (CPU) core developed by as part of its DynamIQ shared memory multi-processing technology, implementing the Armv8.2-A 64-bit with extensions for enhanced scalar, vector, and floating-point processing. Designed primarily for premium mobile devices, laptops, and emerging form factors like foldables, it supports up to four cores per DynamIQ cluster and features configurable L1 instruction and data caches of 32 KB to 64 KB each, along with a private L2 cache of 256 KB to 512 KB per core. Announced in May 2020, the Cortex-A78 emphasizes sustained performance for demanding workloads such as gaming, (XR), and (ML), while delivering up to 20% better sustained performance than its predecessor, the Cortex-A77, within the same mobile thermal power envelope. Key architectural enhancements in the Cortex-A78 include improved branch prediction, instruction fusion, and load/store optimizations, resulting in 7% higher single-threaded performance and 4% lower power consumption per performance point compared to the Cortex-A77. It also achieves 8% less power usage for ML-based tasks, contributing to 10% overall efficiency gains in AI-driven applications like and features. Supporting 40-bit physical addressing and interfaces like AMBA for coherent memory access, the core integrates seamlessly with little cores such as the Cortex-A55 in heterogeneous big.LITTLE configurations, enabling multi-day battery life in 5G-enabled smartphones and tablets. The Cortex-A78 powers flagship systems-on-chip (SoCs) from vendors like , (), and (), bridging the performance gap between and with scalability for up to 3 GHz clock speeds in optimized designs. Variants include the Cortex-A78AE for safety-critical automotive and industrial applications, offering lock-step dual-core redundancy and 48-bit addressing, and the Cortex-A78C for client , supporting up to eight cores with 8 MB shared L3 and frequencies up to 3.3 GHz. These adaptations highlight its versatility in delivering immersive digital experiences while prioritizing energy efficiency amid the rise of and workloads.

Introduction

Overview

The ARM Cortex-A78 is a 64-bit CPU core compatible with the ARMv8.2-A , designed to deliver with optimized for demanding applications. As the fourth-generation premium core in Arm's DynamIQ lineup, it builds on the Austin family microarchitecture to enable sustained performance improvements while maintaining thermal constraints typical of mobile devices. This core targets primary applications in flagship smartphones, tablets, mainstream laptops, and -enabled devices that support immersive experiences such as (XR), tasks, and multi-screen interactions. It plays a central role in the ecosystem by bridging performance gaps between mobile and laptop computing, facilitating innovations in foldable devices and energy-efficient architectures. The Cortex-A78 offers general scalability, configurable for deployment in clusters of 1 to 4 cores within Arm's technology, and is compatible with big.LITTLE heterogeneous architectures that pair high-performance "big" cores with efficient "LITTLE" cores like the . This flexibility allows system designers to balance performance, power, and area across diverse device form factors.

Development and Announcement

The ARM Cortex-A78 was officially announced on May 26, 2020, during Arm Tech Day 2020, as part of the company's 2020 mobile IP portfolio reveal. This event highlighted the core's role in advancing capabilities amid the rise of technologies. The design originated from Arm's facility in , where the team focused on creating a processor that could support evolving device form factors. Development of the Cortex-A78 was driven by the need to close the performance divide between mobile devices and laptops, while emphasizing power efficiency to accommodate power-hungry 5G applications and enable multi-day battery life in premium smartphones and emerging devices like foldables. Positioned as the direct successor to the Cortex-A77, it incorporated refinements aimed at mitigating thermal throttling challenges observed in sustained workloads on its predecessor, such as those in prolonged video playback or multitasking scenarios. These goals targeted a 20% uplift in sustained performance within the same thermal power envelope as the A77. Silicon samples of the Cortex-A78 became available to licensees in late , allowing partners to integrate the core into their system-on-chip designs. The first commercial implementations appeared in 2021 SoCs, powering flagship mobile devices from manufacturers including and .

Architecture

ISA and Extensions

The ARM Cortex-A78 implements the Armv8.2-A (ISA), operating primarily in the 64-bit execution state for high-performance applications. It maintains with the 32-bit AArch32 execution state, supporting the Thumb-2 instruction set at Exception Level 0 (EL0) to enable software execution in user mode. This base ISA provides a with separate instruction and data pipelines, ensuring efficient handling of modern 64-bit workloads while preserving compatibility for mixed-mode environments. Key extensions in the Cortex-A78 enhance its capabilities for specific workloads, including the instructions introduced in Armv8.4-A, which accelerate operations by enabling efficient integer (INT8) dot product computations on vectors. It also supports the Enhanced Vector Floating Point (FP16) extension from Armv8.2-A, allowing half-precision floating-point operations in Advanced SIMD () for improved performance in graphics and AI tasks without full double-precision overhead. Additionally, the core includes Armv8.1-A and Armv8.2-A extensions for atomic operations and , along with partial Armv8.3-A support limited to Load-Acquire/Store-Release Pair (LDAPR) instructions; however, it offers only readiness for the Scalable Vector Extension (SVE), without full of SVE or SVE2 vector processing features, which are reserved for subsequent cores like the Cortex-A710. Cryptographic extensions for and processing are optional but commonly integrated to bolster data security in system-on-chip designs. Security features are integral to the Cortex-A78, with full support for Arm TrustZone technology to enable secure isolation between normal and secure worlds for trusted execution environments. The core does not support Armv8.3-A features such as Pointer Authentication Codes (PAC) or Branch Target Identification (BTI), nor Armv9-A features such as memory tagging or advanced SVE2, positioning it firmly within the Armv8-A ecosystem.

Microarchitecture Details

The ARM Cortex-A78 employs an out-of-order execution microarchitecture with out-of-order issue, enabling efficient handling of instruction dependencies while maintaining a balance between performance and power consumption. Instructions are fetched and decoded into macro-operations (MOPs), which may be fused for optimization, before being split into micro-operations (μOPs) for dispatch to the execution backend. This design supports up to 6 MOPs dispatched per cycle, with a maximum of 12 μOPs per cycle under certain constraints. The integer pipeline comprises 13 stages, facilitating high throughput while minimizing for common operations. Load-to-use is 4 cycles for L1 hits, with dual-issue support for loads and stores to enhance memory access efficiency. The core lacks (SMT) or , operating as a single-threaded per to prioritize and power efficiency in and applications. Key execution units include three integer arithmetic logic units (ALUs) for single-cycle operations, two dedicated branch units to handle , and two load/store units capable of performing two 16-byte loads or one 32-byte store per . The floating-point and unit provides 128-bit vector processing support, with two pipelines (V0 and V1) for scalar and SIMD instructions, enabling efficient execution of multimedia and workloads. Branch prediction utilizes a mechanism combining TAGE and gshare predictors, offering improved accuracy and —supporting up to two taken branches per —over previous generations to reduce misprediction penalties and boost overall throughput. The is configurable for area optimization, particularly targeting 5nm nodes, allowing licensees to balance die size, , and through options like cache sizing and extension inclusions while adhering to Armv8.2-A compatibility.

Performance and Power Efficiency

Improvements over Predecessors

The ARM Cortex-A78 delivers a 20% increase in sustained over its predecessor, the Cortex-A77, within the same thermal envelope of 1 W, as evidenced by SPECint2006 benchmark scores where the A78 achieves higher throughput without exceeding limits. This uplift addresses limitations in prior cores by prioritizing longevity under load, enabling more consistent operation in demanding scenarios. Additionally, the core provides a 7% gain in peak single-threaded through refined tweaks, allowing brief bursts of higher speed before thermal constraints apply. Efficiency advancements are particularly notable, with up to 50% reduction compared to 2019 devices delivering equivalent Cortex-A77 levels, achieved through optimizations that minimize power draw across workloads. In tasks specifically, the A78 consumes 8% less power than the A77 for the same output, contributing to an overall 10% improvement that reduces throttling and extends life in multi-day usage patterns. These gains stem from broader architectural refinements, including wider execution via an expanded dispatch width, enhanced branch prediction for greater accuracy and to cut stalls, and an optimized dispatch queue that sustains high instruction throughput without the rapid drops seen in the A77 under prolonged stress. Building on the microarchitectural lineage of the Cortex-A76 and A77 family—rooted in Armv8.2-A—the A78 refines these foundations for 5G-era demands, such as high-resolution video streaming and applications, by balancing peak capabilities with sustained efficiency to support immersive, always-on experiences. This evolution maintains compatibility with DynamIQ clustering while introducing targeted enhancements that better handle irregular workloads common in modern mobile ecosystems.

Clock Speeds and Benchmarks

The ARM Cortex-A78 typically operates at clock speeds ranging from 2.4 GHz to 3.0 GHz in implementations, such as the Qualcomm Snapdragon 888 where the three performance cores run at 2.4 GHz. In tablet and laptop configurations, it can reach up to 3.0 GHz, as seen in MediaTek's Kompanio 1380 , which utilizes four Cortex-A78 cores clocked at this frequency. Benchmark results highlight the core's balance of performance and efficiency. In standardized tests on early 2021 implementations like the Snapdragon 888, the Cortex-A78 achieves a Geekbench 5 single-core score of approximately 1135 and a multi-core score of around 3794 in an 1+3+4 core configuration (one Cortex-X1 prime core paired with three A78 cores). For integer workloads, it delivers a SPECint2006 base score of about 30 at 1 W per core, supporting multi-day battery life in mobile devices under typical thermal constraints. Compared to the predecessor Cortex-A77, the A78 exhibits 8% lower power draw for inference tasks, contributing to a 10% overall improvement in heterogeneous big.LITTLE configurations. These results vary by integration, process technology (e.g., 5 nm), and cooling solutions, with data primarily from 2021 devices like those powered by the Snapdragon 888.

Variants

Cortex-A78C

The Cortex-A78C is a variant of the Cortex-A78 CPU core, announced by ARM on November 2, 2020, and designed specifically for high-core-count configurations in compute-intensive devices. It builds on the base Cortex-A78 by extending support for larger DynamIQ , enabling up to eight high-performance cores per cluster compared to the standard four-core limit of the Cortex-A78. This variant incorporates a shared 8 L3 —double the maximum of 4 available in the Cortex-A78—to enhance multi-threaded in scenarios with heavy core utilization. configurations remain flexible for , with L1 and caches each configurable at 32 KB or 64 KB per core, private L2 caches at 256 KB or 512 KB per core, and the optional L3 scaling up to 8 for the cluster. These features, combined with an improved interconnect fabric optimized for larger clusters, deliver better for shared- multi-threaded workloads while preserving the power efficiency of the underlying Cortex-A78 design. Targeted at always-on laptops and automotive systems, the Cortex-A78C supports demanding applications such as tasks and , where sustained multi-core is critical. It maintains compatibility with the Mali-G78 GPU, allowing seamless integration into system-on-chip () designs for these markets.

Cortex-A78AE

The Cortex-A78AE is a variant of the ARM Cortex-A78 processor core, specifically engineered for safety-critical applications in automotive environments. Announced on September 29, 2020, as part of Arm's expanded automotive portfolio, it builds on the Cortex-A78 while incorporating dedicated hardware for compliance. Key differences from the standard Cortex-A78 include support for ASIL-D certification under the standard, as well as up to SIL 3, achieved through features like dual-core execution modes and integrated mechanisms in the execution units. In mode, pairs of cores run identical workloads in , comparing outputs to detect faults and ensure deterministic behavior, while hybrid modes allow flexible allocation of safety levels without sacrificing performance. These enhancements enable extended diagnostics, such as parity checks on instruction caches and on data caches, alongside safety wrappers that monitor and isolate potential errors in . The core maintains the performance profile of the Cortex-A78, offering approximately 30% higher single-threaded performance compared to its predecessor, the Cortex-A76AE, at equivalent power efficiency. Targeted at advanced driver-assistance systems (ADAS), in-vehicle , and autonomous driving platforms, the Cortex-A78AE supports configurations of up to four cores per cluster, organized as two dual-core pairs for optimal . It features structures similar to the base Cortex-A78—64 KB L1 instruction with , 64 KB L1 data with , and 512 KB private L2 per core—but augmented with safety-specific monitoring to meet automotive reliability requirements. This design facilitates deployment in domain controllers and software-defined vehicles, where high reliability is paramount without compromising the inherited from the Cortex-A78 family.

Licensing and Implementations

Licensing

The ARM Cortex-A78 is offered as a synthesizable () core, enabling partners to integrate it into custom system-on-chips (SoCs) under ARM's Core License agreements, which provide rights to use the pre-designed processor without the broader design freedoms of an Architectural License. This model supports scalability across high-performance mobile and embedded applications while maintaining compatibility with the Armv8.2-A architecture. Designed for seamless integration within ARM's DynamIQ technology, the Cortex-A78 pairs efficiently with complementary IP blocks such as Mali GPUs for graphics processing, Ethos NPUs for acceleration, and display controllers, all connected via the DynamIQ Shared Unit (DSU) to form heterogeneous clusters. The DSU facilitates shared L3 cache and interconnects, allowing customization during RTL synthesis to optimize for specific area, power, and performance targets. Licensing involves substantial upfront fees, coupled with per-unit royalties on manufactured and shipped , negotiated based on volume and integration scope. There are no open-source releases or free licensing tiers available for the Cortex-A78, restricting access to qualified commercial partners. The agreements are non-exclusive, permitting multiple licensees to implement independently, though access to certain proprietary implementation details may require non-disclosure agreements (NDAs) beyond the publicly available Technical Reference Manual. The core became available to partners in late following its announcement earlier that year.

SoC Integrations and Devices

The ARM Cortex-A78 core has been integrated into numerous high-performance System-on-Chip () designs, primarily for premium mobile devices, leveraging its efficiency in big.LITTLE configurations alongside efficiency cores like the Cortex-A55. Key implementations include Qualcomm's Snapdragon 888, announced in December 2020, which features one Cortex-X1 prime core at 2.84 GHz, three Cortex-A78 performance cores at 2.42 GHz, and four Cortex-A55 efficiency cores at 1.8 GHz, fabricated on a . Similarly, MediaTek's Dimensity 1200, launched in January 2021, employs one Cortex-A78 core at 3.0 GHz, three at 2.6 GHz, and four Cortex-A55 cores at 2.0 GHz on a 6 nm node, targeting mid-to-high-end smartphones. Samsung's 1080, introduced in November 2020, uses one Cortex-A78 core at 2.8 GHz, three at 2.6 GHz, and four Cortex-A55 cores at 2.0 GHz, also on 5 nm, marking an early adoption for balanced performance. The 2100, debuted in January 2021 for flagship devices, mirrors the Snapdragon 888's structure with one Cortex-X1 at 2.91 GHz, three Cortex-A78 at 2.81 GHz, and four Cortex-A55 at 2.2 GHz on 5 nm. Other notable integrations include UNISOC's T820 series (launched in 2022), featuring up to four Cortex-A78 cores at 2.7 GHz for mid-range devices. These SoCs have powered a range of consumer devices, particularly smartphones in the premium segment. Notable examples include the series, which utilized either the Snapdragon 888 or 2100 depending on region, delivering enhanced connectivity and multitasking capabilities. Devices with the MediaTek Dimensity 1200, such as the Realme GT Neo 2, , and V23 Pro, emphasized fast charging and camera performance in mid-range flagships. Beyond phones, the Cortex-A78 appears in tablets and laptops via MediaTek's Kompanio series, like the Kompanio 1300T (based on Dimensity 1300 with four Cortex-A78 cores at up to 3.0 GHz), integrated into s such as the Acer Chromebook Spin 714 for efficient web-based computing. Typical cluster configurations pair 1-4 Cortex-A78 cores for high-performance tasks with 4-6 Cortex-A55 cores for background efficiency, enabling sustained operation in power-constrained environments; these were among the first widespread 5 nm implementations in 2021 flagships, contributing to better thermal management and battery life. The adoption of Cortex-A78-based SoCs facilitated the rollout of premium devices with improved multi-day battery performance, powering numerous devices in the mobile market and influencing the shift toward more efficient architectures. By 2023, the core began phasing out in favor of successors like the Cortex-A710 in new designs, though it persists in some embedded and automotive applications for its proven reliability.

References

  1. [1]
    Cortex-A78 | Advancing 5G with High-End Efficiency and Power - Arm
    Cortex-A78 is a high-performance CPU designed for efficiency, enabling superior experiences, with power efficiency and scalability for mobile devices.Missing: specifications | Show results with:specifications
  2. [2]
    Cortex-A78 Product Support - Arm Developer
    Specifications ; Microarchitecture · Physical Addressing (PA), 40-bit ; Memory system and external interfaces, L1 I-Cache / D-Cache, 32KB to 64KB ; Memory system ...
  3. [3]
    Arm Cortex-A78 CPU: Sustained Performance for Greater Digital ...
    May 26, 2020 · It provides a 20 percent sustained performance improvement over Arm Cortex-A77 CPU in the same mobile thermal power envelope¹. Higher sustained ...
  4. [4]
    Cortex-A78AE - Arm
    Cortex-A78AE is a high-performance Armv8A processor for safety-critical systems, with high performance, scalable compute, and split-lock technology.Missing: specifications | Show results with:specifications
  5. [5]
    New Arm IP Delivers True Digital Immersion for the 5G era
    May 26, 2020 · News Highlights: Arm Cortex-A78 CPU: Transforming next-generation smartphone experiences with 20% sustained performance gains; Arm Cortex-X ...
  6. [6]
    Arm Introduces Cortex-A78, Cortex-X1, Cortex-X Custom
    May 27, 2020 · Arm has announced its next CPU core as the A78. This is the successor to the A77 and is based off of that same architecture which was introduced with the A76.Missing: throttling | Show results with:throttling<|control11|><|separator|>
  7. [7]
    ARM's Cortex-A78 CPU and Mali-G78 GPU will power 2021's best ...
    May 26, 2020 · Chip designer ARM has announced its latest Cortex-A78 CPU and Mali-G78 GPU, offering improvements of up to 25 percent over the previous ...
  8. [8]
    [PDF] Arm Cortex-A78 Core Software Optimization Guide
    Apr 28, 2021 · The Armv8.2-A architecture allows many types of load and store accesses to be arbitrarily aligned. The Cortex-A78 core handles most ...
  9. [9]
    Arm Unveils the Cortex-A78: When Less Is More - WikiChip Fuse
    May 26, 2020 · Continuing their their aggressive yearly cadence, today Arm is launching the Cortex-A78, formerly codenamed Hercules. The Cortex-A78 succeeds ...
  10. [10]
    Kompanio 1380 | Premium 6nm Chromebook SoC - MediaTek
    ... Arm-based Premium Chromebook certificated by Google. Specifications. Processor. CPU Type. 4x Arm Cortex-A78 up to 3GHz; 4x Arm Cortex-A55 up to 2GHz. CPU Cores.Missing: laptops | Show results with:laptops
  11. [11]
    Dimensity 1200 - MediaTek
    Leading 6nm chip design · 1x Ultra-Core: Arm Cortex-A78 up to 3GHz with 2X L2 cache · 3X Super Cores: Arm Cortex-A78 up to 2.6GHz · 4X Efficiency Cores: Arm Cortex ...<|separator|>
  12. [12]
    Qualcomm shares official Snapdragon 888 benchmark results
    Dec 18, 2020 · Geekbench (version 5.0.2) revealed 1,135 points in the single-core test and 3,794 in favor of the latest Snapdragon which is a marginal ...
  13. [13]
    ARM Cortex A78 to power 2021 phones - Fudzilla.com
    May 26, 2020 · When compared with Cortex A77 at 2.6GHz 7nm FinFet, the Cortex A78 at 3GHz and 5nm FinFet and based 1W per core comes about 20 percent faster.
  14. [14]
    Qualcomm benchmarks the Snapdragon 888, and it's fast
    Dec 18, 2020 · Starting with the CPU, GeekBench 5 reveals an 18% multi-core and 26% single-core win for the Snapdragon 888 over the 865 inside the Samsung ...
  15. [15]
    Arm Cortex-A78C CPU: Secure and scalable performance for next ...
    Nov 2, 2020 · Today, we are announcing the Arm Cortex-A78C CPU, a new CPU built on the foundation of Cortex-A78. It is purpose-built to be part of a ...
  16. [16]
    Arm Cortex-A78C Core Technical Reference Manual r0p1
    Feature, Range of options, Notes. L1 data cache size. 32KB; 64KB. -. L1 instruction cache size. 32KB; 64KB. -. L2 cache size. 256KB; 512KB.
  17. [17]
    Arm Cortex-A78C core supports up to 8 cores per cluster, 8MB L3 ...
    Nov 3, 2020 · Arm Cortex-A78C supporting up to eight cores per cluster, a larger cache up to 8MB for higher performance, and advanced security features all designed for ...<|control11|><|separator|>
  18. [18]
    ARM announces the Cortex-A78C, a new variant of a next-gen high ...
    Nov 3, 2020 · Like the A78, this new C variant is compatible with the Mali-G78 GPU. However, it is geared toward different formations in a single DynamiQ ...Missing: key | Show results with:key
  19. [19]
    New Arm Technologies Enable Safety-capable Computing Solutions ...
    Sep 29, 2020 · Designed with safety first: Arm Cortex-A78AE is Arm's highest performance CPU with safety, Arm Mali-G78AE is Arm's first safety capable GPU, and ...Missing: February 2021
  20. [20]
    Revision: r0p3 - Arm Cortex‑A78AE Core Technical Reference Manual
    Release Information ; 0000-01, 28 June 2019, Confidential, First development release for r0p0 ; 0000-02, 31 October 2019, Confidential, Second development release ...
  21. [21]
    Arm Announces Cortex-A78AE CPU, Mali-G78AE GPU & Mali ...
    Sep 30, 2020 · Arm has announced new CPU, GPU, and ISP specifically designed for autonomous automotive and industrial applications with respectively Cortex-A78AE CPU, Arm ...
  22. [22]
    About the core - Arm Cortex‑A78AE Core Technical Reference Manual
    The Cortex®‑A78AE core must be used in a core pair configuration with a maximum of two core pairs in each cluster for a total of four cores. The following ...
  23. [23]
    Licensing Arm Technology and Subscriptions
    Arm offers licensing for a wide range of IPs and tools, enabling companies of all sizes to develop chips and platforms for diverse global markets.Missing: A78 | Show results with:A78
  24. [24]
    DSU - Arm Cortex‑A78 Core Technical Reference Manual
    This Technical Reference Manual is for the Cortex ‑A78 core. It provides reference documentation and contains programming details for registers.
  25. [25]
    DynamIQ: Revolutionizing Multicore Computing - Arm
    Arm DynamIQ technology redefines multicore computing by combining big and LITTLE CPUs into a single, fully integrated cluster with many new and enhanced ...
  26. [26]
    Flexible Licensing, Boundless Innovation: How Arm is Accelerating ...
    Nov 1, 2023 · The licensing model charged a fee upfront for technology access and negotiated royalties based on partner chip sales. The model was broadly ...Missing: A78 synthesizable
  27. [27]
    r1p2 (Latest) - Arm Cortex‑A78 Core Technical Reference Manual
    This Technical Reference Manual is for the Cortex ‑A78 core. It provides reference documentation and contains programming details for registers.
  28. [28]
    MediaTek Dimensity 1200: specs and benchmarks - NanoReview
    Smartphones. Click on the device name to view detailed information. Phones with Dimensity 1200, AnTuTu v10. 1. OnePlus Nord 2 5G, 772507. 2. Xiaomi 11T, 753458 ...Dimensity 7200 vs Dimensity... · Snapdragon 888 vs Dimensity... · Xiaomi 11T
  29. [29]
    Samsung Exynos 1080 Octa-core Cortex-A78/A55 5G SoC includes ...
    Nov 12, 2020 · Samsung has announced the first SoC to combine both Cortex-A78 cores with Mali-G78 GPU. Exynos 1080 octa-core processor also includes a 5G NR modem.Missing: commercial | Show results with:commercial
  30. [30]
    Samsung Exynos 2100: specs and benchmarks - NanoReview
    It has 1 core Cortex-X1 at 2910 MHz, 3 cores Cortex-A78 at 2810 MHz, and 4 cores Cortex-A55 at 2200 MHz. Review. CPU Performance. 47. Gaming Performance. 39.Exynos 2100 vs Exynos 1480 · Exynos 2100 vs Exynos 1380 · Samsung Galaxy S21
  31. [31]
    Ten Products and Trends from the Arm Ecosystem in 2021
    Dec 21, 2021 · This SoC employs an Arm Cortex-X1 CPU in a tri-cluster octa-core configuration alongside an Arm Cortex-A78 CPU and Arm Mali-G78 GPU. In ...