Fact-checked by Grok 2 weeks ago

ARM big.LITTLE

ARM big.LITTLE is a heterogeneous computing architecture developed by Arm Holdings that integrates high-performance "big" processor cores with energy-efficient "LITTLE" cores within the same system-on-chip (SoC), enabling dynamic task allocation to balance peak performance demands with power consumption for mobile, embedded, and computing devices.^[1] Introduced in October 2011, the technology was designed to address the growing need for increased computational power in battery-constrained environments like smartphones and tablets, where traditional symmetric multiprocessing (SMP) architectures struggled to deliver both efficiency and speed.^[2] The first implementation paired the high-performance Arm Cortex-A15 big cores with the efficient Cortex-A7 LITTLE cores, both supporting the Armv7-A instruction set architecture (ISA) for seamless software compatibility and cache coherence via interconnects like the CCI-400.^[3] At its core, big.LITTLE operates through software-managed switching or scheduling: in the initial CPU Migration model, tasks migrate between big and LITTLE cores based on workload intensity, while later Global Task Scheduling (big.LITTLE MP) allows simultaneous execution across mixed cores for multithreaded efficiency.^[3] This approach yields significant benefits, including up to 76% CPU power savings for low-intensity tasks like audio playback and web scrolling, and up to 50% performance uplift in multi-threaded benchmarks compared to homogeneous big-core systems.^[3] The first commercial product featuring big.LITTLE was the Samsung Galaxy S4 smartphone, released in April 2013 with the Exynos 5 Octa SoC, marking a milestone in mobile computing by popularizing heterogeneous processing.^[4] Over time, big.LITTLE evolved with subsequent Arm Cortex-A generations, such as the 64-bit Cortex-A57 big and Cortex-A53 LITTLE pair announced in 2012, extending the architecture to support Armv8-A for enhanced security and efficiency.^[3] In 2017, Arm introduced DynamIQ, an advanced iteration that redefines big.LITTLE by allowing flexible mixing of up to three core types (big, LITTLE, and mid-range) in a single, dynamically scalable cluster, improving power management, thermal control, and integration with accelerators like GPUs and AI engines.^[5] Today, big.LITTLE and its DynamIQ successor underpin billions of devices worldwide, from smartphones to servers, driving innovations in 5G, AI, and edge computing while maintaining backward compatibility with the Arm ecosystem.^[1]

Overview

Definition and Core Concept

ARM big.LITTLE is a heterogeneous multi-core processor architecture developed by ARM Holdings that combines high-performance "big" cores, such as the Cortex-A15 and Cortex-A57, with energy-efficient "LITTLE" cores, exemplified by the Cortex-A7 and Cortex-A53, integrated on the same silicon die to dynamically manage varying computational workloads.^[3] This design addresses the power-performance trade-off inherent in mobile and embedded systems by allocating tasks to the appropriate core type based on demand.^[1] Initially revealed by ARM in October 2011 alongside the Cortex-A7 processor, big.LITTLE marked a pivotal advancement in processor heterogeneity for consumer devices.^[6] At its core, big.LITTLE enables seamless migration of threads or tasks between big and LITTLE cores, optimizing for either sustained high performance during intensive operations or maximal power efficiency during idle or light usage, while accommodating configurations with multiple clusters of each core type for scalability.^[7] This approach allows systems to deliver responsive user experiences without excessive energy consumption, supporting the evolving demands of smartphones, tablets, and other battery-constrained platforms.^[1] The technical foundation of big.LITTLE rests on the ARMv7-A and ARMv8-A instruction set architectures, which provide the basis for both core families and ensure full binary compatibility, meaning software binaries execute identically across big and LITTLE processors without modification or recompilation.^[7] This compatibility simplifies development and deployment, enabling unmodified applications to leverage the heterogeneous resources transparently through operating system scheduling.^[3]

History and Development

ARM big.LITTLE was conceived in the late 2000s by ARM Holdings as a response to the growing tension between escalating performance requirements and stringent battery life constraints in emerging smartphones. Following the 2007 launch of the iPhone, which accelerated demand for power-efficient yet capable mobile processors, ARM's architecture team recognized the limitations of traditional homogeneous multicore designs in balancing peak computational needs with everyday efficiency. This led to the development of a heterogeneous approach pairing high-performance "big" cores with energy-efficient "LITTLE" cores, drawing on ARM's longstanding RISC heritage originating from Acorn Computers in the 1980s.^[6]^[8] The first prototype of big.LITTLE was demonstrated in October 2011, showcasing seamless switching between Cortex-A15 big cores and Cortex-A7 LITTLE cores running Android. This prototype highlighted the technology's potential for dynamic workload management without software overhead. The same month, at ARM TechCon 2011, big.LITTLE was publicly unveiled alongside the Cortex-A7 processor, marking a pivotal announcement that positioned it as a foundational innovation for future mobile computing. Led by ARM's CPU architecture team, the project built on prior multicore advancements like the Cortex-A9 to address the evolving mobile ecosystem.^[9]^[10]^[6] Key commercialization began in 2013 with Samsung's Exynos 5 Octa (Exynos 5410), the industry's first production system-on-chip implementing big.LITTLE with four Cortex-A15 big cores and four Cortex-A7 LITTLE cores, debuting in devices like the Galaxy S4.^[11] Expansion to 64-bit processing followed in 2014, when TSMC and ARM announced the first FinFET-based silicon using 64-bit ARMv8 big.LITTLE configurations, incorporating Cortex-A57 and Cortex-A53 cores for enhanced addressability and efficiency in high-end applications.^[12] Adoption surged through the mid-2010s, with big.LITTLE becoming integral to over a dozen partner designs by 2013 and rapidly establishing itself as the dominant heterogeneous architecture in premium mobile SoCs. By 2015, it powered a majority of high-end Android smartphones, enabling better power-performance trade-offs amid rising app complexity. Usage peaked around 2017-2018, as evidenced by its prevalence in flagship devices before gradual evolution toward successors like DynamIQ, which built upon its clustered model for greater flexibility.^[13]^[1]^[14]

Motivations

Power-Performance Trade-off in Mobile Computing

In mobile computing, high-performance processor cores provide superior computational speed for demanding tasks such as gaming and video processing, but they consume significantly more power and generate excess heat, which accelerates battery drain and necessitates thermal throttling to prevent device overheating.^[15] In contrast, efficiency-focused cores prioritize low power consumption to extend battery life during lighter operations, though they offer limited throughput for peak workloads, potentially resulting in slower response times for intensive applications.^[16] This inherent trade-off underscores the challenge of balancing responsiveness with energy sustainability in battery-powered devices.^[13] Smartphones and tablets exhibit highly variable workloads, ranging from idle states and background activities like email checking to bursts of high-intensity use such as web rendering or augmented reality applications, demanding adaptive resource scaling to maintain usability without rapid battery depletion.^[17] Prior to heterogeneous designs like big.LITTLE, mobile devices relying on uniform high-performance cores often underperformed during light tasks due to inefficiency or exhausted batteries quickly under sustained loads, while uniform low-power configurations failed to handle compute-intensive scenarios adequately.^[15] ARM's early benchmarks from 2011-2013 illustrate this dynamic: the high-performance Cortex-A15 core delivered 2 to 3 times the single-threaded performance of the efficiency-oriented Cortex-A7 but required approximately 4 to 5 times the power and die area under comparable light-load conditions, highlighting the inefficiency of deploying big cores for routine operations.^[15] LITTLE cores, by contrast, achieved up to 3 times greater energy efficiency than big cores for modest tasks, enabling substantial overall power reductions when workloads aligned appropriately.^[13] The proliferation of multicore system-on-chips (SoCs) in the early 2010s, driven by surging smartphone shipments from under 200 million units in 2009 to over 1 billion by 2013, amplified these pressures as devices incorporated more processing capability without proportional gains in battery technology.^[18] Lithium-ion batteries, dominant in mobile devices during this period, faced inherent limitations in energy density scaling—improving only modestly at around 5-8% annually—failing to keep pace with escalating CPU power demands from multicore architectures and richer applications.^[19] This mismatch necessitated innovative architectures to optimize power delivery for diverse usage patterns while adhering to fixed battery capacities.^[20]

Limitations of Homogeneous Architectures

Homogeneous multi-core architectures, where all processor cores are identical in design and capabilities, inherently struggle to balance power efficiency and performance in mobile computing environments. In such systems, deploying only high-performance cores results in excessive energy consumption during light workloads, as these cores draw significant power even for simple tasks like background processes or user interface updates. Conversely, using solely low-power cores leads to performance bottlenecks when handling compute-intensive applications, such as video rendering or gaming, failing to meet user expectations for responsiveness. This lack of fine-grained adaptation prevents optimal resource utilization across diverse workloads, exacerbating battery drain and thermal constraints in battery-limited devices.^[21]^[8] A notable example of these issues is seen in early ARM-based homogeneous processors like the Cortex-A9, widely used in pre-big.LITTLE smartphones. To accommodate occasional performance bursts, these systems often relied on dynamic voltage and frequency scaling (DVFS), which increased power draw and accelerated battery depletion during sustained operation, while still leaving cores inefficient for mixed-use scenarios. Symmetric multiprocessing (SMP) in these designs further compounded the problem by treating all cores uniformly, without accounting for workload-specific efficiency needs, leading to suboptimal energy use in mobile symmetric setups.^[22]^[21] Performance gaps in homogeneous architectures are particularly evident during mixed workloads typical of mobile devices, where cores remain underutilized for extended periods—studies indicate that quad-core ARM processors in smartphones exhibit low average utilization rates across multiple cores, with an average thread-level parallelism of about 1.5 during active workloads, implying substantial idle periods that waste power. Additionally, the high power density of uniform high-performance cores often triggers thermal throttling, reducing sustained performance by up to 34% in commercial mobile platforms to prevent overheating. These inefficiencies highlighted the need for architectural evolution toward asymmetry, paving the way for heterogeneous designs like ARM's big.LITTLE, which emerged as the first practical implementation in commercial systems in 2013.^[23]^[24]^[25]

Architecture and Operation

Big and LITTLE Core Designs

The big cores in the ARM big.LITTLE architecture are high-performance processors from the Cortex-A series, designed to handle compute-intensive tasks such as multimedia processing and user interface rendering. For instance, the Cortex-A15, introduced as the initial big core, features an out-of-order execution pipeline with a superscalar design, enabling it to process multiple instructions simultaneously for enhanced throughput, and supports clock speeds up to 2.5 GHz.^[25] Later 64-bit big cores like the Cortex-A57 build on this with advanced branch prediction mechanisms, wider execution units, and support for higher instruction-level parallelism, also implementing out-of-order execution to deliver sustained performance in power-constrained environments. These cores prioritize peak computational capability while adhering to mobile thermal and power envelopes.^[1] In contrast, the LITTLE cores emphasize energy efficiency for background and light workloads, such as system maintenance and idle processing, using simpler microarchitectures to minimize power draw. The Cortex-A7, the original LITTLE core, employs an in-order execution pipeline with a basic dual-issue design, operating at lower clock speeds around 1 GHz, and delivers performance comparable to the earlier Cortex-A9 while achieving better energy efficiency than big cores for typical mobile tasks.^[8] The 64-bit successor, Cortex-A53, maintains an in-order pipeline with a straightforward dual-issue decode stage, supporting clock speeds up to approximately 2 GHz in efficient configurations, and provides broad compatibility for low-intensity operations with significantly reduced power consumption relative to performance-oriented cores.^[26]^[27] This design allows LITTLE cores to handle the majority of everyday computing demands with minimal battery impact.^[25] A key design principle of big.LITTLE is the shared instruction set architecture (ISA) between big and LITTLE cores, ensuring seamless binary compatibility and enabling transparent task handling across core types. Early implementations use the ARMv7-A ISA with AArch32 execution state for both Cortex-A15 and Cortex-A7, while 64-bit variants adopt ARMv8-A with AArch64 support for Cortex-A57 and Cortex-A53.^[8] Big cores incorporate larger L1 and L2 caches—typically 32 KB instruction and 32 KB data L1 per core, with shared 1-2 MB L2—and wider execution units (e.g., 3-wide issue in A15/A57) to support complex workloads, whereas LITTLE cores feature 32 KB L1 caches and narrower pipelines (e.g., 2-wide in A7/A53) optimized for low latency on simple instructions.^[1]^[25] This asymmetry allows big cores to excel in bursty, high-demand scenarios, while LITTLE cores maintain efficiency without sacrificing core functionality. In typical big.LITTLE system-on-chip (SoC) configurations, clusters consist of 2-4 big cores paired with 4-8 LITTLE cores to balance performance and power across diverse workloads.^[8] These clusters are interconnected via a coherent bus, such as the ARM CoreLink CCI-400, which enforces cache coherency using the AMBA ACE protocol, enabling shared memory access and data consistency between heterogeneous cores without software intervention.^[28] This setup supports configurations like 2 Cortex-A15 + 4 Cortex-A7 or 4 Cortex-A57 + 4 Cortex-A53, facilitating scalable integration in mobile processors.^[29]

Run-State Migration Methods

In ARM big.LITTLE architectures, run-state migration enables the operating system to dynamically shift executing threads between high-performance "big" cores and energy-efficient "LITTLE" cores to adapt to varying workload demands, ensuring optimal power and performance balance while maintaining transparency to applications.^[30] This process relies on hardware interconnects, such as the CoreLink CCI-400 cache coherent interconnect, to preserve cache coherency during migrations, preventing data inconsistencies as threads move between core clusters.^[8] The migration process begins with the OS monitoring CPU load through governors that track metrics like thread utilization and historical weighted averages to detect performance bursts or idle periods.^[8] Upon identifying a need—such as high demand triggering a shift from a LITTLE core to a big core—the OS suspends the thread on the source core by capturing its execution context, including registers and program counter.^[30] It then updates the thread's CPU affinity mask to bind it to the target core, resumes execution by restoring the context, and handles any pending interrupts to ensure seamless handover without perceptible disruption.^[8] Run-state migration integrates closely with power management features, particularly dynamic voltage and frequency scaling (DVFS), where core frequencies and voltages are adjusted in tandem with thread relocation to minimize energy use.^[30] When big cores are idle following a migration, they enter low-power states like core power-down or clock gating, further enhancing efficiency.^[8] For compatibility across clusters, migrations depend on ARM's Generic Interrupt Controller (GIC), such as the GIC-400, which distributes shared interrupts dynamically to the appropriate active cores, supporting coherent operation in heterogeneous environments.^[30]

Switching Techniques

Clustered Switching

Clustered switching, also referred to as cluster migration, is the simplest implementation of task migration in ARM big.LITTLE architectures, where an entire cluster of high-performance "big" cores (such as Cortex-A15) is powered off and replaced by an equivalent cluster of energy-efficient "LITTLE" cores (such as Cortex-A7), or vice versa, to adapt to varying workload demands.^[31]^[32] This approach ensures that only one cluster is active at any given time, except for the brief period during the switch, enabling low-overhead adaptation by fully deactivating the inactive cluster through hardware power domains that isolate and power down the unused cores and associated logic.^[31]^[33] The operation relies on the operating system monitoring system load via mechanisms like dynamic voltage and frequency scaling (DVFS), triggering a switch when the active cluster reaches predefined thresholds—such as when the LITTLE cluster cannot sustain performance at its maximum frequency under increasing load, prompting a shift to the big cluster.^[32] During the switch, all running tasks are migrated to the newly activated cluster, which requires the big and LITTLE clusters to have an equal number of cores to maintain symmetric topology and simplify state transfer.^[31] The migration process is typically atomic, though full cluster activation involves overhead from power domain transitions on the order of milliseconds.^[32] This method was first implemented in the Samsung Exynos 5410 system-on-chip in 2013, powering devices like the Galaxy S4, where it provided a straightforward way to balance power and performance in early big.LITTLE deployments.^[34]^[35]^[32] While clustered switching offers fast setup times and significant power savings—up to 70% reduction in low-load scenarios by allowing the entire inactive cluster to enter a deep sleep state—it is inherently coarse-grained and less suitable for workloads requiring frequent toggling or those that are unbalanced across cores, as it may unnecessarily activate the full big cluster for isolated high-demand tasks.^[31]^[32]

In-Kernel Switcher

The In-Kernel Switcher (IKS) is a kernel-level technique in ARM big.LITTLE architectures that enables the migration of individual CPU threads between high-performance "big" cores and energy-efficient "LITTLE" cores without requiring the shutdown of entire clusters. This approach leverages hooks in the Linux Completely Fair Scheduler (CFS) to manage task placement, treating paired big and LITTLE cores—such as a Cortex-A15 and Cortex-A7—as a single virtual CPU. By doing so, it allows tasks to be dynamically reassigned to the appropriate core type based on current demands, while maintaining power to the clusters and avoiding the coarser granularity of full cluster migrations.^[36]^[13] In operation, the IKS monitors per-core load using mechanisms like the interactive CPU frequency governor, which triggers switches when utilization exceeds predefined thresholds, such as 85% on a LITTLE core to migrate to a big core. Task migration is facilitated through CPU hotplug mechanisms to adjust thread affinity, effectively powering down the unused core in the pair while keeping the cluster active. This results in lower switching latency, typically around 30 microseconds for core transitions, compared to the milliseconds required for clustered switching methods, enabling more responsive adaptation to workload variations without perceptible delays to users.^[36]^[13]^[37] The implementation of the IKS was developed collaboratively by ARM and Linaro, with the switcher code released to partners in December 2012 and available as a patch set for the Linux kernel in early 2013, with upstream merge in version 3.11 in September 2013. It coordinates with the CPU frequency (cpufreq) driver for load balancing across heterogeneous clusters, utilizing well-established kernel interfaces to simplify integration and testing in production environments.^[36]^[38]^[13] Despite its advantages, the IKS introduces higher complexity in kernel coordination, particularly in synchronizing with frequency scaling and ensuring coherent cache behavior across core types. It also carries the risk of performance or power imbalances if tuning parameters—such as switch delays or load thresholds—are not optimized for specific workloads, potentially leading to suboptimal efficiency in non-symmetrical core configurations. Additionally, this method restricts simultaneous utilization of all cores in a system, as only one core per pair operates at a time.^[36]^[38]^[13]

Heterogeneous Multi-Processing

Heterogeneous Multi-Processing (HMP) represents an advanced paradigm in ARM big.LITTLE systems, enabling the simultaneous utilization of both big and LITTLE processor cores to optimize performance and power consumption. Unlike earlier migration-based approaches that limited operation to one cluster at a time, HMP treats the heterogeneous cores as a unified pool, allowing multiple tasks to execute concurrently across core types. Demanding threads, which require high computational throughput, are preferentially assigned to big cores, while lighter background or less intensive tasks run on LITTLE cores to conserve energy. This concurrent execution maximizes overall system utilization without the need for full task migrations in every scenario.^[39]^[40] In operation, the global scheduler in HMP views all available CPUs—regardless of cluster—as a single heterogeneous domain, leveraging task attributes like priority and load hints to determine placement. Load tracking is a core mechanism, where the scheduler monitors utilization across clusters to dynamically balance workloads; for instance, "hot" tasks exhibiting sustained high demand trigger up-migration to big cores for acceleration, while idle or cooling tasks undergo down-migration to LITTLE cores to reduce thermal and power overhead. This approach supports scalable configurations, accommodating systems with 8 or more cores by distributing threads optimally without cluster-wide exclusivity. HMP was integrated into the Linux kernel as upstream support in version 3.13, released in January 2014, marking a significant step in enabling native heterogeneous scheduling for big.LITTLE platforms.^[41]^[30]^[8] By 2015, HMP had evolved into the de facto standard for big.LITTLE deployments, supplanting prior sequential models and requiring specialized CPU frequency governors to fine-tune load balancing and power states—early iterations of which laid the groundwork for ARM's later Energy Aware Scheduling (EAS) framework. This shift allowed developers to exploit full core parallelism in mobile and embedded SoCs, enhancing responsiveness under varying workloads while adhering to strict efficiency constraints.^[42]^[43]

Scheduling Mechanisms

Task Allocation Strategies

Task allocation strategies in ARM big.LITTLE architectures enable operating systems to dynamically assign computational workloads to either high-performance "big" cores or energy-efficient "LITTLE" cores, optimizing for both power consumption and responsiveness. These strategies are implemented within the OS scheduler, which profiles workloads to identify characteristics such as computational intensity versus I/O dependency, directing CPU-bound tasks—those requiring sustained processing—to big cores while assigning lighter, latency-tolerant tasks to LITTLE cores.^[1]^[8] Heuristics form the foundation of these strategies, often relying on real-time monitoring of task utilization to trigger core assignments. For instance, the scheduler tracks a task's load as a weighted average emphasizing recent runqueue residency, sampled approximately every 1 ms, applying up-migration thresholds to shift high-utilization tasks (> a configurable load level) to big cores, and down-migration thresholds to relocate low-utilization tasks back to LITTLE cores.^[8]^[44] This uneven distribution prioritizes power savings by keeping most tasks on LITTLE cores unless they meet criteria like high priority (e.g., nice value ≤ 0) or prolonged high load, avoiding uniform load balancing across core types.^[44] Core methods include several allocation techniques integrated with dynamic voltage and frequency scaling (DVFS) for holistic optimization. Fork allocation places newly created threads on big cores for initial bursty demands, while wake allocation uses historical load data to assign waking tasks appropriately. Idle-pull mechanisms scan for and migrate high-load tasks to idle big cores, and offload strategies pack idle or low-priority tasks onto LITTLE cores to consolidate activity and enable big core idling.^[8] Primary OS support resides in Linux and Android kernels via the Heterogeneous Multi-Processing (HMP) framework, which treats all cores as a single scheduling domain for flexible, global task distribution. Windows on ARM incorporates analogous scheduler modifications to recognize core asymmetries and apply similar load-based allocation rules for energy efficiency. These implementations aim to minimize inter-core migrations—costly due to context switching overhead—by favoring stable assignments that balance load without frequent relocations.^[45]^[46] Tuning of these strategies is facilitated through Linux sysfs interfaces, such as those under /sys/devices/system/cpu/ for adjusting migration thresholds, utilization clamping, and DVFS policies, allowing system administrators to fine-tune parameters like load history weights or priority biases to suit specific workloads while reducing unnecessary migrations.^[47]

Global Task Scheduling

In ARM big.LITTLE systems, Global Task Scheduling (GTS) employs a unified scheduling approach where the operating system treats all big and LITTLE cores as a single pool of heterogeneous resources, enabling concurrent utilization across the entire system. The scheduler, exemplified by ARM's Heterogeneous Multi-Processing (HMP) implementation in the Linux kernel, aggregates load metrics from every CPU to assess global utilization and task suitability based on factors like computational demands and core capacities. This contrasts with localized strategies by considering system-wide dynamics, allowing tasks to execute on any core without strict cluster boundaries.^[45] Task placement and migration in GTS rely on dynamic algorithms that monitor runtime behavior to migrate workloads up to big cores for demanding operations or down to LITTLE cores for lighter loads, using a shared runqueue to facilitate seamless transitions. Hotplug mechanisms adjust overall capacity by enabling or disabling individual cores based on demand, optimizing power states without disrupting ongoing execution. Wake-up balancing further refines this by directing newly activated tasks to the most appropriate core—prioritizing big cores for latency-sensitive wakes or LITTLE cores for throughput-oriented ones—thereby preventing localized overloads and promoting even distribution across the architecture's varying performance envelopes.^[45]^[8] By holistically managing heterogeneity, GTS reduces latency in mixed workloads through proactive load spreading, avoiding bottlenecks in any single core type, and enhances system throughput; for instance, evaluations on platforms like the Odroid-XU3 show up to 35% faster execution times in parallel benchmarks such as PARSEC Black-Scholes compared to big-only clustering. These gains stem from leveraging all cores simultaneously, yielding 20-50% better multi-threaded performance over prior migration-based models in ARM's assessments.^[48]^[8] Post-2016, GTS has integrated with Energy Aware Scheduling (EAS) in Linux kernels, incorporating energy cost models into placement decisions while accounting for inter-cluster latencies akin to NUMA configurations, thus extending its scalability to more complex heterogeneous setups without replacing the core global load aggregation.

Implementations

Early Adopters and SoC Examples

The Samsung Exynos 5 Octa 5410, announced in January 2013, marked the first commercial System-on-Chip (SoC) to implement ARM's big.LITTLE architecture. This octa-core design integrated four high-performance ARM Cortex-A15 cores operating at up to 1.6 GHz as the "big" cluster with four energy-efficient Cortex-A7 cores at 1.2 GHz as the "LITTLE" cluster, allowing seamless task migration between clusters to balance performance and power consumption.^[49] The Exynos 5410 powered international variants of the Galaxy S4, demonstrating big.LITTLE's potential in smartphones through ARM's Heterogeneous Multi-Processing (HMP) software stack.^[50] A subsequent version, the Exynos 5 Octa 5420 announced in July 2013, featured higher clocks of up to 1.9 GHz for the A15 cores and powered devices like the Galaxy Note 3.^[51] MediaTek followed as an early pioneer with the MT8135 SoC in late 2013, a quad-core big.LITTLE configuration featuring two Cortex-A15 big cores and two Cortex-A7 LITTLE cores, primarily targeted at tablets for improved multitasking efficiency. In 2014, MediaTek advanced the technology in the MT6595, an octa-core SoC combining four Cortex-A17 big cores (up to 2.2 GHz) with four Cortex-A7 LITTLE cores at 1.7 GHz, supporting 4G LTE and UHD video decoding while leveraging big.LITTLE for dynamic load balancing.^[52] Similarly, HiSilicon's Kirin 920 SoC, announced in June 2014 for Huawei devices like the Honor 6, adopted an octa-core setup with four Cortex-A15 big cores and four Cortex-A7 LITTLE cores, emphasizing power domain isolation via ARM's multi-Power Management Unit (PMU) design to enable independent cluster control.^[53] These early SoCs predominantly featured octa-core configurations with symmetric 4-big + 4-LITTLE pairings, though quad-core variants like the MT8135 highlighted flexibility for lower-power devices; power domains were managed through ARM's multi-PMU framework to allow granular voltage and frequency scaling per cluster.^[42] Initial deployments faced challenges with task migration reliability, including occasional stalls during core switches due to immature Linux kernel support, which were largely resolved by updates in kernel version 3.13 released in January 2014, introducing enhanced HMP scheduling and hotplug stability.

Widespread Device Integration

The adoption of ARM big.LITTLE architecture extended rapidly into consumer devices following its initial SoC implementations, with the Samsung Galaxy S4 in 2013 marking a pivotal entry into mainstream smartphones through the Exynos 5 Octa 5410 processor, which combined four high-performance Cortex-A15 cores with four energy-efficient Cortex-A7 cores.^[4] The Galaxy S5, released in 2014, built on this foundation using a similar heterogeneous configuration in its Exynos 5422 variant, enabling dynamic workload balancing for improved battery life during everyday tasks.^[54] By the mid-2010s, big.LITTLE had permeated premium Android smartphones from manufacturers like OnePlus and Huawei, becoming a standard feature in high-end models to meet demands for both peak performance and power savings. The architecture's integration influenced broader ecosystems beyond mobile phones. Starting with Android 4.4 KitKat, the operating system supported big.LITTLE-enabled devices, allowing efficient execution of workloads on heterogeneous cores, as demonstrated in early implementations like the AllWinner A80 octa-core processor.^[55] Apple's custom silicon in the M-series chips, introduced in 2020, drew inspiration from big.LITTLE principles by incorporating a mix of high-performance and efficiency cores, though tailored to ARMv8 architecture for optimized macOS and iOS performance.^[56] In automotive and IoT applications, big.LITTLE found use in single-board computers and edge devices, such as Raspberry Pi competitors powered by octa-core SoCs, where its heterogeneous design enhances reliability and power management for resource-constrained environments.^[57]^[58] Market penetration of big.LITTLE accelerated post-2015, contributing to the shipment of billions of ARM-based chips annually, with the architecture underpinning a significant portion of mobile SoCs by 2018 as adoption grew among premium Android vendors.^[59] By 2020, cumulative ARM chip shipments exceeded 100 billion units, many incorporating big.LITTLE for heterogeneous computing in consumer and embedded systems.^[60] As of 2025, big.LITTLE persists in mid-range devices for its proven balance of cost and efficiency, though its usage is declining in favor of the more flexible DynamIQ framework, which integrates big and LITTLE cores into scalable clusters.^[1]^[5]

Benefits

Performance Enhancements

The big.LITTLE architecture achieves significant speed gains by deploying high-performance "big" cores, such as the Cortex-A15, for CPU-intensive applications, delivering up to 2x the single-core execution speed compared to LITTLE cores like the Cortex-A7.^[3] This enables faster processing in scenarios like gaming and multimedia rendering, where big cores handle compute-heavy threads while maintaining compatibility with lighter workloads. In multi-threaded benchmarks, big.LITTLE systems show performance uplifts of up to 50% over homogeneous big-core configurations, as demonstrated in tests like Antutu v4, Geekbench, and ANDeBench for workloads exceeding four threads.^[3] Big cores also yield higher SPECint scores due to their advanced out-of-order execution and larger caches, providing superior integer compute performance for demanding tasks.^[25] Responsiveness is enhanced through low-latency task migrations, with switch times around 30 microseconds—far quicker than traditional dynamic voltage and frequency scaling adjustments—minimizing delays in bursty applications.^[13] This migration overhead imposes less than 5% penalty on overall execution, preserving efficiency during core handoffs.^[13] Multitasking performance improves by allowing parallel execution of heavy tasks on big cores and light tasks on LITTLE cores, as facilitated by global task scheduling mechanisms.^[13] In mixed workloads, this results in higher SoC throughput, balancing peak demands without sacrificing concurrent operations.^[3]

Energy Efficiency Improvements

The ARM big.LITTLE architecture achieves significant efficiency boosts by specializing LITTLE cores for light workloads, reducing power consumption by 73%-76% compared to a homogeneous big-core system for tasks like web browsing and homescreen navigation.^[3] These cores, such as the Cortex-A7, consume far less energy for common mobile activities, enabling overall SoC power savings of 21%-40% across use cases ranging from audio playback to 1080p video decoding.^[3] In real-world tests on early implementations, this translates to battery life extensions of approximately 20%-30%, particularly in mixed workloads where low-intensity tasks dominate.^[48] Key mechanisms driving these improvements include offloading idle or low-demand tasks to the power-efficient LITTLE cluster, which powers down unused big cores to minimize leakage and active power draw.^[3] Synergy with Dynamic Voltage and Frequency Scaling (DVFS) further optimizes this by dynamically adjusting operating points based on task requirements, ensuring the system operates at the lowest viable voltage for sustained efficiency.^[3] Additionally, the architecture provides greater thermal headroom by limiting high-power big-core usage to bursts, reducing overall heat generation and preventing thermal throttling that could otherwise degrade performance and efficiency.^[3] Quantitative metrics underscore these gains: Arm simulations from 2013 demonstrate up to 3x improvement in MIPS per watt over homogeneous architectures for balanced loads, reflecting better resource utilization in heterogeneous setups.^[3] In early 28nm implementations, idle power per cluster drops to as low as 0.5W when inactive cores are gated, compared to higher baselines in uniform designs.^[61] In evaluated systems using Global Task Scheduling, big.LITTLE shows 34% better energy efficiency than big-only clusters under full loads, measured in joules per operation.^[48] Over the long term, these enhancements have enabled the design of thinner mobile devices with extended usage patterns, such as over 10 hours of continuous video playback on early big.LITTLE implementations, while maintaining high performance reserves. Benefits have continued to improve in later core generations, such as Cortex-A76 big cores paired with Cortex-A55 LITTLE cores, offering further efficiency gains on advanced process nodes.^[3]^[62]

Evolutions and Successors

DynamIQ Integration

ARM announced DynamIQ in March 2017 as the primary evolution of the big.LITTLE architecture, shifting from rigid, fixed clusters to a highly flexible system-on-chip design. This integration replaces the traditional separation of big and LITTLE clusters with a unified DynamIQ cluster framework, enabled by the DynamIQ Shared Unit (DSU), which allows for mix-and-match configurations of heterogeneous cores within a single cluster. Supporting up to eight cores per cluster in initial Armv8.2-based implementations, DynamIQ enhances design freedom while maintaining cache coherency through integrated L3 caching up to 8 MB.^[5] A key enhancement of DynamIQ is its support for combining high-performance big cores, energy-efficient LITTLE cores, and optional mid-tier cores in arbitrary ratios, exemplified by pairings like the Cortex-A76 (big) with the Cortex-A55 (LITTLE). This multi-tier flexibility addresses diverse workloads, offering improved scalability for demanding applications such as servers and AI inference, where multiple DynamIQ clusters can interconnect via the CoreLink CMN-600 coherent mesh network to form large-scale systems with dozens to hundreds of cores. The CMN-600 provides a 2D mesh topology for high-bandwidth, low-latency communication, surpassing the limitations of earlier interconnects like CCI-550.^[63]^[64] DynamIQ ensures backward compatibility by preserving the established big.LITTLE software scheduling model, enabling seamless migration of existing applications and operating systems without requiring code changes. Its debut in commercial hardware came with Qualcomm's Snapdragon 845 SoC in early 2018, which featured a DynamIQ cluster of four Cortex-A75-based performance cores and four Cortex-A55-based efficiency cores, marking the first widespread adoption of the technology in mobile devices. Core features include dynamic cluster sizing for customizable core counts and types, advanced power management via per-domain voltage scaling, and integration starting from Armv8.2 architecture, with extensions in later versions for even greater core diversity up to 14 per cluster.^[65]

Modern Heterogeneous Extensions

Following the introduction of DynamIQ in 2017, modern heterogeneous extensions to the ARM architecture have built upon big.LITTLE principles by incorporating Armv9 features starting from 2022, enhancing support for AI workloads through Scalable Vector Extensions version 2 (SVE2). SVE2 enables scalable SIMD processing from 128-bit to 2048-bit vectors, allowing efficient handling of machine learning operations like matrix multiplications without requiring hardware-specific code rewrites.^[66]^[67] For instance, the Cortex-X925 core integrates SVE2 to accelerate AI inference and training tasks directly on CPU clusters.^[66] These extensions also facilitate tighter integration with dedicated neural processing units (NPUs), such as the Ethos-U85, which supports transformer models and is paired with Armv9 CPUs like Cortex-A320 for edge AI applications.^[68]^[69] In big.LITTLE configurations, high-performance "big" cores from the Cortex-X series, such as X925, serve as the primary accelerators for compute-intensive AI tasks, while efficiency cores handle lighter loads.^[70] As of 2025, big.LITTLE and DynamIQ architectures remain the foundation for most mobile system-on-chips (SoCs), powering the majority of smartphones through heterogeneous CPU clustering that balances performance and efficiency. Most popular mobile SoCs employ big.LITTLE designs, with adoption driven by their ability to dynamically allocate tasks across core types.^[71] A representative example is Qualcomm's Snapdragon 8 Gen 3, which features a 1+5+2 configuration: one Cortex-X4 prime core at 3.3 GHz for peak performance, five Cortex-A720 performance cores (three at 3.2 GHz and two at 3.0 GHz), and two Cortex-A520 efficiency cores at 2.3 GHz, enabling up to 30% better CPU performance and 20% improved power efficiency over its predecessor.^[72] This setup exemplifies how modern implementations extend DynamIQ's mix-and-match flexibility to support AI-enhanced mobile experiences, such as on-device generative models.^[73] Looking ahead, future trends in ARM heterogeneous computing emphasize chiplet-based designs for modular scalability, particularly in edge AI deployments where diverse compute elements can be assembled to meet varying workload demands. ARM's 2025 reports highlight heterogeneous integration as key to advancing AI systems, enabling flexible silicon that combines CPUs, NPUs, and accelerators for energy-efficient edge processing.^[74]^[75] This shift supports chiplet architectures that optimize yield and cost by mixing process nodes, as seen in emerging AI SoCs.^[76] However, there are indications of partial evolution away from traditional big.LITTLE toward configurations with all-big efficiency cores, where mid-tier cores like A725 provide near-performance levels with lower power draw, blurring distinctions for broader applicability.^[70] In hyperscale data centers, these extensions introduce challenges related to software complexity, as managing task migration across heterogeneous clusters in converged AI environments demands advanced orchestration to handle latency and resource allocation.^[77]^[78]

References

[1]
big.LITTLE: Balancing Power Efficiency and Performance - Arm
Arm big. LITTLE technology is a heterogeneous processing architecture that uses up to three types of processors. ”LITTLE” processors are designed for maximum ...
[2]
New generation of Armv9 CPUs unleash unprecedented compute ...
Jun 28, 2022 · The Power of big.LITTLE. Arm's big.LITTLE technology, which was first launched in 2011, is now the most commonly used heterogeneous ...
[3]
https://www.arm.com/-/media/Files/pdf/white-paper/big-little-technology-the-future-of-mobile.pdf
[4]
Samsung Galaxy S4 processor: how the eight-core CPU works
Mar 19, 2013 · Big.LITTLE simply takes it further, pairing each high-end CPU core with a slower one. Since the Exynos 5 Octa is the first big.LITTLE chip ...
[5]
DynamIQ: Revolutionizing Multicore Computing - Arm
Arm DynamIQ technology redefines multicore computing by combining big and LITTLE CPUs into a single, fully integrated cluster with many new and enhanced ...
[6]
The Official History of Arm
Aug 16, 2023 · Arm responded to this challenge with its Cortex-A9 CPU multi-core processor, before introducing the innovative “big. LITTLE” approach in 2011, ...
[7]
big.LITTLE - ARM Cortex-A Series (Armv7-A) Programmer's Guide
The basic premise is that software can run on big or LITTLE processors depending on performance requirements. When peak performance is required software can run ...
[8]
None
### Summary of ARM big.LITTLE Technology
[9]
ARM big.LITTLE Processing Demo - CNX Software
Oct 21, 2011 · The demo runs the Android 2.3 (Gingerbread) with graphs showing CPU usage and when Cortex A7 / A15 is running. At the beginning Cortex A7 ...
[10]
Samsung confirmed to use ARM's big.LITTLE chip architecture for ...
Oct 27, 2011 · Samsung confirmed to use ARM's big.LITTLE chip architecture for frugal Exynos in 2012. At the ARM TechCon 2011 event we got the much desired ...
[11]
[PDF] 2012 SAMSUNG ELECTRONICS ANNUAL REPORT
Jan 1, 2012 · Exynos 5 Octa. Industry-first eight-core mobile AP that consists of four high-performance. ARM CortexTM-A15 cores and four low-power ARM ...
[12]
TSMC and ARM set new Benchmarks for Performance and Power ...
Sep 30, 2014 · TSMC and ARM set new Benchmarks for Performance and Power Efficiency with First Announced FinFET Silicon with 64-bit ARM big.LITTLE Technology.
[13]
Ten Things to Know About big.LITTLE - Arm Developer
Sep 11, 2013 · Since its launch there are now more than twelve ARM partners actively designing with big. LITTLE technology.
[14]
Arm Total Compute Solutions Redefine Visual Experiences and ...
... big.LITTLE”-based configurations, with the technology now the most commonly used heterogeneous processing architecture for consumer devices worldwide. Last ...
[15]
None
### Quantitative Details on Power and Performance Differences: Cortex-A15 vs. Cortex-A7
[16]
Deep inside ARM's new Intel killer - The Register
At press soirées in London and San Francisco on Wednesday, ARM announced both a design for a tiny new chip, the Cortex-A7 MPCore, and a system- ...Missing: initial | Show results with:initial
[17]
ARM Unveils Its Most Energy Efficient Application Processor Ever
Oct 19, 2011 · The ARM Cortex-A7 processor and big.LITTLE processing have great potential to help increase smartphone performance and energy efficiency.Missing: conception date late 2000s constraints
[18]
[PDF] Enabling the Next Mobile Computing Revolution with Highly ...
This pairing of two CPUs with different power-performance tradeoffs but complete architectural compatibility allows a unique innovation: ARM® big.LITTLE™ ...Missing: trade- | Show results with:trade-
[19]
Determinants of lithium-ion battery technology cost decline
Nov 22, 2021 · We find that between the late 1990s and early 2010s, about 38% of the observed cost decline resulted from efforts to increase cell charge density.
[20]
The multicore SoC – will 2010 be the turning point? - Embedded
Apr 13, 2010 · There has also been a shift in consumer electronics to adopt multicore hardware as demand for more processing power and complex user interfaces ...
[21]
[PDF] Evaluating Asymmetric Multiprocessing for Mobile Applications
Traditional symmetric multiprocessing (SMP) attains the first goal, but fails the second one, as homogeneous cores are inefficient for the diverse performance ...<|separator|>
[22]
[PDF] The Benefits of Multiple CPU Cores in Mobile Devices | NVIDIA
Dec 1, 2010 · Multiple CPU cores in mobile devices allow for faster work, lower power consumption, higher performance per watt, and faster web page load ...Missing: rise early
[23]
[PDF] A Study of Mobile Device Utilization - Trevor Mudge
We perform two preliminary experiments on an Origen board, a current mobile device platform with a quad-core 1.4GHz ARM Cortex-A9 CPU. First, we measure how ...Missing: homogeneous | Show results with:homogeneous
[24]
[PDF] Power and Thermal Analysis of Commercial Mobile Platforms - arXiv
Mar 19, 2019 · Our results show that thermal throttling degrades the performance by as much as 34% while running popular. Android applications. Our empirical ...<|separator|>
[25]
ARM's big.LITTLE Concept - Semiconductor Engineering
Nov 8, 2012 · ARM EVP Simon Segars gave the opening keynote address at last week's ARM TechCon in Santa Clara, California. The big announcement was the new ...
[26]
Cortex-A53 Product Support - Arm Developer
The Cortex-A53 processor is a high efficiency processor that implements the Armv8-A architecture. The Cortex-A53 processor has one to four cores.Missing: A57 clock out-
[27]
Cortex-A53 | Low-power 64-Bit Processor - Arm
Arm big.LITTLE technology is a heterogeneous processing architecture that uses two types of processor. 'LITTLE' processors are designed for maximum power ...Missing: explanation | Show results with:explanation
[28]
The Arm CoreLink CCI-400 Cache Coherent Interconnect
The Arm CoreLink CCI-400 Cache Coherent Interconnect provides full cache coherency between two clusters of multi-core CPUs. It enables big.LITTLE processing ...Missing: configuration 2-4 4-8
[29]
big.LITTLE configurations - Arm Developer
The LITTLE cluster is capable of handling most low intensity tasks such as audio playback, web-page scrolling, operating system events, and other always on, ...<|control11|><|separator|>
[30]
[PDF] Software Techniques for ARM big.LITTLE Systems
This is called big.LITTLE MP (which is essentially a Heterogeneous Multi-Processing paradigm). This is the most sophisticated and flexible mode for a big.Missing: definition | Show results with:definition<|control11|><|separator|>
[31]
Cluster migration - Arm Developer
Only one cluster, either big or LITTLE, is active at any one time, except very briefly during a cluster context switch to the other cluster.Missing: state | Show results with:state
[32]
Tech Explained - ARM big.LITTLE Processing - CPU - HEXUS.net
Oct 24, 2013 · With big.LITTLE's initial software implementation, called Cluster Migration, only one processor cluster is active at one time; the other ...
[33]
big cluster and LITTLE cluster - Arm Developer
There are five power domains for each cluster. Each core is in its own power domain to enable individual powerdown. The rest of the logic within a cluster ...Missing: switching | Show results with:switching
[34]
ARM Benchmarks Flavors of Big-Little Multiprocessing - EE Times
The simplest is the clustered migration approach adopted by Samsung for the Exynos 5 Octa. This has the advantage of presenting a uniprocessor programming ...
[35]
Samsung Exynos 5 Octa 5420 looks to correct past mistakes ...
Jul 23, 2013 · The first Exynos 5 Octa used a specific implementation of big.LITTLE called Cluster Migration. That means either the A15 cores or the A7s ...
[36]
[PDF] In Kernel Switcher: A solution to support ARM's new big.LITTLE ...
In Kernel Switching(IKS) at Linaro to form a "virtual" CPU. All virtual CPUs have the same processing capabilities.
[37]
Linux support for ARM big.LITTLE - LWN.net
Feb 15, 2012 · With a single call, the hypervisor can atomically suspend execution of the whole system, migrate the CPU states from one cluster to the other, ...<|control11|><|separator|>
[38]
Linaro enhances Linux support for ARM Big.Little - LinuxGizmos.com
Jul 18, 2013 · Linaro has developed a new way for Linux and Android developers to implement ARM's Big.Little multi-core load balancing architecture.
[39]
Samsung Primes Exynos 5 Octa for ARM big.LITTLE Technology ...
Sep 10, 2013 · The HMP solution for Samsung's Exynos 5 Octa application processors will be available to customers in 4Q of 2013.Missing: first commercial
[40]
[PDF] MediaTek CorePilot
Heterogeneous multi-processing removes all limitations and allows a task to be allocated to any combination of 'big' and 'LITTLE' CPUs. Heterogeneous multi- ...<|control11|><|separator|>
[41]
Heterogeneous multi-processing - Arm Developer
The central principle of big.LITTLE is that application software can run unmodified on either type of processor. For a detailed overview of big.LITTLE ...Missing: definition | Show results with:definition
[42]
big.LITTLE - ARM - WikiChip
Mar 29, 2025 · ARM Holdings' power consumption improvement big.LITTLE solution announced in October 2011. CPU developed by ARM Holdings as the ...
[43]
[PDF] EAS Overview and Integration Guide - Arm Developer
big.LITTLE designs. Over the years, these endeavours resulted in a lot of ... performance/power ratio and responsiveness in comparison to the use of PELT.
[44]
[PDF] Update on big.LITTLE scheduling experiments
Frequency transition latency is well below the schedule period on ARM TC2. □ Counterproductive scheduling behaviour can be avoided, e.g. the scheduler migrates ...
[45]
A Task Scheduling Algorithm Based on Big.LITTLE Architecture in ...
To address this issue, we designed an initial task allocation strategy based on the characteristics of big.LITTLE architecture, introduced reinforcement ...
[46]
Global Task Scheduling - Arm Developer
The ARM implementation of GTS is called big.LITTLE Multi-processing (MP). In this model the operating system task scheduler is aware of the differences in ...Missing: HMP | Show results with:HMP
[47]
One Windows Kernel | Microsoft Community Hub
Oct 17, 2018 · To support big.LITTLE architecture and provide great battery life on Windows 10 on ARM, the Windows scheduler added support for heterogenous ...
[48]
Utilization Clamping - The Linux Kernel documentation
Utilization clamping (util clamp) is a scheduler feature that allows user space to manage task performance by clamping the signal to a certain point.
[49]
[PDF] ENERGY EFFICIENCY ANALYSIS OF ARM BIG.LITTLE GLOBAL ...
This work focuses purely on evaluating the performance, in terms of execution time, and energy efficiency of an ARM big. LITTLE system with Global Task ...Missing: peak 2017 2018
[50]
Samsung introduces the New Exynos 5 Octa Processor, Exynos 5420
Jul 23, 2013 · The newest Exynos processor is powered by four ARM Cortex®-A15™ processors at 1.8GHz with four additional Cortex-A7™cores at 1.3 GHz in a big.
[51]
https://news.samsung.com/global/samsung-brings-enhanced-mobile-graphics-performance-capabilities-to-new-exynos-5-octa-processor
[52]
MT6595 | First 4G LTE Octa-Core SoC with UHD - MediaTek
Feb 11, 2014 · The MT6595 employs ARM's big.LITTLE™ architecture with MediaTek's CorePilot™ technology to deliver a Heterogeneous Multi-Processing (HMP) ...
[53]
Teardown: Samsung Galaxy S4 - EE Times
Claiming to be an eight-core processor, the Octa is one of the first processors to incorporate the “big-little” design utilizing four 1.6 GHz ARM Cortex-A15 ...
[54]
AllWinner A80 Octa Core big.LITTLE Processor CPU Usage Under ...
Nov 23, 2014 · LITTLE Processor CPU Usage Under Various Loads in Android 4.4 (Video). Allwinner A80 is one of the few octa core processors featuring ARM's big.
[55]
Evaluating the Apple Silicon M-Series SoCs for HPC Performance ...
Feb 7, 2025 · The Apple Silicon M-Series processors use ARM-based architectures. These processors use a big.LITTLE approach, which incorporates a mix of ...
[56]
ARM big.LITTLE powers Raspberry Pi competitor - Electronics Weekly
Aug 21, 2013 · The $169 computer board is a higher performance competitor to the Raspberry Pi computer board which is aimed at educational markets as well ...
[57]
Reliability analysis in ARM big.LITTLE heterogeneous multi-core ...
Sep 15, 2024 · Error Detectin Rate Analysis minimize power consumption and ensure reliability, ARM big.LITTLE architectures serve as apt choices for IoT ...
[58]
Adoption of ARM big.Little Technology Accelerates
Feb 26, 2013 · LITTLE technology saves up to 70 percent of processor energy consumption in common mobile workload tasks, essential as the performance of the ...Missing: commercial 2014 examples Qualcomm HiSilicon Allwinner
[59]
Arm Partners Have Shipped 200 Billion Chips
Oct 18, 2021 · Half the 200 billion Arm-based chips shipped to date were deployed over the last five years and one-in-eight left the production line during ...
[60]
Samsung Galaxy Alpha vs. HTC One mini 2: Metal mania
... Cortex-A15 cores and 1.3GHz for the four Cortex-A7 cores. As an SMP ... The single-core test shows a 2x advantage of Cortex-A15 over the A7. For ...
[61]
[PDF] big.LITTLE Architecture: Heterogeneous Multicore Processing
With big. LITTLE's initial software implementation, called Cluster Migration, only one processor cluster is active at one time; the other cluster - be it Cortex.Missing: practical | Show results with:practical
[62]
(PDF) Fast and accurate power annotated simulation: Application to ...
Nov 18, 2015 · ... ARM Cortex A9. This implementation is tailored for a. maximum 0.5-W power consumption per cluster with the STM. 28nm bulk CMOS technology. A ...<|control11|><|separator|>
[63]
CMN-600 Coherent Mesh: Scalable Network for Smart Systems - Arm
The Arm CoreLink CMN-600 Coherent Mesh Network is designed for intelligent connected systems across a wide range of applications.Missing: DynamIQ | Show results with:DynamIQ
[64]
Exploring DynamIQ and ARM's New CPUs: Cortex-A75, Cortex-A55
May 29, 2017 · LITTLE configuration. It's been hugely successful, with more than 40 licensees and 1.7 billion units shipped in just 3 years. But during this ...
[65]
Where does big.LITTLE fit in the world of DynamIQ? - Arm Developer
Apr 6, 2017 · These domains, consisting of single or multiple Arm CPUs, can scale up and down in performance and power by up to 4x finer granularity than ...
[66]
Arm Cortex-X925's Breakthrough with a 15 Percent IPC Improvement
Sep 23, 2024 · One standout architecture feature of Cortex-X925 is the integration of Arm's v9 SVE2 (Scalable Vector Extensions). SVE2 expands data-level ...Missing: big. LITTLE
[67]
Arm is so, so over this AI accelerator unit craze - The Register
May 30, 2024 · Specifically, we're talking about Armv9's Scalable Matrix Extension (SME2) as well as its Scalable Vector Extension (SVE2) instructions. Arm ...
[68]
Arm Drives Next-Generation Performance for IoT with World's First ...
Feb 26, 2025 · Arm's new platform brings together a brand-new ultra-efficient Armv9 CPU, Cortex-A320, along with the Ethos-U85 NPU with operator support for transformer ...
[69]
Arm Singles Out Next-Gen IoT With New Edge AI Platform - News
Mar 5, 2025 · Arm introduced its Armv9-based Cortex-A320 processor and Ethos-U85 neural processing unit (NPU) to help advance edge AI computing.
[70]
Cortex-A725 | Premium CPU for Gaming and AI with Armv9.2 ...
Arm big.LITTLE technology is a heterogeneous processing architecture that uses two types of processor. 'LITTLE' processors are designed for maximum power ...Missing: definition | Show results with:definition
[71]
Performance Study on COTS Mobile Devices - arXiv
May 4, 2025 · Most popular mobile SoCs utilize the ”big.LITTLE” architecture for their CPUs, which balances performance with power efficiency. This ...
[72]
Snapdragon 8 Gen 3 explained: The first mobile chipset to get an AI ...
May 19, 2025 · CPU Config. Snapdragon 8 Gen 3. 1x 3.3GHz (Cortex-X4) 3x 3.2GHz (Cortex-A720) 2x 3GHz (Cortex-A720) 2x 2.3GHz (Cortex-A520 Refresh). Snapdragon ...
[73]
https://www.microchipusa.com/articles/qualcomm/snapdragon-8-gen-3-mobile-platform-a-new-era-of-mobile-technology
[74]
[PDF] Silicon Reimagined - NET
Feb 25, 2025 · An analysis of how the Arm architecture and heterogeneous computing approach is enabling the next generation of. AI systems through flexible, ...
[75]
Arm Tech Predictions for 2025 and Beyond
Dec 20, 2024 · Arm's 2025 tech predictions cover the growth of AI, future of silicon designs and key trends across different technology markets.
[76]
Chiplet Technology: Unlocking New Potential in the Semiconductor ...
Jan 6, 2025 · Furthermore, chiplets support heterogeneous integration by combining different process nodes and materials, optimizing both performance and cost ...
[77]
Building what's next for hyperscale and AI data centers | CIO Dive
Sep 22, 2025 · The complexity of AI data centers requires manufacturing partners who can address challenges with network speed and latency, as well as design ...Missing: LITTLE | Show results with:LITTLE
[78]
https://www.forbes.com/sites/tiriasresearch/2025/11/07/the-isa-wars-have-ended-and-the-heterogeneous-cpu-era-has-arrived/