Fact-checked by Grok 2 weeks ago

Hyper-threading

Hyper-Threading Technology (HT Technology), developed by Intel Corporation, is a simultaneous multithreading (SMT) implementation that enables a single physical CPU core to execute two concurrent by duplicating certain architectural states—such as registers and program counters—while sharing the core's execution resources, including caches and functional units. Introduced commercially in 2002 with Intel's server and later integrated into the desktop CPU lineup at 3.06 GHz in November of that year, HT Technology improves efficiency by allowing the core to handle idle execution slots from one with instructions from another, thereby increasing overall throughput without requiring additional physical cores. This technology exposes two logical processors per physical core to the operating system, effectively doubling the thread-handling capacity and enhancing multitasking performance in workloads with variable thread demands, such as video encoding, scientific simulations, and with background applications. By exploiting parallelism at the level, HT Technology can deliver up to a 30% performance boost in threaded applications, though benefits vary based on software optimization and workload characteristics; it is particularly effective when threads perform diverse operations or experience pipeline stalls. Over time, HT Technology has evolved across Intel's processor generations, from early implementations in architecture to modern integrations in and families, where it remains enabled by default but can be toggled via settings for specific tuning needs, such as in or workloads. In July 2025, Intel announced plans to reintroduce Hyper-Threading in upcoming processors, including the 7000 series (Diamond Rapids) and future Ultra generations, to enhance multi-threaded performance. Despite its advantages, it is absent in Ultra Series 2 processors, reflecting shifts toward efficiency-focused designs without .

Fundamentals

Definition and Principles

Hyper-Threading Technology, Intel's proprietary implementation of (SMT), enables a single physical to execute instructions from two threads concurrently, presenting the core to the operating system as two distinct logical processors. This approach builds on the foundational SMT concept, which allows multiple independent threads to issue instructions to the 's functional units within the same clock cycle, thereby enhancing overall throughput by better utilizing available hardware resources. Unlike true multi-core processing, where each core operates as an independent with its own dedicated resources, Hyper-Threading duplicates only the architectural state—such as registers and control structures—while sharing the core's execution engine, caches, and other critical components between the threads. At its core, Hyper-Threading leverages by dynamically scheduling instructions from multiple threads to fill idle slots in the processor's , addressing inefficiencies like unused functional units or stalled cycles that occur in single-threaded execution. Threads share resources such as execution units and caches, allowing the to switch rapidly between them without the full overhead of traditional context switching, which involves saving and restoring the entire . This resource sharing promotes higher utilization of the core's capabilities, as one thread can continue processing while another encounters dependencies, such as waiting for memory access. In terms of thread scheduling, the processor dispatches instructions from the active threads in a cycle-by-cycle manner, prioritizing those that can execute immediately to maximize parallelism and minimize resource underutilization. Key terminology includes logical processors, which represent the virtual cores visible to software; thread contexts, the independent sets of architectural state maintained for each logical processor; and context switching in SMT, a lightweight mechanism that alternates between threads seamlessly to sustain concurrent execution. Hyper-Threading was first introduced commercially in 2002 with Intel's Xeon server processors, and later integrated into the Pentium 4 desktop processors.

Core Mechanisms

Hyper-threading, as an implementation of (SMT), enables a single physical processor core to execute instructions from two threads concurrently by duplicating the architectural state while sharing most execution resources. The operating system schedules threads to the logical processors, treating each as an independent entity, and pairs them based on workload characteristics to maximize resource utilization; for instance, the OS may assign compute-intensive and I/O-bound threads to the same core to balance activity levels. In the instruction pipeline, the fetch stage alternates between the two logical processors every clock cycle, pulling instructions from the trace cache in a fashion to ensure fair access, while the decode stage operates at a coarser , sharing decode logic but switching contexts as needed to handle instructions from either . Instructions are then dispatched to the engine via a (uop) queue that is partitioned equally between threads, allowing up to six uops per cycle from both threads combined, with the scheduler selecting ready operations regardless of thread origin to keep execution units occupied. This adaptation maintains the core's out-of-order capabilities, such as a 126-entry reorder split into 63 entries per thread, enabling and reordering across threads without inter-thread dependencies. Resource contention arises when both threads demand shared execution units, such as the or floating-point pipelines, but is resolved through partitioned buffering and allocation limits; for example, the load/store buffers are divided (24 loads and 12 stores per ) to prevent one thread from exhausting the , while retirement logic alternates commits between threads on a first-come, first-served basis to avoid . Fairness is maintained via hardware-enforced caps on active entries in shared structures, ensuring neither thread monopolizes resources, though no explicit priority levels exist—instead, access is arbitrated or by readiness, with the operating system influencing outcomes through thread scheduling priorities. The control unit oversees these operations by monitoring the state of each logical , arbitrating resource access (e.g., trace cache indexing or branch prediction), and facilitating rapid context switches, which occur transparently in a single cycle when one thread stalls, allowing the other to proceed without OS intervention. It also handles independent thread halts, transitioning the core to single-thread mode if one logical idles, thereby optimizing power and performance during unbalanced workloads.

Technical Implementation

Architectural Components

In the original implementation based on the , Hyper-Threading Technology (HTT) in achieves by duplicating a minimal set of architectural state while sharing the majority of resources between two logical processors per physical . This design allows the core to maintain two independent thread contexts with low hardware overhead, typically less than 5% increase in die area for the duplicated elements. Key duplicated components include the register files, implemented as two separate Register Alias Tables (RATs) to handle allocation and renaming for each logical independently. The reorder (ROB) is partitioned to support independent tracking for instructions from both threads, with up to 63 entries allocated per logical in early designs. Similarly, load/store queues are duplicated or partitioned, with early implementations allocating a maximum of 24 load and 12 store per logical to manage operations for its context. Additional duplicated elements encompass segment registers, control registers, debug registers, most model-specific registers (MSRs), and an (APIC) per logical . Shared components form the bulk of the pipeline, promoting efficient resource utilization across threads. Execution units, including arithmetic logic units (ALUs), floating-point units, and load/store execution pipelines, operate on a shared physical register pool that is agnostic to logical boundaries, with scheduling handled by the unified out-of-order . Caches at L1 (4-way set-associative with 64-byte lines) and L2 (8-way with 128-byte lines) levels are shared, with cache lines tagged by logical ID to resolve conflicts and ensure during multi-thread access in early designs. Branch predictors a shared global history tagged by logical ID for pattern tracking, while the return buffer is duplicated to maintain accurate call-return predictions per thread; the overall predictor arbitrates access to minimize latency for both threads. Other shared elements include the interface and firmware hub. While principles of duplication and sharing persist, specific resource sizes and mechanisms have evolved in later microarchitectures, such as dynamic allocation in modern designs. Modifications to the front-end in early implementations accommodate from multiple threads by widening the fetch and decode stages. The employs two next- pointers () to track fetch addresses for each logical , alternating access between them every clock cycle to deliver up to four in total. Decode logic maintains separate states for both threads and switches between them at a coarser , such as every eight , to sustain higher throughput without excessive switching overhead. Power and in HTT integrates dynamic disabling to optimize under varying workloads. When a logical executes a HALT , it enters a low-power state (ST0 or ), allowing the physical core to reallocate resources—such as ROB entries and queues—to the active for unimpeded single- ; if both threads halt, the core enters a deeper power-saving mode. This mechanism, combined with independent halting per logical , enables fine-grained control over power dissipation and output specific to hyper-threaded operation.

Thread Execution Process

In Hyper-Threading Technology (HTT), the operating system (OS) perceives each physical as two logical processors, enabling the scheduler to assign software independently to these logical processors as if they were distinct physical entities. The OS scheduler, such as Windows NT's or Linux's, treats logical processors symmetrically with physical ones, dispatching based on priority, load balancing, and settings to optimize locality and reduce migration overhead. Thread mechanisms allow developers or the OS to pin to specific logical processors, ensuring consistent mapping to avoid unnecessary context switches between sibling on the same physical , which could increase due to shared resources. During runtime, the execution flow begins with the frontend fetching instructions alternately from the two active threads on each cycle, arbitrating access to maintain fairness and prevent starvation. Out-of-order execution units then process micro-operations (uops) from both threads concurrently, with schedulers capable of dispatching up to six uops per cycle by interleaving from the two threads to hide latency from stalls like cache misses in one thread. Instruction retirement occurs in program order for each thread independently, using a partitioned reorder buffer (e.g., 63 entries per logical processor in early designs) for checkpointing speculative execution states, allowing precise rollback if branch mispredictions or exceptions arise without affecting the other thread. Interrupts and exceptions are handled per logical processor via dedicated local APICs, ensuring isolation so that an interrupt on one thread does not disrupt the execution context of its sibling on the shared physical core. Synchronization in HTT environments relies on standard OS primitives adapted for shared-core contention, such as mutex locks and semaphores, which must account for reduced in intra-core communication but increased risk of thrashing. Barriers, used to coordinate progress in workloads, benefit from HTT's fine-grained interleaving, as stalled threads at a barrier free execution resources for the other , improving overall utilization; however, implementations should incorporate pause instructions in loops to and prevent excessive power consumption on the shared core. Consider a simple parallel matrix multiplication workload divided into two threads assigned by the OS scheduler to sibling logical processors on one physical core. Thread A loads row data and performs floating-point multiplications, while Thread B handles column data with similar operations; the frontend alternates fetching their instructions, allowing Thread A's cache miss stall to be masked by dispatching Thread B's uops to underutilized ALUs. As Thread A retires independent results to its architectural registers, Thread B advances similarly, with a shared barrier synchronizing partial sums at iteration ends—here, Thread B's quicker progress fills idle cycles, achieving up to 30% higher throughput than single-threaded execution on the same core by better exploiting execution unit parallelism.

History and Evolution

Origins and Development

The concept of (SMT), the foundational technology behind hyper-threading, emerged from academic research in the early aimed at improving processor efficiency through better resource utilization. Pioneering work by Yale Patt and colleagues on high-performance superscalar microarchitectures in the 1980s laid the groundwork by emphasizing , which later influenced SMT designs. The term "simultaneous multithreading" was first proposed in a 1992 paper by Hideki Hirata et al., who explored issuing instructions from multiple threads in a single cycle to mitigate issues in superscalar processors. This was expanded in 1995 by Dean M. Tullsen, Susan J. Eggers, and colleagues at the , whose seminal ISCA paper demonstrated SMT's potential to achieve up to 50% higher throughput on existing hardware by overlapping thread execution, without requiring massive increases in hardware complexity. Intel began adopting SMT concepts in the late 1990s as part of efforts to sustain performance gains amid physical limits on clock speed scaling, driven by escalating power dissipation and thermal challenges in technology. By the mid-1990s, clock frequencies had approached 1 GHz, but further increases risked prohibitive heat and energy costs, prompting a shift toward architectural innovations like threading to extract more from single cores. Deborah T. Marr, leading an team in the Desktop Products Group, spearheaded the development of hyper-threading as Intel's proprietary implementation for the microarchitecture. Motivated by simulations showing underutilized execution units in superscalar designs—often idle due to branch mispredictions or cache misses—Marr's group focused on duplicating minimal architectural state (like registers and program counters) while sharing core resources to enable two logical processors per physical core. Early prototypes validated the approach, proving feasible integration with minimal area overhead of about 5%. Intel secured key patents for hyper-threading mechanisms during this period, including filings on thread scheduling in multi-threaded processors and logical processor emulation. These built on , such as a Sun Microsystems patent granted to Okin in 1994 for similar threading concepts. Initial testing occurred in research labs, influenced by academic studies, confirming viability for commercial deployment without major redesigns. Hyper-threading debuted commercially on November 14, 2002, with the Northwood-core processors at 3.06 GHz and above, following an earlier rollout in server processors in February 2002. announced the technology at Intel Developer Forum in 2001, positioning it as a solution to "keep alive" by delivering up to 30% performance uplift in threaded workloads through better core utilization, amid growing demand for multitasking in desktops and servers. This launch marked the first widespread adoption of in consumer x86 CPUs, transitioning the idea from research to production.

Adoption Across CPU Generations

Hyper-Threading Technology was first adopted in consumer processors with the in November 2002, specifically the 3.06 GHz model based on the Northwood core, enabling one physical core to appear as two logical processors to the operating system. This feature was extended across subsequent variants, including those on the 90 nm Prescott core, until the architecture's phase-out around 2008, though its primary consumer availability spanned 2002 to 2005. Following the shift to dual-core designs under the architecture, such as the processors introduced in 2005, Hyper-Threading was temporarily discontinued, as these chips relied on multiple physical cores rather than logical threading for multithreading support. Hyper-Threading was revived in 2008 with the Nehalem microarchitecture, powering the first Core i7 processors, where it was reintroduced alongside an integrated memory controller and shared L3 cache to enhance multithreaded workloads. This revival continued into the Sandy Bridge microarchitecture in 2011, which featured an enhanced version known as Hyper-Threading 2.0, offering improved thread scheduling and up to a 15-30% performance uplift in threaded applications compared to single-threaded execution on the same cores. The technology evolved further with hybrid architectures like Alder Lake in 2021, where it is implemented exclusively on performance cores (P-cores) to double their logical threads, while efficiency cores (E-cores) operate without it to prioritize power savings and simpler design. Refinements over generations have maintained the core principle of up to two logical cores per physical core via 2-way , but with optimizations for power efficiency, particularly in 10 nm and later processes; for instance, the 10 nm architecture simplified thread control logic to reduce gate count and improve energy use without sacrificing throughput. As of 2025, Hyper-Threading remains standard in Intel's 14th-generation processors and server lines, such as Granite Rapids, delivering consistent multithreading benefits in environments. However, it was omitted from 15th-generation consumer Ultra processors like Arrow Lake to focus on single-thread and die area reduction, though Intel has confirmed its return in future generations to address multithreaded competitiveness. In comparison, ARM-based processors have seen limited adoption of equivalent , with most Neoverse cores eschewing it in favor of higher physical core counts for efficiency in server and mobile applications.

Performance Evaluation

Benefits and Gains

Hyper-Threading Technology enhances CPU throughput by enabling (SMT) on a single physical core, allowing two logical threads to share execution resources and overlap operations to hide latencies from stalls such as misses or mispredictions. This parallelism improves overall performance in workloads exhibiting thread-level parallelism, delivering an average gain of up to 30% in common server applications by keeping functional units active when one thread is idle. Key benefits include superior utilization of superscalar execution units, where resources underutilized by one thread—such as during branch mispredictions—can be immediately allocated to the other thread, thereby maximizing pipeline efficiency without additional hardware complexity. Additionally, Hyper-Threading reduces context switch overhead compared to traditional full-core switching, as the operating system treats logical processors as separate entities while sharing the same core, avoiding the full cost of thread migration between physical cores. The technology proves particularly advantageous for server applications like (OLTP), web serving, and workloads, where multithreaded environments benefit from increased responsiveness and throughput. In scenarios, it supports higher density by presenting more logical cores to the , enabling better across partitioned environments without proportional hardware increases. For lightly threaded desktop tasks, such as media processing or background operations, Hyper-Threading sustains performance by handling concurrent streams more effectively than single-threaded execution. Regarding , Hyper-Threading achieves lower power per in mixed workloads by leveraging existing resources to complete tasks faster, with implementations adding only about 20% to dynamic power for a 30% instructions-per-cycle () uplift, outperforming scenarios where the feature is disabled. This modest overhead in die area and power—less than 5% in early designs—allows for substantial throughput gains without excessive demands.

Benchmarks and Claims

Intel has claimed that early implementations of Hyper-Threading Technology deliver approximately 30% performance gains for multithreaded operating systems and applications compared to non-hyper-threaded processors. In server-centric benchmarks, such as and workloads, reported gains ranging from 16% to 28%, with up to 30% improvements in common server applications on processors. Independent evaluations corroborate these claims with variability across workloads. AnandTech's analysis of SPEC CPU2006 benchmarks on Skylake-era processors showed an 20% uplift from Hyper-Threading in multithreaded scenarios. Similarly, Phoronix benchmarks on an i7-8700K demonstrated notable gains in multi-threaded applications, such as in rendering tasks like for parallel workloads. Single-threaded tasks, however, exhibited minimal or no improvement, often below 5%, highlighting Hyper-Threading's dependency on thread-parallelizable code. Performance variability also stems from workload characteristics and operating system scheduling. Multithreaded applications like video encoding and benefit most, with gains of 20-35% reported in representative tests, while latency-sensitive or single-threaded tasks show little advantage. OS schedulers influence outcomes based on thread distribution across logical cores. In modern architectures like Intel's ( Series 1), Hyper-Threading on performance cores synergizes with efficiency cores to enhance multi-threaded performance, particularly in tasks. For Lunar Lake ( Series 2), which omits Hyper-Threading in favor of architectural optimizations, multi-threaded performance still improves over by up to 50% in power-constrained scenarios, demonstrating evolving synergies without traditional .

Limitations

Drawbacks and Overhead

Hyper-Threading Technology introduces resource overhead primarily through the duplication of certain architectural structures, such as files and execution units, which can increase consumption by less than 5% in maximum requirements and up to 10% in certain cache-intensive workloads compared to disabled configurations, even in single-threaded scenarios where the additional logical cores remain idle. This elevated draw stems from the sustained activation of shared resources and the baseline complexity of the duplicated states, leading to higher generation and potential throttling in densely packed systems. For instance, in SPEC CPU2006 benchmarks on Westmere-EP processors, enabling Hyper-Threading resulted in a consistent premium of up to 10% for cache-intensive tasks without corresponding performance improvements, exacerbating energy inefficiency. Performance regressions occur due to contention in shared resources, particularly , where the two logical on a physical compete for limited space, causing increased cache misses and evictions in cache-sensitive workloads. This contention can lead to slowdowns of 8-50% in applications like on processors, as observed in environments, where doubled thread counts amplify L3 cache pressure and branch mispredictions without sufficient parallelism to offset the interference. Such regressions are pronounced in scenarios with poor data locality or , where aliased accesses between threads result in unnecessary cache-line invalidations, reducing overall throughput for memory-bound tasks. The implementation of Hyper-Threading adds complexity to operating systems and , requiring thread-aware optimizations to manage logical versus physical distinctions and minimize synchronization overheads. Operating systems must handle increased context-switching costs for the additional logical cores, while developers face challenges in tuning applications to avoid resource conflicts, such as data structures to prevent or selecting appropriate blocking APIs over spin-wait loops, which can otherwise consume shared execution resources and negate benefits. Mitigation strategies include disabling Hyper-Threading at the level to eliminate overhead in single-threaded or contention-heavy workloads, which has been shown to restore performance in parallel applications on Windows and systems by reducing thread spawning and interference. Workload-specific tuning, such as optimizing locality through partitioning or minimizing inter-thread dependencies, further alleviates regressions without full disablement.

Security Implications

Hyper-threading, by enabling on shared physical cores, amplifies vulnerabilities in transient execution attacks such as and Meltdown, as logical threads share microarchitectural resources like and buffers, facilitating cross-thread data leaks through timing side channels. For instance, attackers can exploit timing differences to infer sensitive data processed by a sibling thread, bypassing isolation boundaries that would otherwise protect against such leaks in non-hyper-threaded configurations. This model heightens risks in environments where multiple threads from different security contexts run concurrently on the same core. A prominent example is the ZombieLoad attack, disclosed in 2019, which targets hyper-threading's shared L1 data cache and load port buffers to snoop on data from other threads, potentially extracting secrets like passwords or keys across privilege levels. This vulnerability, assigned CVE-2018-12130 and affecting CPUs from 2011 onward, leverages to access stale data left in buffers by a sibling thread, enabling leaks at rates up to several kilobytes per second in cross-VM scenarios. Similarly, the RIDL (Rogue In-Flight Data Load) attack, also from 2019 and part of the broader Microarchitectural Data Sampling (MDS) family (CVE-2018-12127), exploits shared line fill buffers and store buffers under hyper-threading to leak in-flight data across address spaces, including from SGX enclaves or virtual machines. RIDL variants demonstrated practical extractions of kernel data and cryptographic keys, underscoring hyper-threading's role in amplifying these side-channel threats. Mitigations for these vulnerabilities include Intel's microcode updates, which clear affected buffers on context switches, alongside operating system patches that enforce additional , such as Linux's MDS options. In vulnerable systems, disabling hyper-threading via settings or OS configurations fully eliminates the cross-thread leak vectors but incurs performance penalties of up to 30-40% in certain workloads. Hardware-based fixes were introduced in Intel's 12th generation () and subsequent CPUs, rendering them unaffected by MDS-class attacks while retaining hyper-threading functionality, supported by ongoing microcode enhancements. These security concerns have contributed to the omission of Hyper-Threading in Intel's Ultra Series 2 processors (2024), prioritizing efficiency and without . As of 2025, concerns persist in environments, where multi-tenant setups amplify hyper-threading risks by allowing untrusted workloads on shared cores, potentially enabling between virtual machines. Providers recommend selective enabling of hyper-threading—such as limiting threads per core to one for high-security —or combining it with advanced techniques to balance performance and protection against evolving side-channel exploits.

References

  1. [1]
    [PDF] Intel Technology Journal
    Feb 14, 2002 · Intel's Hyper-Threading Technology brings Simultaneous Multi-Threading to the Intel Architecture and makes a single physical processor ...
  2. [2]
    Intel Delivers Hyper-Threading Technology With Pentium® 4 ...
    HT Technology allows the processor to work on two separate threads at the same time rather than one at a time. In addition, applications can ...
  3. [3]
    Hyper-Threading the Pentium 4 - Explore Intel's history
    Hyperthreading technology had been included in Intel's Xeon server processors earlier in 2002, but Pentium 4 was its first inclusion in a PC processor. It would ...
  4. [4]
    What Is Hyper-Threading? - Intel
    Hyper-Threading is an Intel® hardware innovation that allows multiple threads to run on each core, this means more work can be done in parallel.
  5. [5]
    [PDF] Intel® Technology Journal
    Feb 18, 2004 · Intel® Pentium® 4 processors built on the 90-nanometer process retain the multitasking capabilities of Hyper-Threading (HT) Technology, and add ...<|control11|><|separator|>
  6. [6]
    No Hyper-Threading (HT) for Intel® Core™ Ultra Processors (Series 2)
    This technology allows a single physical core to act as two logical cores, effectively doubling the thread count. Since these processors do not support hyper- ...
  7. [7]
    [PDF] Intel® Hyper-Threading Technology Technical User's Guide
    Hyper-Threading Technology makes a single physical processor appear as multiple logical processors. Through simultaneous multithreading, Hyper-Threading.
  8. [8]
    [PDF] Simultaneous Multithreading: Maximizing On-Chip Parallelism
    This paper examined simultaneous multithreading. a technique that allows independent threads to issue instructions lo multiple func- tionalunits in a single ...Missing: seminal | Show results with:seminal
  9. [9]
    [PDF] Hyper-Threading Technology Architecture and Microarchitecture
    Hyper-Threading Technology makes a single physical processor appear as two logical processors; the physical execution resources are shared and the architecture ...Missing: contention resolution
  10. [10]
    [PDF] Supporting Fine-Grained Synchronization on a Simultaneous ...
    Multi- threaded processors, such as SMT, provide an opportunity to greatly decrease synchronization cost, because the communicating threads are internal to a ...Missing: hyper- | Show results with:hyper-
  11. [11]
    Intel Pentium D microprocessor family - CPU-World
    The dual-core processors do not include HyperThreading technology. The Pentium D CPUs have higher power consumptions than Pentium 4 processors, but not ...
  12. [12]
    Nehalem Revolution: Intel's Core i7 Processor Complete Review
    Nov 2, 2008 · Intel claims that HyperThreading is an extremely power efficient way to increase performance – it takes up very little die area on Nehalem yet ...Missing: Sandy | Show results with:Sandy
  13. [13]
    [PDF] Performance Evaluation of the Intel Sandy Bridge Based NASA ...
    Hyper-Threading 2.0: Intel provided HT 1.0 in Nehalem. In Sandy Bridge E5-2670, it is enhanced to HT 2.0. HT enables two threads to execute on each core in ...
  14. [14]
    Intel talks up its 10nm Tiger Lake laptop system-on-chips as though ...
    Aug 13, 2020 · This allowed us to simplify the hardware thread control logic, reduce gate count, and as a result, increase the power efficiency." At its ...
  15. [15]
    Intel CEO Confirms SMT To Return to Future CPUs | TechPowerUp
    Jul 25, 2025 · HT is present on Intel processors up to the 14th generation. It's not present on the 15th (ULTRA)... And this is being shown on the front page.
  16. [16]
    Intel CEO confirms that SMT (Hyper-Threading) will return to future ...
    Jul 27, 2025 · Intel CEO Lip-Bu Tan confirms that the return of SMT (aka Hyper-Threading) will happen on future-generation CPUs, as it bleeds market share ...
  17. [17]
    Reassess CPU utilization on Simultaneous Multithreading Enabled ...
    Sep 19, 2024 · Arm Neoverse N-Series and V-Series processors do not implement SMT (Simultaneous Multithreading) technology. When running on an Arm Neoverse ...Missing: equivalents | Show results with:equivalents
  18. [18]
    [PDF] How Processor Core Count Impacts Virtualization Performance and ...
    We have increased the CPU core count from two to 10 cores and reintroduced Intel®. Hyper-Threading Technology (Intel® HT), which doubles the number of software.
  19. [19]
    [PDF] Next Gen P-core: The Lion Cove Architecture - Intel
    SLIDE 5: Hyper-Threading Benefits. Hyper-Threading scaling characterized as adding 30% IPC. (or throughput) for 20% Cdyn (or power at the same V/F. Projected ...
  20. [20]
    SMT Integer Performance With SPEC CPU2006 - Sizing Up Servers
    Jul 11, 2017 · On average, both Xeons pick up about 20% due to SMT (Hyperthreading). The EPYC 7601 improved by even more: it gets a 28% boost on average.
  21. [21]
    Intel Hyper Threading Performance With A Core i7 On Ubuntu 18.04 ...
    Jun 20, 2018 · Here are some benchmarks using a current-generation Intel Core i7 8700K six-core processor with Hyper Threading.Missing: apps | Show results with:apps
  22. [22]
    Multi-threading performance much worse on Windows 10 than Linux
    Jul 6, 2018 · The performance of the Windows version is abysmal compared to the performance of the Linux version running on the same dual-boot hardware.Hyper-threading Performance Comparison - c++ - Stack OverflowHow to test performance of Intel HyperThreading in Linux X86_64More results from stackoverflow.comMissing: workload benchmarks
  23. [23]
    Windows 11 vs. Ubuntu Linux Performance On The Intel Core Ultra ...
    Dec 21, 2023 · This article will shed some light on that aspect with looking at the Microsoft Windows 11 vs. Ubuntu Linux performance on the Acer Swift Go 14 Meteor Lake ...Missing: HT off
  24. [24]
    Intel Lunar Lake-U gets 50% boost over Meteor Lake without HTT
    Mar 11, 2024 · New leak confirms Intel Lunar Lake-U mobile CPUs could be up to 50% faster than Meteor Lake in multi-threaded workloads without HTT.
  25. [25]
    [PDF] Simultaneous Multithreading on x86_64 Systems: An Energy ...
    Oct 23, 2011 · This study has shown first signs that the use of Hyper-Threading introduces a significant power consumption overhead while not providing any ...
  26. [26]
    [PDF] Effects of Hyper-Threading on the NERSC workload on Edison
    Along with the Aries interconnect, Hyper-Threading (HT) is one of the new ... communication overhead. We were able to confirm that with HT the cycles ...
  27. [27]
  28. [28]
    [PDF] Leaking Data on Meltdown-resistant CPUs (Updated and Extended ...
    Mar 5, 2021 · Meltdown and Spectre exploit microarchitectural changes the CPU makes during transient out-of-order execution. Using side-channel.
  29. [29]
  30. [30]
    [PDF] Cross-Privilege-Boundary Data Sampling - ZombieLoad Attack
    May 14, 2019 · We discuss both short and long-term mitigation approaches and arrive at the conclusion that disabling hyperthreading is the only possible ...
  31. [31]
    [PDF] RIDL: Rogue In-Flight Data Load - MDS Attacks
    May 14, 2019 · RIDL, also known as MDS, is a speculative execution attack that leaks arbitrary in-flight data across address spaces and privilege boundaries.
  32. [32]
  33. [33]
    Affected Processors: Transient Execution Attacks & Related Security...
    This table shows the impact of transient execution attacks and security issues on currently supported Intel processors, with recommended mitigation.
  34. [34]
    Set the number of threads per core | Compute Engine
    Security: If a VM runs untrusted code, reducing the number of threads per core can help mitigate CPU vulnerabilities such as Microarchitectural Data Sampling.