Fact-checked by Grok 2 weeks ago

Efficiently updatable neural network

An efficiently updatable neural network (NNUE) is a specialized neural network architecture optimized for evaluating positions in board games like shogi and chess, featuring an input layer that encodes board states in a way that enables incremental updates with minimal computational overhead when pieces move.^[1] Developed initially for shogi engines, NNUE replaces traditional handcrafted evaluation functions with machine-learned models trained on vast datasets of game positions, achieving high accuracy while running efficiently on standard CPUs during alpha-beta search.^[2] Originating in 2018 from the Japanese shogi programming community, the technique was pioneered by Yu Nasu as an extension of piece-square table methods from earlier engines like Bonanza, using a multi-layer perceptron with half-knowledge point (HalfKP) features that consider king-relative piece placements for both players.^[3] In 2020, programmer Hisayori "Nodchip" Noda adapted NNUE for the chess engine Stockfish, integrating it into version 12 and yielding an Elo rating improvement of approximately 90 points over the prior hand-tuned evaluation, marking a pivotal shift toward neural network dominance in competitive chess engines.^[2] Since then, NNUE has been widely adopted in engines such as Komodo Dragon, Igel, and Ethereal, revolutionizing game AI by combining the precision of deep learning with the speed required for real-time play, and influencing training methodologies that leverage self-play and supervised learning from high-quality game databases.^[1] The architecture typically includes an overparameterized input layer (e.g., 768 features for chess), one or more hidden layers with clipped ReLU activations (often 1024–3072 neurons), and a single output neuron producing a centipawn evaluation score, all quantized for further efficiency.^[1]

Overview

Definition and purpose

An efficiently updatable neural network (NNUE), stylized as ƎUИИ, is a type of feedforward neural network specifically engineered to support rapid incremental updates when inputs undergo minor modifications, making it ideal for real-time evaluation functions in board games such as shogi and chess.^[4] Unlike conventional deep neural networks that require full recomputation for each input change, NNUE leverages sparse, differential updates to maintain computational efficiency on standard CPUs, exploiting the fact that game states typically evolve through small alterations like piece movements.^[5] This architecture was originally developed for computer shogi but has since been adapted for other strategic games.^[6] The primary purpose of NNUE is to generate a dynamic numerical score representing the evaluative strength of a given game position—such as the estimated advantage for one player over the other—thereby approximating the intuitive assessments made by human experts more faithfully than hand-crafted heuristic functions traditionally used in alpha-beta search algorithms.^[4] By integrating neural network capabilities into game engines, NNUE enhances the accuracy of position assessments without compromising the low-latency requirements of search processes, which often demand millions of evaluations per second.^[5] This allows engines to achieve superhuman performance levels while remaining computationally feasible for consumer hardware.^[7] The name NNUE originates from a Japanese wordplay on nue, a mythical chimera-like creature from folklore, and was coined by its inventor, Yu Nasu, in reference to the network's hybrid and adaptive nature.^[3] The core evaluation function of NNUE can be expressed mathematically as

s = f\left(W_4 \cdot \sigma\left(W_3 \cdot \sigma\left(W_2 \cdot (W_1 \cdot x + b_1) + b_2\right)\right) + b_3\right),

where x represents the vectorized board state, W_i denote the weight matrices for each layer, b_i are bias terms, \sigma is the ReLU activation function, and f is a linear scaling function to produce the final score s.^[4] This formulation enables the efficient propagation of changes through the network, particularly in the initial feature transformation layer, to support seamless integration with game tree search.^[5]

Key advantages

Efficiently updatable neural networks (NNUE) offer significant update efficiency, achieving constant-time O(1) complexity for evaluating single-piece moves by incrementally updating only the affected neurons in the input layer, in contrast to the O(n cost of full forward passes required by standard neural networks for each position change.^[1] This reuse of prior computations from accumulators enables rapid adaptation to board states with minimal changes, such as quiet moves affecting just two neurons or captures impacting three.^[1] NNUE facilitates hybrid integration with classical search algorithms like alpha-beta pruning, allowing neural evaluations to enhance traditional tree searches without incurring proportional computational slowdowns, thereby supporting deeper exploration of game trees on standard hardware.^[2] In practice, this combination has enabled CPU-based engines to achieve performance levels competitive with GPU-accelerated deep learning methods.^[8] In terms of accuracy, NNUE models trained on self-play or human game data substantially outperform hand-crafted evaluation functions by approximately 100 Elo points in engines like Stockfish.^[9]^[8] NNUE maintains a favorable resource profile, operating efficiently on CPUs with network weights typically under 50 MB, around 10-50 million parameters, and inference times below 1 microsecond per position, ensuring low memory and latency suitable for real-time game play.^[10]^[1]

History

Development in shogi

The efficiently updatable neural network (NNUE) was invented by Yu Nasu in 2018 as an advancement over traditional static neural evaluators in computer shogi programs, such as Bonanza and its derivatives like YaneuraOu.^[4]^[11] Nasu, a member of the Ziosoft Computer Shogi Club and the Tanu-King team, developed NNUE to address limitations in prior evaluation functions that struggled with the computational demands of shogi's search algorithms.^[4]^[12] The primary motivation stemmed from shogi's unique characteristics, including its 9x9 board and piece drop rules, which generate highly dynamic positions requiring evaluation functions resilient to frequent incremental changes during alpha-beta search.^[4] Early prototypes employed half-KP (King-Piece) features, which encode relative positions between the king and other pieces in a sparse, symmetric manner to capture essential tactical and positional motifs while minimizing input dimensionality.^[4] This design allowed for efficient updates through delta computations, where only affected features are recalculated upon piece movements, enabling rapid evaluation without full network recomputation.^[4]^[13] Nasu detailed this "efficiently updatable" paradigm in his seminal 2018 paper, "Efficiently Updatable Neural-Network-based Evaluation Functions for Computer Shogi," presented as an appeal document for the 28th World Computer Shogi Championship.^[4] Accompanying the publication, he released the initial codebase on GitHub, implementing NNUE as a USI-compliant evaluation module integrable into existing shogi engines.^[13] The architecture featured a shallow feedforward network with clipped ReLU activations, optimized for CPU execution to maintain search speeds comparable to handcrafted evaluators.^[4] Early adoption occurred swiftly within the Japanese shogi programming community, with NNUE integrated into the open-source engine YaneuraOu by developer Motohiro Isozaki as early as May 2018.^[12]^[14] This integration enabled the use of wider, more expressive networks without incurring search slowdowns, as the update mechanism preserved low-latency performance during gameplay.^[4] By 2019, YaneuraOu employing NNUE—under the banner "YaneuraOu with Otafuku Lab"—won the 29th World Computer Shogi Championship (WCSC29), demonstrating substantial performance gains over prior handcrafted and static neural approaches.^[12]^[15] NNUE continued to power top-performing shogi engines in subsequent championships through the 2020s.

Integration into chess engines

The adaptation of efficiently updatable neural networks (NNUE) from shogi to Western chess began in 2020, driven by Japanese developer Hisayori "Nodchip" Noda, who ported the technology into a development version of the Stockfish engine.^[3]^[8] This port modified the input representation to suit chess, incorporating piece-square table (PST)-style features that encode piece positions relative to the king, enabling efficient updates during search.^[1] Community efforts facilitated the integration, with the NNUE code merged into Stockfish's main repository through collaborative pull requests, marking a pivotal shift toward hybrid neural-classical evaluation in open-source chess engines.^[2] Stockfish version 12, released on September 2, 2020, officially introduced NNUE as the default evaluation function, representing a milestone in chess engine development.^[16] The network was trained on evaluations from millions of positions generated at moderate search depths using prior Stockfish versions, resulting in a substantial strength increase of approximately 90 Elo points compared to Stockfish 11.^[2]^[9] This upgrade preserved Stockfish's alpha-beta search efficiency while enhancing positional understanding, quickly establishing NNUE as a standard for CPU-based engines. By 2021, the chess AI community expanded NNUE's reach, with Leela Chess Zero (LC0) incorporating hybrid NNUE variants to combine its deep neural network style with faster CPU inference.^[17] Ongoing refinements in Stockfish included the HalfKAv2 architecture in version 14 (September 2021), which reduced input redundancy by focusing on king-relative features, and larger networks such as the SFNNv6 net in version 16 (June 2023), further boosting performance through refined training and implementation.^[8]^[18] By 2025, NNUE had been integrated into numerous engines beyond Stockfish, including Komodo Dragon, which adopted the technology in its November 2020 release to blend traditional search with neural evaluation for deeper positional insight.^[19] Similarly, Fairy-Stockfish supports variant-specific NNUE networks for fairy chess variants, enabling strong play across non-standard rulesets like those with custom pieces or board geometries.^[20] Recent advancements, such as Stockfish 17 released in September 2024, continued to optimize NNUE for even greater efficiency and strength.^[21] These developments underscore NNUE's versatility beyond standard chess, fostering broader adoption in specialized and experimental engines.

Technical architecture

Layer composition

The Efficiently Updatable Neural Network (NNUE) utilizes a shallow multi-layer feedforward architecture optimized for rapid incremental updates in game tree search. The network typically comprises four layers: a sparse input layer derived from board features, two hidden layers employing clipped ReLU activations, and a single-neuron output layer that yields a scalar evaluation score in centipawns. This topology balances representational capacity with low latency, enabling evaluation speeds exceeding 10 million positions per second on consumer hardware.^[4]^[5] The hidden layers feature neuron counts such as 1024–3072 in the first and 32 in the second, facilitating hierarchical feature processing from raw board states to refined evaluations. Modern implementations, such as in Stockfish as of 2025, often use larger first hidden layers (1024–3072 neurons) and may employ variants like squared clipped ReLU (SCReLU) for improved performance. The clipped ReLU activation, applied after each hidden layer's linear transformation, is defined as

\sigma(x) = \min(\max(0, x), 1),

which ensures numerical stability by clipping outputs to the [0, 1] interval, mitigating gradient issues during training and overflow in integer computations. To further enhance inference speed, weights and biases in these layers are quantized to 8-bit signed integers, reducing memory footprint while preserving accuracy.^[4]^[5] In chess implementations, the network typically encompasses around 10 million parameters, predominantly in the first hidden layer due to the expansive input dimensionality. Shogi variants, adapted to a larger 9x9 board and diverse piece promotions, scale to significantly more parameters, often tens of millions including biases, to capture increased positional complexity.^[5]^[4] The output layer performs a linear projection without activation, producing a value normalized to the [-1, 1] range; this is scaled during evaluation such that +1.0 corresponds to approximately +400 centipawns advantage for the player to move, aligning with traditional engine scoring conventions.^[5]

Input representation

The input representation in efficiently updatable neural networks (NNUE) employs a sparse binary vector to encode board states, prioritizing king-centric perspectives to reflect strategic priorities in games like chess and shogi, such as piece mobility and safety around the kings.^[1] A core feature type is the Half-KP encoding, which constructs sub-vectors for each possible king position—up to 64 squares in chess—capturing interactions between the king and non-king pieces on the board. In chess implementations, the encoding uses 6 piece types × 64 squares = 384 features per side in the effective representation, extended over 64 king positions in the overparameterized HalfKP structure.^[8]^[22] The overall input vector spans a dimension of roughly 80,000, yet leverages inherent board sparsity, activating only the features corresponding to the current pieces on the board, approximately 32 active features per position, via indexing of piece types and their square positions relative to the king. This design ensures that only occupied squares contribute non-zero values, minimizing computational overhead while preserving positional detail.^[8]^[5] Encoding is inherently side-specific, maintaining distinct representations for white and black kings to incorporate asymmetric viewpoints; features cover all standard piece types (pawn through queen) with square positions defined relative to each king, thereby emphasizing proximity, attacks, and defensive configurations.^[1]^[22] Adaptations for game variants extend this framework: shogi NNUE includes dedicated flags for pieces held in hand, accounting for drop mechanics and promotions without inflating the core vector. For fairy chess variants, additional custom vectors encode bespoke piece movements and interactions, ensuring compatibility with non-standard rulesets.^[1]^[4]

Update mechanism

Incremental computation

The core of incremental computation in NNUE lies in the delta update principle, which maintains persistent states for the hidden layer activations across board position changes. Rather than recomputing the entire network input from scratch for each evaluation, the system tracks an accumulator representing the pre-activation values of the first hidden layer. When a minimal change occurs, such as a single move, the accumulator is updated by subtracting the contributions from the removed or altered features (e.g., a piece leaving its square) and adding the contributions from the new features (e.g., the piece arriving at its destination or a captured piece being removed). This is formalized as h' = h - W \cdot \delta_{\text{out}} + W \cdot \delta_{\text{in}}, where h is the current accumulator, W is the weight matrix for the input-to-hidden layer, \delta_{\text{out}} encodes the outgoing feature vector (typically a one-hot vector for the old position), and \delta_{\text{in}} encodes the incoming one.^[4]^[5] This approach optimizes the forward pass by limiting recomputation to only the affected features, which are few in number per move—typically 2 for a quiet move (old and new position of the piece), 3 for a capture (plus the captured piece), or up to 4 for special moves like castling. The full hidden layer update, if needed for evaluation, follows h_{\text{new}} = \max(0, W \cdot x_{\text{new}} + b), where x_{\text{new}} is the updated sparse input feature vector and b is the bias; however, the incremental variant computes the change directly as \Delta h = W \cdot (x_{\text{new}} - x_{\text{old}}), applied additively to the existing accumulator before applying the clipped ReLU activation only during position evaluation. This ensures that the bulk of the network's computation remains deferred until necessary, with the accumulator serving as a lightweight, updatable intermediate representation.^[4]^[5] The time complexity of these updates is effectively O(1) per move, as it scales linearly with the small number of changed features (bounded by a constant like 4) multiplied by the hidden layer size, contrasting sharply with the O(d) complexity of a naive full-input recomputation, where d is the total number of input features (often thousands for board representations). In practice, this enables rapid state transitions during search, with updates performed via simple integer additions and subtractions on the accumulator values.^[4]^[5] NNUE's incremental updates involve no approximations, relying on exact arithmetic to preserve precision; quantized integer representations (e.g., 16-bit for accumulators) are used to avoid floating-point accumulation errors over multiple moves, with careful scaling to prevent overflow given the maximum number of active features. This exactness ensures that evaluations remain deterministic and faithful to the trained network parameters, without degradation from repeated delta applications.^[4]^[5]

Efficiency optimizations

To achieve low-latency evaluations during alpha-beta searches in chess engines, NNUE implementations employ several hardware-aware optimizations that reduce memory bandwidth and computational overhead without compromising positional accuracy. Quantization is a primary technique, converting floating-point weights and activations to low-precision integers such as int8 for weights and int16 for accumulators, with dequantization performed on-the-fly using scaling factors like powers of 2 for efficient bit shifts.^[5] This reduces the model size by approximately 4x—from 32-bit floats to 8-bit integers—while limiting accuracy degradation to negligible levels in shallow networks like those used in Stockfish, where clipped ReLU outputs are bounded to 0-127.^[1] In practice, Stockfish's feature transformer layer applies int8 multiplications followed by int32 accumulation, enabling faster integer arithmetic on modern CPUs and supporting larger hidden layer sizes without proportional increases in memory usage.^[5] SIMD instructions further accelerate core operations, with AVX2 vectorizing dot products to process 16 int16 values per 256-bit register and AVX-512 extending this to 32 values, providing additional speedup over AVX2 in the innermost evaluation loops.^[1] These extensions leverage instructions like VNNI for fused int8 multiply-accumulate, minimizing data movement in the affine transformations of hidden layers.^[5] For cache efficiency, implementations precompute sparse feature vectors relative to king positions—using halfKP encoding where each piece pairs with the opposing king—and store incremental accumulators that update only affected elements per move, avoiding full recomputation except on king relocations.^[8] Additionally, transposing weight matrices aligns memory access patterns with CPU cache lines, improving locality during batched evaluations and reducing load/store latencies by up to 20% in optimized builds.^[23] Experimental ports extend NNUE to non-x86 hardware, including SIMD-free variants for mobile devices via NNAPI and GPU backends like CUDA, though these remain secondary to CPU optimizations due to the network's small size and the latency costs of data transfer in inference scenarios. Stockfish has supported NNUE on ARM-based systems since version 12 (2020), with ongoing optimizations for portability and minimal performance loss as of Stockfish 17 (September 2024).^[2]^[24] In Stockfish 16 (June 2023), the classical hand-crafted evaluation was fully removed, marking complete reliance on NNUE.^[8]

Training methods

Dataset generation

Training datasets for efficiently updatable neural networks (NNUE) are constructed as pairs of board positions and corresponding evaluation labels, primarily derived from simulations of games played by strong engines. Self-play games generated by engines like Stockfish, searched to depths of 20 or greater, form the core of these datasets, providing a vast array of positions encountered during play. To enhance diversity and capture human-like strategic nuances, positions from human master games are incorporated alongside the self-play data. These datasets have scaled dramatically over time, starting with approximately 10 million positions for initial shogi implementations in 2018 and expanding to 800 million for early chess NNUE in 2020, reaching up to 20 billion positions by 2024; by 2022, some datasets exceeded 4 TB in size. Labels for these positions are assigned based on game outcomes—win (1), draw (0.5), or loss (0)—often interpolated linearly to reflect expected scores, or derived from value estimates produced by deeper engine searches or Monte Carlo Tree Search (MCTS) rollouts in compatible frameworks. Supervision is applied sparsely, prioritizing quiet positions where no immediate tactics disrupt the board state, as these allow the network to learn stable positional evaluations without interference from volatile moves. Recent studies emphasize advanced filtering for quiet positions, using thresholds like a quiescence search difference greater than 60 centipawns or negamax difference greater than 70 centipawns to exclude noisy data.^[25] Noisy or tactically unstable positions are filtered out to improve data quality, for instance by excluding those where the absolute difference between the initial evaluation and a quiescence search result exceeds 60 centipawns (equivalent to roughly 0.6 pawns). Data augmentation techniques further expand the effective dataset size while promoting symmetry invariance. Common methods include mirroring the board across the vertical axis, generating equivalent positions from a single source without altering the underlying evaluation. Positions are also rigorously filtered for reliability, such as discarding those where the discrepancy between the game outcome and the engine evaluation is greater than 1 pawn (100 centipawns), ensuring the training data aligns closely with reliable assessments. This preprocessing yields high-quality inputs tailored for NNUE's supervised learning paradigm.

Optimization techniques

The training of efficiently updatable neural networks (NNUE) follows a supervised regression framework, where the primary objective is to minimize the discrepancy between the network's predicted position evaluations and target values derived from game outcomes. The standard loss function is the mean squared error (MSE), formulated as
L = \frac{1}{N} \sum_{i=1}^{N} (s_{\text{pred},i} - s_{\text{target},i})^2,
where N is the number of samples in a batch, s_{\text{pred}} denotes the network's output score, and s_{\text{target}} represents discounted game results—typically scaled as 1 for a win, 0.5 for a draw, and 0 for a loss, with adjustments based on the game phase to emphasize midgame or endgame relevance.^[4]^[1] This setup ensures the network learns a smooth evaluation function aligned with actual play outcomes, prioritizing positions near the end of self-play games for higher reliability.^[5] Optimization relies on gradient-based methods such as Adam or stochastic gradient descent (SGD) with momentum to update network weights efficiently. A common learning rate schedule begins at $10^{-3} and decays progressively to $10^{-5} over roughly 100 epochs, allowing initial rapid convergence followed by fine adjustments to avoid overshooting minima.^[26] Batch sizes typically range from 10,000 to 100,000 positions, enabling stable gradient estimates while leveraging GPU parallelism for large-scale training; smaller batches may introduce noise beneficial for generalization, but larger ones accelerate throughput on modern hardware.^[5]^[8] To mitigate overfitting, L2 regularization via weight decay is applied, penalizing large weights to promote sparse and generalizable representations suitable for the NNUE's integer-based deployment. Early stopping monitors validation performance, halting training when metrics like Elo rating on a held-out set plateau, typically after observing no improvement over several epochs.^[26]^[6] Post-training refinement often involves knowledge distillation, where a smaller NNUE student network is trained to mimic outputs from a larger teacher model, compressing knowledge while preserving accuracy. Additionally, fine-tuning on curated datasets of human expert games can infuse stylistic elements, such as positional preferences or aggressive tendencies, enhancing the network's alignment with intuitive play beyond pure win-rate optimization.^[1]^[26]

Applications

Use in chess evaluation

In modern chess engines employing alpha-beta search, NNUE functions as the core evaluation mechanism, supplanting traditional handcrafted heuristics particularly at the leaf nodes of the search tree where full position assessments are required.^[1] This integration allows the network to output a static evaluation in centipawns—a numerical score representing the advantage for the side to move—which directly informs the minimax decision process and enhances move ordering for subsequent iterations.^[2] By leveraging incremental updates, NNUE ensures that evaluations remain computationally efficient even during the exploration of millions of positions, enabling deeper searches without prohibitive overhead.^[27] A prominent example is Stockfish, where NNUE evaluations are seamlessly incorporated into the principal variation search (PVS) variant of alpha-beta pruning, providing precise centipawn scores that guide beta cutoffs and aspiration windows.^[2] The improved evaluative accuracy of NNUE facilitates more aggressive pruning of suboptimal branches and resulting in effective search depth gains despite approximately 50% reduction in raw nodes per second compared to prior classical evaluations, while yielding substantial Elo improvements of around 100 points in playing strength.^[8] To further optimize integration, Stockfish and similar engines scale NNUE usage by depth, applying the full network for deeper plies (typically beyond moderate depths) where its nuanced understanding outperforms simpler heuristics, while reserving hybrid or classical components for shallow searches to maintain overall efficiency.^[28] NNUE variants in chess engines often involve phase-specific tuning, with networks trained or fine-tuned on datasets emphasizing opening complexities or endgame simplifications to better capture positional nuances in those regimes.^[3] Additionally, adjustments akin to contempt factors—mechanisms to bias evaluations against premature draws in balanced positions—are implemented through hybrid approaches, combining NNUE outputs with classical terms to encourage decisive play in competitive settings.^[29] As of 2025, NNUE-derived evaluations dominate top-tier chess engines, powering nearly all alpha-beta-based participants in events like the Top Chess Engine Championship (TCEC), where Stockfish continues to lead with consistent victories across seasons such as Cup 15. By 2023, Stockfish had fully transitioned to an NNUE-only evaluation, removing hand-crafted components, further solidifying its lead in competitions like the TCEC Season 28 superfinal in September 2025.) This prevalence underscores NNUE's role in elevating engine performance, with nearly all top contending engines in recent TCEC cycles adopting such architectures for their superior blend of speed and accuracy.^[30]

Extensions to other games

NNUE has been adapted for shogi beyond its initial development, with full integration into engines such as YaneuraOu, introduced in 2018. This adaptation extends the input representation to accommodate shogi's unique piece drop mechanics by incorporating additional features for hand-held pieces and their potential placement squares, enabling efficient updates during search. The resulting evaluation functions have demonstrated substantial strength improvements in competitive play, often exceeding prior neural network approaches by significant margins in Elo ratings.^[14]^[1] Extensions to fairy chess variants and other board configurations have been facilitated through engines like Fairy-Stockfish, released in 2021 as a derivative of Stockfish. This engine customizes NNUE inputs to support non-standard pieces, such as enhanced bishops or promoted variants, and larger board sizes up to 12x12, including examples like 10x10 grids for games such as Grand Chess. Variant-specific NNUE networks are trained and loaded to enhance evaluation accuracy, outperforming handcrafted functions across supported fairy variants by providing more nuanced positional assessments.^[31]^[20] Experimental adaptations of NNUE to other games, such as Xiangqi, have explored simplified board representations to fit the architecture's piece-centric design. For Xiangqi, Fairy-Stockfish incorporates built-in NNUE networks since 2021, adjusting for the 9x10 board and river mechanics, which yield marked improvements over classical evaluations in engine strength.^[32]^[33]

Performance evaluation

Benchmarks against traditional methods

The integration of NNUE into Stockfish resulted in a substantial strength improvement over the classical handcrafted evaluation, with fishtest regressions showing gains of over 80 Elo points.^[2] Specifically, in time control tests such as 10+0.1 seconds per thread, NNUE achieved an Elo rating of 92.77 ± 2.1 compared to the classical version, based on 60,000 games.^[2] In broader evaluations, this translated to approximately 100 Elo points of overall improvement relative to Stockfish 11.^[9] In competitive settings like the Top Chess Engine Championship (TCEC) Season 22 in 2022, Stockfish with NNUE demonstrated superiority over Leela Chess Zero (LC0) on equalized hardware, winning superfinal matches and establishing an effective Elo advantage of around 50 points or more when normalized for comparable computational resources.^[34] Stockfish has maintained its dominance in subsequent TCEC seasons, winning the superfinal in Season 28 in 2025 against LC0 with a score of 36-21 favoring Stockfish (100 games).^[35] NNUE's design enables significantly faster position evaluations than dense neural networks like those in LC0, which rely on GPU acceleration for practical use. On CPU hardware, Stockfish NNUE achieves search speeds of approximately 100 million nodes per second, compared to roughly 75,000 nodes per second for LC0 under similar conditions.^[36] This represents a substantial advantage in evaluation efficiency over dense networks when run on consumer CPUs without specialized accelerators, allowing deeper searches in alpha-beta frameworks.^[1] Regarding hardware scaling, on typical consumer CPUs (e.g., multi-core Intel i7), NNUE enables Stockfish to process 13-20 million nodes per second.^[37] This efficiency stems from NNUE's incremental updates and integer-based computations, which leverage SIMD instructions for broad accessibility.^[5] Since the initial integration, NNUE has contributed to further Elo gains in Stockfish versions, with Stockfish 17 (released September 2024) showing up to 46 Elo points improvement over Stockfish 16, and cumulative gains exceeding 200 Elo from Stockfish 12.^[21]

Impact on computational resources

The introduction of NNUE in chess engines like Stockfish marked a significant shift from GPU-intensive neural network approaches, such as those in Leela Chess Zero (LC0), to CPU-optimized inference that leverages incremental updates and low-precision integer arithmetic for rapid position evaluation.^[2]^[38] This design exploits sparse inputs and shallow networks, enabling millions of evaluations per second on standard CPUs without requiring specialized hardware, thereby reducing overall computational demands during gameplay.^[5] Training NNUE networks, while initially resource-heavy, became feasible on high-end single CPUs using tools like the nodchip implementation, allowing developers to generate and refine models on accessible hardware rather than multi-GPU clusters.^[2] NNUE's open-source integration into Stockfish has democratized access to high-performance chess AI, with pre-trained networks released alongside engine updates, enabling hobbyists and researchers to deploy advanced evaluation without proprietary barriers.^[2] Quantization techniques, converting weights to int8 and int16 formats while preserving accuracy through scaling and clipping, further enhance portability, allowing quantized NNUE variants to run efficiently on resource-constrained devices such as mobile applications.^[5] This CPU-centric paradigm lowers entry barriers for game AI development, as evidenced by widespread adoption in open-source communities and commercial tools.^[28]

References

[1]
NNUE - Chessprogramming wiki
NNUE, (ƎUИИ Efficiently Updatable Neural Networks) a Neural Network architecture intended to replace the evaluation of Shogi, chess and other board game ...
[2]
Introducing NNUE Evaluation - Strong open-source chess engine
Aug 7, 2020 · NNUE is a neural network evaluation that assigns a value to a chess position, trained on millions of positions, and is efficiently updatable.
[3]
Origins and Development of NNUE in Chess Engines - beuke.org
Oct 7, 2025 · The Efficiently Updatable Neural Network (NNUE) originated in the Japanese computer shogi community. It was invented in 2018 by Yu Nasu, ...
[4]
[PDF] Efficiently Updatable Neural-Network-based Evaluation Functions ...
The NNUE evaluation function is a neural network based function, which evaluates one game position on a CPU without the need of a graphics ...
[5]
NNUE | Stockfish Docs - GitHub Pages
Sep 25, 2025 · What is NNUE? . NNUE (ƎUИИ Efficiently Updatable Neural Network) is, broadly speaking, a neural network architecture that takes advantage of ...
[6]
[PDF] arXiv:2412.17948v1 [cs.AI] 23 Dec 2024
Dec 23, 2024 · NNUE (Efficiently Updatable Neural Networks) is a neural network evaluation function introduced by Yu Nasu, first used in the Japanese Shogi ...
[7]
[PDF] Unveiling Concepts Learned by a World-Class Chess-Playing Agent
Aug 10, 2023 · Stockfish (since version 12) uses a neural network called NNUE [Nasu, 2018] (EUNN Efficiently Updatable Neural Network) for evaluating game ...<|control11|><|separator|>
[8]
asdfjkl/nnue - GitHub
In 2018, Yu Nasu proposed a neural network based evaluation function for computer shogi in his paper "Efficiently Updatable Neural-Network-based Evaluation ...
[9]
Stockfish NNUE - Chessprogramming wiki
NNUE, introduced in 2018 by Yu Nasu, were previously successfully applied in Shogi evaluation functions embedded in a Stockfish based search, such as YaneuraOu ...Missing: origin | Show results with:origin
[10]
Stockfish Absorbs NNUE, Claims 100 Elo Point Improvement
Sep 7, 2020 · In less than a month since the integration, Stockfish+NNUE has shown more than 100 Elo points of improvement relative to Stockfish 11. This ...
[11]
Frequently Asked Questions | Stockfish Docs - GitHub Pages
Oct 18, 2025 · Modern Stockfish builds use NNUE (neural-network-based evaluation). NNUE gives stronger play but is significantly slower than the old hand- ...
[12]
Yu Nasu - Chessprogramming wiki
This approach, dubbed NNUE (ƎUИИ Efficiently Updatable Neural Networks), turned out to become extremely powerful. NNUE was used along with a Stockfish based ...
[13]
YaneuraOu - Chessprogramming wiki
At the WCSC29, it used a NNUE type of evaluation developed by Tanu-King team member Yu Nasu. One of the characteristics of the NNUE type is that the ...
[14]
ynasu87/nnue: Efficiently Updatable Neural-Network-based ... - GitHub
オープンソースの将棋ソフト『やねうら王』をベースとして、評価関数を置き換える形で実装したもので、 USI プロトコル対応の将棋エンジンとして動作します。第 28 ...Missing: Yu Nasu
[15]
YaneuraOu is the World's Strongest Shogi engine(AI player ... - GitHub
Windows、Ubuntu、macOS、ARMなど様々なプラットフォームをサポートしています。評価関数として、KPPT、KPP_KKPT、NNUE(各種)に対応しています。ふかうら王の特徴.Releases 35 · Wiki · Make CI (for Ubuntu Linux) · やねうら王のインストール手順Missing: Yu Nasu
[16]
[PDF] 『やねうら王 with お多福ラボ 2019』(WCSC29 優勝)アピール文書
May 13, 2019 · NNUE 型の特徴として評価関数のパラメーター数が KPPT 型より少ないということが挙げられ. る。強化学習のために必要な教師局面の数は、評価関数の ...
[17]
Stockfish 12 - Stockfish - Strong open-source chess engine
Sep 2, 2020 · The recommended parameters of the NNUE network are embedded in distributed binaries, and Stockfish will use NNUE by default. Both the NNUE ...Missing: integration Joost Plies
[18]
Jumping on the NNUE bandwagon | Leela Chess Zero
We are talking about Efficiently Updatable Neural Networks (referred to as NNUE, giving new meaning to backronyms) allegedly discovered by Japanese monks.Missing: hybrid | Show results with:hybrid
[19]
Stockfish 16 - Stockfish - Strong open-source chess engine
Jun 30, 2023 · This updated version of Stockfish introduces several enhancements, including an upgraded neural net architecture (SFNNv6), improved implementation, and refined ...Missing: NNUE | Show results with:NNUE
[20]
Komodo Releases Powerful New 'Dragon' Chess Engine
Nov 11, 2020 · The Komodo team has released Komodo 14.1 and a new chess engine dubbed "Dragon." Dragon adds powerful NNUE (Efficiently Updatable Neural Networks) technology ...
[21]
Download NNUE | Fairy-Stockfish
Variant-specific NNUE (efficiently updatable neural network) evaluation files can be used to improve playing strength compared to the handcrafted evaluation.
[22]
HalfKP/NNUE: Efficiently Updatable Neural Network - GitHub
The neural network consists of four layers. The input layer is heavily overparametrized, feeding in the board representation for all king placements per side.
[23]
Making NNUE 60% faster - TalkChess.com
The most innermost hot loop of NNUE can be optimized for all intrinsic AVX2, AVX512 types and it was a very fun journey.
[24]
My solution: Cfish, nnue, data (1st) - Kaggle
Mar 31, 2025 · Modern stockfish is optimized for strong nnue (~70mb compressed weights) evaluations at long time controls. Many improvements in recent ...Missing: optimal | Show results with:optimal
[25]
Using Neural or Graphical Processor Units in NNUE engines
Jun 16, 2024 · Hi, I wonder why NNUE chess Engines don't use Neural processing units or GPUs of modern Apple and Qualcomm SOCs in Smatphones, instead of using there CPUs.Stockfish NN release (NNUE) - TalkChess.comDual NNUE-new leaf in Stockfish dev - TalkChess.comMore results from talkchess.comMissing: Stockfish port NNAPI
[26]
Stockfish NNUE (Chess evaluation) trainer in Pytorch - GitHub
Games are played using c-chess-cli and nets are ranked using ordo . This script runs in a loop, and will monitor the directory for new checkpoints. Can be run ...
[27]
Study of the Proper NNUE Dataset - arXiv
NNUE (Efficiently Updatable Neural Networks) is a chess evaluation technique that utilizes the incremental update technique to quickly update the model for ...
[28]
A Theoretical Analysis of the Development and Design Principles of ...
May 10, 2025 · Incremental Updates: NNUE updates only the affected parts of the evaluation after each move, leveraging the locality of chess moves for speed.<|control11|><|separator|>
[29]
Does Stockfish NNUE have contempt implemented?
Sep 4, 2020 · Stockfish NNUE does not have contempt per se - it's not clear how to implement contempt in NNs. However, Stockfish NNUE does use the handcrafted eval in ...Missing: tuning openings
[30]
https://www.attackingchess.com/top-10-strongest-chess-engines-in-2025/
[31]
https://github.com/fairy-stockfish/Fairy-Stockfish
[32]
fairy-stockfish/Fairy-Stockfish: chess variant engine supporting ...
Fairy-Stockfish is a chess variant engine derived from Stockfish designed for the support of fairy chess variants and easy extensibility with more games.
[33]
Fairy-Stockfish 14.0.1 XQ
Nov 19, 2021 · A new version of Fairy-Stockfish is available, see the release notes. This release is specifically for providing built-in NNUE networks for Xiangqi and Janggi.
[34]
Specialized NNUE releases of Fairy-Stockfish - GitHub
Fairy-Stockfish releases with built-in NNUE (neural network) for Xiangqi, Janggi, and Makruk - fairy-stockfish/Fairy-Stockfish-NNUE.
[35]
[2412.17948] Study of the Proper NNUE Dataset - arXiv
Dec 23, 2024 · In this paper, we propose an algorithm for generating and filtering datasets composed of "quiet" positions that are stable and free from tactical volatility.
[36]
Stockfish wins TCEC Season 22, sets records - Chessdom
Apr 21, 2022 · ... NNUE gaining another 2-3 elo in self-play was trained. Runner-up Komodo Dragon can also look back on Season 22 as a successful campaign ...<|separator|>
[37]
Evolution of a Chess Fish: What is NNUE, anyway?
Dec 28, 2020 · What does NNUE stand for? NNUE stands for Efficiently Updateable Neural Network. It's "NNUE" instead of "EUNN" because the technique was adopted ...
[38]
Strategic Test Suite - Chessprogramming wiki
The Strategic Test Suite (STS) is a series of themed test suites designed to evaluate chess engines' long-term understanding of strategic and positional ...
[39]
The playing style of StockfishNNUE - Chess Forums
Jul 20, 2020 · Stockfish NNUE, with no openings book, currently leads 9.0/9 but has not faced anyone important yet. SFNNUE with an openings book is 4th at 7.5/ ...Missing: 800 | Show results with:800
[40]
[PDF] Chess AI: Competing Paradigms for Machine Intelligence - arXiv
Sep 23, 2021 · Nasu, Yu (2018). “NNUE: Efficiently Updatable Neural-Network-Based Evaluation Functions for. Computer Shogi”. The 28th World Computer Shogi ...