KataGo
KataGo is a free and open-source computer Go program that employs deep learning techniques, including self-play reinforcement learning inspired by AlphaZero, to play the strategic board game Go at superhuman levels.[1] Developed by David J. Wu under the GitHub username lightvector, it was initially released in early 2019 following a seven-day training run on high-performance GPUs, marking a significant advancement in open-source Go AI.[1][2] KataGo's core innovation lies in its accelerated self-play training process, which incorporates enhancements such as improved value and policy targets, auxiliary training heads for territory estimation, and efficient search algorithms like Monte Carlo tree search, enabling it to reach professional-strength play from scratch in days on modest hardware or superhuman performance in months on a single high-end GPU.[1][3] The program supports a wide range of board sizes from 7x7 to 19x19, multiple rule sets including Japanese and Chinese, variable komi values, and handicap games, all handled by a single neural network without human-provided game knowledge.[4][2] Its neural networks are publicly available through ongoing distributed training efforts, which as of 2025 have amassed over 4.3 billion training examples from more than 87 million self-play games contributed by over 1,300 users worldwide.[4] Since its inception, KataGo has been supported by financial backing from Jane Street, allowing for extended training runs that have positioned it as one of the most powerful open-source Go engines, surpassing earlier programs like Leela Zero in efficiency and strength while providing tools for game analysis, score prediction, and human-like play simulation.[5][2] It excels in practical applications, such as integrating with online platforms for player review and study, and has demonstrated the ability to solve complex historical Go problems, such as those in the classic Japanese text Igo Hatsuyōron, by identifying optimal moves and outcomes unattainable by human analysis alone.[5] By 2025, recent versions like v1.16 incorporate specialized models for mimicking human moves across ranks from 20k to 9d, enhancing its utility for training and education in the Go community.[4]Overview
Development history
KataGo was initiated in 2018 by David J. Wu, a computer science researcher specializing in game AI, as an open-source project aimed at advancing reinforcement learning techniques for the game of Go.[1] Wu, who previously developed the Arimaa-playing program bot_Sharp, sought to build upon AlphaZero-style self-play methods by introducing optimizations tailored to Go, enabling more efficient training without reliance on proprietary systems like AlphaGo.[2] The core motivation was to democratize access to high-performance Go AI through faster learning algorithms that reduced computational demands while maintaining or exceeding superhuman play strength.[1] A pivotal milestone came with the publication of the preprint paper "Accelerating Self-Play Learning in Go" on February 27, 2019, which detailed innovative techniques such as global pool training, auxiliary heads for move reasoning, and score-based loss functions, achieving a 50-fold reduction in computation compared to prior methods.[1] Concurrently, Wu launched the project's GitHub repository under the username lightvector, making the GTP engine and self-play learning code publicly available.[2] Released under the permissive MIT license, KataGo encouraged widespread adoption and modification, fostering community-driven enhancements to the engine, neural network implementations, and integrations with Go-playing platforms. Over time, KataGo evolved through iterative releases, incorporating feedback from users and researchers to refine its architecture and performance. A significant update occurred in version 1.12.0, released on January 8, 2023, which transitioned the training backend from TensorFlow to PyTorch for improved efficiency and compatibility in distributed self-play processes.[6] This shift, alongside a new nested residual bottleneck neural network design, marked a key advancement in the project's scalability, allowing for more rapid experimentation and stronger models without altering the core search mechanisms.[6]Release and adoption
KataGo's initial release, version 1.0, occurred on February 27, 2019, establishing it as a GTP-compatible engine capable of playing and analyzing Go games through self-play learning enhancements. This version laid the foundation for its open-source availability on GitHub, where it quickly gained traction among developers and Go enthusiasts for its efficient neural network architecture and search algorithms.[2] Subsequent major updates expanded its capabilities and accessibility. Version 1.15.0, released on July 19, 2024, introduced human-like imitation features via a supervised learning model trained to predict moves across player ranks from 20 kyu to 9 dan.[7] The stable release of version 1.16.3 followed on June 28, 2025, incorporating performance optimizations such as improved Metal backend support for macOS and enhanced training data handling for larger board sizes.[8] Version 1.16.4, released in October 2025, added experimental eval-caching and further bug fixes in distributed training.[9] Precompiled binaries for Windows, Linux, and macOS are distributed through GitHub releases, while neural network weights are freely hosted on katagotraining.org to support user experimentation and integration.[9][10] KataGo's adoption has grown significantly within Go communities and digital platforms. It serves as the default engine for game analysis on Online-Go.com (OGS), enabling users to review matches with AI feedback directly in the interface.[11] Mobile applications, such as AI KataGo Go, have extended its reach to casual players; the iOS version launched on October 24, 2021, followed by the Android release on November 24, 2023, allowing offline play against the AI on smartphones and tablets.[12][13] The project's community impact is bolstered by the volunteer-driven distributed training initiative "kata1," which commenced on November 28, 2020, and leverages global contributors to generate high-quality self-play data.[4] This effort has enabled free access to advanced models, including the b28c512 network—featuring 28 residual blocks and 512 channels—released in May 2024, which marked a substantial strength improvement over prior iterations through extended training on distributed resources.[10][14]Technical architecture
Neural network design
KataGo's neural network is a convolutional residual network (ResNet) utilizing pre-activation blocks, similar to those in AlphaGo Zero and AlphaZero but with tailored enhancements for Go. The base architecture employs a trunk of residual blocks, with model sizes varying from 20 to 40 layers; for instance, smaller models like b6c128 feature 6 blocks with 128 channels, while larger ones such as b20c256 use 20 blocks with 256 channels, and recent variants like b18c384nbt incorporate 18 nested bottleneck residual blocks for efficiency.[2] The network processes inputs tailored to Go's strategic depth, using 18 binary spatial feature planes for a standard 19x19 board.[15] These planes encode current stone positions for both players (2 planes), stones with 1, 2, or 3 liberties (3 planes), illegal moves due to ko, superko, or suicide (1 plane), locations of the last 5 moves (5 planes), stones ladderable 0, 1, or 2 turns ago (3 planes), moves that catch the opponent in a ladder (1 plane), and pass-alive areas for self and opponent (2 planes), along with a plane indicating locations on the board. Additional global features include real-valued scalars for game length, ruleset parameters like ko bans and komi, and positional summaries such as the number of captured stones. This design provides the network with rich, game-specific context beyond raw board states, including indicators of self-atari risks to avoid suicidal moves. Outputs are generated via dedicated heads attached to the shared trunk. The policy head produces a 19x19 spatial map of move logits plus a separate logit for passing, yielding probabilities over all possible actions after softmax application. The value head outputs scalars estimating the win probability for the current player and the expected score difference. Complementing these is an ownership head, delivering a 19x19 heatmap where each value represents the predicted probability that the intersection will belong to the current player at the game's end, aiding in territory evaluation. To enhance global awareness, the architecture integrates global pooling layers after convolutional stages, computing channel-wise statistics like means and maxima across the entire board for incorporation into subsequent layers. Domain-specific features, such as liberty counts and ladder outcomes, further distinguish KataGo from generic AlphaZero setups by accelerating convergence on Go's tactical nuances. Since version 1.12, the training framework has utilized PyTorch, facilitating flexible implementations that support variable board sizes from 7x7 to 19x19 without architectural changes.[6] These policy and value outputs integrate with Monte Carlo tree search during gameplay.Search mechanism
KataGo employs an AlphaZero-style Monte Carlo Tree Search (MCTS) algorithm, where simulations are guided by a neural network's policy and value outputs to explore the game tree efficiently. The policy network provides prior probabilities for move selection, incorporated into the PUCT formula for balancing exploration and exploitation: \text{PUCT}(c) = V(c) + c_{\text{PUCT}} \cdot P(c) \cdot \frac{\sqrt{\sum_{c'} N(c')}}{1 + N(c)}, with c_{\text{PUCT}} = 1.1, where V(c) is the average value, P(c) the policy prior, and N(c) the visit count.[15] Value estimates from the network back up through the tree to evaluate node utilities, enabling rapid assessment of positions without full rollouts.[15] To enhance exploration, KataGo adds Dirichlet noise to the root node's policy priors, mixing 75% of the raw policy with 25% noise sampled from a Dirichlet distribution parameterized by \alpha = 0.03 \times 19^2 / N, where N is the number of simulations.[15] Virtual loss is applied during parallel searches to temporarily penalize nodes under exploration by multiple threads, preventing over-visitation and promoting balanced tree growth.[2] These mechanisms ensure robust sampling in high-branching-factor positions typical of Go. KataGo extends standard MCTS with Monte Carlo Graph Search (MCGS), representing the search as a directed acyclic graph to merge transpositions—identical board states reached via different move sequences—reducing redundant computations.[2] This is particularly efficient for handling ko fights and repeated subpositions, where traditional tree structures would duplicate subtrees, leading to exponential growth in memory and time. Additionally, during search, KataGo identifies and fills in dead groups using Benson's algorithm for pass-aliveness, pruning irrelevant branches and lowering the effective branching factor by excluding moves in unconditionally dead regions.[15] The neural network's policy priors guide initial move proposals, while value estimates inform backups after each simulation; temperature scaling via softmax (e.g., root temperature of 1.03) introduces controlled variability in early-game playouts, aiding diverse training data generation and simulating human-like decision uncertainty when configured.[15] For move selection in gameplay, KataGo allocates a fixed number of visits, such as 1600 per move in standard benchmarks, choosing the action with the highest visit count to balance accuracy and computational constraints.[15] It also supports pondering, continuing background searches on the current position during the opponent's turn to anticipate responses and accelerate subsequent decisions.[2] KataGo accommodates multiple rule sets, including Chinese, Japanese, and Tromp-Taylor, with support for variable komi values and board sizes from 7x7 to 19x19, ensuring compatibility across diverse competitive formats without altering the core search logic.[16]Training process
Initial self-play training
KataGo's initial training employs a pure self-play reinforcement learning approach, bootstrapping its neural networks from scratch without any prior human game data or supervision. The process begins with random play, where the neural network guides Monte Carlo tree search (MCTS) to simulate games on the 19×19 Go board. These self-play games generate training data consisting of board positions paired with move probabilities derived from MCTS-improved policies and game outcomes as value labels. Iteratively, a new neural network is trained on this data to predict better policies and values, which then informs subsequent self-play games, enabling progressive improvement toward superhuman strength.[1] The key phases of this bootstrap training involve alternating between data generation and model updates. First, the current neural network, augmented with MCTS, plays thousands of games against itself, varying the number of simulations per move to focus computation on challenging positions and reduce variance in training targets. Positions from these recent games—typically the most recent 500,000—are then used to train a successor network, emphasizing samples with high policy divergence from the prior model to accelerate learning. This cycle repeats, scaling the network architecture progressively (e.g., from 6 blocks and 96 channels to 20 blocks and 256 channels) to handle increasing complexity, culminating in a model capable of matching or exceeding prior state-of-the-art systems like ELF OpenGo.[1] The initial full-strength run utilized 28 NVIDIA V100 GPUs over 19 days, generating 4.2 million self-play games and 241 million position samples to produce the first strong model (b20 × c256). This hardware setup equated to approximately 1.4 GPU-years of computation, a fraction of the resources required by earlier systems. The training incorporates domain-specific features, such as Go's stone ladder and liberty representations, which encode board state more efficiently than raw images, significantly reducing sample complexity compared to general-purpose architectures.[1] Central to the training are specialized loss functions that target the neural network's prediction heads. The primary policy loss uses cross-entropy to align the network's move probabilities with those improved by MCTS during self-play. The value loss employs mean squared error (MSE) between predicted win rates and actual game outcomes, scaled by a coefficient of 1.5 for emphasis. An auxiliary ownership loss, weighted at 1.5 divided by the board size squared, predicts territorial control per intersection to enhance endgame evaluation, while additional auxiliary heads for score beliefs and opponent policy further refine predictions without dominating the main objectives. A small L2 regularization (coefficient 3×10^{-5}) prevents overfitting. These components, combined with techniques like dynamic komi adjustment and visit randomization, enable rapid convergence.[1] Efficiency gains from these innovations allow KataGo to reach high amateur dan levels in just days on modest hardware, contrasting with AlphaZero's multi-week timelines on vastly more resources for similar milestones. Overall, the approach achieves a 50-fold reduction in computational cost to surpass ELF OpenGo's performance, demonstrating the impact of tailored enhancements in self-play learning for Go. The neural network's value and policy heads, briefly, output scalar win probabilities and move distributions, respectively, directly supporting the MCTS integration during training.[1]Distributed and ongoing training
Following its initial training phase, KataGo's development has been sustained through the Kata1 project, a community-driven distributed training initiative hosted at katagotraining.org. Launched on November 28, 2020, this effort leverages volunteer-contributed GPUs worldwide to generate self-play games and update neural networks, resuming from the g170 checkpoint achieved in June 2020. Participants install custom client software, such as the KaTrain graphical user interface or command-line tools integrated with KataGo version 1.16.2 (released June 4, 2025), to run self-play simulations on their idle hardware after creating an account on the platform.[4] The process involves volunteers producing vast quantities of training data through distributed self-play, which is then uploaded and aggregated at a central server for processing. As of November 2025, the project has accumulated over 4.3 billion rows of training data from more than 87 million games contributed by 1,321 unique users, with recent activity including 2.1 million rows and 45,000 games uploaded in the prior 24 hours alone. This data fuels periodic releases of updated neural networks—totaling 908 models to date—with new versions generated frequently based on aggregated contributions, often resulting in stronger iterations without reliance on proprietary computing resources.[4][10] Among recent advancements, the b28c512 architecture (28 residual blocks and 512 channels) represents a pinnacle of open-source models, with variants like kata1-b28c512nbt-s11803203328-d5553431682 released as recently as November 13, 2025, and the confidently rated strongest net kata1-b28c512nbt-adam-s11165M-d5387M from October 2025. Experimental runs have incorporated limited supervised learning from human games to predict moves across various player ranks and historical time periods, enhancing features like human-like play analysis, as seen in dedicated "extra networks" trained on large human datasets.[10][17] The infrastructure's scalability is evident in its ability to harness contributions from thousands of volunteer GPUs globally, enabling the production of networks that surpass earlier versions in strength through collective, decentralized compute power. This model supports ongoing enhancements, such as the April 2025 introduction of experimental action-value heads in version 1.16.0, which aim to improve training efficiency and adaptability for deployment on resource-constrained edge devices.[4]Features and applications
Analysis and visualization tools
KataGo provides robust interfaces for analysis and visualization, enabling integration with various graphical user interfaces (GUIs) to facilitate game review and strategic insight. Its core tools include a standard Go Text Protocol (GTP) implementation with extensions such as thekata-analyze command, which outputs winrate estimates, expected scores, ownership distributions, and policy probabilities for board positions.[18] Additionally, a JSON-based analysis engine supports efficient batch processing of multiple games or positions, making it suitable for backend services and automated evaluations.[19] These interfaces allow seamless integration with popular GUIs like Sabaki, Lizzie, KaTrain, and others, where KataGo serves as the backend engine for generating winrate graphs, branch analysis (exploring alternative move sequences), and suggestion modes that highlight recommended plays.[2]
Key visualizations derived from KataGo's neural network include ownership heatmaps, which estimate territorial control by displaying the probability that each intersection will belong to a specific player at the game's end, aiding in the assessment of positional advantages.[18] Policy maps visualize the neural network's move probabilities across the board, helping users understand likely strategic focuses, while score leader graphs track the evolving lead in estimated final scores throughout a game, providing a dynamic view of momentum shifts.[18] These outputs are configurable via settings files, such as gtp_example.cfg, allowing users to adjust parameters like search depth for balancing speed and thoroughness in analysis.[20]
In usage modes, KataGo supports post-game analysis of human or AI-played games by loading Smart Game Format (SGF) files and annotating them with winrate variations, ownership estimates, and best-move suggestions.[2] Real-time hints can be enabled during live play, offering immediate feedback on move quality without disrupting the flow.[2] It also accommodates bookups—precomputed opening sequences for rapid evaluation—and puzzle modes for tactical exercises, where users can explore specific scenarios.[2] A unique feature is multi-principal variation (multi-PV) output, which generates and displays several alternative lines of play with their respective winrates and scores, enabling deeper exploration of branching possibilities.[18]
For broader accessibility, KataGo includes plugins for platforms like Online Go Server (OGS), allowing server-side analysis directly within web-based interfaces.[2] Outputs can be exported to annotated SGF files, preserving visualizations and comments for archiving, sharing, or further study in compatible software.[2] These tools collectively emphasize practical utility for players and analysts, leveraging KataGo's search computations to deliver interpretable insights without requiring advanced technical expertise.[2]