Fact-checked by Grok 2 weeks ago

Fork

In software engineering, a '''fork''' is the creation of a copy of an existing computer program or codebase, which is then developed independently of the original.^[1] This practice is common in open source software to allow experimentation or divergence without affecting the upstream project.^[2] In computing more broadly, '''fork''' also refers to the process of creating a new process by duplicating an existing one, typically via the fork() system call in Unix-like operating systems.^[3] The child process is an exact copy of the parent but runs concurrently and independently.

Definition and Etymology

Definition

In software development, a fork refers to the creation of an independent copy of a software project's source code, which is then maintained and evolved separately from the original project, potentially resulting in divergent versions featuring different functionalities, priorities, or architectural directions.^[4]^[5] This process allows developers to experiment with modifications, address unmet needs, or pursue alternative visions without impacting the upstream codebase.^[1] Key characteristics of a fork include its full independence from the original repository, encompassing not only the source code but also ancillary elements such as documentation, configuration files, issue trackers, and build scripts.^[4] Over time, this separation can lead to incompatibilities in features, maintenance schedules, or even licensing terms, as the forked project builds its own community and release cadence.^[5] The term "fork," evoking a split in the road, underscores this divergence in development paths.^[1] A critical distinction exists between forking and branching: while a branch represents a temporary or parallel line of development contained within the same repository—facilitating collaboration through merges—a fork generates a standalone repository with its own history, permissions, and tools, offering greater isolation but necessitating formal contributions like pull requests to reintegrate changes.^[4]^[6] Forking applies across both open-source and proprietary software contexts, though implications vary significantly; in open-source projects, it is enabled by licenses that grant redistribution and modification rights, fostering community-driven evolution, whereas in proprietary settings, it typically requires explicit licensing permissions from copyright holders, often limiting its use to avoid intellectual property conflicts.^[7]^[8]^[9]

Etymology

The term "fork" in computing originates from the Unix fork() system call, introduced in the early 1970s as a mechanism to create a new process by duplicating an existing one, resulting in a parent and child process that diverge independently. This concept traces back to Melvin Conway's 1963 proposal in his paper "A Multiprocessor System Design," where he described "fork" and "join" operations for parallel processing, drawing on the visual metaphor of a road or path splitting into branches to represent divergence in execution flows. The term was first implemented in software at Project Genie in 1963 and later adopted in Unix, as documented in the 1971 Unix Programmer's Manual, emphasizing the splitting of processes without direct ties to hardware designs like CPU forks.^[10] By the 1980s, the metaphor extended to software development, particularly in version control systems. Eric Allman first applied "fork" to code branching in 1980, describing how creating a branch in the Source Code Control System (SCCS) "forks off" a version of the program for independent development.^[11] This usage emerged in discussions around Unix variants, including the Berkeley Software Distribution (BSD) released in 1977, where forking enabled divergent implementations of Unix-like systems, such as FreeBSD derived from 386BSD in the early 1990s.^[10] The term gained prominence in open source communities through Richard Stallman's GNU Project, launched in 1983, which promoted free software licensing that facilitated forking as a means of community-driven evolution. By the 1990s, "fork" had become standard terminology in free software documentation, exemplified by the 1997 EGCS fork of the GNU Compiler Collection (GCC), which addressed development stagnation and was reintegrated as the official GCC by 1999, underscoring its role in sustaining project vitality.^[10]

History

Origins in Early Computing

In the pre-1970s era, informal code sharing emerged in academic and mainframe computing environments, where users exchanged software through physical media and collaborative networks. The SHARE organization, founded in 1955 by users of IBM's 701 computer, facilitated this by distributing user-contributed programs, libraries, and documentation via tape libraries and meetings, enabling ad hoc modifications for scientific and engineering applications across institutions.^[12] Early ARPANET projects from 1969 onward built on this tradition, with researchers at sites like UCLA and Stanford sharing source code for network protocols through initial file transfer mechanisms, though widespread digital distribution was limited by the nascent infrastructure.^[13] The concept of forking gained technical footing in the 1970s through AT&T's Unix development, where internal modifications created variants of the operating system. Ken Thompson introduced the fork() system call in 1973 as part of the Fourth Edition of Unix on the PDP-11, allowing a process to duplicate itself into parent and child instances for efficient multitasking and resource management.^[14] This primitive separated process creation from execution, enabling developers at Bell Labs to experiment with system extensions without disrupting the core codebase, thus laying groundwork for divergent implementations.^[15] By the 1980s, the Berkeley Software Distribution (BSD) represented the first major documented fork from Unix, driven by university-led enhancements for academic use. Starting in 1977 with Bill Joy's distribution of modified Version 6 Unix code including Pascal and the ex editor, Berkeley released 4BSD in 1980, building on prior enhancements such as virtual memory from 3BSD (1979) and incorporating job control, while early networking features were added later in 4.2BSD (1983) tailored for VAX systems at UC Berkeley.^[16] These changes, distributed to over 150 licensees, addressed AT&T's limitations in research environments, marking a shift toward community-modified variants while requiring AT&T source licenses until the 1990s.^[17] A pivotal event occurred in 1983 when Richard Stallman announced the GNU Project, advocating for reusable code in a free Unix-like system.^[18] This initiative highlighted emerging tensions in code reuse, emphasizing shared standards to mitigate risks seen in prior Unix variants.

Development in Open Source Era

The rise of free and open-source software (FOSS) in the 1990s marked a pivotal evolution in software forking, transforming it from a niche technical practice into a foundational mechanism for collaborative development and community-driven innovation. The Linux kernel, initiated by Linus Torvalds in 1991 as a free alternative to proprietary Unix systems, quickly became a central hub for forking activities. Early Linux distributions such as Slackware, which emerged in 1993 from the Softlanding Linux System (SLS) project started in 1992, and Debian, founded in 1993, exemplified this trend by forking and customizing the shared kernel codebase to create tailored user environments, fostering widespread adoption and experimentation within the FOSS ecosystem.^[19]^[20] FOSS licensing frameworks profoundly shaped forking practices by balancing openness with obligations for sharing modifications. The GNU General Public License (GPL), first published by the Free Software Foundation in 1989, enabled forking by permitting users to copy, modify, and redistribute code while mandating that derivative works disclose their source code and adopt compatible licenses, thereby ensuring the persistence of freedoms in communal projects. In contrast, permissive licenses like the MIT License, originating in the late 1980s at the Massachusetts Institute of Technology, allowed easier divergence into proprietary software by imposing minimal restrictions on redistribution, which facilitated broader commercial integration but sometimes reduced the incentive for upstream contributions.^[21] These licenses collectively empowered forking as a tool for ideological and practical advancement in FOSS, with the GPL's copyleft mechanism particularly influential in maintaining community control over core projects like Linux.^[22] Key milestones underscored forking's role in responding to corporate shifts and sustaining open development. In 1998, Netscape Communications open-sourced its Communicator browser suite on March 31, leading to the creation of the Mozilla project as a community-managed fork that addressed the original codebase's stagnation amid competitive pressures from Microsoft.^[23] Similarly, in 2010, concerns over Oracle's acquisition of Sun Microsystems prompted a group of OpenOffice.org developers to fork the project into LibreOffice, prioritizing independent governance and accelerated feature development to preserve its viability as a free alternative to proprietary office suites.^[24] These events highlighted forking as a strategic response to external threats, ensuring the longevity of critical FOSS tools. Community dynamics in the FOSS era further elevated forking into a democratic instrument for governance, enabling decentralized decision-making and accountability. Events like the Ohio LinuxFest, launched in 2003, hosted discussions in the 2000s that explored forking's implications for project sustainability and collaboration, contributing to the growth of resources for tracking forks and resolving disputes.^[25] Over time, forking emerged as a core governance tool in FOSS, allowing communities to diverge from unresponsive leadership or incompatible directions while upholding principles of openness, as evidenced by its role in guaranteeing project continuity through collective choice.^[26] This evolution democratized software stewardship, making forking not just a technical option but a vital check on centralized authority within open-source ecosystems.

Types of Forks

Codebase Forks

A codebase fork entails the complete duplication of an existing project's source code repository, encompassing metadata such as commit history and branches, which enables independent development and often leads to parallel tracks with diverging features.^[1]^[27] This process creates a new repository that retains a link to the original for potential synchronization via pull requests, but allows the fork to evolve separately, incorporating unique modifications without impacting the upstream project.^[28] Binaries, if included in the original repository, are also duplicated, though the primary focus remains on source code for ongoing customization.^[9] Common triggers for initiating a codebase fork include disagreements over project direction, such as divergent technical visions or governance issues, which account for a substantial portion of cases (around 42% technical and 38% governance-related).^[29] Licensing changes, occurring in about 15% of forks, often prompt splits when communities seek to preserve or alter terms for greater freedom.^[29] Project abandonment drives roughly 19% of forks, where developers revive stalled efforts, while community splits from cultural differences or trademark conflicts (8-12%) further catalyze independent branches.^[29] Once established, codebase forks undergo independent maintenance, including separate bug fixes applied via techniques like git cherry-pick (observed in 0.9-9% of fork pairs across ecosystems), distinct release cycles (with 23-67% of forks issuing multiple versions), and autonomous versioning schemes often aligned through merges or rebases (11-33% usage rate).^[30] To facilitate interoperation, some forks incorporate compatibility layers, such as porting changes between versions or maintaining backward-compatible APIs, enabling selective integration without full convergence.^[1]^[30]

Process Forks

In operating systems, particularly Unix-like systems, a process fork refers to the creation of a new process by duplicating an existing one, enabling concurrent execution of tasks. The primary mechanism for this is the fork() system call, which generates a child process that is an exact copy of the parent process at the moment of invocation, except for specific attributes such as the process ID (PID).^[31] This duplication allows the parent and child to run independently, with the child often proceeding to execute a different program via an exec() family call to replace its image, while initially sharing the same memory space through copy-on-write semantics in modern implementations to optimize resource use.^[31]^[3] The child process inherits key elements from the parent, including open file descriptors, environment variables, and the current working directory, ensuring continuity in resource access. However, differences arise immediately: the child receives a return value of 0 from fork(), while the parent receives the child's PID (a positive integer), allowing each to distinguish their role; the child's PID is unique and does not match any existing process group ID, and its parent PID is set to the original parent's PID. Additionally, the child starts with no pending signals, a reset alarm clock, and cleared resource usage timers (tms_utime, tms_stime, etc.), and in multi-threaded parents, only the calling thread is duplicated in the child, requiring async-signal-safe operations until an exec() to avoid concurrency issues. The parent can later retrieve the child's exit status using wait() or related calls.^[31]^[31] Process forks are fundamental to multiprocessing in Unix-like environments, facilitating parallel execution without complex setup. In shell scripting, for instance, the ampersand (&) operator triggers a fork to run commands as background processes, allowing the shell to regain control while the child handles the task asynchronously, such as monitoring jobs with process group IDs distinct from the foreground. Server daemons commonly employ forking to spawn worker processes; a master process forks children to handle incoming requests, distributing load while the parent oversees supervision, often using a double-fork technique to detach from the controlling terminal and session for true background operation.^[32]^[33] Native support for fork() is absent in traditional Windows environments, where process creation relies instead on the Win32 API's CreateProcess() function, which launches a new process with specified attributes but lacks the exact duplication semantics of fork(), requiring explicit inheritance setup for handles and environment. This limitation persisted until the introduction of the Windows Subsystem for Linux (WSL), which emulates Unix process creation, including fork(), by leveraging the NT kernel's underlying capabilities for compatibility with Linux applications.^[34]^[35]

Forking Process

Technical Implementation

Creating a software codebase fork begins with replicating the original repository to enable independent development while maintaining a connection to the upstream source. On platforms like GitHub, the forking process is initiated by navigating to the target repository and selecting the "Fork" button in the top-right corner, which creates a new repository under the user's account or organization. This action copies the entire codebase, branches, commits, and visibility settings from the upstream repository, establishing an implicit link that facilitates future synchronization. The user can optionally rename the repository during this step to reflect the forked project's identity and add a description for clarity. Unlike a simple clone, this server-side operation ensures the fork is hosted remotely and ready for collaboration without affecting the original project.^[36] To work locally, the next step is cloning the forked repository using Git: execute git clone <fork-url> to download the full history to the local machine. This creates a working directory where modifications can occur. If the fork intends to diverge significantly, rename project files, directories, or configuration elements (e.g., updating package names in build scripts or manifests) to avoid namespace collisions, ensuring consistency across the codebase. Subsequently, review and update dependencies by examining files like package.json (for Node.js), pom.xml (for Maven), or requirements.txt (for Python), adjusting versions to resolve incompatibilities or incorporate project-specific needs while testing for functionality. Finally, initialize new versioning by incrementing semantic version numbers in relevant files (e.g., VERSION or Cargo.toml) and creating an initial tag with git tag v1.0.0 followed by git push origin v1.0.0 to establish a baseline for the fork's release history.^[37]^[38] Integration with version control systems like Git enhances the fork's utility by preserving traceability to the upstream repository. After cloning, add the original as an upstream remote using git remote add upstream <upstream-url>, allowing fetches of updates with git fetch upstream. This setup enables pull requests from the fork back to the upstream, where changes are proposed via the platform's interface, promoting collaborative reconvergence. The fork operates as a full Git repository with its own branches, issues, and wikis, but the upstream link supports commands like git merge upstream/main to incorporate upstream advancements without manual duplication.^[4]^[37] Best practices emphasize maintaining the fork's integrity and usability for potential merging. Preserve attributions by retaining original copyright notices, author credits, and commit histories in the codebase, as Git naturally carries this metadata forward during the fork and clone operations. Include the original project's license file verbatim to comply with open-source terms, and if diverging, document changes in a CHANGELOG or README section. For continuity, migrate the wiki by cloning the original wiki's Git repository (accessible via <repo-url>.wiki.git) and pushing its contents to the fork's wiki endpoint, preserving documentation. Issues cannot be automatically transferred but can be manually recreated or imported using platform tools or scripts, prioritizing high-impact ones. To facilitate reconvergence, avoid immediate large-scale divergence by regularly syncing the fork—via the platform's "Sync fork" button or git pull upstream main—keeping the codebase aligned and reducing integration complexity later.^[39]^[40] Common pitfalls in technical implementation include dependency conflicts, particularly in polyglot projects spanning multiple languages or ecosystems, where upstream updates to one subsystem (e.g., a JavaScript library) may break compatibility in another (e.g., Python bindings), requiring manual resolution during merges. Syncing the fork can also introduce merge conflicts if local changes overlap with upstream modifications, necessitating line-by-line edits in tools like Git's mergetool or the platform's conflict resolver before committing. For proprietary code, attempting to fork without explicit rights can lead to copyright infringement, as the process replicates protected assets verbatim; always verify open-source status via the repository's license before proceeding. These issues underscore the need for incremental changes and thorough testing post-fork to maintain stability.^[41]^[42]

Legal and Licensing Aspects

Forking software raises significant legal and licensing considerations, particularly regarding intellectual property rights and the terms under which code can be copied, modified, and redistributed. In open source contexts, licenses dictate the permissibility and obligations of forks. Copyleft licenses, such as the GNU General Public License (GPL) version 3, impose strong requirements on derivative works: any modified version or fork distributed must be licensed under the GPL, ensuring that the source code remains available and freedoms to modify and redistribute are preserved for recipients.^[43] This "share alike" principle prevents proprietary enclosures of GPL-licensed code, as the entire combined work must adhere to GPL terms if modifications are incorporated.^[43] In contrast, permissive open source licenses like the Apache License 2.0 allow greater flexibility for forking. Under Apache 2.0, users may modify and distribute derivative works in source or object form, including within proprietary software, without requiring the fork to remain open source, provided they retain original copyright notices, include a copy of the license, and add prominent notices of changes.^[44] Attribution is mandatory, but the license explicitly permits additional terms or even relicensing the modifications under different conditions, facilitating commercial forks as long as core conditions are met.^[44] Proprietary software, governed by end-user license agreements (EULAs), typically prohibits forking outright to protect intellectual property. These agreements restrict users to personal use and explicitly ban reverse engineering, modification, or redistribution, with violations potentially leading to termination of access and legal action.^[45] Exceptions arise in cases of expired copyrights, acquisitions, or clean-room reimplementations of interfaces. For instance, ReactOS legally reimplements Windows APIs through independent development using public documentation and reverse engineering, avoiding direct copying of Microsoft's proprietary code since copyright protects expression, not underlying ideas or functional interfaces.^[46] Governance in open source projects further shapes forking dynamics, often through bylaws or contributor agreements that outline decision-making but cannot override license-granted rights to fork. Community-driven forks may invoke project governance documents, such as contributor license agreements (CLAs), to manage contributions, but disputes over violations—like unauthorized proprietary use—are typically resolved via arbitration, negotiation, or courts. A notable example involves BusyBox, where the Software Freedom Conservancy enforced GPL compliance against companies embedding modified versions without source code, resulting in settlements, injunctions to cease distribution, and requirements to release sources, underscoring judicial support for GPL terms in violation cases.^[47] Recent developments, including the European Union's Digital Markets Act (DMA) effective from 2024, influence forking rights in big tech ecosystems by mandating interoperability for designated gatekeepers like Apple and Google. Article 6(7) requires free and effective access to hardware and software features for third-party developers, promoting contestability and potentially easing forks or alternative implementations within closed platforms without full proprietary restrictions.^[48] This regulatory push aims to curb anti-competitive practices, indirectly bolstering forking as a tool for innovation in dominant digital markets.^[49]

Notable Examples

Open Source Forks

In the realm of free and open-source software (FOSS), forking has enabled the creation of influential Linux distributions by adapting existing projects to new needs. Ubuntu, launched in October 2004 as a fork of Debian, prioritizes user-friendliness, regular release cycles, and enterprise-oriented features such as long-term support (LTS) versions backed by Canonical's commercial services, which facilitate deployment in business environments.^[50] Similarly, Google initiated Android in 2008 by forking the Linux kernel to build a mobile operating system optimized for touch interfaces and embedded devices, incorporating custom modifications like the Android Runtime while remaining GPL-compliant.^[51]^[52] A notable early example of forking in text editors arose from disagreements over development direction and user interface enhancements. In 1991, Lucid Emacs—later renamed XEmacs—was forked from GNU Emacs version 19 to accelerate integration of graphical user interface (GUI) features, such as native X Window System support, amid frustrations with the slower pace of GNU Emacs updates at the time.^[53]^[54] XEmacs continues to be maintained separately, with its latest stable release in 2009 and recent beta releases as of June 2025, preserving compatibility with Emacs Lisp while emphasizing multimedia and toolkit extensions. Forking has also revitalized office productivity software amid corporate stewardship concerns. LibreOffice emerged in 2010 as a community-driven fork of OpenOffice.org, prompted by unease over Oracle Corporation's acquisition and perceived reduced commitment to the project's open development model.^[24]^[55] Now the leading FOSS office suite, LibreOffice supports over 200 million users worldwide and has become the default in numerous government and educational institutions due to its robust compatibility with Microsoft Office formats and active feature development.^[56] Forks of the Firefox browser from the Mozilla project, such as LibreWolf and Waterfox, exemplify how branching sustains innovation and competition in FOSS ecosystems by prioritizing privacy enhancements or legacy extension support without relying on upstream changes.^[57] These variants contribute to broader project vitality.

Proprietary Forks

Proprietary forks of closed-source software are typically internal developments or licensed modifications constrained by commercial agreements, non-disclosure pacts, and intellectual property laws that prohibit unauthorized redistribution or public divergence. These forks enable companies to customize software for specific hardware, performance needs, or market strategies while preserving the original vendor's control over the core codebase. Unlike open-source forks, proprietary ones rarely result in competing public products, instead serving as evolutionary branches within corporate ecosystems. In the realm of Unix operating systems, notable proprietary forks emerged from AT&T's System V Release 4 (SVR4) in the late 1980s. Sun Microsystems developed Solaris as a proprietary adaptation of SVR4, integrating BSD-derived features from its earlier SunOS while licensing the core from AT&T to create a robust platform for Sun's SPARC hardware; Solaris 2.0, released in 1992, marked this transition and became a cornerstone for enterprise computing.^[58] Similarly, IBM created AIX (Advanced Interactive eXecutive) as a proprietary Unix variant, initially drawing from SVR3 but incorporating SVR4 elements by AIX 4.0 in 1994 to support its RS/6000 and Power systems with enhanced reliability features like journaling file systems.^[59] The browser industry illustrates proprietary forks through Microsoft's evolution of its web technologies. Internet Explorer, a closed-source browser from the 1990s, involved internal forks for version-specific optimizations and platform integrations, such as tailored builds for Windows versions. A significant pivot occurred in 2019 when Microsoft released a new Edge browser as a proprietary fork of the open-source Chromium project, adding closed-source components like enterprise policy controls and Azure Active Directory integration to differentiate it from Google Chrome while benefiting from Chromium's rendering engine.^[60] Game engines provide another domain for proprietary forks under licensing terms. Epic Games' Unreal Engine, a proprietary technology, allows licensees to create internal forks for custom implementations in commercial products; for example, developers of AAA titles such as Gears 5 and Batman: Arkham Knight modified the engine with proprietary plugins, shaders, and tools optimized for their narratives and hardware targets, all governed by Epic's end-user license agreement that mandates NDAs and restricts engine redistribution.^[61] Proprietary forks face substantial challenges, particularly around reverse engineering and emerging AI applications. In 2022, a class-action lawsuit was filed against Microsoft, GitHub, and OpenAI, alleging copyright infringement by training their Copilot tool on open-source code from public repositories without permission; the case remains ongoing as of 2025, highlighting risks for AI-assisted development in both open and closed ecosystems.^[62]^[63]

Implications

Benefits for Development

Forking in open source software development enables developers to experiment with modifications and introduce niche features without compromising the stability of the original project, thereby accelerating innovation. By creating an independent copy of the codebase, contributors can test bold ideas, such as specialized adaptations for particular use cases or emerging technologies, in a low-risk environment. This practice fosters a culture of creativity, as evidenced by empirical studies showing that forking serves as a key mechanism for exploring alternative directions while preserving the upstream project's integrity.^[10]^[64] One significant benefit is risk mitigation through community-driven revivals of abandoned or stalled projects, providing a resilient backup mechanism for software continuity. When maintainers cease activity, forks allow motivated communities to sustain development, preventing total loss of valuable codebases. A 2024 analysis of free and open source software (FOSS) sustainability reports that 41% of projects survive critical developer detachments—such as the departure of key contributors—by attracting new talent or reactivating old ones, often via forked variants. Additionally, research on GitHub hard forks indicates that 47.6% outlive their upstream projects, particularly when the original becomes inactive, highlighting forking's role in ensuring long-term viability.^[65]^[66] Forking enhances competition and user choice by diversifying available software options, which in turn drives quality improvements across ecosystems. Multiple variants emerging from forks cater to varied needs, encouraging projects to innovate to retain users and avoid obsolescence. This dynamic contributes substantially to the FOSS economy, with a 2024 Harvard Business School study estimating the demand-side value of open source software at $8.8 trillion annually, representing the replacement cost firms would incur without freely accessible, fork-enabled codebases. By increasing options and spurring rivalry, forking amplifies overall ecosystem productivity and adoption. Finally, forking bolsters collaboration by facilitating the integration of external contributions back into the main project through pull requests, enriching the upstream codebase with diverse improvements. In fork-based workflows prevalent on platforms like GitHub, developers propose changes from their copies, allowing maintainers to selectively merge valuable enhancements. Studies of social coding practices reveal that a majority of projects accept pull requests from forks, with many active forks submitting contributions that address inefficiencies or add features, thereby strengthening community health and project evolution.^[64]

Challenges and Risks

Software forking can lead to fragmentation of the developer community and user base, as resources become divided among multiple incompatible versions of the project. This dilution often results in duplicated efforts and reduced overall momentum, with developers splitting their contributions across forks rather than consolidating on a single codebase. For instance, the fork of XFree86 into X.Org in 2004 stemmed from governance disputes and led to a fragmented graphics driver ecosystem, where users and contributors had to choose between competing implementations, ultimately stalling innovation in one branch.^[67] Such scenarios, sometimes escalating into "fork wars" characterized by heated public debates and personal clashes, exacerbate the division, as seen in the contentious split of the Emacs editor into GNU Emacs and XEmacs in the 1990s over technical and philosophical differences.^[68] Maintaining a fork imposes a significant burden on resources, requiring ongoing synchronization with upstream changes and independent bug fixes, which many projects cannot sustain long-term. A study of over 15,000 hard forks on GitHub found that 43.6% of forks were discontinued, often due to the challenges of keeping pace with evolving upstream developments without dedicated teams.^[69] Similarly, an analysis of 220 notable open-source forks revealed that 13.8% failed outright, with an additional 8.7% seeing both the fork and original project discontinued, highlighting the high attrition rate driven by limited contributor availability.^[70] These maintenance demands frequently lead to forks becoming dormant within a few years, diverting effort from core advancements. Divergent codebases in forks heighten security risks, as they may miss critical upstream patches for known vulnerabilities, leaving users exposed to exploits that have already been addressed in the original project. For example, forks of the Chromium browser, such as those in certain development tools, have been found running outdated versions vulnerable to over 80 common vulnerabilities and exposures (CVEs), including actively exploited flaws, because manual merging of security updates is labor-intensive and often overlooked.^[71] Closed-source forks amplify this issue, as security fixes from open-source origins do not propagate automatically, creating blind spots in vulnerability management.^[72] Forking can also trigger social dynamics that fracture communities, fostering toxicity through prolonged disputes and ideological rifts. The 2016 DAO hack on Ethereum, where $50 million in ether was stolen due to a smart contract vulnerability, exemplified this when the community hard-forked the blockchain to recover funds, resulting in a permanent split into Ethereum and Ethereum Classic; opponents of the fork viewed it as a betrayal of immutability principles, leading to bitter debates and ongoing animosity that divided developers and users.^[73] Legal risks, such as disputes over intellectual property in forked code, further complicate these social tensions, though they are addressed in detail under licensing aspects.^[69]

Tools and Platforms

Version Control Systems

Git, the predominant distributed version control system (VCS), enables codebase forking through its core distributed model, where every clone serves as a complete, independent repository. The git clone command creates a local copy of a remote repository, automatically setting the source as the "origin" remote, allowing developers to work offline and diverge the codebase without immediate server dependency. This approach draws conceptual inspiration from the Unix fork() system call but applies it at the repository level, facilitating parallel development streams. To maintain synchronization with the original project, users add the upstream repository as a remote via git remote add upstream <url>, enabling fetches (git fetch upstream) and merges or pulls to incorporate upstream changes into the fork.^[74] Mercurial provides a comparable distributed forking mechanism as an alternative to Git, using the hg clone command to replicate an entire repository—including its full history—into a new directory on the local filesystem or remote location. This creates a self-contained copy that supports decentralized workflows, with the source URL recorded in the clone's .hg/hgrc file for subsequent pulls. Mercurial's cloning efficiency, including options like hardlinking for local copies and streaming for large repositories, made it suitable for substantial projects; for instance, Mozilla relied on it for Firefox development from 2007 until announcing the switch to Git in 2023 and completing the migration in 2025, underscoring its preeminence in collaborative open-source efforts before Git's widespread adoption.^[75]^[76]^[77] Centralized VCS such as Subversion (SVN) impose limitations on forking due to their reliance on a single server for all operations, making independent copies more administratively intensive than in distributed systems. While SVN supports branching within the same repository via svn copy, creating a true fork as a separate repository requires server-side tools: administrators use svnadmin dump to export the repository's contents into a portable dump file, which can then be loaded into a new repository instance with svnadmin load. This process, while effective for migration or duplication, lacks the seamless decentralization of Git or Mercurial clones and often necessitates privileged access, hindering casual forking in collaborative settings.^[78] Distributed VCS offer advanced features to manage fork evolution and integration. In Git, synchronization techniques include git rebase, which replays the fork's commits atop the latest upstream changes to produce a cleaner, linear history without extraneous merge commits, and git cherry-pick, which selectively applies individual upstream commits to the fork for targeted updates like bug fixes. These commands help resolve divergences efficiently, preserving commit integrity while adapting to upstream progress. Additionally, Git and similar systems integrate natively with CI/CD pipelines—through hooks or webhooks— to automate divergence testing, where changes in a fork trigger builds and tests against upstream baselines, identifying incompatibilities early in the development cycle.^[79]^[80]^[81]

Collaborative Hosting Services

GitHub pioneered user-friendly forking with its one-click fork button, introduced in 2008 as a core feature to simplify creating personal copies of repositories for experimentation and contribution. This interface fosters social coding by displaying metrics such as stars for popularity and forks to indicate branching activity, enabling developers to gauge community engagement at a glance.^[82] As of October 2025, GitHub hosts over 630 million repositories, reflecting the platform's scale in supporting collaborative development.^[83] GitLab provides robust forking capabilities through both cloud-hosted and self-hosted deployments, allowing users to run instances on their own infrastructure for enhanced data sovereignty. Its Enterprise Edition includes advanced forking controls, such as group-level restrictions and permission policies, to manage access and prevent unauthorized copies in organizational settings.^[84] Additionally, GitLab's built-in CI/CD system integrates seamlessly with forked repositories, enabling automated pipelines for building, testing, and deploying changes without disrupting upstream workflows. Bitbucket, an Atlassian product, facilitates forking for both Git and Mercurial repositories, offering a straightforward interface to clone projects while preserving version history.^[85] It excels in proprietary environments through tight integration with Jira, where forks can trigger automated workflows linking code commits to issues, sprints, and deployment pipelines for streamlined team collaboration.^[86] In 2024, decentralized platforms like Radicle emerged as alternatives to traditional hosting services, enabling peer-to-peer forking via a Git-based stack with cryptographic authentication and gossip protocols to reduce reliance on central servers.^[87] This trend addresses concerns over platform lock-in by distributing control across user nodes, though adoption remains nascent compared to established services.^[88]

References

[1]
Forks - California Academy of Sciences
Kitchen forks trace their origins back to the time of the Greeks. These forks were fairly large with two tines that aided in the carving and serving of meat.Missing: definition | Show results with:definition
[2]
Fork - Italian - The Metropolitan Museum of Art
The widespread acceptance of the fork as an eating utensil is definitely post-medieval. Introduced into France soon after 1553, when Catherine de'Medici married ...Missing: definition | Show results with:definition<|control11|><|separator|>
[3]
Stick a Fork in it - Thomas Jefferson's Poplar Forest
The fork is the youngest of the basic dining utensils and once considered a scandalous and indulgent item. The earliest forks were not used for delivering food ...Missing: definition | Show results with:definition
[4]
The Fork—Extraordinary Ordinary Things - BLOG@UBIQUITY • - ACM
Apr 1, 2020 · As originally conceived, eating forks were a kind of a two-blade (forked) knife that made these manipulations easier. However, the idea of the ...
[5]
About forks - GitHub Docs
A fork is a new repository that shares code and visibility settings with the original “upstream” repository.Missing: software | Show results with:software
[6]
What's the difference between a fork and a distribution?
Jul 13, 2018 · A fork—and any distribution based on the fork—results in a version of the code and documentation that are different from the upstream project.
[7]
Forks - Producing Open Source Software
At its most basic, a fork is when one copy of a project diverges from another copy: think "fork in the road". What that divergence actually means for the ...
[8]
Branch or fork your repository | Bitbucket Cloud - Atlassian Support
So, unlike a branch, a fork is independent from the original repository. If the original repository is deleted, the fork remains. If you fork a repository, you ...
[9]
Frequently Answered Questions - Open Source Initiative
... fork covered code and use it without additional fees. The FSF uses a shorter, four-point definition of software freedom when evaluating licenses, while the ...
[10]
What is Fork (Software)? Definition & Meaning | Crypto Wiki
Apr 3, 2024 · Forking refers to the development of a brand new project or program using the source code from another software.
[11]
Why Open Source Forking Is a Hot-Button Issue - The New Stack
Sep 26, 2024 · Forking is not an option with proprietary software, as the codebase, even if accessible, does not meet open source software standards. In ...<|control11|><|separator|>
[12]
(PDF) Understanding code forking in open source software
The fork system call as originally suggested by Conway in 1963 ................... 27! Figure 5!“Fork logic” in Tymshare (derived from Project Ge ...
[13]
[PDF] An Introduction to the Source Code Control System
Dec 5, 1980 · This document gives a quick introduction to using the Source Code Control System (SCCS). The presentation is geared to programmers who are ...
[14]
SHARE, The First Computer Users' Group, is Founded
In 1955 the SHARE Offsite Link volunteer-run user group for IBM mainframe computers was founded in the Los Angeles area by users of the IBM 701 Offsite Link.
[15]
[PDF] A History of the ARPANET: The First Decade - DTIC
Apr 1, 1981 · In fiscal year 1969 a DARPA program entitled "Resource. Sharing Computer Networks" was initiated. The research carried out under this program ...
[16]
[PDF] The Evolution of the Unix Time-sharing System* - Nokia
This paper presents a brief history of the early development of the Unix operating system. It concentrates on the evolution of the file system, the process- ...
[17]
[PDF] The UNIX Time- Sharing System
UNIX is a general-purpose, multi-user, interactive operating system with a hierarchical file system, compatible I/O, and asynchronous processes.
[18]
Twenty Years of Berkeley Unix : From AT&T-Owned to Freely - O'Reilly
Up through the release of 4.3BSD-Tahoe, all recipients of BSD had to first get an AT&T source license. ... The history of the Unix system and the BSD system in ...<|separator|>
[19]
History of the Berkeley Software Distributions - Marshall Kirk McKusick
Nov 6, 2006 · Marshall Kirk McKusick's work with Unix and BSD development spans nearly thirty years. It begins with his first paper on the implementation ...
[20]
Initial Announcement - GNU Project - Free Software Foundation
This is the original announcement of the GNU Project, posted by Richard Stallman on September 27, 1983. The actual history of the GNU Project ...
[21]
The early days of Linux - LWN.net
Apr 12, 2023 · The first Linux distribution was also started in 1992: Softlanding Linux System or SLS. The next year, SLS morphed into Slackware, which ...Missing: forks | Show results with:forks
[22]
History of 3 Linux Distributions: Slackware, Debian & Red Hat
May 9, 2019 · To this day, Slackware remains the oldest surviving Linux distribution. By the time Slackware came onto the scene, there were already half a ...
[23]
Which License Should I Use? MIT vs. Apache vs. GPL - Exygy
Jun 21, 2016 · A guide to knowing which open source licenses to use: MIT, Apache, or GPL and their variants. This is a guest post from Exygy's long time counsel Joseph Morris.
[24]
GNU General Public License version 2 - Open Source Initiative
The GPL-2 guarantees freedom to share and change free software, ensuring freedom to distribute, access source code, and modify it. It requires recipients to ...
[25]
Mozilla's radical open-source move helped rewrite rules of tech
Mar 29, 2018 · So Netscape did something that was radical for the time: On March 31, 1998, it gave away the source code behind its Netscape Communicator ...Missing: fork | Show results with:fork
[26]
LibreOffice Timeline - Free and private office suite - LibreOffice
The foundation's main project is LibreOffice, a fork of OpenOffice.org. 2011. LibreOffice 3.3 is released, with unique features including SVG image import ...
[27]
[PDF] “40 Years of Unix” Celebration - » Linux Magazine
to open source. Other talks ranged from beginner level to ... Figure 2: Ohio Linux Fest expo floor. Ohio linux Fest. COMMUNITY. 90. ISSuE 109. December 2009.
[28]
Sustainability and Governance in Developing Open Source Projects ...
Jan 29, 2013 · Thus, code forking represents the single greatest tool available for guaranteeing sustainability in open source software. In addition to ...
[29]
What Is An Open-Source Fork And How To Secure it? - Patchstack
Apr 24, 2023 · An open-source fork refers to a project that has been derived from an existing open-source project. When a project is forked, it means that a ...Missing: definition | Show results with:definition
[30]
How to Successfully Fork an Open-Source Project - Heavybit
Dec 3, 2024 · A fork is the act of copying the source code from an existing project and using it as the basis of a new project with a comparable license and a noticeably ...Missing: definition | Show results with:definition
[31]
None
### Summary of Causes or Triggers for Forking Open Source Projects
[32]
Reuse and maintenance practices among divergent forks in three ...
Mar 4, 2022 · We empirically explore maintenance practices in such fork-based software families within ecosystems of open-source software.
[33]
[PDF] Exploring Trends and Practices of Forks in Open-Source Software ...
A fork can result in changes that are pushed back to the original project or even evolve into an independent project. Some projects tend to be forked ...
[34]
[PDF] Fork Entropy: Assessing the Diversity of Open Source Software ...
Sep 19, 2023 · In this work, we study the diversity of a projects' contributors-owned fork population created on. GitHub. Although we do not discriminate hard ...
[35]
fork - The Open Group Publications Catalog
The fork() function shall create a new process. The new process (child process) shall be an exact copy of the calling process (parent process) except as ...
[36]
fork(2) - Linux manual page - man7.org
fork() creates a new process by duplicating the calling process. The new process is referred to as the child process. The calling process is referred to as the ...
[37]
daemon(3) - Linux manual page - man7.org
The daemon() function is for programs wishing to detach themselves from the controlling terminal and run in the background as system daemons.
[38]
daemon(7) - Linux manual page - man7.org
A daemon is a service process that runs in the background and supervises the system or provides functionality to other processes. Traditionally, daemons are ...Missing: explanation | Show results with:explanation
[39]
Create processes - Win32 apps - Microsoft Learn
Jul 14, 2025 · The CreateProcess function creates a new process that runs independently of the creating process. For simplicity, this relationship is called a parent-child ...
[40]
WSL architectural overview - Microsoft Learn
The Linux fork syscall is interesting to discuss because there is no direct equivalent documented for Windows. At a high level, fork simply creates a copy of ...
[41]
Fork a repository - GitHub Docs
### Step-by-Step Process for Forking a Repository on GitHub
[42]
Forking Workflow - Atlassian Git Tutorial
A breakdown of the Git Forking Workflow. Learn how git fork can help teammates and collaborators work better together.
[43]
https://www.gnu.org/licenses/gpl-3.0.en.html
[44]
Best Practices for Maintainers - Open Source Guides
Making your life easier as an open source maintainer, from documenting processes to leveraging your community.Missing: independent | Show results with:independent
[45]
Transferring a repository - GitHub Docs
When you transfer a repository, its issues, pull requests, wiki, stars, and watchers are also transferred. If the transferred repository contains webhooks, ...
[46]
Syncing a fork - GitHub Docs
### Summary of Syncing Forks
[47]
Resolve merge conflicts | Bitbucket Cloud - Atlassian Support
Resolving the conflict between Git forks. These steps include details for resolving conflicts between a forked repository and its original Git repository.
[48]
https://digital-markets-act.ec.europa.eu/questions-and-answers/interoperability_en
[49]
Apache License, Version 2.0 | Apache Software Foundations
### Summary of Key Permissions and Attribution Requirements (Apache License, Version 2.0)
[50]
Software Licensing Agreements: Proprietary vs Free - TermsFeed
A proprietary software licensing agreement gives licensees the right to use the software while establishing the terms of use and restricting unauthorized use or ...<|separator|>
[51]
Intellectual Property Guideline | ReactOS Project
Nearly all of the sample code, documentation, header files, and reference material available about the Windows APIs is subject to some form of copyright. A ...
[52]
Best Buy, Samsung, Westinghouse, And Eleven Other Brands ...
Dec 14, 2009 · The suit charges each of the defendants with selling products containing BusyBox in violation ... FOSS programs in violation of the GPL. Though ...Missing: forks | Show results with:forks
[53]
Interoperability - Digital Markets Act (DMA) - European Commission
Mar 19, 2025 · Under Article 6(7) of the DMA, Apple must provide developers and businesses with free and effective interoperability with hardware and software features.
[54]
Digital Markets Act
### Summary of Digital Markets Act (DMA) Provisions
[55]
Glossary - Community Help Wiki - Ubuntu Documentation
Sep 26, 2017 · Ubuntu is a fork of Debian and, as such, Ubuntu is reliant upon much of the hard work of the Debian team. See also: Wikipedia Fork (software ...
[56]
Kernel overview | Android Open Source Project
Oct 9, 2025 · The Android kernel is based on an upstream Linux Long Term Supported (LTS) kernel. At Google, LTS kernels are combined with Android-specific patches.Missing: fork 2008
[57]
Linus Torvalds on Android, the Linux fork - ZDNET
Aug 18, 2011 · What's also true though is that Google took Android in its own direction, a direction that wasn't compatible with the mainstream Linux kernel.
[58]
XEmacs Internals Manual: 3. A History of Emacs
XEmacs is a powerful, customizable text editor and development environment. It began in 1991 as Lucid Emacs, which was in turn derived from GNU Emacs.
[59]
Frequently asked questions about XEmacs: XEmacs FAQ
Why haven't XEmacs and GNU Emacs merged? 1.7: External Packages. Q1.7.1: What ... Emacs (in 1991), but they have all failed. RMS has very strong views ...
[60]
Document Foundation forks OpenOffice.org, liberates it from Oracle
Sep 28, 2010 · Their goal is to liberate the project from Oracle's control and create a more inclusive and participatory ecosystem around the software. OOo was ...Missing: 200M | Show results with:200M<|separator|>
[61]
4 reasons why LibreOffice downloads are way up (hint: you'll relate)
Mar 31, 2025 · According to the Document Foundation, LibreOffice, which runs on Linux, MacOS, and Windows, now boasts around 200 million worldwide users.
[62]
A look at Firefox forks - OSnews
Mar 15, 2025 · Mozilla's actions have been rubbing many Firefox fans the wrong way as of late, and inspiring them to look for alternatives.
[63]
Octoverse: The state of open source and rise of AI in 2023
Nov 8, 2023 · In this year's Octoverse report, we study how open source activity around AI, the cloud, and Git are changing the developer experience.
[64]
[PDF] The History of Solaris - UNL School of Computing
System V, release 4 (SRV4) was a merger of System V and BSD, incorpo- rating the important features found in SunOS. As more hardware vendors, such as Sun, ...
[65]
A Brief History of AIX - IBM
Jun 8, 2023 · AIX 3 (1990) for Power based processor computers (initially RS/6000) based on UNIX AT&T System V. 2 and V. 3 with some added BSD 4.2 features ...
[66]
Microsoft Edge: Making the web better through more open source ...
Dec 6, 2018 · We're announcing that we intend to adopt the Chromium open source project in the development of Microsoft Edge on the desktop to create better web ...
[67]
EULA - Unreal Engine
This Agreement is a legal document detailing your rights and obligations related to using Epic's proprietary computer software program known as Unreal® Engine ...
[68]
Lawsuit Takes Aim at the Way A.I. Is Built - The New York Times
Nov 23, 2022 · A programmer is suing Microsoft, GitHub and OpenAI over artificial intelligence technology that generates its own computer code.
[69]
a study of inefficient and efficient forking practices in social coding
In this paper, we explore how open-source projects on GitHub differ with regard to forking inefficiencies. ... Pull-Requests from Forks2025 IEEE International ...
[70]
Free open source communities sustainability: Does it make a ...
Jul 23, 2024 · Our data covered projects' activities in the period from March 2009 to April 2023. These projects are not only diverse in scope but also at ...
[71]
How has forking changed in the last 20 years? - ACM Digital Library
Oct 1, 2020 · How has forking changed in the last 20 years?: a study of hard forks on GitHub ... Pull-Requests from Forks2025 IEEE International Conference on ...
[72]
10 interesting open source software forks and why they happened
Sep 11, 2008 · A fork (initially called Funpidgin) was done of Pidgin 2.4.0 because there were disagreements about the size of the text entry field. The new ...
[73]
Flame Wars, Forks and Freedom - OSnews
Jan 24, 2005 · In the news media, it is generally shown that flame wars and forks are detrimental to the growth of FOSS (Free/Open Source Software) But if ...Missing: challenges | Show results with:challenges
[74]
[PDF] How Has Forking Changed in the Last 20 Years? A Study of Hard ...
ABSTRACT. The notion of forking has changed with the rise of distributed ver- sion control systems and social coding environments, like GitHub.
[75]
[PDF] A Comprehensive Study of Software Forks: Dates, Reasons and ...
May 5, 2017 · Forking occurs when a part of a development community (or a third party not related to the project) starts a completely independent line of ...
[76]
The Curse of the Fork: When Patching Is Not Trivial - OX Security
Aug 28, 2025 · Cursor and Windsurf run outdated Chromium, exposing devs to 80+ CVEs including active exploits. Understand the risk of unpatched forks.Missing: missing | Show results with:missing
[77]
Security Blindspots in Closed Source Forks - Kilo Code Blog
May 15, 2025 · It highlights a fundamental issue with closed-source forks of open-source projects: security updates don't automatically flow downstream.
[78]
DAO Hack Explained: How a Vulnerability Split Ethereum - Gemini
The DAO hack showed how one vulnerability siphoned 3.6 M ETH and reshaped Ethereum, sparking tighter audits and hard-fork safeguards. Read the history now.
[79]
Git - Working with Remotes
### Summary: Git Handling Remotes, Adding Upstream, Cloning as Forking
[80]
hg clone - Mercurial
Create a copy of an existing repository in a new directory. If no destination directory name is specified, it defaults to the basename of the source.
[81]
Blog Archive » How I (kind of) killed Mercurial at Mozilla - glandium.org
Nov 22, 2023 · As I remember it, it was challenging to create, but once it was there, Git users could just clone it and have a working, up-to-date local copy ...<|control11|><|separator|>
[82]
svnadmin dump - Version Control with Subversion
Dump the contents of the filesystem to stdout in a “dump file” portable format, sending feedback to stderr. Dump revisions LOWER rev through UPPER rev.
[83]
git-rebase Documentation - Git
When --fork-point is active, fork_point will be used instead of <upstream> to calculate the set of commits to rebase, where fork_point is the result of git ...2.41.0 2023-06-01 · 2.6.7 2017-05-05 · 2.18.0 2018-06-21 · 2.0.5 2014-12-17
[84]
Git - git-cherry-pick Documentation
### Summary of Git Cherry-Pick in Fork Synchronization
[85]
Git and DevOps: Integrating Version Control with CI/CD Pipelines
Jul 23, 2025 · This article will explore the principles of Git and DevOps and explain how Git and CI/CD can best suit your software development process.The Role Of Git In Version... · Basics Of Devops · Integrating Git With Ci/cd...<|control11|><|separator|>
[86]
Understanding connections between repositories - GitHub Docs
How many times the fork has been starred · The number of direct forks (of the fork) · The number of open issues · The number of open pull requests · When the fork ...In This Article · Viewing A Repository's... · Accessing The Forks PageMissing: statistics | Show results with:statistics
[87]
Trending repositories on GitHub today
GitHub is where people build software. More than 150 million people use GitHub to discover, fork, and contribute to over 420 million projects.Developers · Collections · Trending · Chamorro
[88]
Self-Managed Feature Comparison - GitLab
Compare GitLab features across plans. Discover which GitLab plan delivers the right features and capabilities for your team's needs.Missing: forking | Show results with:forking
[89]
Fork a repository | Bitbucket Cloud - Atlassian Support
Forking is a way for you to clone a repository at a specific point, and to modify it from there. To fork is just another way of saying clone.Missing: Mercurial | Show results with:Mercurial
[90]
Jira & Bitbucket Integration - Atlassian
Integrating Jira and Bitbucket automatically keeps teams up-to-date on code changes. Learn how your team can integrate and release versions 14% more often.
[91]
Radicle: the sovereign forge
Radicle is an open source, peer-to-peer code collaboration stack built on Git. Unlike centralized code hosting platforms, there is no single entity controlling ...Missing: rise | Show results with:rise
[92]
Radicle: An Open-Source, Peer-to-Peer, GitHub Alternative
Mar 16, 2024 · Radicle is an open-source, peer-to-peer collaboration stack built on top of Git but backed with public key cryptography as a standard and a gossip protocol.Missing: rise | Show results with:rise