Wget

Wget
GNU Wget is a free and open-source command-line utility for non-interactively retrieving files from the World Wide Web using the HTTP, HTTPS, FTP, and FTPS protocols.^[1]
Originally developed in 1996 by Hrvoje Nikšić as part of the GNU Project, Wget enables users to download single files, mirror entire websites or FTP directories, and perform recursive retrievals while respecting server restrictions like robots.txt.^[1]^[2] It supports features such as resuming interrupted downloads via REST and RANGE methods, handling HTTP proxies and cookies, converting links for offline viewing, and operating in the background for scripted or automated tasks.^[2]
Licensed under the GNU General Public License version 3 or later, Wget is maintained by a team including Tim Rühsen, Darshit Shah, and Giuseppe Scrivano, with the latest stable release being version 1.25.0, released in November 2024.^[1]^[3] A successor project, GNU Wget2, is under active development to enhance performance and add modern features like HTTP/2 support.^[1]
Background
Origins and History
Wget originated as a personal project by Hrvoje Nikšić in 1996, initially known as Geturl, aimed at creating a command-line tool for non-interactively retrieving files from the World Wide Web.^[4] Nikšić, then a developer associated with XEmacs, sought to address the need for a simple, scriptable downloader that could operate without a graphical interface or browser.^[2] The project focused initially on HTTP protocol support for basic file transfers.^[2]
The first public release occurred in 1996, when Nikšić integrated it into the GNU Project as GNU Wget, with the name derived from combining "Web" (referring to the World Wide Web) and "get" (the HTTP request method for retrieving resources).^[1] Version 1.4.0, released in November 1996, was the first under the name Wget and distributed under the GNU GPL.^[1] This marked Wget's transition from a solo endeavor to an open-source utility under the GNU umbrella, emphasizing free software principles. Early versions emphasized reliability in unstable network conditions and included support for FTP and recursive downloads, aligning with the growing accessibility of the internet.^[2]
Widespread adoption began in Unix-like systems by the late 1990s, as evidenced by its inclusion in major distributions like Debian starting with version 1.4.0 in February 1997.^[5]
Entering the 2000s, maintenance shifted to a collaborative model amid increasing web complexity, with contributions from developers like Dan Harkless, who released version 1.6 in December 2000.^[6]^[3] A major rewrite in version 1.5, released in September 1998, focused on improving stability and code architecture to handle evolving protocols and error recovery more robustly.^[3] This period saw Wget establish itself as a standard tool in command-line environments, licensed under the GPL to encourage community involvement.^[1]
Authors and Licensing
GNU Wget was originally written by Hrvoje Nikšić, who initiated its development and served as the primary author and maintainer in its early years.^[6]
The copyright for Wget has been held by the Free Software Foundation (FSF) since 1996, as part of its integration into the GNU Project, with all releases featuring FSF copyright notices.^[2]
Wget is distributed under the GNU General Public License (GPL) version 3 or later, which permits free modification and redistribution provided the source code is made available and derivative works remain under the same license terms.^[1]
Maintainership evolved over time: after Nikšić, Mauro Tortonesi maintained the project from 2004 to 2007, followed by Micah Cowan from 2007 to 2010.^[6]^[7]^[8]
Since 2010, Tim Rühsen has served as the primary maintainer, with significant contributions from Darshit Shah and Giuseppe Scrivano, who joined as co-maintainers around 2014.^[6]^[1]^[9]
The GPL licensing has ensured no proprietary forks exist, as any modifications must adhere to the open-source requirements, preserving Wget's status as free software.^[10]
Core Functionality
Download Protocols and Capabilities
Wget supports the HTTP, HTTPS (with SSL/TLS encryption), FTP, and FTPS protocols for retrieving files from remote servers, enabling secure and non-interactive downloads over these widely used internet standards.^[2]^[11] It lacks native support for SFTP or BitTorrent, focusing instead on these core web and file transfer protocols to ensure broad compatibility without additional dependencies.^[2]
At its foundation, Wget enables single-file downloads by specifying a URL, recursive retrieval of directories to mirror site structures, wildcard pattern matching for selective file downloads (particularly in FTP contexts), and timestamping to compare remote file modification times against local copies, thereby avoiding unnecessary re-downloads of unchanged content.^[2] These capabilities allow users to efficiently fetch resources while respecting server metadata for optimized transfers.
Wget manages redirects by automatically following HTTP 3xx status codes up to a default limit of 20, configurable for extended chains, and integrates with proxy servers through environment variables such as http_proxy or https_proxy for routed access in networked environments.^[2] For authentication, it handles HTTP basic authentication via username and password options or embedded URL credentials, alongside FTP login support using similar mechanisms or .netrc files.^[2]
A notable feature is Metalink support, which facilitates multi-source downloads by processing .metalink files (versions 3 and 4) to select optimal mirrors based on checksums and availability, introduced in Wget 1.17 in November 2015.^[12]^[2] Additionally, interrupted downloads can be resumed precisely using HTTP range requests to continue from the exact byte offset or the FTP REST command, enhancing reliability over unstable connections.^[2]
Robustness and Reliability Features
Wget has been engineered for robustness, particularly in environments with slow or unstable network connections, where it automatically retries failed downloads due to transient issues such as timeouts or server errors in the 5xx range.^[2] By default, Wget attempts up to 20 retries before giving up, excluding fatal errors like "connection refused" or HTTP 404 responses, and this count is configurable via the --tries option.^[2] This mechanism ensures reliable completion of transfers without manual intervention, making it suitable for long-running or unattended operations over unreliable links.^[2]
To prevent overwhelming remote servers or local networks, Wget includes bandwidth limiting capabilities through the --limit-rate option, which caps download speeds to a specified rate, such as 20 kilobytes per second.^[2] Additionally, it supports persistent HTTP connections by default, reusing a single TCP connection for multiple requests to the same server, which reduces connection setup overhead and improves efficiency in high-volume downloads.^[13] Users can disable this with --no-http-keep-alive if server incompatibilities arise.^[13]
For enhanced offline usability, Wget features automatic link conversion in downloaded HTML and CSS files, transforming absolute URLs to relative ones via the --convert-links option, allowing seamless local browsing without network access.^[2] Regarding web etiquette, Wget honors the robots.txt standard by default to respect site crawling restrictions, but this can be overridden with the --no-robots option for cases where full access is permitted or required.^[2]
On the security front, Wget prioritizes secure protocols like HTTPS to protect against interception of sensitive data, such as credentials in URLs, which remain visible in command lines or logs if not encrypted.^[2] A notable vulnerability, CVE-2024-10524, affected versions up to 1.24.5, enabling server-side request forgery (SSRF) through improper handling of shorthand FTP URLs with embedded credentials, potentially allowing requests to unintended hosts.^[14] This was addressed in version 1.25.0 (released November 2024) by removing support for such shorthand formats, recommending explicit full URLs for safer operations.^[15]
Advanced Features
Recursive Download and Mirroring
Wget supports recursive downloading, which allows users to retrieve entire directory structures or websites by following hyperlinks and FTP listings hierarchically. This mode is activated using the -r or --recursive option, enabling Wget to traverse links and download linked resources up to a specified depth. By default, the recursion depth is limited to 5 levels to prevent excessive downloading and potential infinite loops, but this can be adjusted with the -l or --level=depth option, where a value of 0 or inf permits unlimited depth.^[16]
For complete site mirroring, the -m or --mirror option is used, which combines recursive retrieval with timestamp checking (-N), infinite recursion depth (-l inf), and preservation of FTP directory listings (--no-remove-listing). This setup ensures that only updated files are downloaded on subsequent runs, creating an efficient offline clone of the site while maintaining its structure. Mirroring also involves converting absolute links to relative ones for local browsing, facilitated by the -k or --convert-links option.^[16]
To control the scope of recursive downloads, Wget provides selectors and filters for accepting or rejecting specific file types and paths. The -A or --accept option specifies comma-separated patterns or suffixes for files to include, such as *.html or *.css, using shell-like wildcards (enclosed in quotes to avoid expansion). Conversely, the -R or --reject option excludes patterns, for example, rejecting *.exe or *.zip files to avoid downloading executables or archives. Directory pruning allows skipping unwanted paths through the -X or --exclude-directories option, which takes a comma-separated list of directories (with wildcard support) to ignore, or --cut-dirs=number to trim leading directory components from the retrieval tree. These filters ensure targeted downloads, focusing on relevant content like web pages while omitting binaries or temporary directories.^[17]^[18]
By default, recursive downloads are restricted to the host from which retrieval begins, but the -H or --span-hosts option enables spanning multiple hosts by allowing links to external domains, useful for downloading linked resources across sites. For handling content that may lack proper file extensions, such as dynamically generated pages saved without .html, the --adjust-extension option appends appropriate suffixes based on content type, improving local file organization and browser compatibility.^[19]
Wget's recursive features are limited to static content, as it does not execute JavaScript or render dynamic elements, making it suitable for mirroring static websites but ineffective for single-page applications or AJAX-driven sites. Without quotas or careful configuration, recursive mirroring can lead to substantial disk usage, particularly on large sites with deep hierarchies, emphasizing the need for depth limits and filters to manage storage requirements.^[20]
Non-Interactive and Automation Options
Wget is designed as a non-interactive command-line tool, making it ideal for automated and scripted operations where user input is neither required nor possible. Unlike graphical download managers, it operates entirely from the terminal or scripts, ensuring seamless integration into batch processes or scheduled tasks without interrupting workflows. This non-interactive approach relies on command-line options and configuration files to handle decisions such as file overwrites or directory placements, using predefined defaults to avoid any prompts.^[2]
One key feature for unattended execution is the -b or --background option, which detaches Wget from the controlling terminal, allowing it to run as a background process similar to a daemon mode. When invoked with -b, output is automatically redirected to a file named wget-[log](/page/Log) unless a custom log file is specified via the -o or --output-file option, enabling monitoring of progress and errors without real-time user supervision. This capability supports long-running downloads, leveraging Wget's inherent reliability for resuming interrupted transfers if needed.^[21]
For batch processing multiple URLs, the -i or --input-file option allows Wget to read a list of URLs from a specified file or standard input, facilitating automated handling of large sets of resources without manual invocation for each one. This is particularly useful in scripts where URLs are dynamically generated or sourced from external data, streamlining operations like periodic data fetches.^[22]
Wget's automation is further enhanced by its reliance on initialization files, such as ~/.wgetrc, which define default behaviors like custom user agents via the user_agent directive or output directories with dir_prefix, eliminating the need to repeat common parameters in every command. Additionally, session persistence is supported through the --load-cookies option, which imports cookies from a Netscape/Mozilla-compatible file to maintain authentication states across runs, crucial for accessing protected resources in automated sequences.^[23]^[24]
In practice, Wget integrates well with system schedulers like cron for recurring tasks, such as daily mirroring of web content, by embedding download commands directly into crontab entries for hands-off execution. For handling API responses, the --content-disposition option respects server-provided filenames from HTTP headers, ensuring downloaded files are named appropriately without manual renaming in automated pipelines.^[1]^[24]
Distinguishing it from interactive tools, Wget forgoes confirmation prompts for actions like file overwrites, instead applying sensible defaults; for instance, the -N or --timestamping option checks remote timestamps against local files to skip redundant downloads, promoting efficiency in unattended environments.^[25]
Usage and Configuration
Basic Syntax and Commands
The basic syntax for invoking Wget is wget [option]... [URL]..., where multiple URLs can be specified on the command line, and Wget will download each one sequentially.^[26] This command-line structure allows for simple retrieval of files from supported protocols such as HTTP, HTTPS, and FTP, with options modifying the default behavior.^[26]
For a straightforward download, the essential command is wget https://example.com/file.txt, which retrieves the specified resource and saves it in the current directory under its original filename derived from the URL.^[26] To customize the output filename, use the -O or --output-document option, as in wget -O output.txt https://example.com/file.txt, which directs the content to the named file instead of using the URL's basename; if -O - is specified, output is sent to standard output.^[27] By default, without such options, Wget saves files to the current working directory with the filename taken from the last component of the URL path.^[27]
Output control can be adjusted for directory placement using --directory-prefix or -P, which sets a base directory for all saved files; for example, wget --directory-prefix=/downloads https://example.com/file.txt stores the file in /downloads/file.txt rather than the current directory.^[18] Verbose mode, enabled by default but adjustable with -v or --verbose, provides detailed progress logs including connection details and transfer statistics.^[28] For reduced output, -q or --quiet suppresses all messages except fatal errors, while --no-verbose or -nv turns off verbosity but retains basic information and error reports.^[28]
To access usage information, run wget --help, which displays a summary of all command-line options, or consult the manual page with man wget for full documentation.^[29] Wget provides basic error handling through exit status codes: 0 indicates successful completion with no issues, 1 signifies a generic error such as invalid options or network failures, and 8 denotes a server-issued error response like a 4xx or 5xx HTTP status.^[30] Lower-numbered codes take precedence if multiple errors occur during a run.^[30]
Common Options and Examples
Wget provides several common options that enhance its utility for everyday downloading tasks, allowing users to handle interruptions, security concerns, and bandwidth limitations effectively. The --no-check-certificate option disables verification of SSL/TLS certificates, converting potential errors into warnings, which is useful for accessing sites with self-signed or invalid certificates, though it should be used cautiously to avoid security risks.^[2] For resuming interrupted downloads, the -c or --continue option appends data to partially downloaded files, provided the server supports range requests, making it ideal for large files or unstable connections.^[2] To manage network load, --limit-rate restricts the download speed, such as --limit-rate=100k to cap at 100 kilobytes per second, preventing overload on shared connections or servers.^[2]
File type filtering is handled via the -A or --accept option, such as wget -A.pdf,.txt https://example.com to download only PDF and text files during retrieval.^[2] In API interactions, wget -q -O - "https://api.example.com/data?key=val" quietly fetches JSON or other data directly to stdout, facilitating integration into scripts for automated data pulls.^[2] For monitoring large files, --progress=bar displays a progress bar during downloads, as in wget --progress=bar https://example.com/largefile.zip, providing visual feedback on completion status without verbose output.^[2]
Best practices emphasize customization and caution: --user-agent allows mimicking a browser, e.g., wget --user-agent="Mozilla/5.0" https://example.com, to bypass sites that block default Wget identification.^[2] These options support non-interactive automation by enabling scripted, reliable downloads with minimal intervention.^[2]
Configuration Files
Wget supports configuration files to set default options persistently. The primary file is .wgetrc, using a simple syntax of variable = value for options (case-insensitive, with underscores or hyphens interchangeable). Comments begin with #, and empty lines are ignored.^[31]
The global configuration file is typically located at /usr/local/etc/wgetrc, while the user-specific file is at ~/.wgetrc in the home directory. The location can be overridden by the WGETRC environment variable or the --config option. Command-line options take precedence over settings in these files.^[32] This allows customization of behaviors like proxy settings, recursion limits, or default directories without specifying them each time.^[32]
Portability and Integration
Supported Platforms
Wget is designed for high portability across Unix-like operating systems, where it compiles natively using standard tools like GNU Autoconf to ensure compatibility with POSIX standards. It has been tested and runs on a wide range of Unix variants, including GNU/Linux distributions, macOS (formerly Mac OS X), FreeBSD, NetBSD, OpenBSD, Solaris, SunOS 4.x, OSF/Tru64 Unix, Ultrix, IRIX, AIX, and Darwin-based systems. This emphasis on Unix standards dates back to its early versions, such as 1.0 released in 1996, prioritizing broad compatibility without reliance on proprietary features.^[33]
On Microsoft Windows, Wget is supported through POSIX emulation layers like Cygwin or MSYS2, which provide a Unix-like environment for native compilation and execution, or via pre-built binaries compiled with tools such as MS Visual C++, Watcom C, Borland C, or MinGW GCC. These Windows ports enable core functionality but may lack full support for certain Unix-specific features, such as advanced signal handling, unless using Cygwin or MSYS2. Additionally, historical ports exist for MS-DOS via the DJGPP compiler and OpenVMS maintained by third parties.^[33]^[34]^[35]
For secure protocol support, Wget requires either the OpenSSL or GnuTLS library to handle HTTPS connections, with the choice specified at compile time via options like --with-ssl=openssl or --with-ssl=gnutls. GnuTLS is optionally used for enhanced FTPS (FTP over TLS) capabilities, though OpenSSL also supports FTPS; neither Java nor any GUI libraries are needed, keeping the tool lightweight and command-line focused. Wget's non-interactive design further aids its portability by minimizing dependencies on interactive environments.^[11]
Wget extends to embedded and mobile platforms, such as Android, where it runs in Unix-like environments like Termux, allowing compilation and use without root access. There is no official support for iOS due to platform restrictions on command-line tools and sandboxing, though unofficial or jailbroken implementations may exist.
Installation and Compatibility
The current stable version 1.25.0, released in November 2024, maintains backward compatibility with scripts dating back to the 1.0 era, allowing legacy automation tools to function without modification in most cases.^[2]
On Unix-like systems, Wget is pre-installed on most major distributions, such as Debian, Ubuntu, Fedora, and CentOS, where it can be verified with wget --version or installed via package managers if absent—for example, sudo apt install wget on Debian-based systems or sudo dnf install wget on Fedora. For custom builds, users can compile from source by downloading the tarball from the official GNU FTP site, extracting it, and running ./configure && make && sudo make install, which requires a compatible compiler like GCC 4.8 or newer to handle C99 features.^[3]
For Windows, it can be installed via package managers like MSYS2, providing the latest version and seamless integration.^[34] These binaries are fully compatible with PowerShell environments, though users may need to invoke them as wget.exe to avoid conflicts with PowerShell's built-in wget alias for Invoke-WebRequest.
On macOS, Wget is not included by default but can be installed using Homebrew via brew install wget or MacPorts with sudo port install wget, both of which build with support for TLS 1.2 and higher to handle modern HTTPS connections securely. These methods ensure compatibility with macOS versions from 10.13 onward, leveraging the system's OpenSSL or GnuTLS libraries.^[36]
Notable compatibility considerations include the absence of full HTTP/2 support in versions prior to 1.20, which may affect performance on sites requiring HTTP/2 for optimal transfers; users on older installations should upgrade to 1.25.0 for enhanced protocol handling. When compiling custom versions, matching the system's compiler version (e.g., GCC 4.8+) is recommended to avoid linkage issues with dependencies like libgnutls.
Development and Maintenance
Current Maintainers and Contributions
The primary maintainers of GNU Wget are Tim Rühsen, who has served as lead maintainer since 2014 and contributed extensively to fuzzing support and continuous integration; Darshit Shah, maintainer since 2014 with a focus on security-related patches and build processes; and Giuseppe Scrivano, who maintained from 2007 to 2014 and continues to contribute to core fixes, particularly for HTTP and HTTPS functionality.^[6]^[1]
Contributions to Wget are managed through a structured process hosted on the GNU Savannah platform, where developers can clone the source repository using Git via git clone git://git.savannah.gnu.org/wget/wget.git or the HTTPS equivalent.^[37] Patches and proposed changes must adhere to GNU coding standards, including style guidelines for C code, documentation requirements, and copyright assignment to the Free Software Foundation for significant contributions exceeding 15 lines. Submissions are sent via email to the active bug-wget mailing list at [email protected], where discussions on bugs, features, and testing occur.^[38]
The project emphasizes collaborative development through this mailing list, which serves as the central hub for reporting issues and coordinating efforts, alongside Savannah's task management tools.^[39] Historically, Wget has benefited from over 100 documented contributors, reflecting broad community involvement in enhancements and bug fixes.^[6] Following high-profile security vulnerabilities in 2024, such as CVE-2024-38428 and CVE-2024-10524, the maintainers have intensified focus on security audits, resulting in more frequent releases prioritizing vulnerability remediation.^[40]
Version History and Notable Releases
Wget's development began with its initial stable release, version 1.0, in 1996, which introduced core functionality for retrieving files via basic HTTP and FTP protocols, establishing it as a non-interactive downloader for command-line use.^[3]
In 1998, version 1.5 marked a significant advancement by adding support for HTTPS through integration with SSL libraries, enabling secure downloads over encrypted connections and broadening its applicability to protected web resources. This release addressed growing demands for secure protocol handling in an era of expanding web security standards.
Version 1.10, released in June 2005, enhanced network compatibility with IPv6 support on dual-stack systems, alongside improvements in large file handling and NTLM authentication, facilitating downloads in modern IPv6-enabled environments.
Subsequent releases focused on protocol extensions and security. Version 1.14, released on August 6, 2012, introduced support for content disposition on error responses, WARC logging for web archiving, and TLS Server Name Indication (SNI), improving compatibility with virtual hosting and error handling in HTTP interactions.^[41]
Version 1.21, released on December 31, 2020, added HTTP/2 protocol support via the nghttp2 library, enabling faster multiplexed downloads and better performance on compatible servers, while also fixing numerous bugs in certificate validation and progress reporting.^[42]
The most recent stable release, 1.25.0, arrived on November 11, 2024, primarily addressing security vulnerabilities, including the removal of shorthand URL parsing for credentials to mitigate CVE-2024-10524, a potential information disclosure issue in FTP and HTTP URLs, and fixes for CVE-2024-38428 related to URI semicolon handling.^[43]^[40] It also refined input handling for non-blocking stdin reads and updated URI parsing to align with RFC 3986 standards. As of November 2025, no major updates have been issued beyond 1.25.0, with the project maintaining a stable branch frozen after 1.21 for security-focused patches only.^[39]
Throughout its evolution, Wget transitioned from basic protocol implementations to a robust tool emphasizing speed and reliability; early versions were rewritten in C for performance gains over initial prototypes, ensuring efficient recursive downloads without dependencies on scripting languages like Perl. Each release has preserved backward compatibility for command-line options, allowing seamless upgrades in scripts and automation workflows. This stability has made Wget instrumental in web archiving tools, such as those used by the Internet Archive, by providing reliable mirroring capabilities.
Successor Project
Overview of Wget2
Wget2 is the modern successor to the original GNU Wget, initiated in 2012 by developer Tim Rühsen as a complete rewrite in the C programming language to achieve greater modularity, improved performance, and support for contemporary web technologies.^[44] Unlike its predecessor, which had accumulated a complex codebase over decades, Wget2 was designed from scratch to address limitations in scalability and extensibility while maintaining a familiar command-line interface for backward compatibility.^[45] The project emphasizes a library-centric architecture, with the core functionality encapsulated in the reusable libwget library, enabling integration into other applications beyond standalone command-line use.^[46]
Key goals of Wget2 include enabling support for advanced protocols such as HTTP/3 via QUIC (planned but not yet implemented as of November 2025), asynchronous input/output operations for efficient handling of concurrent downloads, and Metalink version 4 for enhanced file integrity and multi-source retrieval.^[47]^[48] These features aim to deliver significantly faster download speeds—often much quicker than the original Wget due to multi-threading, HTTP/2 multiplexing, compression handling (including Brotli and Zstandard), and parallel connections—while bolstering security through modern TLS configurations like Perfect Forward Secrecy (PFS), HTTP Strict Transport Security (HSTS), and OCSP stapling.^[46] The command-line syntax remains largely compatible with Wget 1.x, but the underlying internals leverage asynchronous I/O libraries like libuv for non-blocking operations, reducing latency in recursive website mirroring and large-scale file transfers.^[46]
The latest stable release is version 2.2.0, released in November 2024, with ongoing active development hosted on GitLab, where contributors collaborate on features and bug fixes.^[49] Despite its maturity, Wget2 is not yet the default wget implementation in most Linux distributions, remaining an alternative package in many repositories due to its relatively recent stabilization.^[50] Adoption has progressed in select environments, notably integrated into Fedora Linux as the primary wget starting with version 40 in April 2024; however, in May 2025, the original GNU Wget was reintroduced alongside Wget2 as a separate package due to compatibility issues, such as missing full FTP support and differences in command-line options.^[51]^[52] Wget2 continues to be available as an enhanced alternative, though ongoing refinements address protocol support and compatibility edge cases.
Key Differences from Original Wget
Wget2 represents a complete rewrite of the original Wget, shifting from a monolithic architecture to a modular design centered around the libwget library, which enables easier integration and maintenance while providing a reusable API for developers.^[45] Unlike the original Wget's single-threaded approach, Wget2 employs multi-threading for concurrent downloads, supporting up to five threads by default (configurable via --max-threads), which enhances performance over unstable networks by allowing parallel handling of multiple connections.^[53] This event-driven concurrency model in libwget facilitates efficient resource management, contrasting with the original's sequential processing that could bottleneck large or recursive downloads.^[54]
Among its new features, Wget2 introduces native support for HTTP/2, enabling multiplexed streams and server push for faster retrievals compared to the original's HTTP/1.1 limitation.^[55] It also includes Brotli compression decompression (via Accept-Encoding: br), alongside other formats like gzip and zstd, reducing bandwidth usage and accelerating transfers where supported by servers.^[45] Enhanced progress bars provide real-time visual feedback on download status, customizable with options like --progress=bar:force, improving user experience over the original's text-based output.^[56] Additionally, Wget2 improves Metalink support, allowing specification of multiple mirror URLs in a single file for resilient, P2P-like downloads that automatically select optimal sources.^[57]
In terms of compatibility, Wget2 retains a significant portion of the original Wget's command-line interface, including core options like --recursive, --mirror, and --output-document, but introduces new flags such as --http2 and --max-threads while omitting some legacy features like full FTP support.^[53] This design promotes faster parsing of responses through HTTP/2 efficiencies and lower memory consumption via optimized threading, often resulting in quicker overall downloads without the original's resource overhead.^[55] Wget2 eliminates any reliance on external scripting languages like Perl for core functionality, relying purely on C for a leaner build process. It offers superior native Windows support through direct compilation without requiring Cygwin, enabling seamless operation on Windows environments.^[46] The security model features stricter defaults, including automatic HSTS enforcement and OCSP stapling for certificate validation, reducing risks from outdated protocols compared to the original's more permissive settings.^[57]
As a trade-off, Wget2 maintains a smaller, more maintainable codebase focused on modern protocols, prioritizing speed and modularity over exhaustive feature parity with the original. While recursive mirroring is supported with options like --recursive and --page-requisites, it does not yet fully replicate all nuances of the original's web spanning and robot exclusion handling as of version 2.1.0, though ongoing development addresses gaps in versions like 2.2.0.^[56]^[58]
Related Software
GUI Frontends
Graphical user interfaces (GUIs) for Wget provide a visual layer over the command-line tool, enabling non-expert users to initiate downloads without typing commands. These frontends typically invoke the Wget binary in the background, offering features like progress visualization and basic configuration through dialogs or lists, while relying on Wget's core functionality for file retrieval. They are particularly useful for beginners seeking simplicity in tasks such as single-file downloads or basic mirroring, though they rarely extend Wget's advanced recursive capabilities with native graphical controls.
One prominent example is GWget, a GTK-based frontend designed for GNOME environments on Linux and Unix-like systems. It wraps Wget calls to support queue management, parallel downloads, and progress bars, allowing users to monitor multiple transfers visually. GWget also includes clipboard monitoring to automatically detect and queue URLs, as well as options for setting download limits and resuming interrupted sessions via GConf storage. Ideal for novices, it facilitates drag-and-drop URL addition and displays detailed logs in a graphical window. However, its development has been inactive since around 2013, with the last release on SourceForge marking limited ongoing maintenance.^[59]^[60]
Other frontends include WebGet, a cross-platform tool originally implemented in a scripting-friendly manner for HTTP and FTP downloads, though it predates modern Java-based implementations and is discontinued without active updates. For Windows users, WgetGUI offers a simple dialog-based interface tailored to the platform, enabling easy option selection for basic downloads like specifying output directories or user agents by invoking the Wget executable. These tools emphasize straightforward usage but remain limited to invoking the underlying Wget binary, focusing on single or queued basic downloads without built-in graphical support for complex recursion or site mirroring.
Overall, Wget GUIs cater to users avoiding the command line, such as in educational or casual scenarios, but most are outdated relative to contemporary download managers, lacking active development and integration with newer protocols or features.^[61]^[62]^[63]
Clones and Alternatives
HTTrack is a free and open-source offline browser utility designed primarily for mirroring entire websites, allowing users to download a World Wide Web site from the internet to a local directory while building a recursively linked structure for offline browsing. Unlike Wget, which is command-line only, HTTrack includes a graphical user interface (GUI) that simplifies configuration for complex mirroring tasks, such as filtering content types or setting depth limits, making it particularly suitable for users focused on website archiving.^[64]
Aria2 serves as a direct clone in functionality but extends Wget's capabilities as a lightweight, multi-protocol command-line download utility that supports HTTP/HTTPS, FTP, SFTP, BitTorrent, and Metalink protocols. It enables multi-source downloads to maximize bandwidth utilization, often resulting in faster performance for large files compared to Wget's single-connection approach, and includes features like asynchronous operations that appeal to power users for concurrent tasks. However, aria2 requires additional configuration for advanced features, such as magnet link support via BitTorrent, which Wget lacks natively.^[65]^[66]^[67]
Among broader alternatives, curl provides versatile data transfer capabilities across numerous protocols, including HTTP, FTP, and more, making it ideal for API interactions and scripted transfers but without Wget's built-in recursive downloading for mirroring sites. In contrast, lftp is an FTP-centric client that excels in mirroring directories over FTP and other protocols like HTTP, supporting parallel transfers and synchronization options to efficiently update local copies without re-downloading unchanged files.^[68]^[69]
Cliget, a Firefox extension, leverages Wget as a backend to generate command-line download instructions for grabbing links, including those from login-protected pages, by emulating browser requests with cookies and headers. For modern media downloads, tools like yt-dlp address Wget's limitations in handling video streaming sites by supporting thousands of platforms with features for audio/video extraction and format selection.^[70]^[71]
Tool Key Strength Over Wget Limitation Compared to Wget
curl Versatile for APIs and bidirectional transfers No built-in recursion for mirroring
lftp Parallel FTP mirroring and syncing Less emphasis on HTTP recursion
aria2 Multi-protocol (incl. BitTorrent) and async speed More complex configuration

Tool	Key Strength Over Wget	Limitation Compared to Wget
curl	Versatile for APIs and bidirectional transfers	No built-in recursion for mirroring
lftp	Parallel FTP mirroring and syncing	Less emphasis on HTTP recursion
aria2	Multi-protocol (incl. BitTorrent) and async speed	More complex configuration

Wget

Background

Origins and History

Authors and Licensing

Core Functionality

Download Protocols and Capabilities

Robustness and Reliability Features

Advanced Features

Recursive Download and Mirroring

Non-Interactive and Automation Options

Usage and Configuration

Basic Syntax and Commands

Common Options and Examples

Configuration Files

Portability and Integration

Supported Platforms

Installation and Compatibility

Development and Maintenance

Current Maintainers and Contributions

Version History and Notable Releases

Successor Project

Overview of Wget2

Key Differences from Original Wget

Related Software

GUI Frontends

Clones and Alternatives

References