Xpdf
Xpdf is a free and open-source PDF viewer and toolkit based on the Qt framework, available for Unix-like systems, Linux, Windows, and macOS, enabling users to view, print, and extract content from Portable Document Format (PDF) files.[1] Developed by Derek Noonburg and first released in 1995, Xpdf consists of a graphical viewer application (xpdf) alongside a suite of command-line utilities, including pdftotext for converting PDFs to plain text, pdfimages for extracting embedded images, pdfinfo for retrieving document metadata, pdffonts for listing fonts, and pdfdetach for handling attachments.[1] The toolkit supports various PDF features such as rendering, text selection, searching, and printing, and has been ported to multiple platforms including Linux, Windows, and macOS.[1] Starting with version 4.00, the viewer transitioned to the Qt framework for improved cross-platform compatibility, with the latest open-source release being version 4.06 as of November 2025.[2] Xpdf is dual-licensed under the GNU General Public License versions 2 and 3, allowing free redistribution and modification of its source code, which is available from the official project website.[3] A related closed-source product, XpdfReader, builds on the open-source Xpdf viewer by incorporating proprietary enhancements like advanced color management and Windows-specific printing support.[1]Introduction
Overview
Xpdf is a free, open-source PDF viewer and toolkit designed for displaying, extracting, and converting PDF content.[4] Its primary purpose is to enable viewing of PDF files across various operating systems, including Unix-like systems such as Linux, Windows, macOS, FreeBSD, and OpenVMS, while also providing utilities for tasks like text extraction, image conversion, and metadata handling.[1][5][6] The toolkit encompasses a collection of command-line tools that support these functions without requiring a graphical interface.[4] The latest stable release, version 4.06, was issued on November 14, 2025, with native support for Linux, Windows, and macOS, alongside ports for FreeBSD and OpenVMS.[2][1][5][6] In version 4.00, Xpdf shifted from the X Window System and Motif framework to Qt for enhanced cross-platform compatibility.[7]Development and Maintenance
Xpdf was originally created by Derek Noonburg in 1995 as the first open-source PDF viewer.[8] Noonburg has served as the primary developer since its inception, handling the core writing and ongoing enhancements.[1] Glyph & Cog, LLC, founded by Noonburg in 2002, specializes in PDF technology and maintains Xpdf as part of its portfolio of software components for viewing, printing, and text extraction.[9] As of November 2025, Xpdf remains actively maintained, with regular updates addressing bug fixes, security vulnerabilities—such as those reported in CVE-2025-3154 and CVE-2025-2574, which are fixed in version 4.06—and compatibility improvements for evolving PDF standards.[10][11][12] The development model centers on Noonburg as the lead contributor, supplemented by community input through open-source channels, including source code downloads and mirrors on platforms like GitHub.[2][13] Official resources include the primary website at xpdfreader.com for downloads and documentation, with source code available directly from the site and via GitHub mirrors for collaborative access.[4][2]History
Origins and Early Development
Xpdf was created by Derek Noonburg as the first open-source PDF viewer, with its initial public release (version 0.2) occurring in December 1995.[8] This early prototype addressed the lack of freely available tools for viewing Portable Document Format (PDF) files on Unix-like systems, at a time when commercial alternatives like Adobe Acrobat were dominant and not widely accessible for open-source environments.[8] The software was built specifically for the X Window System, utilizing the Motif toolkit to provide a graphical interface suitable for Unix desktops.[14] Early versions from 1.00, released in 1996, to the 3.xx series emphasized core PDF viewing capabilities, including decoding of LZW-compressed streams as specified in the initial PDF standards. These releases progressively stabilized basic functionality, such as page rendering and simple navigation, without advanced features like annotation or form handling. By the early 2000s, Xpdf gained traction as a foundational backend for other open-source projects, notably serving as the code base for KPDF, a KDE-integrated viewer released around 2002, and GPDF, a GNOME-based PDF viewer.[15][16] Noonburg continued leading its development through this period, laying the groundwork for broader adoption in Linux distributions.[8]Major Versions and Evolution
Xpdf originated as a basic PDF viewer in the mid-1990s, with its first public release (version 0.2) occurring in December 1995, providing essential rendering capabilities for early PDF files under the X Window System.[1] Over the subsequent decade, it evolved into a more comprehensive toolkit by the 2000s, incorporating command-line utilities for text extraction, image conversion, and notably PostScript output via the pdftops tool, which facilitated broader integration in printing and publishing workflows.[1] A significant milestone came with version 3.04, released on May 28, 2014, which introduced a completely rewritten text extractor for improved accuracy, including new modes for handling tabular and monospaced data, alongside a faster PDF rendering engine that maintained spec compliance while enhancing performance.[17] This version also advanced font handling through better support for complex text layouts and improved color separation with a DeviceN rasterizer for CMYK and spot colors.[17] The project's architecture underwent a major overhaul in version 4.00, released on August 10, 2017, transitioning the viewer to the Qt framework via the XpdfWidget library, which eliminated the dependency on X/Motif and enabled superior cross-platform support for Windows, Linux, and other environments.[18] This update also added linear text selection, enhanced color management, and initial support for most PDF 2.0 (ISO 32000-2) features, marking a shift toward modern standards.[18] Version 4.05, released on February 8, 2024, focused on refinements including security fixes for vulnerabilities like divide-by-zero errors in color space handling and issues with large page sizes, alongside new command-line options such as-overwrite for pdftohtml and expanded pdfinfo capabilities for metadata extraction.[19][12]
The release of Xpdf 3.0 in 2004 served as the basis for Poppler, a fork announced in 2005.[20][21][22]
Version 4.06, released on November 14, 2025, is primarily a bug-fix release that addresses several security vulnerabilities, including PDF object loops and integer overflows (e.g., CVE-2024-7866, CVE-2024-7867), and adds minor features such as the -listonly option for pdfimages.[2][23][12]
As of November 2025, Xpdf continues to emphasize performance optimizations, such as multi-pass DeviceN rasterization supporting up to 32 channels introduced in recent updates, and maintains strong compatibility with evolving PDF standards, including ongoing security enhancements to address contemporary threats.[19][12]
Features and Capabilities
PDF Viewing Functionality
The Xpdf graphical viewer offers a range of user interface elements designed for efficient PDF navigation and manipulation. It supports scrollable views through mouse dragging with the middle button, arrow keys for incremental movement, and Page Up/Page Down keys for larger jumps, enabling smooth traversal of multi-page documents. Zoom functionality includes dedicated buttons and a popup menu for options such as fit-to-width or fit-to-page, alongside keyboard shortcuts like Ctrl-+ for zoom in and Ctrl-- for zoom out, allowing users to adjust magnification dynamically. Page navigation is facilitated by a direct page number entry box, left/right arrow buttons for sequential movement, and Alt-left/right for history-based back/forward actions, with Home and End keys jumping to the first or last page. Search capabilities are provided via a find entry box, next/previous buttons, and the Ctrl-F shortcut to locate text within the document. Printing is accessible through the Ctrl-P shortcut or the print menu, which opens a standard dialog for output configuration.[24] Interaction features in Xpdf emphasize Unix-like efficiency and user control. Basic annotation support allows display of existing highlights and notes in compatible PDFs, though editing is not available in the open-source version. Full-screen mode can be enabled via the -fullscreen command-line option or toggled with Ctrl-L, presenting a single-page view for immersive reading and exiting with the Esc key. The viewer incorporates numerous keyboard shortcuts for streamlined operation, such as Ctrl-O to open files, Ctrl-C to copy selected text, Ctrl-W to close tabs, and Ctrl-Q to quit, catering to power users on keyboard-centric workflows.[24][25] The rendering engine in Xpdf delivers high-fidelity display of PDF content, accurately reproducing text, images, and vector graphics as specified in the file. Text rendering includes linear or block selection modes for copying, with anti-aliasing enabled by default for smoother appearance. Images are rendered without reversal in reverse video mode, preserving visual integrity. Vector elements benefit from configurable anti-aliasing to reduce jagged edges. Hyperlinks are fully supported as clickable elements, with left-click activating internal navigation and middle-click opening external URLs in a new tab or system browser, depending on configuration. Form filling is limited to display of interactive elements in compatible PDFs, without editing capabilities in the core viewer.[24][1] Performance in Xpdf prioritizes a lightweight footprint, enabling quick loading and responsive operation even on older hardware, thanks to its efficient design built on the Qt framework. Anti-aliasing for fonts and vectors enhances readability without significant overhead, and the overall architecture ensures minimal resource usage for standard viewing tasks.[24][1]Format Decoding and Compatibility
Xpdf's decoding capabilities encompass several key compression algorithms defined in the PDF specification, enabling it to process a wide range of document content efficiently. It supports LZW compression decoding, which is commonly used for text and vector data in older PDF files, allowing for the extraction and rendering of compressed streams without data loss.[26] Similarly, Flate (ZIP-based) decoding is fully implemented, handling the prevalent general-purpose compression for page content, images, and metadata streams that became standard in PDF 1.2 and later.[27] For image-specific compression, Xpdf includes a JBIG2 decoder, introduced in version 2.00, which processes bi-level (black-and-white) images with high efficiency, supporting both lossless and lossy modes as per the PDF specification.[28] Additionally, Xpdf handles encrypted PDFs by prompting for user or owner passwords during loading, decrypting the document using supported algorithms like RC4 and AES up to 256-bit keys, thereby enabling access to protected content while adhering to the encryption metadata.[29] In terms of standards compliance, Xpdf provides full support for PDF 1.7 (ISO 32000-1), covering core features such as object streams, cross-reference streams, and hybrid file specifications, ensuring compatibility with the majority of legacy and contemporary documents generated by standard tools. For PDF 2.0 (ISO 32000-2), support is partial, with version 4.00 and later implementing most essential features relevant to viewing and text extraction, including enhanced digital signatures and AES-GCM encryption, though some advanced optional extensions remain unimplemented.[30] Regarding digital rights management (DRM), the official Xpdf implementation respects restrictions such as copy, print, and modification prohibitions embedded in PDF metadata, preventing unauthorized operations to comply with the standard; however, community patches exist to bypass these, and distributions like Debian incorporate modifications for broader usability, sometimes integrating tools like qpdf for decryption workflows.[26][31] Xpdf demonstrates robust compatibility in rendering complex PDF elements, leveraging its internal graphics state management to handle transparency groups and blending modes introduced in PDF 1.4, which are rasterized appropriately during output to maintain visual fidelity without artifacts.[32] It also supports axial, radial, and free-form shadings (types 1 through 7), accurately interpolating color gradients across defined domains for smooth visual effects in vector graphics.[33] Embedded fonts, including subsets of Type 1, TrueType, and OpenType formats, are processed via FreeType integration, ensuring precise glyph rendering and substitution for missing characters while preserving kerning and ligatures where specified in the PDF font dictionary.[29] However, limitations arise with proprietary extensions, such as Adobe's XFA (XML Forms Architecture) for dynamic forms, which are not fully supported due to their non-standard nature, leading to fallback rendering or omission of interactive elements; similarly, certain vendor-specific annotations or multimedia integrations may degrade to basic display.[33] For error recovery, Xpdf employs a resilient parsing strategy that allows graceful degradation when encountering malformed PDFs, such as invalid page tree references or syntax errors in object dictionaries, by logging warnings to stderr while continuing to process recoverable sections and avoiding crashes.[34] This approach enables partial functionality, for instance, extracting text or images from otherwise corrupted files, and prioritizes stability through bounded memory allocation and input validation during stream decoding.[33]Components
Graphical Viewer
The graphical viewer in Xpdf is launched by executing thexpdf command from the terminal or command prompt, which opens a standalone window for displaying PDF files.[24] If no arguments are provided, it starts without loading a document; otherwise, specifying a PDF file path, such as xpdf document.pdf, loads that file immediately.[24] Command-line options allow customization of the initial view, including the -z flag for zoom levels (e.g., -z 100 for 100% scaling, -z page to fit the entire page to the window, or -z width to fit the page to the window width) and page selectors like document.pdf:18 to open directly to a specific page.[24]
Building the graphical viewer requires the Qt GUI toolkit for Xpdf versions 4.xx and later, supporting Qt 5.x or 6.x on platforms including Unix, macOS, and Windows.[24] Legacy versions 3.xx instead depend on the Motif library for the user interface.[35] The viewer can be used standalone or integrated into other applications through XpdfWidget, a commercial Qt component that embeds PDF viewing capabilities while providing developers control over rendering and interaction.[36]
User preferences and defaults for the graphical viewer are configured via the .xpdfrc file, typically located in the user's home directory on Linux/Unix/macOS systems.[32] This file allows customization of rasterization options, such as enabling font anti-aliasing with antialias yes (the default) and vector graphics anti-aliasing with vectorAntialias yes.[32] Screen resolution settings can be adjusted, for instance, via zoomScaleFactor actual to base zoom calculations on the display's DPI rather than the standard 72 dpi, ensuring appropriate scaling for fit-page or fit-width modes.[32]
Command-Line Utilities
Xpdf provides a suite of command-line utilities designed for batch processing and extraction of content from PDF files, collectively known as the xpdf-utils package. These tools enable users to manipulate PDFs without a graphical interface, supporting tasks such as text extraction, image retrieval, and metadata inspection on Unix-like systems, Windows, and macOS. Developed as part of the open-source Xpdf project, they leverage the same PDF parsing engine as the graphical viewer but focus on non-interactive workflows.[1] The pdftotext utility extracts text from PDF files into plain text or Unicode formats, with options to preserve layout, handle tables, or maintain raw content stream order. For instance, the commandpdftotext -layout input.pdf output.txt converts a PDF while retaining the physical arrangement of text, making it suitable for document analysis or search indexing. It supports page ranges via -f (first page) and -l (last page) flags, adjustable margins in points, and output encoding specifications like UTF-8, along with end-of-line conventions for Unix, DOS, or Mac systems. Password-protected files can be processed using owner (-opw) or user (-upw) passwords.[37]
Pdfimages extracts embedded images from PDFs, saving them in formats such as PPM, PGM, PBM, JPEG, or raw native formats without applying transformations like rotation or clipping. Users can run pdfimages -j -f 1 -l 5 [document](/page/Document).pdf images to pull JPEG images from pages 1 through 5, naming outputs sequentially as images-0001.jpg. Key options include -u for unique images only, -list for a summary of image details (e.g., dimensions, color space, page), and support for JPEG 2000 via -J, with page-specific scanning and password handling.[27]
Pdftops converts PDF pages to PostScript, facilitating printing or integration with other graphics workflows, and supports levels 1, 2, or 3 PostScript with grayscale or separable color modes. The command pdftops -level2 -paper [a4](/page/A4) input.pdf output.[ps](/page/PS) generates level-2 PostScript scaled to A4 paper, while -eps produces Encapsulated PostScript for vector graphics. Features encompass font embedding toggles for Type 1, TrueType, and CID fonts, page scaling and centering, OPI comments if compiled with support, and security options for encrypted files.[38]
Pdfinfo outputs metadata from the PDF's Info dictionary, including author, creation and modification dates, page count, encryption status, permissions, and file size, along with details like PDF version and JavaScript presence. Executing pdfinfo -meta document.pdf displays XMP metadata, while -box reveals page bounding boxes in points. It examines specified page ranges and reports tagged or form-based structures, with exit codes indicating errors such as permission denials.[39]
Other utilities include pdftoppm, which renders PDF pages as PPM, PGM, or PBM images at customizable resolutions (default 150 DPI) and rotations, useful for rasterization; pdftohtml, which generates HTML representations with embedded images and optional form fields, supporting zoom levels and iframe indices; pdffonts, which lists embedded fonts by type (e.g., Type 1, TrueType), subset status, and Unicode mappings, aiding in font analysis; and pdfdetach, which lists or extracts attached files from PDFs, with encoding options for file names. These tools read configuration from ~/.xpdfrc or system-wide files and are optimized for scripting in automated pipelines.[1][29][40][41][42]