Fact-checked by Grok 2 weeks ago

DjVu

DjVu is a raster image file format and compression technique designed primarily for the efficient storage and distribution of high-resolution scanned documents, particularly those combining text, line drawings, and images, achieving significantly smaller file sizes than traditional formats like JPEG or TIFF while maintaining quality.^[1] Developed in the late 1990s, it separates document layers—such as foreground text and background images—for targeted compression, enabling progressive loading and viewing over low-bandwidth connections.^[2] The format originated at AT&T Labs in 1996, where researchers including Yann LeCun, Léon Bottou, Patrick Haffner, and Paul G. Howard created it to address the challenge of delivering color scanned documents over the early Internet, compressing 300 DPI pages from megabytes to tens of kilobytes.^[3] Initial development focused on wavelet-based compression for images and pattern-matching for text (JB2), with the first public demonstration in 1998.^[2] In 2000, LizardTech acquired the technology for commercialization, releasing viewers and plugins, while an open-source implementation, DjVuLibre, emerged in 2001 under the GNU General Public License, ensuring long-term accessibility.^[3] The specification became an open standard, with the MIME type image/vnd.djvu, and trademarks held by entities like Celartem Inc. before management shifted to Cuminas Corporation.^[1] Key features include multi-layer encoding, where bilevel masks isolate text for lossless compression and continuous-tone layers handle images with lossy wavelets, yielding 5-10x better ratios for color documents compared to JPEG.^[2] It supports multipage documents in bundled or indirect formats, requires minimal resources (e.g., 2 MB RAM), and facilitates interactive viewing with zooming and panning.^[1] DjVu has been widely adopted for digital libraries and archives, such as in book digitization projects, due to its balance of quality, speed, and size efficiency, though it competes with PDF for general use.^[3]

History

Origins and Development

DjVu's development began in 1996 at AT&T Labs-Research in Red Bank, New Jersey, led by a team of researchers including Yann LeCun, Léon Bottou, Patrick Haffner, and Paul G. Howard.^[2] The project originated as an effort to address the challenges of distributing high-resolution scanned documents over the slow internet connections of the era, targeting formats that combined text, drawings, and images on colored or textured backgrounds.^[4] The core innovation of the early DjVu work was the segmentation of document images into distinct layers: a foreground layer for bi-level text and line art, a background layer for images and paper texture, and a mask to delineate the two.^[2] Backgrounds were compressed using a wavelet-based method called IW44, while foregrounds employed JB2, a bi-level compression technique building on JBIG, both enhanced with arithmetic entropy coding via the ZP-coder.^[4] This layered approach allowed for efficient handling of hybrid text-image documents, such as scanned books and newspapers, which were tested in initial prototypes to validate performance.^[2] Prototypes demonstrated significant compression gains, reducing file sizes for 300 dpi color pages from around 25 MB uncompressed to 40-60 KB, achieving 5-10 times better ratios than JPEG equivalents and 3-8 times better than TIFF-G4 for black-and-white scans.^[5] These early systems also supported progressive rendering, enabling text visibility in seconds over a 56 kbps modem, with full images loading shortly after.^[2] Key milestones included internal demonstrations at AT&T in 1998, coinciding with the publication of foundational papers and the release of browser plug-ins for real-time decoding.^[4] By 1999, AT&T made the first public release of DjVu as an open research project, including software and specifications available via their website, paving the way for subsequent commercialization by LizardTech.^[6]

Commercialization and Open Sourcing

In March 2000, LizardTech Inc. acquired the DjVu technology from AT&T Labs-Research, transitioning the format from an experimental project to a commercial product. The company promptly released the first commercial DjVu software suite, which included encoders for creating DjVu files from scanned documents and viewers for rendering them, targeting web-based distribution of high-resolution images. This move aimed to capitalize on DjVu's compression advantages for scanned materials, positioning it as a competitor to formats like PDF for online publishing.^[7]^[8] LizardTech expanded its offerings with specialized products, including DjVu Solo—a consumer-oriented tool for small-scale document conversion from scanners, complete with OCR and hyperlink support—and Document Express, a full-featured suite for professional document publishing workflows. To promote adoption, LizardTech established partnerships with libraries and publishers, enabling the distribution of large-scale scanned collections, such as historical archives and academic materials, in the efficient DjVu format. In 2003, LizardTech was acquired by Celartem Inc., which continued support for DjVu products. AT&T and LizardTech maintained control over core compression techniques through patents, notably U.S. Patent No. 6,058,214, which covered wavelet-based methods essential to DjVu's performance.^[9]^[10]^[11]^[12] Despite these proprietary elements, LizardTech committed to broader accessibility by open-sourcing the technology. In October 2000, the company released the DjVu Reference Library version 2.0 under the GNU General Public License version 2, followed by version 3.5 in 2001, which included enhanced decoding and encoding capabilities. This initiative, driven by the desire to establish DjVu as an open standard for scan-to-web applications, spurred the formation of the DjVuLibre project by the format's original developers, providing a free, GPL-licensed implementation compatible with LizardTech's suite.^[12]^[13] Post-open-sourcing, maintenance shifted to community efforts hosted on SourceForge, where DjVuLibre has evolved through volunteer contributions. The project saw steady development, culminating in the latest stable release, version 3.5.29, on July 3, 2025, featuring minor updates for modern system compatibility and bug fixes without altering the core format. This ongoing open-source stewardship has ensured DjVu's longevity beyond commercial constraints.^[14]^[15]

Technical Design

Compression Methods

DjVu employs a layered compression approach to achieve high ratios for scanned documents, separating each page into a background layer for continuous-tone images like photographs and textures, a foreground layer for bi-level elements such as text and line drawings, and a mask layer that defines the placement of the foreground over the background. This segmentation allows specialized algorithms to target the distinct characteristics of each layer, resulting in efficient encoding that preserves visual fidelity while minimizing file size.^[4]^[5] The background layer, typically downsampled to 100-150 dpi, is compressed using the IW44 codec, a progressive wavelet-based method akin to JPEG 2000 but optimized for speed and low memory usage with Deslauriers-Dubuc interpolating wavelets sensitive to 4:4:4 color sampling. IW44 performs a five-stage lifting decomposition to transform the image into wavelet coefficients, which are then quantized and entropy-coded, enabling fast progressive refinement where low-resolution previews load quickly and details refine as more data arrives. The discrete wavelet transform underlying IW44 can be conceptually represented as a multi-resolution decomposition:

\begin{align*} c_{j+1,k} &= \sum_m h[m-2k] c_{j,l}, \\ d_{j+1,k} &= \sum_m g[m-2k] c_{j,l}, \end{align*}

where c_{j,k} are approximation coefficients, d_{j,k} detail coefficients, and h, g are low- and high-pass filters, respectively; this process iteratively splits the signal for scalable decoding.^[4]^[16]^[17] The foreground layer, retained at full 300 dpi resolution for sharpness, uses JB2 encoding, a vector quantization technique that builds a dictionary of repeating shapes (e.g., characters) via soft pattern matching and clusters similar instances to reduce redundancy, outperforming JBIG by handling noise and variations in scanned text. JB2 encodes prototypes in the dictionary and references their positions and refinements, with bit-rate estimation approximating the total as the sum of dictionary overhead plus per-instance coding: roughly 1-2 bits per pixel for sparse text, scaling down with shape reuse. For black-and-white pages without color elements, JB2 or JBIG2 can be applied directly, leveraging JBIG2's lossless bi-level compression for the entire image. The mask layer is compressed using JB2, a lossless bi-level compression method that uses pattern matching to efficiently encode shapes and positions.^[4]^[5]^[16] DjVu supports both lossy and lossless modes, with IW44 providing lossy compression on the background for aggressive size reduction (e.g., tuning quantization levels to balance PSNR above 30 dB) while JB2 and the mask remain lossless to maintain text legibility. This trade-off yields files typically 3-10 times smaller than PDF for black-and-white scans and 5-10 times smaller than JPEG for color pages at comparable quality, as demonstrated in benchmarks where a 300 dpi color magazine page compresses to 40-70 KB versus 300-500 KB for JPEG. Additionally, an optional hidden text layer from OCR is embedded losslessly, adding negligible size (under 5% for dense text) while enabling searchability without altering the visual compression. Progressive refinement in IW44 further enhances usability by prioritizing text and low-frequency details for rapid initial rendering.^[4]^[5]^[16]

File Components and Structure

DjVu files employ a modular, chunk-based structure derived from the Interchange File Format (IFF) specification, enabling efficient storage and parsing of document components. Each file begins with a four-byte preamble identifier "AT&T" (hex: 41 54 26 54), followed by IFF-style chunks that encapsulate various data elements. These chunks are self-contained units, each starting with a four-character chunk ID, a four-byte length field (big-endian), and the payload data, padded to an even byte boundary if necessary. This design supports extensibility, as decoders are required to ignore unrecognized chunks, allowing backward compatibility across versions.^[6]^[18] Single-page DjVu files are organized as a FORM:DJVU container, which holds the primary image data and metadata. The foundational INFO chunk, mandatory and positioned first, specifies essential properties such as image width and height in pixels, resolution in dots per inch (typically 300 dpi), and the DjVu version number (e.g., 26 for modern files). Subsequent chunks include Sjbz for the shape data and mask, compressed using JB2 bi-level encoding to represent text and line art; FGbz for the foreground color data using JB2; BG44 for the background layer, using IW44 wavelet compression for continuous-tone images; and optionally shared JB2 dictionaries. For photographic content, IW44 chunks can directly encode full images, while extensibility allows alternatives like JPEG via BGjp or JBIG2 via INCL-referenced external data. Text layers for searchability are stored in separate TXTz chunks (BZZ-compressed UTF-8 text), often referenced by an INCL chunk within the main file. Thumbnails are managed in THUM chunks containing scaled-down IW44 images.^[6]^[19] Multi-page documents use a FORM:DJVM container, which bundles pages via a DIRM (directory) chunk listing offsets and identifiers for each page's DJVU sub-form. The DJVM also includes shared components like global color tables (COLR) or shape dictionaries (Sjbz) to reduce redundancy across pages. For large files, an indirect mode separates the document into an index file (ending in .djv or .djvu) and individual page files, linked by file paths in the DIRM; this facilitates streaming and partial loading over networks. Navigation elements, such as a table of contents or hyperlinks, are encapsulated in ANTa chunks (uncompressed annotations) or ANTz (BZZ-compressed), supporting outlines and interactive metadata without built-in encryption—security relies on external file permissions or metadata tags.^[6]^[20] The format's version history spans from DjVu version 2 in 1999, which introduced multi-page support and indirect storage, to version 3 in 2005, adding progressive loading capabilities through ordered chunk sequences. Chunk ordering is critical for streaming: INFO precedes visual data, with BG44 chunks arranged in a pyramid for progressive refinement—coarse low-resolution previews load first, followed by finer details. This structure prioritizes file size management, enabling documents to stream efficiently while maintaining modularity for tools like djvudump, which parses and displays the hierarchical chunk layout.^[6]^[21]

Features

Document Handling and Rendering

DjVu documents are rendered by decomposing each page into a background layer for images and textures, a foreground layer for text and line drawings, and a mask to determine where the foreground is applied over the background. This separation enables efficient rendering of anti-aliased text superimposed on complex backgrounds, preserving sharp edges for text while handling photographic or patterned elements separately.^[21]^[6] The foreground mask, typically encoded using JB2 compression for bilevel content, guides the stenciling process, allowing viewers to composite layers at the desired resolution.^[21] For fast previews at low zoom levels, DjVu employs sub-sampling, where the background layer is often reduced by a factor of 3 and the foreground by up to 12, enabling quick initial display without full decompression. Multi-resolution pyramids, implemented via the IW44 wavelet codec for continuous-tone layers, support progressive refinement: the background is displayed first as a low-resolution approximation, followed by successive details as data is decoded. This approach minimizes memory usage and allows rapid loading, with the full image refining in place.^[21]^[16] Zoom and navigation in DjVu support continuous magnification up to 999% without quality loss, thanks to the format's retention of high-resolution source data (typically from 300 DPI scans). Panning is achieved through smooth viewport adjustments, and page rotation (0°, 90°, 180°, or 270°) is handled by reorienting the rendered layers. Interactivity includes hyperlinks embedded in annotation chunks, which activate on clicks within designated zones, and bookmarks derived from the document outline for hierarchical navigation across multi-page files.^[22]^[6] DjVu supports 24-bit color for backgrounds and foregrounds using IW44 encoding, while grayscale scans are managed through bilevel or wavelet layers optimized for tonal variations. Progressive display prioritizes the background for immediate visibility, overlaying text and details as they become available, which enhances perceived responsiveness on slower connections. For printing, DjVu enables high-fidelity output matching the original scan resolution, typically 300 DPI, by scaling layers appropriately during conversion to PostScript or other formats.^[23]^[21]^[16] Unlike vector-based formats, DjVu lacks native support for interactive forms or scripting, concentrating instead on static representation of scanned content to ensure consistent rendering across devices. Searchability via hidden text layers can enhance navigation but is secondary to visual rendering.^[6]^[16]

Searchability and Metadata

DjVu documents support searchability through an optional hidden text layer, which consists of OCR-generated Unicode text stored in TXTz chunks (or uncompressed TXTz variants). This layer includes positional information, such as coordinates for columns, regions, paragraphs, lines, and words, enabling precise text selection and highlighting during searches. The text is encoded in UTF-8 format, prefixed by a 24-bit length integer, and compressed using the BZZ algorithm in TXTz chunks to maintain file efficiency while preserving alignment with the image content.^[21] Full-text indexing is facilitated by this hidden layer, allowing users to search across documents with support for diacritics and multiple languages, as the Unicode encoding handles international characters. DjVu viewers can perform in-document searches, copying, and pasting operations directly from the text layer, while server-side implementations enable broader indexing using tools like Perl scripts for integration with search engines. For large collections, the extractable text supports advanced processing, including word-level selection and compatibility with indexing frameworks such as Apache Solr or Lucene, where positional data aids in relevance ranking and result highlighting.^[21]^[5]^[24] Metadata in DjVu files is embedded via ANTa or ANTz chunks, which store key-value pairs in UTF-8 format (e.g., creator information, page labels, and document properties like title or author) using a BibTeX-inspired syntax. These chunks can be shared across pages through INCL chunks, which reference included files (e.g., FORM:DJVI structures) containing annotations or dictionaries, ensuring consistent metadata application in multi-page documents. Annotations, including hyperlinks and viewer settings (such as initial zoom or page mode), are also housed in these chunks using a parenthesized notation that defines shapes and actions. Additionally, DjVu supports XML-based representations for metadata, hyperlinks, and hidden text editing via tools like djvuxml, allowing conversion to and from XML for enhanced interoperability.^[21]^[25] The hidden text layer enhances accessibility by providing a selectable text overlay compatible with screen readers, enabling audio narration of document content for visually impaired users when supported by the viewer software. Text extraction tools like djvutxt allow for reflowable output, suitable for conversion to e-book formats where layout reflow is needed without relying on the fixed image structure. However, the effectiveness of these features is limited by OCR accuracy, which varies with scan quality—poor resolution or artifacts can introduce errors in the hidden text, and older DjVu files lack mechanisms for automatic OCR updates or corrections.^[26]

Software Support

Viewers and Plugins

DjVu files can be viewed using a variety of open-source and multi-format software applications across desktop, mobile, and web platforms. The primary open-source viewer is DjView4, part of the DjVuLibre project, which provides a cross-platform standalone application built on the Qt toolkit.^[14] DjView4 supports smooth zooming and panning, instant page turning, quick rendering, annotations, and thumbnails for efficient navigation of documents. It is available for Unix-like systems, Windows, and macOS, with the latest version 3.5.29 released on July 3, 2025, including compatibility patches for ARM64 architectures to ensure performance on modern devices like Apple Silicon and ARM-based servers.^[14] For Windows users seeking a lightweight alternative, WinDjView offers a compact viewer with a tabbed interface, continuous scrolling, and advanced printing options, optimized for fast performance without additional dependencies.^[27] Its extended edition maintains these core features while enhancing stability for larger files.^[28] Multi-format readers extend DjVu support to broader document workflows. SumatraPDF, a minimalistic and fast viewer, handles DjVu alongside PDF, EPUB, and other formats, emphasizing low resource usage and quick loading for everyday reading.^[29] STDU Viewer supports over 20 formats including DjVu, PDF, TIFF, and XPS, providing a tabbed interface tailored for technical and scientific documents with features like zooming and multi-page navigation.^[30] On Linux and Unix systems, integrated desktop environments offer native DjVu viewing. Okular, the KDE document viewer, includes full DjVu support through the DjVuLibre backend, enabling seamless integration with KDE workflows for rendering, annotations, and format conversion previews.^[31] Evince, the default GNOME document viewer, supports DjVu files natively, allowing users to open and navigate them alongside PDF and other formats with standard tools like search and print.^[32] Mobile support for DjVu is available through dedicated apps. On Android, the DjVu Reader & Viewer app provides fast rendering and touch-based navigation for DjVu documents, with additional PDF compatibility for hybrid libraries.^[33] For iOS, the DjVu Reader app offers comprehensive support for DjVu and PDF, featuring clear rendering of complex layouts and offline access.^[34] Web-based viewing has shifted from deprecated browser plugins (phased out after 2015 due to security concerns) to modern HTML5 and WebAssembly solutions. The DjVu.js library enables client-side decoding for embedding DjVu viewers in web pages, while the DjVu.js Viewer Plus extension for Firefox, updated February 4, 2025, allows direct opening of local and remote .djvu files with tagging and processing capabilities via WebAssembly.^[35]^[36]

Creation and Conversion Tools

DjVuLibre, the primary open-source implementation, includes command-line encoders for creating DjVu files from images. The cjb2 tool encodes bitonal (black-and-white) images using JB2 compression, accepting inputs like PBM or single-page TIFF files and supporting both lossless and limited lossy modes for document pages.^[37] The c44 encoder processes continuous-tone images with IW44 wavelet compression, converting PPM or JPEG files into DjVuPhoto components optimized for photographic or color elements.^[38] Complementing these, the ddjvu decoder extracts pages from DjVu files to image formats such as PNM or TIFF, enabling iterative workflows for testing or further processing.^[39] Prior to widespread open-sourcing, LizardTech's Document Express suite served as a commercial solution for DjVu creation, particularly for batch scanning and conversion of large document collections into compressed multi-page files. This tool facilitated high-volume production with features like workflow automation for scanned inputs, but it was discontinued in the early 2000s following LizardTech's acquisition by Celartem Technology in 2003, shifting reliance to open-source alternatives.^[40] Among open tools, the Any2DjVu service offers a web-based converter that transforms PDF, TIFF, JPEG, PNM, and similar formats into DjVu without local software installation, handling segmentation and compression server-side for user-uploaded files.^[41] ImageMagick provides DjVu decoding support via a delegate to DjVuLibre, allowing command-line conversion of DjVu files to raster formats like PNG or JPEG for integration into broader image pipelines.^[42] Typical conversion workflows start with TIFF or PDF scans, where OCR engines like Tesseract generate hidden text layers for searchability, followed by DjVuLibre encoders to compress and assemble multi-page documents.^[43] Batch processing for books often involves segmenting images into foreground masks and backgrounds with tools like csepdjvu, embedding OCR output, and bundling pages via djvm for efficient digitization of archives.^[39] However, the absence of an official high-end encoder post-open-sourcing means community tools remain slower for production-scale tasks compared to optimized PDF workflows, often requiring manual optimization for large volumes.^[44] DjVu creation is also supported in ebook management software like Calibre, and it is commonly used in digital archives such as the Internet Archive for scanned book collections.^[45]^[46]

Adoption

Early and Institutional Use

The Internet Archive became a prominent early adopter of the DjVu format, utilizing it extensively from 2005 to 2016 for storing millions of scanned books and documents. This approach allowed for significantly smaller file sizes compared to alternatives like PDF, facilitating easier downloads and access for users with limited bandwidth. By 2011, the Internet Archive had digitized over 3 million texts, many of which were available in DjVu to support efficient web distribution and preservation of public domain works.^[47]^[48]^[49] Several major libraries and educational institutions tested and implemented DjVu for digitization projects during the 2000s, particularly in Europe where it was employed for archiving public domain materials. The format's advanced compression techniques enabled high-quality rendering of scanned documents at low file sizes, making it suitable for institutional repositories focused on heritage collections. For instance, national archives in Europe, such as those participating in IIIF initiatives, relied on DjVu for extensive image storage due to its efficiency in handling large volumes of scanned content.^[50]^[51] In publishing and document scanning, DjVu saw integration with tools from companies like LizardTech, which developed software for creating compressed files from scanned originals, including support for hardware from Xerox and Canon scanners in early workflows. Web distribution benefited from browser plugins, enabling previews in early search engines such as Yahoo, where users could view DjVu files directly if the plugin was installed.^[50] DjVu reached peak adoption in the 2010s, driven by its superior performance in the dial-up and early broadband era. The format offered 5-10 times better compression than JPEG for color documents, reducing file sizes to 40-80 KB per 300 dpi page and enabling faster loading over low-resolution connections. This bandwidth efficiency was crucial for institutions distributing large volumes of content without overwhelming users' limited internet capabilities.^[50]

Current Applications and Challenges

In niche applications, DjVu remains valued for data archiving, particularly among communities focused on preserving scanned documents due to its efficient compression for large collections. For instance, it is employed in academic repositories for handling mathematics papers and other scanned scholarly materials where high-quality image retention is essential alongside text layers. The format's small file sizes make it suitable for low-bandwidth environments, supporting open access initiatives in developing countries by enabling easier distribution of digitized resources without substantial infrastructure demands.^[1]^[52] Modern projects continue to leverage DjVu for established digital collections. This ongoing use underscores DjVu's role in institutional preservation efforts where legacy compatibility and compression efficiency are prioritized over newer standards, as of 2025.^[53] Despite these applications, DjVu faces significant challenges to broader adoption. The lack of native support in mainstream mobile operating systems and apps limits accessibility on portable devices, requiring third-party viewers that may not integrate seamlessly with popular ecosystems. PDF's dominance, bolstered by Adobe's widespread tools and compatibility, has overshadowed DjVu, as evidenced by the Internet Archive's 2016 decision to cease generating DjVu files for new uploads due to declining usage and creation errors. These factors contribute to DjVu's stable but declining presence in digital libraries as of 2025.^[33]^[48] DjVu retains advantages in specific contexts, such as being 2-5 times smaller than equivalent PDF files for scanned documents, which supports compact storage in fields like digital forensics for evidence preservation. Looking ahead, potential revival may come through community forks enhancing OCR capabilities, potentially integrating AI for improved text extraction from scans. Browser-based implementations, including JavaScript libraries, hint at future WebAssembly compatibility to revive in-browser viewing, with comparisons affirming DjVu's superiority for black-and-white documents over PDF in compression and rendering fidelity.^[54]^[55]^[56]^[57]

Licensing

Proprietary Origins

DjVu's proprietary origins trace back to its development at AT&T Laboratories, where key patents were filed between 1997 and 1999 to protect the core compression technologies. These patents primarily covered wavelet-based methods for handling partially masked images, such as the iterative encoding process that separates foreground and background layers for efficient compression of scanned documents. A seminal example is U.S. Patent No. 6,058,214, filed in January 1998 and issued in May 2000 to inventors Léon Bottou and Steven Pigeon, assigned to AT&T Corp., which detailed a multi-stage wavelet coding technique that identifies and cancels masked coefficients to achieve high-fidelity reconstruction with reduced computational overhead. Additionally, the JB2 method—a bi-level image compression scheme derived from AT&T's proposal to the JBIG2 standard—received patent protection during this period, enabling shape dictionary-based encoding for text and line art with superior ratios over traditional methods like G4 or MMR. AT&T retained exclusive rights to these technologies until commercialization, prohibiting commercial exploitation without authorization while permitting academic and research use under limited non-exclusive terms.^[58]^[12] In March 2000, LizardTech, Inc., a Seattle-based compression specialist, acquired the commercial rights to DjVu from AT&T, transitioning the technology into a proprietary licensing model from 2000 to 2001. Under this framework, LizardTech offered encoders and viewers for DjVu files through paid licenses tailored to enterprise users, particularly in digital publishing and archiving sectors, where fees were structured based on usage volume and integration needs to monetize the format's advantages in file size and web delivery. For instance, publishers could license the full suite—including the proprietary foreground-background separation algorithm—for creating compressed document libraries, with pricing reflecting the technology's edge in handling high-resolution scans at low bandwidths. The patents granted broad coverage to these core elements, such as wavelet decomposition (IW44) and JB2 encoding, but explicitly allowed non-commercial research applications without royalties, fostering initial experimentation while reserving revenue streams for licensed commercial deployments. This model emphasized enterprise adoption, with LizardTech providing SDKs and plugins for seamless integration into workflows like newspaper digitization.^[59]^[12] During the early 2000s, DjVu faced minor legal challenges stemming from perceived overlaps between its wavelet compression techniques and the emerging JPEG 2000 standard, particularly in tile-based processing and lossless modes. A notable case was LizardTech, Inc. v. Earth Resource Mapping, Inc. (2004), where a U.S. district court invalidated parts of LizardTech's related wavelet patents (e.g., U.S. Patent No. 5,710,835) for insufficient written description, amid arguments that the claims encroached on JPEG 2000's domain and lacked specificity for software implementations. These disputes highlighted tensions in patent scope for compression algorithms but did not directly halt DjVu's development, as the core AT&T patents remained intact for document-specific applications. Ultimately, declining sales of proprietary DjVu tools, pressured by competition from established formats like PDF, prompted LizardTech to initiate an open-source release in late 2000 to bolster the ecosystem's longevity.^[60]

Open Source Evolution

DjVuLibre, the primary open-source implementation of the DjVu format, was launched in 2002 by Léon Bottou and collaborators as an enhanced implementation of LizardTech's GPL-licensed DjVu Reference Library version 3.5, ensuring full compatibility while improving portability and build ease.^[61] This release was distributed under the GNU General Public License version 2 (GPL v2), accompanied by a royalty-free patent grant from LizardTech—the company holding rights to the original authors' patents—allowing users to create, use, and distribute derived works without patent infringement concerns related to core DjVu technologies like the ZP-coder and IW44 wavelet encoder.^[62] The project's evolution includes several key releases building on the 2001 baseline of version 3.5, with version 3.5.6 in 2002 addressing critical bug fixes and stability improvements for broader platform support.^[63] Maintenance continued steadily, reaching version 3.5.28 in 2020 with optimizations for modern systems, followed by version 3.5.29 in July 2025 incorporating security patches to mitigate vulnerabilities in decoding routines.^[14] These updates reflect ongoing efforts to preserve the format's viability amid evolving software ecosystems. Hosted primarily on SourceForge since its inception, with mirrors on GitHub for collaborative development, DjVuLibre has benefited from community contributions enhancing features like Unicode text handling for international document support and ports to mobile platforms such as Android.^[64] Notable forks include DjvuNet, a .NET-based library initiated in the late 2010s and actively developed through the 2020s, which provides cross-platform enhancements for DjVu encoding and decoding tailored to .NET environments.^[65] Under GPL v2, DjVuLibre permits commercial use provided derivative works share their source code, enforcing copyleft principles while the associated patent grants—covering essential DjVu algorithms—expired in the 2010s due to standard 20-year terms, eliminating any remaining royalty obligations.^[62]^[66] This open-source framework has enabled the creation of free viewers, encoders, and utilities, sustaining DjVu's niche role in scanned document archiving and distribution despite its relatively low mainstream adoption compared to formats like PDF.^[67]

References

[1]
What is DjVu - DjVu.org
DjVu (pronounced 'déjà vu') is a digital document format with advanced compression technology and high performance value.
[2]
[PDF] DjVu: a Compression Method for Distributing Scanned Documents ...
Abstract. We present a new image compression technique called. “DjVu” that is specifically geared towards the compression of scanned documents in color at ...
[3]
Yann's DjVu Page
DjVu (pronounced déjà vu) is an image compression technique, a file format, and a software platform, designed bring paper documents and high resolution photos ...
[4]
[PDF] High Quality Document Image Compression with DjVu - Leon Bottou
Jul 13, 1998 · Abstract. We present a new image compression technique called "DjVu " that is specifically geared towards the compression of high-resolution ...
[5]
DjVuLibre: Open Source DjVu library and viewer - SourceForge
For scanned document, DjVu file sizes are typically 3 to 10 times smaller than TIFF or PDF in black and white, and 5 to 10 times smaller than JPEG in color. A ...
[6]
DjVu Reference - SnDjVu
DjVu documents can span more than one page. There are two multi-page formats available: bundled (single file) and indirect (separate file for each page). See § ...
[7]
The ATT-LizardTech DjVu deal signature in Seattle, March 2000
In March 2000, AT&T sold the DjVu technology to LizardTech Inc., a move that proved disastrous for the dissemination of DjVu. AT&T CTO Dave Nagel signed to deal ...Missing: acquisition 1999
[8]
[PDF] Document Imaging Report - Info-source.com
Its DjVu technology was first released by AT&T Labs in the late 1990s. AT&T sold the technology to Seattle- based compression specialist LizardTech in 2000 ...
[9]
DjVu Offers Alternative to Adobe PDF, JPEG, and GIF Files
"DjVu Solo makes it easy to convert small quantities of paper into our DjVu format. Customers can input directly from their scanners and add OCR and hyperlinks ...
[10]
[PDF] The DjVu complete solution - DjVu++
DjVu Products. Here are some companies that offer DjVu products and services. • LizardTech, a leading developer of imaging solutions for Internet applications ...
[11]
[PDF] Patent problems likely to increase - ALAIR - American Library ...
For the many digital library projects that use MrSID and DjVu images, the acquisi- tion of LizardTech, Inc., by Celartem Technology USA in August is both good ...
[12]
LizardTech - DjVu Reference Library Licensing Statement
When LizardTech acquired the DjVu technology from AT&T in March of this year, we wanted to broaden access to DjVu. To further efforts to make DjVu the standard ...
[13]
FreshPorts -- graphics/djvulibre: DjVu base libraries and utilities
LizardTech released the reference implementation of DjVu under the GNU GPL in October 2000. DjVuLibre (which means free DjVu), is an enhanced version of that ...
[14]
DjVuLibre: Open Source DjVu library and viewer
DjVu is a web-centric format and software platform for distributing documents and images. DjVu can advantageously replace PDF, PS, TIFF, JPEG, and GIF for ...DjView4 · DjVuLibre download · Browse /DjVuLibre_Windows... · DocumentationMissing: Solo | Show results with:Solo
[15]
DjVuLibre download | SourceForge.net
Rating 4.8 (38) · Free · CommunicationDjVu is a web-centric format for distributing documents and images. DjVu was created at AT&T Labs-Research and later sold to LizardTech Inc.DjVuLibre Reviews · DjVuLibre Support · Files
[16]
Yann's DjVu Page
To compress black and white images, DjVu uses a technique called JB2 that attempts to find repeating shapes on the page (such as multiple occurences of a ...
[17]
DjVu: a Compression Method for Distributing Scanned Documents ...
With DjVu, scanned pages at 300dpi in full color can be compressed down to 30 to 60 KB files from 25 MB originals with excellent quality.Missing: "research | Show results with:"research
[18]
DJVU File Format
DjVu can achieve compression ratios about 5 – 10 better than existing methods such as JPEG & GIF for colour documents and 3 – 8 times better than TIFF in black ...
[19]
DJVUDUMP - DjVuLibre
Nov 10, 2001 · Each line represent contains a chunk ID followed by the chunk size. Lines are indented in order to reflect the hierarchical structure of the IFF ...
[20]
DJVM - DjVuLibre
Nov 10, 2001 · This program creates or modifies a bundled multi-page DjVu document. Multi-page bundled documents can be used directly or converted to indirect ...Missing: mode | Show results with:mode
[21]
djvu3changes.txt
The JB2 data compression model uses the soft pattern matching technique, which essentially consists of encoding each character by describing how it differs from ...Missing: rate estimation
[22]
DJVIEW4
### Summary of djview4 Features from https://djvu.sourceforge.net/doc/man/djview4.html
[23]
DJVUPS
### Summary of Printing Capabilities and High-Fidelity Output up to 600 DPI
[24]
Efficient Search in Hidden Text of Large DjVu Documents
It allows in particular to convert to DjVu the PDF output of popular OCR programs like FineReader preserving the hidden text layer and some other features.Missing: specification TXTz
[25]
DJVUXML - DjVuLibre
Nov 15, 2002 · The ~DjVuLibre XML Tools~ provide for editing the metadata, hyperlinks and hidden text associated with DjVu files. Unlike djvused(1) the ~ ...Missing: annotations | Show results with:annotations
[26]
Efficient Search in Hidden Text of Large DjVu Documents
Aug 7, 2025 · It allows in particular to convert to DjVu the PDF output of popular OCR programs like FineReader preserving the hidden text layer and some ...
[27]
WinDjView » Homepage - SourceForge
WinDjView is a fast, compact and powerful DjVu viewer for Windows with tabbed interface, continuous scrolling and advanced printing options.Contents in DjVu · WinDjView » 2007 » · Version history · News
[28]
WinDjView Extended download | SourceForge.net
Rating 4.5 (2) · FreeNov 1, 2025 · WinDjView is a fast, compact and powerful DjVu viewer for Windows with tabbed interface, continuous scrolling and advanced printing options.
[29]
Supported document formats - Sumatra PDF
DjVu (.djv, .djvu); Microsoft Compiled HTML Html (.chm); XPS (.xps, .oxps ... HEIF support #. Ver 3.4+: SumatraPDF can open HEIF images but only if ...
[30]
STDU Viewer - Download
This utility allows users to open and view various file formats, including PDF, TIFF, DjVu, XPS, and more. Its user-friendly interface makes it accessible for ...
[31]
Okular - The Universal Document Viewer
Support for Many Formats. Okular supports many formats, including PDF, EPub, DjVU and MD for documents; JPEG, PNG, GIF, Tiff, WebP for images; CBR and CBZ ...Download · Graphics / Okular · GitLab · Frequently Asked Questions · Windows
[32]
Apps/Evince – GNOME Wiki Archive
Evince is specifically designed to support the file following formats: PDF, Postscript, djvu, tiff, dvi, XPS, SyncTex support with gedit, comics books (cbr ...
[33]
https://play.google.com/store/apps/details?id=ru.androidtools.djvureaderdocviewer
[34]
DjVu Reader & Viewer - App Store - Apple
Rating 4.3 (6) · Free · iOSJul 19, 2025 · DjVu Reader is the modern, elegant solution for reading technical and academic DjVu documents on your iOS device.
[35]
RussCoder/djvujs: DjVu.js is a program library for working ... - GitHub
DjVu.js is a program library for working with .djvu files online without any connection with the server. DjVu.js Viewer is a widget that allows viewing ...<|separator|>
[36]
DjVu.js Viewer Plus – Get this Extension for Firefox (en-US)
Rating 5.0 (1) · FreeFeb 4, 2025 · Download DjVu.js Viewer Plus for Firefox. Opens links to .djvu files. Allows opening files from a local disk. Processes & tags.
[37]
CJB2 - DjVuLibre
Nov 10, 2001 · This is a simple encoder for bitonal files. Argument inputfile is the name of a PBM or bitonal TIFF file containing a single document image.Missing: c44 | Show results with:c44
[38]
C44 - DjVuLibre
Nov 10, 2001 · The main design objective for the DjVu wavelets consisted of allowing progressive rendering and smooth scrolling of large images with limited ...Missing: cjb2 | Show results with:cjb2
[39]
DjVuLibre: Open Source DjVu library and viewer
### Summary of DjVu File Format and Components
[40]
Any2Djvu: The Free On Line DjVu Conversion Service - DjVu.org
GZ, PDF, TIFF, JPEG, PNM, and other formats that may be converted to DjVu. The Any2DjVu server will handle all the details of conversion. Any2DjVu is maintained ...
[41]
Image Formats - ImageMagick
Used to support embedded images in compound formats like WMF. DJVU, R. DMR, RW, Digital media repository, Requires the MagickCache delegate library. Supported ...Supported Image Formats · Pseudo-Image Formats · Built-In Patterns
[42]
intranda/goobi-workflow - GitHub
Support for different image file formats (e. g. TIF, JPEG, JPEG 2000, PNG, DjVu); OCR integration using ABBYY Finereader SDK, Tesseract and ABBYY Recognition ...
[43]
Features - DjVuLibre: Open Source DjVu library and viewer
Here is a non exhaustive list of the commands included with DjVuLibre: c44: a wavelet-based continuous-tone image encoder (à la JPEG-2000). cjb2: single page ...
[44]
3 Million Texts for Free | Internet Archive Blogs
Sep 17, 2011 · Our 3 millionth text is a Galileo pamphlet from the rare book collection of the University of Toronto. Internet Archive has been scanning books ...
[45]
djvu files for new uploads - Internet Archive Forums
Feb 26, 2016 · The Internet Archive will soon stop creating DJVU files for uploaded text files. The reasons for this are declining use, errors in the creation of new files.DjVu conversion technique (esp. background separation)Re: OCR output for indexing, proofreading, and maybe researchMore results from archive.org
[46]
Help:Internet Archive - Wikisource, the free online library
This page in a nutshell: DjVu files of scanned books can be uploaded from the Internet Archive. If the book you want to add to Wikisource is available, follow ...Missing: 2005-2016 | Show results with:2005-2016
[47]
(PDF) Electronic Document Publishing Using DjVu - ResearchGate
Aug 7, 2025 · Originally developed for scanned color documents, the DjVu technology was recently expanded to electronic documents. The small file sizes and ...
[48]
Paper 16 — IIIF | International Image Interoperability Framework
The IIIF-server accesses images both in JPEG- and DjVU-format. The DjVU-format has been in extensive use within the National Archives but because the format ...
[49]
DJVU file format - DjVu Image - File-Extensions.com
History and Development. The DjVu file format was initially developed by Yann LeCun, LÃ©on Bottou, Patrick Haffner, and Paul G. Howard at AT&T Labs from 1996 ...
[50]
Category:Gallica scans - Wikimedia Commons
Jan 19, 2017 · Français : Collections électroniques d'images de la Bibliothèque nationale de France ... djvu 4,267 × 5,824, 644 pages; 37.97 MB. Apus and Octans ...
[51]
PDF vs DjVu benefits and disadvantages - Ebooks Stack Exchange
Mar 4, 2016 · DjVu uses the same JBIG2 for bi-level (black and white) images. For colors PDF can use JPX (JPEG 2000 Part 2), while DjVu is using IW44 (Image ...
[52]
Wikisource:DjVu vs. PDF
Mar 27, 2025 · For historical reasons, DjVu files are still preferred on Wikisource but either DjVus or PDFs can be used and there are advantages to both.Missing: community | Show results with:community
[53]
jwilk-archive/ocrodjvu: OCR for DjVu - GitHub
Oct 3, 2022 · Overview. ocrodjvu is a wrapper for OCR systems that allows you to perform OCR on DjVu files. Example.Missing: enhancements | Show results with:enhancements
[54]
Open and view DjVu online: DjVu.js
DjVu.js is a program library for working with .djvu files online. It's written in JavaScript and can be run in a web browser without any connection ...Missing: WebAssembly | Show results with:WebAssembly
[55]
https://patents.google.com/patent/US6058214A/en
[56]
Software and Patent Scope: A Report from the Middle Innings
Aug 10, 2025 · I then look in detail at the recent LizardTech case, which applied the written description requirement to a software patent. This serves as an ...<|separator|>
[57]
[PDF] Software and Patent Scope: A Report from the Middle Innings
I then look in detail at the recent LizardTech case, which applied the written description requirement to a software patent. This serves as an interesting case ...
[58]
Credits - DjVuLibre: Open Source DjVu library and viewer
In 1999, AT&T released v2.0 of the DjVu Reference Library (mostly written by Leon Bottou) under some sort of complicated open source license (AT&T lawyers would ...
[59]
License - DjVuLibre: Open Source DjVu library and viewer
The DjVu Reference Library 3.5 was released by Lizardtech under the GNU General Public License version 2. DjVuLibre-3.5 was developed by Leon Bottou and ...Missing: 2001 | Show results with:2001
[60]
DjVuLibre - SourceForge
Nov 10, 2001 · The DjVu format has an official MIME type of image/vnd.djvu, which is the preferred content-type to be given by http servers for DjVu files.
[61]
djvulibre sourceforge repo clone, plus Debian packaging info - GitHub
Last commit date. Latest commit. author. Leon Bottou ... Populate the zlib, jpeg and tiff directories with the open source code suggested in the README files.
[62]
DjvuNet is a cross platform fully managed .NET library for ... - GitHub
DjvuNet is a cross platform fully managed .NET library for working with Djvu documents which can run on Linux, macOS and Windows.Missing: enhancements encoders
[63]
DjVuLibre Help - Patents - SourceForge
Sep 8, 2019 · Leon Bottou kindly informed me in a private mail of 6 Aug 2020: The most impactful patent on djvulibre was the zcoder because you cannot produce ...Missing: grant authors
[64]
[PDF] Scanned publications in digital libraries: new Open Source DjVu tools
Oct 5, 2012 · ... Open Source DjVu tools. Why DjVu? The legal status of the DjVu technology ... 4 freedoms. (http://www.gnu.org/philosophy/free-sw.html): The ...