Fact-checked by Grok 2 weeks ago

BagIt

BagIt is a set of hierarchical file layout conventions designed to support the storage and transfer of arbitrary digital content, consisting of a "bag" that packages payload files alongside descriptive metadata tags without requiring knowledge of the content's semantics.^[1] A bag is structured as a base directory containing a required "data" subdirectory for the payload—treated as opaque octet streams—and tag files that include a "bagit.txt" file declaring the BagIt version, one or more manifest files listing payload file paths with cryptographic checksums (such as MD5 or SHA-256) for integrity verification, and optionally a "bag-info.txt" file with basic metadata like contact information or creation date.^[1] This format enables reliable validation of the bag's completeness and authenticity through checksum comparisons, supports direct access to individual files without unpacking the entire structure, and imposes no limits on file sizes or counts, making it suitable for large-scale digital preservation and data exchange.^[1] Originally developed by the Library of Congress in collaboration with the California Digital Library around 2007 to facilitate the transfer of born-digital materials, BagIt evolved through community contributions and was formalized as version 1.0 in RFC 8493 by the Internet Engineering Task Force (IETF) in October 2018.^[2] It has been widely adopted by cultural heritage institutions, research repositories, and data archives for packaging diverse content types, including digitized collections, research datasets, and software distributions, often integrated with tools like the open-source Bagger application for creation and validation.^[2] Extensions such as BagIt Profiles allow for customizable requirements on metadata and structure to meet specific institutional or community needs, enhancing interoperability while maintaining the core specification's simplicity and extensibility.^[3]

Introduction

Definition and Purpose

BagIt is a set of hierarchical file layout conventions designed to support disk-based storage and network transfer of arbitrary digital content.^[1] It provides a simple, standardized way to package digital files without imposing restrictions on the types or formats of the content included, making it applicable to a wide range of data such as documents, images, audio, video, and software artifacts.^[1] The core purpose of BagIt is to enable the secure packaging of payload files— the primary digital content—along with integrity checks that prevent data loss or corruption during transfer or long-term archiving.^[1] A "bag" serves as a self-describing container that bundles both the payload and descriptive metadata, known as tags, allowing recipients to verify the completeness and accuracy of the package without requiring specialized knowledge of the payload's semantics or structure.^[1] This approach ensures that the bag can be handled by generic tools, promoting interoperability across systems and institutions. In the context of digital preservation, BagIt addresses key challenges by focusing on bit-level integrity, where the exact reproduction of digital objects is paramount to maintaining their authenticity over time.^[1] By incorporating mechanisms for checksum-based validation, it allows for straightforward detection of alterations or errors, reducing the risk of silent data corruption in archival environments without necessitating complex proprietary software.^[1]

Key Components

A BagIt package is structured around four primary components that facilitate the secure packaging, transfer, and verification of digital content: the payload, tag files, manifest files, and the optional fetch file. These elements form a self-describing hierarchy that prioritizes data integrity without imposing restrictions on the content type or format.^[1] The payload represents the core content of the bag, consisting of all files and subdirectories within the data/ directory at the root of the package. This component holds the arbitrary digital objects—such as documents, images, videos, or datasets—that are the primary focus for preservation, archiving, or transmission. By isolating the payload in this dedicated directory, BagIt ensures that the actual data remains distinct from descriptive or verification metadata, simplifying management and validation processes.^[1] Tag files serve as metadata descriptors located directly in the bag's root directory, providing contextual information about the package as a whole. The mandatory bagit.txt file declares the BagIt version (e.g., 1.0) and the character encoding (typically UTF-8) employed throughout the bag, establishing the foundational rules for interpretation. An optional bag-info.txt file supplements this with human-readable details, such as the creation date, contact information, or payload size, enhancing usability without affecting the bag's validity. These tags enable quick assessment of the package's provenance and properties.^[1] Manifest files are checksum-based inventories essential for integrity assurance, also residing at the root level and named according to the hashing algorithm used, such as manifest-sha256.txt. Each manifest lists every file in the payload (relative to the data/ directory) paired with its computed checksum, allowing recipients to verify that no files have been altered, lost, or corrupted during transfer or storage. Optional tag manifests (e.g., tagmanifest-sha256.txt) extend this verification to the tag files themselves. The selection of checksum algorithms, like SHA-256 for robust collision resistance, supports reliable detection of modifications.^[1] The optional fetch.txt file addresses scenarios with large or distributed datasets by specifying how to acquire payload files not included locally in the data/ directory. It contains entries detailing remote URLs, expected file sizes, and target paths within the payload, enabling the construction of "partial bags" that defer downloading until needed, which is particularly useful for bandwidth-constrained environments or collaborative data sharing. When present, this file ensures the bag remains complete upon fetching, maintaining overall integrity through the associated manifests.^[1]

Specification

Bag Structure

A BagIt bag consists of a hierarchical directory structure designed for the storage and transfer of digital content. The root directory, also referred to as the base directory, must contain the mandatory "data" subdirectory along with tag files, such as the bagit.txt file that declares the BagIt version (detailed further in the Metadata and Manifest Files section). Optional tag directories may also reside at the root level to organize additional tag files, but no other files, payload content, or subdirectories are permitted directly under the root. The "data" subdirectory exclusively houses the payload files, which represent the primary digital content being packaged. These files retain their original relative paths from the source collection and may be organized with arbitrary nesting of subdirectories within "data/" to accommodate complex hierarchies, such as those found in software distributions or multimedia archives. Payload files are treated as opaque octet streams, with no inherent restrictions on their types, sizes, or internal structures beyond the overall bag constraints. All file paths referenced within the bag utilize forward slashes (/) as the directory separator to ensure cross-platform compatibility, irrespective of the underlying operating system. The specification assumes the use of regular files and directories to maintain a portable and verifiable structure. Tag files and path names adhere to UTF-8 encoding without byte order marks (BOMs), prohibiting non-standard encodings that could introduce compatibility issues. While BagIt is fundamentally a filesystem-based format, it supports serialization into monolithic archives, such as ZIP files, for efficient network transfer or storage. In a serialized bag, the archive must preserve the exact directory layout upon extraction, including the root-level tags and "data" subdirectory, to ensure compliance with the core structure.^[1]

Metadata and Manifest Files

The BagIt specification requires a file named bagit.txt at the root of every bag to declare the BagIt version and the character encoding used for all tag files. This file must consist of exactly two lines: the first specifying the version in the format BagIt-Version: <major>.<minor>, such as BagIt-Version: 1.0, and the second indicating the encoding in the format Tag-File-Character-Encoding: <encoding-name>, with UTF-8 strongly recommended and no byte order mark (BOM) permitted.^[1] An optional bag-info.txt file provides human-readable metadata about the bag in a simple key-value format, where each line consists of a label followed by a colon, optional whitespace, and a value. Common fields include Source-Organization (the entity creating the bag), Bagging-Date (in YYYY-MM-DD format), and Payload-Oxum (an octet-stream sum in the format <total-octets>.<file-count>, representing the total size and number of payload files, e.g., 279164409832.1198). This file preserves the order of entries and uses the encoding declared in bagit.txt.^[1] Manifest files are required and provide integrity information for the payload by listing each payload file's relative path alongside its checksum, ensuring the content remains unaltered during transfer or storage. At least one such file must exist, named manifest-<algorithm>.txt (e.g., manifest-md5.txt), with each line formatted as <checksum> <relative-path>, where the checksum is a lowercase hexadecimal digest (e.g., d41d8cd98f00b204e9800998ecf8427e data/file.txt). Supported algorithms are MD5, SHA-1, SHA-256, and SHA-512, with SHA-256 and SHA-512 required for validating software; the file must include exactly one entry per payload file. Multiple manifest files using different algorithms may coexist. Filepaths in manifests must percent-encode Line Feed (LF), Carriage Return (CR), Carriage Return Line Feed (CRLF), or percent sign (%) characters following RFC 3986.^[1] Optional tagmanifest files, named tagmanifest-<algorithm>.txt, extend integrity checks to the tag files themselves (excluding payload files) using the same format and supported algorithms as payload manifests. For example, a line might read 3b5f06b0b7d3f5a3e0a4d5f6e7b8c9d0 bag-info.txt, verifying files like bagit.txt and other manifests. These files are generated after the primary tags to include their own checksums where applicable.^[1] An optional fetch.txt file allows for incomplete bags by specifying files to be fetched from remote URLs. Each line follows the format url length filepath, where url is the location to fetch from, length is the expected octet count (or "-" if unspecified), and filepath is the relative path within the data/ directory. This file uses the encoding declared in bagit.txt and enables bags to reference large or external payloads without including them initially.^[1] These metadata and manifest files collectively enable basic validation of bag integrity, as detailed in the validation process.^[1]

Validation Process

The validation process for a BagIt package consists of two main stages: confirming completeness and verifying validity, ensuring the package's structure and content integrity.^[1] Completeness validation begins by checking the presence of essential components in the root directory: the bagit.txt file declaring the BagIt version, at least one payload manifest file (e.g., manifest-sha256.txt), and the data/ subdirectory containing the payload files. The root directory must not include any extraneous files or subdirectories beyond these required elements, optional tag manifests, bag-info.txt, or fetch.txt. Furthermore, every file referenced in the payload manifests must physically exist within the data/ directory, and every payload file must appear in at least one payload manifest to avoid omissions. If a fetch.txt file is present, the bag remains incomplete until the listed remote files are retrieved and added to the data/ directory.^[1] Validity validation proceeds only after completeness is established and involves recalculating checksums for all files listed in the payload manifests using the specified algorithm (e.g., SHA-256) and comparing them against the values in the manifests. If tag manifests are included, their checksums must similarly be verified against the corresponding tag files. This step confirms that no alterations or corruptions have occurred in the payload or metadata. For bags with fetch.txt, optional remote validation requires downloading files from the provided URLs, checking their byte lengths if specified, integrating them into the data/ directory, and then performing the checksum comparison as part of the payload manifests.^[1] Bags failing completeness checks are classified as incomplete, often due to absent required elements, missing files, or unresolved fetches, while complete bags with checksum discrepancies are deemed invalid. Validation tools are expected to provide detailed reporting of specific failures, such as which files or checksums failed, to facilitate remediation.^[1] Best practices recommend conducting validation immediately following any transfer of the bag to identify transmission errors or corruption early in the workflow. For serial bags—compressed or archived formats like ZIP or TAR—the package must first be deserialized into its full directory structure before applying the completeness and validity checks.^[1]^[4]

History

Development Origins

BagIt originated in 2007 at the Library of Congress (LOC), where it was developed to facilitate the reliable transfer of digital collections between diverse systems and organizations.^[5] This effort was driven by the need to handle growing volumes of digital content, from gigabytes to petabytes, in a way that ensured integrity during handoffs without requiring complex software installations.^[2] The specification was co-created with the California Digital Library (CDL), reflecting collaborative needs for standardized packaging in digital preservation workflows.^[6] The core philosophy behind BagIt draws from the simple "bag it and tag it" approach, where digital content is bundled into a "bag" for transport and accompanied by "tags"—machine-readable metadata—for description, verification, and automated processing.^[6] This minimalist design emphasized ease of use across institutions, allowing content creators and recipients to validate packages using standard file system tools and checksums.^[7] Key early motivations stemmed from LOC's digitization initiatives, particularly the National Digital Newspaper Program (NDNP), which required error-free delivery of large newspaper collections from external partners.^[2] Other projects at LOC faced similar challenges in managing transfers via network or physical media, often involving terabytes of data from web archiving and cultural heritage sources.^[5] These drivers highlighted the limitations of unstructured methods, pushing for a convention that separated payload files from metadata while enabling quick integrity checks.^[7] By 2008, an initial informal specification had emerged, formalizing the structure from ad-hoc scripts used in NDNP transfers and evolving into a de facto standard for content packaging.^[5] This early version outlined basic directory layouts, manifest files, and validation steps, laying the groundwork for broader adoption in preservation practices.^[6]

Version History

The development of the BagIt specification began with a series of informal drafts in 2008, emerging from collaborations between the Library of Congress and the California Digital Library to facilitate reliable digital content transfer.^[8] The initial draft-00 was released on March 24, 2008, introducing core concepts such as hierarchical file packaging with manifests and checksums for integrity verification.^[8] Subsequent drafts refined these elements; for instance, draft-01 on May 30, 2008, simplified tag manifests, while draft-02 on July 11, 2008, standardized path separators and introduced the Payload-Oxum metadata field to describe payload content volume.^[8] Further iterations, including draft-05 in April 2011, added support for tag directories and clarified validity rules, culminating in version 0.97 released on April 2, 2012, which introduced optional multiple manifests for different checksum algorithms and optional UTF-8 encoding declaration in tag files.^[8] Version 1.0 of the BagIt specification was formalized as Internet Engineering Task Force (IETF) Request for Comments (RFC) 8493 in October 2018, marking its transition from draft status to a stable standard.^[1] This release imposed stricter syntax requirements, mandating UTF-8 encoding for all tag files to ensure consistent character handling across implementations, whereas previous versions treated it as optional.^[1] It also clarified serialization rules for bags, requiring that all payload files be listed in every manifest to prevent partial listings, and recommended SHA-512 as the default checksum algorithm while retaining support for MD5 and SHA-1 for legacy compatibility.^[1] These changes aimed to enhance interoperability and robustness without breaking existing bags. Following version 1.0, the BagIt Profiles specification was introduced in 2015 (version 1.0) as an extension mechanism to define custom rules for bags, such as required metadata fields or file organization, without modifying the core BagIt standard.^[3] This allowed communities to enforce domain-specific constraints while maintaining BagIt conformance. The profiles evolved through subsequent releases, with version 1.2.0 introducing and requiring the BagIt-Profile-Version field, and the latest version 1.4.0 released in November 2023 incorporating refinements for better validation and serialization of profile documents.^[3] The specification emphasizes backward compatibility, requiring implementations to support both version 0.97 and 1.0 bags; for example, version 1.0 parsers must tolerate multiple linear whitespace characters around colons in bag-info.txt headers, a leniency present in earlier versions.^[1] Upgrades from pre-1.0 bags can occur in place by adding new manifests with stronger checksums, ensuring minimal disruption to existing workflows.^[1]

Implementations and Tools

Software Libraries

The BagIt Java library, known as bagit-java, is the original implementation developed by the Library of Congress to support the creation, manipulation, and validation of BagIt packages.^[9] Originating from the library's early adoption of the BagIt specification around 2007, it has been maintained and updated to align with evolving standards, with version 5.x representing a complete rewrite using modern Java practices for improved internationalization and linting capabilities (last updated June 2018).^[10]^[9] It supports BagIt versions from 0.93 to 0.97, providing APIs for generating manifests, tag files, and checksums, as well as validating package integrity without built-in serialization to archive formats.^[9] This library forms the basis for extensions like the BagIt Library (BIL), which integrates into broader workflows for automated bag handling.^[11] The bagit-python library, also developed by the Library of Congress, offers a Python module and command-line interface for handling BagIt operations, available via PyPI for easy integration into scripts and applications (latest version 1.9.0 as of October 2024).^[12] It enables bag creation with custom metadata, parallel checksum computation for efficiency, and comprehensive validation against the IETF BagIt specification, making it suitable for both programmatic and utility-based use.^[13] Key APIs include make_bag() for packaging directories with tags like contact information, and is_valid() for integrity checks, ensuring cross-platform compatibility on systems with Python installed.^[13] While its exact inception ties to the Library's post-2010 digital preservation efforts, it remains actively maintained to support serialization and fixity generation in preservation pipelines.^[14] Other implementations extend BagIt support to additional languages, including bagit.rb for Ruby, which provides library and command-line tools for bag creation, manifest generation, remote file fetching via fetch.txt, and validation per BagIt spec v0.97.^[15] Similarly, bagit-js for Node.js facilitates creation, modification, and validation of BagIt containers by wrapping core functionality, though it relies on underlying Python dependencies for full operation, promoting use in JavaScript environments.^[16] For scientific data applications, the research-object/bagit-ro extension defines a profile that serializes Research Objects as BagIt archives, embedding rich metadata like RO-Crate JSON files within the data directory while leveraging BagIt's checksums and structure for integrity and transfer.^[17] These libraries emphasize standardized APIs for bagging files, generating manifests and tags, and validating packages, ensuring cross-platform reliability in diverse development workflows.^[15]^[16]^[17]

Graphical and Command-Line Tools

Bagger is a Java-based graphical user interface (GUI) tool developed by the Library of Congress for creating and validating BagIt packages, enabling non-programmers to package digital content through an intuitive interface (last updated April 2018).^[18] It supports drag-and-drop functionality for adding files to the bag's data directory, interactive entry of metadata such as bag-info.txt fields (e.g., contact information, creation date), and generation of required manifest files with checksums for integrity verification.^[19] Users can validate bags by checking checksums and structure compliance, with visual feedback on errors, and export completed bags to compressed formats like ZIP or TAR for easy transfer.^[20] As an open-source application licensed under the Apache License 2.0, Bagger is freely available and runs on multiple platforms including Windows, macOS, and Linux.^[18] For command-line operations, bagit.py serves as a Python-based utility from the Library of Congress, providing scripting capabilities for automated bagging and validation processes suitable for batch handling of multiple files or directories.^[13] The tool installs via pip and offers commands like bagit.py /path/to/input /path/to/output to create a bag with SHA-256 checksum manifests, or bagit.py --validate /path/to/bag to verify completeness and integrity without a GUI.^[12] It supports fetch.txt generation for remote file inclusion, profile adherence checking against BagIt extensions (e.g., BagIt Profiles), and detailed reporting of validation results to stdout or files, facilitating integration into workflows for large-scale digital preservation.^[13] Released under the public domain, bagit.py is lightweight and cross-platform, relying on underlying Python libraries for core functionality.^[12] Archivematica, an open-source digital preservation platform, includes built-in BagIt support through integration with bagit-python, allowing users to ingest, process, and validate BagIt packages via its web-based interface without separate tool installation.^[21] This extension handles unzipped or zipped bags during transfer, automatically verifying manifests, generating PREMIS metadata for preservation events, and ensuring compliance with BagIt specifications during archival information package (AIP) creation.^[22] Features include reporting on validation outcomes and fetch handling for incomplete bags, with all components licensed under AGPLv3 for free community use.^[21] These tools collectively emphasize accessibility for end-users, with capabilities for profile checking (e.g., against BagIt Profiles for serializations like ZIP), fetch implementation to acquire external payloads, and comprehensive reporting on bag status, all while remaining free and open-source to promote widespread adoption in digital archiving.^[13]^[19]

Applications

In Digital Preservation

BagIt serves as a standardized packaging format within the Open Archival Information System (OAIS) reference model, facilitating the creation and exchange of Submission Information Packages (SIPs) and Archival Information Packages (AIPs). In OAIS-compliant systems such as Archivematica, digital content is ingested as SIPs and transformed into BagIt-structured AIPs for long-term storage, ensuring that descriptive, representation, and preservation metadata are bundled with the payload files to support archival functionality.^[23] Similarly, institutions like the Library of Congress and Columbia University employ BagIt to package SIPs for transfer and generate AIPs for preservation repositories, aligning with OAIS principles for ingest, storage, and dissemination.^[24]^[25] The format's manifest files enable fixity checks using checksums, which detect alterations or degradation such as bit-rot in stored digital objects, thereby supporting the detectability and repair of errors over time. This integrity mechanism is crucial for preservation, as it allows verification of content unchanged from the original submission, reducing risks during long-term storage on media prone to silent corruption.^[26] Additionally, BagIt's structure facilitates data migration to new formats or systems without loss, as pre- and post-migration fixity validation confirms the payload's fidelity, preserving the archival record's authenticity.^[4] In preservation workflows, BagIt is utilized for ingest processes in distributed systems like LOCKSS networks, where bags are validated against their manifests upon receipt to confirm completeness and integrity before storage.^[27] Following ingest, these systems perform periodic re-verification using the embedded checksums to monitor for degradation, ensuring ongoing preservation across networked nodes as seen in LOCKSS-based cooperatives such as MetaArchive.^[28] BagIt can be extended for enhanced provenance tracking by incorporating elements of the PREMIS preservation metadata standard into the bag-info.txt file, which holds descriptive tags for the bag's origin, creation date, and contact information.^[29] This integration allows simple key-value pairs in bag-info.txt to reference or summarize PREMIS events, such as ingest actions or fixity generations, while more detailed PREMIS XML records reside within the bag's payload, providing a lightweight yet standards-aligned approach to documenting preservation history.^[30]

Adoption and Use Cases

The Library of Congress has been a core adopter of BagIt since 2008, utilizing it to package digitized content for transfer to and from partners in the National Digital Newspaper Program (NDNP). Through NDNP, BagIt facilitates the distribution of bagged newspaper pages, which are made publicly accessible via the Chronicling America portal, ensuring reliable transfer and integrity verification of millions of historical documents.^[2]^[31] Other prominent adopters include the California Digital Library (CDL), which employs BagIt for large-scale content transfer between cultural institutions, simplifying the movement of digital assets without requiring specialized software installations. Archivematica, an open-source digital preservation platform, integrates BagIt for ingesting and packaging Archival Information Packages (AIPs), enabling automated workflows that support long-term authenticity and fixity checks. DSpace repositories, a widely used open-source platform for institutional repositories, leverage BagIt for submission packages through integrations like the Eclipse PASS deposit service, which assembles BagIt-compliant bundles for ingest into DSpace collections.^[6]^[22]^[32] BagIt supports diverse use cases, including research data sharing where it complements specifications like RO-Crate to package datasets with metadata for FAIR-compliant distribution, allowing external files to be fetched reliably during transfer. In cloud environments, organizations such as the MIT Libraries and Rockefeller Archive Center use BagIt for handoffs to AWS S3, where validation scripts confirm payload integrity post-upload to prevent data corruption during storage. Additionally, BagIt is integrated into preservation systems that align with the Open Archival Information System (OAIS) reference model defined in ISO 14721, facilitating standardized packaging in international digital archiving efforts.^[33]^[34]^[35]^[23] A key challenge in BagIt adoption involves handling large-scale bags, addressed through the fetch.txt file, which specifies URLs and sizes for remote retrieval of payload content, enabling efficient management of petabyte-scale digitization projects without embedding all files upfront.^[36] The community-maintained BagIt Conformance Suite, hosted by the Library of Congress, provides standardized test cases to validate implementations, ensuring interoperability and compliance across tools and reducing errors in production environments.^[37] As of 2025, BagIt continues to see adoption in new initiatives, including the Harvard Library Innovation Lab's archiving of Data.gov content using BagIt for authenticity and provenance, the University of Texas Libraries' 2024-2028 digital preservation strategy incorporating BagIt packaging, and Duke Kunshan University's implementation of Archivematica with BagIt for integrity validation.^[38]^[39]^[40]

References

[1]
RFC 8493 - The BagIt File Packaging Format (V1.0) - IETF Datatracker
This document describes BagIt, a set of hierarchical file layout conventions for storage and transfer of arbitrary digital content.
[2]
BagIt at the Library of Congress | The Signal
Apr 4, 2019 · BagIt provides a directory structure and a specifies a set of files for transferring and storing files that includes clear delineations between the digital ...
[3]
BagIt Profiles Specification 1.4.0 - GitHub Pages
Nov 2, 2023 · The purpose of the BagIt Profiles Specification is to allow creators and consumers of Bags to agree on optional components of the Bags they are exchanging.<|control11|><|separator|>
[4]
[PDF] The BagIt File Packaging Format (V0.97) - Digital Preservation
Apr 2, 2012 · This document specifies BagIt, a hierarchical file packaging format for storage and transfer of arbitrary digital content. A "bag" has.
[5]
[PDF] What is Fixity, and When Should I be Checking It?
When fixity information is provided with objects upfront, it can be used to validate that you have received what was intended for the collection. Page 1. Page 3 ...
[6]
Network Data Transfer to the Library of Congress - D-Lib Magazine
The Library of Congress and the California Digital Library jointly developed the BagIt specification based on the concept of "bag it and tag it," where digital ...Missing: 2007 | Show results with:2007
[7]
Bagit: Transferring Digital Content - California Digital Library
Jul 2, 2008 · The BagIt format specification is based on the concept of 'bag it and tag it,' where digital content is packaged (the bag) along with a small amount of machine ...
[8]
A Set of Transfer-Related Services - D-Lib Magazine
For example, the National Digital Newspaper Program (NDNP) project is focused on the collection of historical newspapers and is staffed by a project team from ...<|control11|><|separator|>
[9]
Java library to support the BagIt specification. - GitHub
The BAGIT LIBRARY is a software library intended to support the creation, manipulation, and validation of bags. Its current version is 0.97.Bagit Library (bil) · Examples Of Using The New... · Developing Bagit-Java
[10]
The “End of Term” Was Only the Beginning - Library of Congress Blogs
Jul 26, 2011 · The Library has developed an open source tool called the BagIt Library based on the BagIt specification. BIL is a Java library for the Unix ...
[11]
BIL (BagIt Library) - COPTR
Oct 15, 2021 · BagIt Library is a Java software library that supports the creation, manipulation and validation of bags. Developed by Library of Congress.
[12]
bagit - PyPI
bagit is a Python library and command line utility for working with BagIt style packages. Installation bagit.py is a single-file python module that you can ...
[13]
bagit-python by LibraryOfCongress - GitHub Pages
bagit is a Python library and command line utility for working with BagIt style packages. BagIt is a minimalist packaging format for digital preservation.Missing: implementations | Show results with:implementations
[14]
Work with BagIt packages from Python. - GitHub
Jun 28, 2014 · This can be handy on multicore machines. Validation. If you would like to see if a bag is valid, use its is_valid method:.Missing: serial | Show results with:serial
[15]
tipr/bagit: Ruby Library and Command Line tools for BagIt - GitHub
This is a Ruby library and command line utility for creating BagIt archives based on the BagItspec v0.97.Missing: rb | Show results with:rb
[16]
bagit
- **Purpose**: bagit-js is a Node.js library that wraps the Library of Congress bagit-python for BagIt functionality in Node.js environments.
[17]
Implementation notes | Research Object Crate (RO-Crate)
[BagIt is] … a set of hierarchical file layout conventions for storage and transfer of arbitrary digital content. A “bag” has just enough structure to ...Combining With Other... · Bagit Examples · Adding Ro-Crate To Bagit
[18]
The Bagger application packages data files according to ... - GitHub
The Bagger application was created for the US Library of Congress as a tool to produce a package of data files according to the BagIt specification.Bagger · License · Project Profile
[19]
Bagger's Enhancements for Digital Accessions | The Signal
Apr 26, 2016 · Bagger is a digital records packaging and validation tool based on the BagIt Specification. This BagIt-compliant software allows creators and recipients of ...
[20]
[PDF] Introduction Contents
Bagger is a desktop software tool developed by the Library of Congress using the BagIt specification. It helps aid digital preservation through packaging ...
[21]
External tools | Documentation (Archivematica 1.13.2)
BagIt¶. Standard and script to package digital objects and metadata for archival storage. Archivematica uses the bagit-python library. License: Public domain ...
[22]
Library of Congress Bagit format | Documentation (Archivematica 1.6)
Bags must be packaged in accordance with the Bagit specification. To ingest a zipped bag, user selects transfer type “Zipped bag” from the dropdown menu in the ...Missing: standard | Show results with:standard
[23]
AIP structure | Documentation (Archivematica 1.13.2)
Archivematica AIPs are structurally consistent regardless of variables in original content, processing, and storage. They are is packaged into a bag.
[24]
Digital Content Transfer Tools - Digital Preservation (Library of ...
During 2008, the Library used these tools to add approximately 80 terabytes to its digital collections. BagIt video. From the video, "Bagit: Transferring ...Missing: California 2007
[25]
[PDF] Acquisition of Digital Records: - Columbia University Libraries
Digital Preservation Workflow. Preservation of bit-by-bit copy of the ... AIPs in Bagit format are ingested into Preservation Repository. Page 24 ...
[26]
Protect Your Data: File Fixity and Data Integrity | The Signal
Apr 7, 2014 · Fixity, in the preservation sense, means the assurance that a digital file has remained unchanged, ie fixed.
[27]
How LOCKSS Works
LOCKSS covers the digital preservation lifecycle, ingesting, managing, preserving content by comparing copies, and delivering content via proxy, serving, or ...
[28]
Collaboratively Preserving Our Digital Memory
▫ Distributed digital preservation cooperative. ▫ Founded 2004 ... LOCKSS against hashes provided by a BagIt manifest document. Skinner 2014. 59 ...
[29]
[PDF] Create AIP with DataAccessioner and Bagger
This workflow allows staff to generate a basic AIP using free tools, the PREMIS metadata standard, and the BagIt specification. It is particularly useful ...
[30]
AIP structure | Documentation (Archivematica 1.10.2)
The role of the METS file is to link original objects to their preservation copies and to their descriptions and submission documentation, as well as to link ...
[31]
National Digital Newspaper Program - The Library of Congress
Aug 21, 2025 · A long-term effort to develop an Internet-based, searchable database of US newspapers with descriptive information and select digitization of historic pages.Guidelines & Resources · Award Recipients · NDNP Extras · Contact the NDNPMissing: BagIt | Show results with:BagIt
[32]
Assemblers - Eclipse PASS Documentation
Mar 27, 2025 · For example, if you want to produce BagIt packages and DSpace METS packages , you would need two Assembler implementations, each responsible ...
[33]
Implementation notes | Research Object Crate (RO-Crate)
RO-Crate can be combined with BagIt simply by placing the RO-Crate files within the BagIt payload ( data/ ) directory.
[34]
Lambda to validate a Bagit bag stored in S3. - GitHub
This application includes a CLI that is designed to invoke the deployed AWS Lambda. This supports running AIP validation from a command line context, while ...Missing: cloud | Show results with:cloud
[35]
Keeping Our Heads in the Cloud, Part 2: File Validation | Bits & Bytes
Sep 28, 2023 · Once delivered to the appropriate S3 bucket, the contents of the compressed archive files are extracted, and the BagIt validation command checks ...
[36]
Scaling BagIt Tools to Manage the Ingest of Petabytes of Digitization ...
Feb 20, 2019 · This paper details the Library's usage of the BagIt File Packaging Format during Quality Assurance and Audit Submissions functions as defined by ...
[37]
[PDF] Big Data Bags and Minimal Identifiers for Exchange of Large ...
The BagIt specification and the BDBag BagIt profile provide a framework for enumerating the files that make up a potentially large and distributed dataset, in a ...
[38]
GitHub - LibraryOfCongress/bagit-conformance-suite: Test cases for validating BagIt implementations
### Bag Structure Requirements Summary