Fact-checked by Grok 2 weeks ago

ATAC-seq

Assay for transposase-accessible chromatin using sequencing (ATAC-seq) is a high-throughput sequencing method that maps regions of open chromatin across the genome, identifying regulatory elements such as promoters, enhancers, and insulators by probing DNA accessibility with a hyperactive Tn5 transposase enzyme. The technique involves a simple two-step protocol where the transposase simultaneously fragments accessible DNA and ligates sequencing adapters to the ends, enabling library preparation from as few as 500 cells and generating nucleotide-resolution profiles of chromatin structure, nucleosome positioning, and transcription factor binding sites. Developed to overcome limitations of prior methods like DNase-seq, ATAC-seq provides rapid and sensitive epigenomic profiling without the need for antibodies or extensive cell numbers. Introduced in 2013 by Buenrostro et al., ATAC-seq builds on earlier assays but distinguishes itself through its efficiency, requiring 50,000-fold fewer cells than traditional DNase hypersensitivity assays and completing in under three hours. Its advantages include high reproducibility, low input requirements suitable for clinical or rare samples, and compatibility with diverse sample types, including frozen tissues via optimized protocols like Omni-ATAC. Compared to ChIP-seq, which targets specific proteins but demands 10^5–10^7 cells and specialized antibodies, ATAC-seq offers a broader, unbiased view of regulatory landscapes at lower cost and complexity. ATAC-seq has revolutionized studies of gene regulation, enabling insights into cell-type-specific epigenomes during , , and . Key applications include profiling accessibility in embryonic tissues, immune cell responses, and cancer subtypes to uncover and therapeutic targets. Single-cell variants, such as scATAC-seq introduced in 2015, extend its utility to heterogeneous populations, revealing cell fate transitions and integrating with for multi-omics analysis of regulatory networks. Recent advancements, including high-throughput methods like sci-ATAC-seq and benchmarks of protocols, continue to enhance its resolution and scalability for large-scale epigenomic atlases.

Introduction and Background

Overview and Principles

ATAC-seq, or Assay for Transposase-Accessible using sequencing, is a high-throughput sequencing method designed to map regions of open across the genome, identifying sites where transcription factors and other regulatory proteins can bind to influence . This technique leverages the natural accessibility of DNA in regulatory elements, such as promoters and enhancers, to provide insights into the epigenetic landscape that governs cellular identity and function. Chromatin accessibility is a fundamental biological principle underlying gene regulation, where the packaging of DNA into chromatin modulates access to genetic information. Euchromatin represents loosely packed, transcriptionally active regions with high accessibility, facilitating the binding of regulatory proteins, while heterochromatin is densely compacted and generally repressive, limiting such interactions. Nucleosome positioning plays a critical role in this process, as nucleosomes—histone octamers wrapped by DNA—act as barriers that, when repositioned or evicted, expose enhancers and promoters to enable precise control of transcriptional activity. The biochemical foundation of ATAC-seq relies on the hyperactive , an engineered enzyme that catalyzes tagmentation—a coupled process of DNA cleavage and adapter sequence ligation—specifically at accessible chromatin sites. This transposase preferentially targets nucleosome-free or depleted regions, fragmenting the DNA and tagging it with sequencing adapters in a single step, which allows for high-resolution mapping down to the single-nucleotide level upon next-generation sequencing. The method was first described in 2013 as a streamlined approach for epigenomic profiling. Compared to earlier techniques like DNase-seq, which uses DNase I digestion, or FAIRE-seq, which relies on formaldehyde-assisted isolation of regulatory elements, ATAC-seq offers significant advantages, including a rapid protocol completable in under 3 hours and compatibility with low cell inputs ranging from 500 to 50,000 cells, making it suitable for scarce or precious samples. These features enhance its utility for profiling dynamic chromatin states in diverse biological contexts.

Historical Development

ATAC-seq was initially developed in 2013 by Jason D. Buenrostro, Paul G. Giresi, Lisa C. Zaba, Howard Y. Chang, and William J. Greenleaf at , introducing a method for assaying chromatin accessibility genome-wide using hyperactive Tn5 transposase to insert sequencing adapters directly into open chromatin regions. This innovation addressed key limitations of prior techniques, such as DNase-seq, which required millions of cells and complex enzymatic digestion steps, and ChIP-seq, which was limited to predefined protein targets and dependent on antibody quality. The original protocol enabled profiling with as few as 500 cells in under 3 hours, facilitating rapid epigenomic analysis in diverse biological contexts. Key milestones followed swiftly, including the 2015 adaptation for single-cell resolution (scATAC-seq) by Buenrostro, Bo Wu, Utz-Maria Litzenburger, and colleagues, which allowed high-throughput mapping of from individual s, revealing principles of regulatory variation across cell types. In 2017, M. Ryan Corces, Anshul Kundaje, Howard Y. Chang, and others refined the method into Omni-ATAC, optimizing it for low-input samples (down to 50,000 nuclei) and frozen tissues while reducing mitochondrial background noise, thus broadening applicability to archival and clinical specimens. Post-2020 advances extended ATAC-seq spatially, as demonstrated by Deng et al.'s 2022 spatial-ATAC-seq, which integrated tissue sectioning with barcoded capture to profile at subcellular resolution in intact and tissues. ATAC-seq's adoption accelerated rapidly, with integration into the ENCODE project's phase 3 by 2015, where it complemented DNase-seq for comprehensive mapping of regulatory elements across human cell lines and tissues. By 2025, ATAC-seq had been referenced in over 9,000 publications, reflecting its widespread use in genomics research. The Chang laboratory at Stanford played a pivotal role in multimodal extensions, combining ATAC-seq with for joint profiling of accessibility and transcription. Meanwhile, contributions from the Broad Institute, including computational pipelines for peak calling and integration, enhanced data analysis scalability and reproducibility.

Methodology

Core Experimental Procedure

The core experimental procedure for ATAC-seq involves isolating nuclei from a small number of cells, followed by tagmentation to insert sequencing adapters into accessible regions, purification, and amplification to generate a sequencing-ready . This , originally developed for bulk , requires approximately 50,000 intact nuclei obtained from fresh or frozen cells or tissues, ensuring high-quality starting material to maintain . Nuclei are isolated by lysing cells in a hypotonic buffer containing detergents like , followed by centrifugation to pellet the nuclei while removing cytoplasmic contaminants; this step typically takes 10-15 minutes and is performed on ice to prevent degradation. Tagmentation, the hallmark step of ATAC-seq, employs hyperactive Tn5 transposase from the Nextera kit in a specialized buffer to simultaneously fragment DNA and ligate adapters preferentially at sites, generating 5-10 fragments. The isolated nuclei are resuspended in a transposition mix consisting of 2× TD buffer, Tn5 transposase, and nuclease-free water (total volume ~50 µL), then incubated at 37°C for 30 minutes in a to allow enzyme activity without excessive background cutting. This controlled reaction exploits the transposase's preference for accessible DNA, yielding fragments that reflect regulatory elements like promoters and enhancers. Following tagmentation, the reaction is stopped by adding EDTA and purified using a MinElute column to remove the enzyme, free adapters, and debris, resulting in a clean DNA eluate of ~10-20 µL. The purified tagmented DNA is then amplified via using 1× NEBNext master mix and barcoded Nextera primers (1.25 µM each) to add full sequencing adapters and indices; the cycling conditions include an initial 72°C for 5 minutes, 98°C for 30 seconds, followed by 8-12 cycles of 98°C for 10 seconds, 63°C for 30 seconds, and 72°C for 1 minute, with a final 72°C extension. Amplification cycle number is optimized via qPCR to avoid over-amplification, which can introduce bias, and the entire tagmentation-to-amplification process can be completed in 2-3 hours. Library quality is assessed by quantifying DNA concentration using qPCR and evaluating fragment size distribution with a Bioanalyzer or TapeStation, where nucleosome-free regions appear as a peak at 150-250 base pairs (including adapters), and mono-nucleosomal fragments at ~300 bp. Sequencing is performed on Illumina platforms with paired-end reads of 50-75 base pairs at a depth of 25-50 million reads per sample to achieve sufficient coverage for peak calling. To ensure quality, over-tagmentation is avoided by strict adherence to incubation times, as prolonged exposure can lead to excessive fragmentation and reduced library complexity; additionally, all steps are conducted with RNase treatment if necessary to eliminate .

Protocol Variations and Optimizations

To address challenges with limited starting material, the Omni-ATAC protocol optimizes the standard ATAC-seq workflow for low-input samples, enabling reliable profiling from as few as 500 cells through refined nuclear lysis using a combination of mild detergents (, , and Tween-20) and adjusted concentrations that enhance tagmentation efficiency while minimizing duplicates and background noise. This adaptation maintains high correlation with bulk ATAC-seq data (r > 0.95) across diverse cell types, facilitating applications in rare populations without substantial loss in . Adaptations for fixed-tissue samples incorporate formaldehyde fixation to preserve chromatin structure in archival or clinical specimens, such as formalin-fixed paraffin-embedded (FFPE) tissues, followed by de-crosslinking steps and extended incubation times (up to 2 hours) during transposition to recover accessible regions comparable to fresh samples (correlation r ≈ 0.87). These modifications extend ATAC-seq utility to biobanked materials, though they may increase sequencing depth requirements by 20-50% to compensate for fixation-induced biases. For high-throughput processing, via 96-well plate formats and robotic liquid handling systems, such as RoboATAC, supports of up to 96 samples per run, streamlining library preparation while preserving signal-to-noise ratios equivalent to manual methods. Microfluidic integrations further enable scalable tagmentation, reducing hands-on time to under 2 hours and minimizing variability across batches. Bias mitigation strategies include custom lysis and transposition buffers that suppress contamination by over 90% through selective permeabilization, as implemented in Omni-ATAC, thereby enriching for signals. Additionally, ATAC-STARR-seq combines ATAC-seq enrichment with self-transcribing active regulatory region sequencing to functionally validate enhancer activity in accessible regions, identifying active versus poised elements with high specificity in reporter assays. In the 2020s, protocol updates have integrated ATAC-seq with long-read platforms like PacBio's Fiber-seq, which captures extended fragments (>10 ) to phase haplotypes within accessible regions, revealing allele-specific patterns and structural variants not detectable by short-read methods. Troubleshooting high nucleosome occupancy, particularly in stem cells with compact , involves incorporating mild detergents (e.g., 0.1% ) into the to gently disrupt membranes without altering positioning, yielding up to 30% more peaks in low- samples compared to standard conditions.

Applications

Fundamental Uses in Genomics

ATAC-seq serves as a foundational for regulatory elements across the by identifying regions of open that correspond to enhancers, promoters, and insulators, which are typically associated with active transcription. These accessible sites, detected as peaks in sequencing reads, highlight DNA sequences bound by transcription factors and other regulatory proteins, providing insights into the architectural organization of . For instance, in lines, ATAC-seq peaks often overlap with known active promoters near transcription start sites and distal enhancers that loop to influence , enabling the annotation of functional non-coding elements without prior knowledge of specific protein bindings. Beyond peak identification, ATAC-seq facilitates the inference of nucleosome positioning by analyzing the size distribution of sequenced fragments, where protected regions of approximately 147 base pairs indicate cores, and shorter fragments in reveal accessible intervals. This approach allows for high-resolution mapping of the +1 immediately downstream of transcription start sites, as well as periodic arrays in promoter and enhancer regions, which influence compaction and accessibility. In and cells, such patterns have demonstrated shifts in occupancy during environmental responses, underscoring the role of structure in modulating regulatory potential. At sub-nucleosomal resolution, ATAC-seq enables footprinting by detecting localized reductions in accessibility within open peaks, corresponding to motifs where TFs bind and protect DNA from insertion. Computational tools like HINT-ATAC correct for Tn5 biases to accurately delineate these footprints, revealing binding dynamics for hundreds of TFs in a single experiment. This has proven effective in identifying key regulatory motifs in immune cells, where footprints correlate with ChIP-seq validated sites, offering a cost-efficient alternative for genome-wide TF occupancy profiling. In comparative epigenomics, ATAC-seq profiles accessibility differences across species, tissues, or conditions, as exemplified by datasets from diverse human cell lines, which catalog over 100,000 reproducible peaks per sample to reveal cell-type-specific regulatory landscapes. These comparisons highlight evolutionary conservation of open at core promoters while exposing condition-specific gains or losses at enhancers, aiding in the dissection of regulatory evolution and perturbation responses. ATAC-seq integrates seamlessly with other epigenomic assays, such as H3K27ac ChIP-seq, to distinguish active enhancers from poised ones; regions with high ATAC accessibility and H3K27ac enrichment denote dynamically active regulatory elements driving transcription. models trained on such paired data enhance prediction of active marks from ATAC-seq alone, improving the resolution of enhancer states in low-input samples. A notable application is illustrated in a study on human T differentiation, where ATAC-seq revealed dynamic chromatin accessibility changes at cytokine loci, such as Il2 and Ifng, correlating with lineage-specific activation during immune responses. These shifts in accessibility preceded transcriptional upregulation, demonstrating how ATAC-seq captures regulatory remodeling in bulk immune populations.

Applications in Disease and Development

ATAC-seq has been instrumental in profiling accessibility alterations in cancer, enabling the identification of regulatory changes driving tumorigenesis. A landmark study generated genome-wide accessibility profiles for 410 tumor samples across 23 cancer types from (TCGA), revealing cancer-specific open regions enriched near oncogenes. Such analyses have uncovered tissue-specific enhancer landscapes that distinguish tumor subtypes and inform potential therapeutic vulnerabilities. In , ATAC-seq facilitates tracking dynamic epigenetic reprogramming during embryogenesis by capturing waves of accessibility that coincide with key lineage specification events. For instance, a multi-omics of mouse at single-cell resolution demonstrated sequential accessibility changes in promoter and enhancer regions, marking the transition from pluripotent to differentiated states around embryonic day 6.5 to 7.5. These accessibility waves were linked to the activation of developmental transcription factors, providing insights into the regulatory mechanisms orchestrating . Quantitative assessments often employ fold-change metrics, such as Δaccessibility = log₂( intensity_disease / intensity_control), to quantify these shifts relative to baseline states, highlighting up to 2- to 4-fold increases in accessibility at lineage-specific loci. ATAC-seq has revealed aberrant chromatin landscapes in neurological disorders, particularly in Alzheimer's disease (AD) where accessibility changes at enhancers contribute to pathological gene dysregulation. In AD, snATAC-seq profiling of tissue identified altered accessibility in neuronal enhancers associated with , correlating with disease progression. These findings link open regions to the propagation of , with differential accessibility scores showing significant fold-changes (e.g., >1.5 log₂) in disease-associated loci compared to controls. Such patterns underscore ATAC-seq's utility in dissecting epigenetic drivers of neurodegeneration. Recent integrations of ATAC-seq with single-nucleus multi-omics have identified disease-critical cell types in brain-related disorders, as of 2024. In infectious diseases, ATAC-seq elucidates host-pathogen interactions by mapping at viral integration or latency sites. For , analyses of latently infected + T-cells demonstrated reduced proviral chromatin accessibility in latent reservoirs, identifying closed regions that enforce viral silencing and evade immune detection. These latency-associated sites exhibited lower ATAC-seq signal intensities, with log₂ fold-changes indicating up to 3-fold decreased accessibility relative to productively infected cells, informing strategies for reservoir reactivation. Clinically, ATAC-seq-derived accessibility signatures are emerging as biomarkers for predicting responses in tumors. A 2024 multi-omics study in integrated ATAC-seq data to identify differential accessibility patterns in immune-related enhancers, associating open signatures with improved outcomes to checkpoint inhibitors like PD-1 blockers. These signatures, quantified via fold-change metrics in peak intensities between responders and non-responders, enable non-invasive prognostic tools and guide personalized treatment.

Variants and Extensions

Single-Cell ATAC-seq

Single-cell ATAC-seq (scATAC-seq) adapts the bulk ATAC-seq protocol to profile chromatin accessibility at the resolution of individual cells, enabling the study of regulatory heterogeneity within populations. The method originated in 2015 with a pioneering approach by Buenrostro et al., who isolated single cells via fluorescence-activated cell sorting (FACS) and performed tagmentation on each, yielding sparse but informative accessibility profiles from hundreds of cells. Subsequent advancements included combinatorial indexing strategies, such as sci-ATAC-seq introduced in 2018 by Cusanovich et al., which barcodes nuclei across multiple rounds of splitting and pooling to scale up to tens of thousands of cells without physical isolation. Droplet-based encapsulation protocols, exemplified by the system, further revolutionized scalability by partitioning nuclei into emulsion droplets for barcoded tagmentation, typically requiring an input of 5,000–10,000 nuclei per sample to generate libraries from thousands to tens of thousands of cells. scATAC-seq data exhibit high sparsity, with individual cells covering only 1–10% of the genome due to limited fragment detection per , often resulting in matrices where peaks are scored as present or absent. To address amplification biases introduced during library preparation, some protocols incorporate unique molecular identifiers (UMIs) to deduplicate artifacts and improve quantification accuracy, particularly in low-input settings. These features distinguish scATAC-seq from bulk methods, which average signals across populations, by revealing cell-to-cell variability in open chromatin regions. Key applications of scATAC-seq include deconvoluting cell types in complex tissues, such as identifying rare immune subsets like exhausted T cells within tumor microenvironments, as demonstrated in profiling of cancer samples. It also supports to map differentiation paths, for instance, tracing hematopoietic progression from progenitors to mature cells by linking accessibility changes to developmental stages. When integrated with single-cell , scATAC-seq enhances regulatory inference, though it primarily stands alone for epigenetic heterogeneity. A landmark example is the 2021 atlas by Domcke et al., profiling over 1.3 million single cells and nuclei across 30 tissues, which uncovered subtype-specific regulatory elements driving tissue-specific gene programs.

Spatial and Multimodal ATAC-seq

Spatial ATAC-seq extends traditional ATAC-seq by incorporating spatial barcoding to map accessibility while preserving architecture, enabling the study of epigenetic variation . One seminal method, spatial-ATAC-seq, uses microfluidic channels to deliver barcoded to fixed sections, followed by Tn5 tagmentation and sequencing, achieving resolutions down to 20 μm per , often capturing nuclei. This approach has revealed region-specific accessibility patterns in embryos and , such as differential accessibility in radial progenitors during development. More recent innovations, like SPACE-seq introduced in 2025, adapt ATAC-seq with polyA-tailed transposomes on platforms such as 10x Visium, combining spatial with transcriptomics and lineage tracing in frozen at similar micron-scale resolutions. Multimodal ATAC-seq variants integrate accessibility with other layers, typically expression, to link regulatory elements to transcriptional states in single s, providing foundational data for spatial extensions. sci-CAR, developed in 2018, employs split-pool barcoding to co-profile ATAC-seq and in thousands of nuclei, yielding thousands of unique fragments and UMIs per to infer cis-regulatory correlations, such as in pseudotemporal dynamics of treated lines. Building on this, SHARE-seq (2020) uses iterative hybridization for high-throughput joint profiling, detecting over 8,000 ATAC fragments and 2,500 UMIs per across tissues like , where it identifies domains of regulatory (DORCs) that overlap super-enhancers and predict fate potential. ISSAAC-seq (2022) further enhances sensitivity through in situ RNA-DNA hybrid sequencing post-ATAC tagmentation, enabling flexible plate- or droplet-based workflows that capture heterogeneity within expression-defined types, such as during oligodendrocyte maturation in . Protocol variations in these spatial and multimodal methods emphasize tagmentation on slides or split-pool to minimize artifacts, with resolutions typically ranging from 10-100 μm to align epigenetic with histological features. Key insights include spatially resolved accessibility gradients in regions, highlighting architecture's role in developmental , and heterogeneous states in tumor microenvironments that reveal immune organization relative to lymphoid structures. Data outputs consist of spatially binned accessibility matrices, often co-registered with images, facilitating overlays of peak calls with or protein markers. Advances in 2024-2025 have incorporated for multimodal fusion, such as SIMO, which integrates with single-cell ATAC and data using optimal transport algorithms to predict regulatory interactions and map modalities at accuracies exceeding 80% in complex tissues like . These tools enable of 3D contacts from fused epigenomic layers, enhancing predictions of long-range interactions in developmental and disease contexts without direct measurements.

Data Analysis and Computational Tools

Preprocessing and Peak Identification

Raw ATAC-seq data are typically obtained as FASTQ files from Illumina sequencing platforms, consisting of paired-end reads with a minimum length of 45 base pairs. Initial preprocessing involves quality assessment using tools like FastQC to evaluate base quality scores, distribution, and adapter contamination, followed by adapter trimming and removal of low-quality bases with software such as Trimmomatic or Cutadapt. Reads are then aligned to a , such as hg38 for human samples, using aligners like BWA-MEM or Bowtie2, targeting a unique mapping rate exceeding 80%. Post-alignment processing includes removal of PCR duplicates via MarkDuplicates and filtering out mitochondrial reads to ensure less than 5% mitochondrial content, as higher levels indicate poor cell quality or contamination. Aligned reads are stored in BAM , with at least 50 million non-duplicate, non-mitochondrial reads recommended for reliable open chromatin profiling. Quality control metrics are essential to validate . The fraction of reads in peaks (FRiP) measures the proportion of aligned reads overlapping called peaks, with values greater than 30% indicating high-quality enrichment for accessible regions. Transcription start site (TSS) enrichment score assesses nucleosome-free fragment accumulation at promoters, where scores above 5 (ideally >7 for hg38) confirm effective tagmentation and low background noise. These metrics, computed using tools like ATACseqQC, help filter suboptimal samples before proceeding. Peak calling identifies regions of open by detecting significant read enrichments. The widely adopted is employed, often with parameters tailored for ATAC-seq paired-end : -f BAMPE -g hs --nomodel --shift -75 --extsize 150, which accounts for the Tn5 offset of approximately 75 base pairs upstream and 50 base pairs downstream. This process typically yields 50,000 to 100,000 peaks per sample, depending on cell type and sequencing depth. Peak significance is determined using a model, where the p-value reflects the probability of observing the read under a null background distribution: -\log_{10}(p\text{-value}) = -\log_{10}\left( \text{[binomial](/page/Binomial) CDF}(k \mid n, [\lambda](/page/Lambda)) \right) Here, k is the observed read count in the region, n is the total reads, and [\lambda](/page/Lambda) is the expected count based on local background. To mitigate biases, downsampling is applied to normalize for variations, and peaks overlapping blacklist regions—known for artefactual enrichments like mappability issues—are removed. Final outputs include files listing peak coordinates and scores for downstream use, as well as bigWig files generated from normalized read coverage for visualization in genome browsers like IGV.

Advanced Analysis and Software Resources

Advanced analysis of ATAC-seq data extends beyond initial peak identification to uncover regulatory mechanisms, compare accessibility across conditions, and integrate with other datasets. Motif analysis identifies potential (TF) binding sites within accessible peaks, aiding in the prediction of regulatory elements. Tools such as and the suite perform motif discovery and enrichment testing, scanning peak sequences against known TF motifs to highlight enriched binding patterns. For instance, employs a hypergeometric test to assess motif frequencies in peaks relative to background regions, while 's AME module uses a rank-sum test for similar comparisons. analysis further refines these predictions by detecting TF occupancy through biases in Tn5 insertion patterns around motifs; , a comprehensive framework, processes aligned reads and motif positions to quantify footprint scores and infer binding kinetics across cell types or conditions. Differential accessibility analysis quantifies changes in openness between samples, such as treated versus control conditions, to identify condition-specific regulatory regions. Statistical frameworks adapted from , including DESeq2 and edgeR, model count data from peak regions using negative binomial distributions to compute log2 fold-changes and false discovery rates (FDR), typically thresholding at FDR < 0.05 for significance. DESeq2 incorporates size factor to account for library depth and sequencing biases, enabling robust detection of accessibility shifts, while edgeR applies empirical Bayes moderation for in smaller sample sets. These methods have been benchmarked on ATAC-seq datasets, showing comparable performance in identifying biologically relevant regions when proper precedes analysis. Integrative pipelines facilitate the analysis of single-cell ATAC-seq (scATAC-seq) data by combining accessibility with or other modalities for clustering and pseudotime inference. ArchR is a scalable R-based package that performs via latent semantic indexing, identifies marker peaks, and supports trajectory analysis through imputation of gene activity scores from accessible promoters and enhancers. Similarly, Signac extends the Seurat ecosystem for scATAC-seq, enabling joint embedding of accessibility and profiles via , followed by shared nearest neighbor clustering and RNA velocity-like trajectory mapping. These tools streamline workflows from raw fragments to interpretable cell states, with ArchR emphasizing memory-efficient processing for large datasets and Signac focusing on seamless multi-omics integration. Visualization tools and public databases enhance the interpretation of ATAC-seq results by providing comparative contexts and interactive views. The supports uploading ATAC-seq tracks in bigWig format for overlaying peaks and signal intensities alongside reference annotations, facilitating inspection of regulatory landscapes. The ROADMAP Epigenomics Consortium database offers processed ATAC-seq and related epigenomic profiles from diverse human tissues, enabling cross-sample comparisons of accessibility patterns through downloadable signal tracks and chromatin state segmentations. Recent advances in and address challenges in scATAC-seq, particularly the imputation of sparse data to recover missing accessibility signals. SCRIPro, introduced in 2024, employs to infer gene regulatory networks by integrating scATAC-seq with multi-omics inputs, imputing sparse profiles through variational autoencoders that model cell-type-specific accessibility distributions. A range of open-source resources supports reproducible ATAC-seq analysis. The ATACseqQC package generates diagnostic plots for post-alignment quality, including nucleosome positioning and transcription start site enrichment metrics. Galaxy provides web-based workflows that encapsulate peak calling, differential analysis, and visualization steps, promoting accessibility for non-experts while ensuring parameter traceability. An example end-to-end workflow begins with identified peaks as input, followed by motif enrichment using to nominate TFs, differential testing via DESeq2 to select condition-specific regions, and functional annotation through the GREAT tool, which extends peaks to nearby genes and performs enrichment to link accessibility changes to biological processes like or .

Limitations and Future Directions

Technical and Biological Challenges

ATAC-seq is subject to several technical biases that can distort the interpretation of chromatin accessibility landscapes. The Tn5 transposase, central to the assay, exhibits sequence preferences that lead to uneven cleavage, creating hot spots at motifs like TA dinucleotides and cold spots elsewhere, which artificially inflates signal in preferred regions and underrepresents others. Additionally, PCR amplification during library preparation generates duplicates that inflate read counts in accessible regions, potentially skewing quantification; this is partially mitigated by incorporating unique molecular identifiers (UMIs) to distinguish true molecules from amplification artifacts, rescuing up to 20% of reads otherwise discarded as duplicates. Sample preparation poses further limitations, particularly for diverse biological materials. The method performs poorly on non-mammalian organisms due to differences in structure and nuclear properties, resulting in high and low fraction of reads in (FRiP scores often below 0.3). Similarly, highly cross-linked or frozen tissues yield elevated mitochondrial reads and nonspecific signal, as the original struggles with fixation-induced barriers to Tn5 access. In low-input scenarios, such as rare types, increases substantially, reducing signal-to-noise ratios and complicating reliable peak detection. At the data level, single-cell ATAC-seq (scATAC-seq) suffers from extreme sparsity, with dropout rates exceeding 90% of features per cell due to limited DNA capture and technical noise, which hampers downstream analyses like clustering and . Moreover, accessibility changes detected may reflect indirect effects, such as secondary responses to perturbations, rather than direct regulatory alterations, challenging causal inferences without orthogonal validation. is compromised by batch effects arising from variations in efficiency or sequencing runs, necessitating at least three biological replicates per condition to enhance reliability and minimize false discoveries. Peak calling in ATAC-seq datasets without stringent filters can yield increased false positives due to biases and propagating into called regions. The overall cost, including library preparation and next-generation sequencing for sufficient depth (typically 25-50 million reads), varies by facility and infrastructure, limiting accessibility in resource-constrained settings. While computational tools exist to correct some biases, such as Tn5 preference modeling, persistent challenges underscore the need for cautious interpretation.

Emerging Developments and Integrations

Recent advancements in long-read sequencing technologies have enabled the integration of ATAC-seq with platforms like (ONT) to achieve full-length fragment phasing and improved detection of structural variants within open regions. For instance, methods such as scNanoATAC-seq utilize long-read capabilities to simultaneously profile accessibility and genetic variants at single-cell resolution, overcoming limitations of short-read sequencing in resolving complex genomic architectures. A 2025 update, scNanoATAC-seq2, further improves depiction of accessibility in early embryonic development. Similarly, Fiber-seq, a PacBio-based long-read , extends beyond traditional ATAC-seq by providing high-resolution mapping of accessibility alongside positioning and , facilitating the identification of structural variants in regulatory elements. These integrations enhance the ability to phase haplotypes and detect insertions, deletions, and inversions in accessible , which are critical for understanding disease-associated variants. AI-driven models have increasingly incorporated ATAC-seq data to predict and design novel regulatory elements. Extensions of the Enformer architecture, such as REnformer, leverage single-cell ATAC-seq profiles to forecast cell-type-specific by modeling regulatory interactions from accessibility patterns, achieving higher accuracy in distinguishing subtle regulatory differences across states. In parallel, diffusion-based generative models like DNA-Diffusion enable de novo design of DNA sequences that modulate , allowing the creation of synthetic enhancers with targeted regulatory functions validated through ATAC-seq assays. These approaches, including iterative frameworks for enhancer design, prioritize orientation and spacing to generate sequences that predictably alter open landscapes. ATAC-seq is being integrated into broader multi-omics ecosystems to link chromatin accessibility with and spatial profiles. Complementing this, initiatives like the Human BioMolecular Atlas Program (HuBMAP) incorporate ATAC-seq into spatial multi-omics atlases, generating 3D maps of chromatin accessibility alongside transcriptomics and across human tissues to elucidate organ-level regulatory networks. These efforts facilitate the correlation of accessibility changes with protein abundance and , advancing comprehensive models of cellular function. In therapeutic applications, ATAC-seq-identified enhancers are guiding CRISPR-based editing for precision medicine. By pinpointing disease-specific open chromatin regions, ATAC-seq informs the design of CRISPR-Cas9 screens that target enhancer elements to modulate , as demonstrated in studies resolving super-enhancers driving activity through targeted deletions. This approach has shown promise in disrupting enhancer-promoter interactions in cancer cells, enhancing therapeutic specificity and reducing off-target effects in personalized treatments. Standardization efforts are promoting interoperability and in ATAC-seq research. The project provides standards and processing pipelines for , , and peak calling, ensuring reproducible analyses across datasets. Community-driven guidelines from initiatives like the International Human Epigenome Consortium emphasize metadata requirements and deposition in public repositories, fostering large-scale meta-analyses of accessibility. These standards have accelerated the integration of ATAC-seq data into global databases, supporting collaborative discoveries. Looking ahead, ATAC-seq holds potential for real-time assessment of dynamics using portable sequencers like ONT's , which could enable on-site profiling through miniaturized assays. Such developments may transform diagnostics by capturing accessibility changes during disease progression without . Emerging trends in ATAC-seq studies are driven by expanding multi-omics applications and computational advancements.

References

  1. [1]
    Transposition of native chromatin for fast and sensitive epigenomic ...
    Oct 6, 2013 · ATAC-seq queries the location of open chromatin, the binding of DNA-associated proteins and chromatin compaction at nucleotide resolution.
  2. [2]
    insights into the development of ChIP-seq and ATAC-seq - PMC - NIH
    We discuss the development of epigenome sequencing technologies, especially ChIP-seq & ATAC-seq and their current applications in scientific research.
  3. [3]
    From reads to insight: a hitchhiker's guide to ATAC-seq data analysis
    Feb 3, 2020 · Here, we discuss the major steps in ATAC-seq data analysis, including pre-analysis (quality check and alignment), core analysis (peak calling), and advanced ...
  4. [4]
    The chromatin accessibility landscape of primary human cancers
    Oct 26, 2018 · ATAC-seq enables the genome-wide profiling of TF binding events that orchestrate gene expression programs and give a cell its identity. RESULTS.The Chromatin Accessibility... · Atac-Seq In Frozen Human... · Noncoding Dna Elements...
  5. [5]
    Systematic benchmarking of single-cell ATAC-sequencing protocols
    Aug 3, 2023 · In this study, we benchmark the performance of eight scATAC-seq methods across 47 experiments using human peripheral blood mononuclear cells ( ...
  6. [6]
    Transposition of native chromatin for fast and sensitive epigenomic ...
    ATAC-seq captures open chromatin sites using a simple two-step protocol with 500-50,000 cells and reveals the interplay between genomic locations of open ...
  7. [7]
    Chromatin accessibility: methods, mechanisms, and biological insights
    An important aspect of chromatin-based regulation lies in the controlled ability of factors to gain and maintain physical access to DNA – often referred to as ...
  8. [8]
    Molecular Complexes at Euchromatin, Heterochromatin and ... - MDPI
    Euchromatin is characterized by active genes, wider spacing between nucleosomes, higher accessibility to transcription machinery, histone modifications and ...
  9. [9]
    Chromatin accessibility: biological functions, molecular mechanisms ...
    Dec 4, 2024 · As the basic chromatin structure, nucleosome positioning is closely associated with chromatin accessibility. ... enhancers upper ETV4 promoter ...
  10. [10]
    Single-cell chromatin accessibility reveals principles of regulatory ...
    Jun 17, 2015 · We have developed a single-cell assay for transposase-accessible chromatin (scATAC-seq). ATAC-seq is an ensemble measure of open chromatin that ...
  11. [11]
    Spatial profiling of chromatin accessibility in mouse and human tissues
    Aug 17, 2022 · Here we describe a method for spatially resolved chromatin accessibility profiling of tissue sections using next-generation sequencing (spatial-ATAC-seq)
  12. [12]
    ATAC-seq Data Standards and Processing Pipeline - ENCODE
    The ENCODE ATAC-seq pipeline is used for quality control and statistical signal processing of short-read sequencing data, producing alignments and measures of ...Menu · Pipeline Overview · OutputsMissing: integration 2015
  13. [13]
    High-resolution genome-wide functional dissection of transcriptional ...
    Dec 19, 2018 · The experimental component of HiDRA is the combination of ATAC-seq and STARR-seq (i.e., ATAC-STARR-seq): fragments are enriched from open ...
  14. [14]
    [PDF] Application note – Fiber-seq: High-resolution long-read chromatin ...
    Oct 2, 2025 · Fiber-seq not only captures genetic variants for long-range haplotype phasing, but also provides methylation and chromatin accessibility.
  15. [15]
    Identification of transcription factor binding sites using ATAC-seq
    Feb 26, 2019 · Cleavage events in small linker DNA between nucleosomes are possible ... Position 1 corresponds to the start position of the ATAC/DNase-seq read.
  16. [16]
    Deep learning-based enhancement of epigenomics data ... - Nature
    Mar 8, 2021 · ATAC-seq is a widely-applied assay used to measure genome-wide chromatin accessibility; however, its ability to detect active regulatory ...
  17. [17]
    The chromatin accessibility landscape of primary human cancers - NIH
    We profiled the chromatin accessibility landscape for 23 types of primary human cancers, represented by 410 tumor samples derived from 404 donors from TCGA.
  18. [18]
    Multi-omics profiling of mouse gastrulation at single cell resolution
    We avoid binarising DNA methylation and chromatin accessibility values into “low” or “high” states as it is not a good representation of the continuous nature ...
  19. [19]
    Characterization of the chromatin accessibility in an Alzheimer's ...
    Mar 23, 2020 · ... (ATAC-seq) was used to investigate the AD ... Increased CSF E-Selectin in clinical Alzheimer's disease without altered CSF Aβ42 and tau.
  20. [20]
    The three-dimensional landscape of cortical chromatin accessibility ...
    Apr 1, 2023 · To characterize the dysregulation of chromatin accessibility in Alzheimer's disease (AD), we generated 636 ATAC-seq libraries in neurons and ...
  21. [21]
    Epigenomic characterization of latent HIV infection identifies latency ...
    Using Assay of Transposon-Accessible Chromatin sequencing (ATACseq) we found that latently infected cells exhibit greatly reduced proviral accessibility, ...
  22. [22]
    Multi-omics profiling highlights karyopherin subunit alpha 2 as a ...
    Utilizing ATAC-seq as one of the multi-omics technologies, we explored the tumor heterogeneity in ACC and sought to identify aberrantly accessible differential ...<|control11|><|separator|>
  23. [23]
    Assessment of computational methods for the analysis of single-cell ...
    Nov 18, 2019 · The low copy number results in an inherent per-cell data sparsity, where only 1–10% of expected accessible peaks are detected in single cells ...
  24. [24]
    ATAC-seq with unique molecular identifiers improves quantification ...
    Nov 13, 2020 · ... unique molecular identifiers (UMIs) ... A plate-based single-cell ATAC-seq workflow for fast and robust profiling of chromatin accessibility.
  25. [25]
    Unified molecular approach for spatial epigenome, transcriptome ...
    Apr 18, 2025 · SPACE-seq offers a unified molecular approach, providing a versatile solution for studying the multiomics landscape of complex tissues.
  26. [26]
    Joint profiling of chromatin accessibility and gene expression in ...
    sci-CAR effectively combines sci–ATAC sequencing (sci-ATAC-seq) and sci-RNA-seq into a single protocol (Fig. 1) by the following steps: (i) Nuclei are extracted ...<|control11|><|separator|>
  27. [27]
  28. [28]
  29. [29]
    Spatial integration of multi-omics single-cell data with SIMO - Nature
    Feb 1, 2025 · We introduce SIMO, a computational tool for spatial transcriptomics with multiple non-spatial single-cell omics data, such as RNA, ATAC, and DNA methylation.
  30. [30]
    ATAC-Seq data analysis - Galaxy Training!
    Sep 2, 2019 · In this tutorial we will use data from the study of Buenrostro et al. 2013, the first paper on the ATAC-Seq method. The data is from a human ...
  31. [31]
    ATAC-Seq Services - End-to-End Open Chromatin Analysis Service
    FRiP scores will vary depending on cell type. FRiP scores of >30% are a good indication of success. However, lower FRiP scores are acceptable in more ...
  32. [32]
    CebolaLab/ATAC-seq: Analysis pipeline for ATAC-seq data - GitHub
    TSS enrichment should be ideally >10 for hg19 and not <6 and ideally >7 and not <5 for GRCh38. HOMER can be used for genomic annotation. FRiP score should be > ...
  33. [33]
    Help understanding MACS2 --extsize and --shift - Biostars
    Aug 16, 2016 · So I'm trying to understand the --shift and --extsize parameters in MACS2. I have inherited an ATAC-seq that used the following MACS2 command.How to merge samples of the same cell type to do differential peak ...I do not understand how to get total peaks from MACS2 ... - BiostarsMore results from www.biostars.org
  34. [34]
    Comparison of differential accessibility analysis strategies for ATAC ...
    Jun 23, 2020 · ATAC-seq is widely used to measure chromatin accessibility and identify open chromatin regions (OCRs). OCRs usually indicate active ...
  35. [35]
    ATAC-seq normalization method can significantly affect differential ...
    Apr 22, 2020 · We argue that researchers should systematically compare multiple normalization methods before continuing with differential accessibility analysis.
  36. [36]
    ArchR is a scalable software package for integrative single-cell ...
    Feb 25, 2021 · ArchR provides a user-focused interface for complex scATAC-seq analysis, such as marker feature identification, transcription factor (TF) ...
  37. [37]
    UCSC Genome Browser Home
    Genome Browser - Interactively visualize genomic data ; BLAT - Rapidly align sequences to the genome ; In-Silico PCR - Rapidly align PCR primer pairs to the ...Genomes · Other Tools · GTEx Resources at UCSC · COVID-19 Research at UCSCMissing: ATAC- ROADMAP
  38. [38]
    grid visualization - NIH Roadmap Epigenomics
    The project has generated high-quality, genome-wide maps of several key histone modifications, chromatin accessibility, DNA methylation and mRNA expression ...Processed Data · Metadata section · Chromatin state learning · Imputed signal tracks
  39. [39]
    Single-cell and spatial multiomic inference of gene regulatory ...
    Here, we present SCRIPro, a comprehensive computational framework that robustly infers GRNs for both single-cell and spatial multiomics data.
  40. [40]
    ATACseqQC - Bioconductor
    ATAC-seq, an assay for Transposase-Accessible Chromatin using sequencing, is a rapid and sensitive method for chromatin accessibility analysis. It was developed ...
  41. [41]
    GREAT: Genomic Regions Enrichment of Annotations Tool
    No information is available for this page. · Learn why
  42. [42]
    Intrinsic bias estimation for improved analysis of bulk and single-cell ...
    Sep 21, 2022 · The sequence preferences of DNaseI or Tn5 cleavage can be better reflected when the enzymes are applied to deproteinized naked genomic DNA. We ...
  43. [43]
    ATAC-seq with unique molecular identifiers improves quantification ...
    UMI-ATAC-seq can rescue about 20% of reads that are mistaken as PCR duplicates in standard ATAC-seq in our study. We demonstrate that UMI-ATAC-seq could more ...
  44. [44]
    ATAC‐seq in Emerging Model Organisms: Challenges and Strategies
    Jun 1, 2025 · ATAC-seq begins with the isolation of cell nuclei and utilizes a hyperactive Tn5 transposase, which preferentially binds to accessible nuclear ...
  45. [45]
    Investigating chromatin accessibility during development and ... - NIH
    May 23, 2022 · Limitations of ATAC-seq include ... (2017) An improved ATAC-seq protocol reduces background and enables interrogation of frozen tissues.
  46. [46]
    A hierarchical, count-based model highlights challenges in scATAC ...
    Sep 17, 2025 · However, computational analyses of said data are exceptionally challenging due to the data readout of scATAC-seq being sparse, with over 90% of ...
  47. [47]
    Chromatin accessibility profiling by ATAC-seq - PubMed Central
    In summary, ATAC-seq is an effective technique for uncovering the gene regulatory changes that govern why cells express certain genes and how gene expression ...
  48. [48]
    Review and Evaluate the Bioinformatics Analysis Strategies of ATAC ...
    Sep 10, 2024 · Here, we conducted a comprehensive benchmarking analysis to evaluate the performance of eight popular software for processing ATAC-seq and CUT&Tag data.
  49. [49]
    Services | Emory University | Atlanta GA
    Services ; ATAC-Seq (Assisted) **, $155, $241 ; ChIPSeq/CUT&RUN (Assisted) **, $183, $286.
  50. [50]
    scNanoATAC-seq: a long-read single-cell ATAC sequencing ...
    Oct 11, 2022 · A long-read single-cell ATAC sequencing method to detect chromatin accessibility and genetic variants simultaneously within an individual cell.
  51. [51]
    Fiber-seq 101: A Multiomic Assay That Goes Beyond ATAC-seq
    Oct 13, 2025 · Fiber-seq is a powerful new long-read sequencing assay to simultaneously study genetic and epigenetic features. Read more!
  52. [52]
    REnformer, a single-cell ATAC-seq predicting model to investigate ...
    Jun 4, 2025 · Scientists have developed a new predictive model called REnformer that improves the understanding of gene regulation at the single-cell level.Missing: 2023-2025 | Show results with:2023-2025
  53. [53]
    DNA-Diffusion: Leveraging Generative Models for Controlling ...
    Feb 1, 2024 · To evaluate how DNA sequences generated by the diffusion model affect chromatin accessibility, we inserted them within known or putative ...Missing: 2025 | Show results with:2025
  54. [54]
    Human BioMolecular Atlas Program (HuBMAP): 3D Human ... - Nature
    Mar 13, 2025 · The Human BioMolecular Atlas Program (HuBMAP) aims to construct a comprehensive reference model of the healthy ('non-diseased') human body across all levels.
  55. [55]
    CRISPR screening identifies regulators of enhancer-mediated ...
    Feb 14, 2025 · We used CRISPR-Cas9 screening to identify transcription factors that bind to the AR enhancer and modulate enhancer-mediated AR transcription.
  56. [56]
    Resolution of a human super-enhancer by targeted genome ...
    Jan 14, 2025 · (a) A single ATAC-seq peak within the R5-2 domain, identified as the central active element of the OTX2 super-enhancer, contains consensus DNA ...
  57. [57]
    How nanopore sequencing works | Oxford Nanopore Technologies
    It is the only sequencing technology that offers real-time analysis (for rapid insights), in fully scalable formats from pocket to population scale.Nanopores · Dna/rna Sequencing · Why Dna / Rna?Missing: ATAC- | Show results with:ATAC-
  58. [58]
    Multi-omics Data Analysis and Integration Guide - Biostate.ai
    Aug 1, 2025 · The global multiomics market size is projected to jump from USD3.9 billion in 2024 to USD15.3 billion by 2034, a 14.8%CAGR. This rapid growth ...