Fact-checked by Grok 2 weeks ago

CTCF

CCCTC-binding factor (CTCF) is a highly conserved, multifunctional protein that acts as a pivotal architectural regulator of organization and across eukaryotic genomes. Comprising 11 central domains flanked by unstructured N- and C-terminal regions, CTCF binds to approximately 50,000 genomic sites per , recognizing a complex DNA motif in forward, reverse, or convergent orientations through multivalent interactions. First identified in the as a transcriptional of the c-myc gene, CTCF is ubiquitously expressed and exhibits over 95% identity in its DNA-binding domains among vertebrates, underscoring its evolutionary importance. In architecture, CTCF plays a central role in forming topologically associating domains (TADs) and chromatin loops, often in cooperation with the complex, to establish stable three-dimensional genome structures that compartmentalize regulatory elements. These loops facilitate or restrict long-range promoter-enhancer interactions, ensuring precise spatial organization of the genome within the ; for instance, convergent CTCF motifs at TAD boundaries promote loop extrusion and , with about 92% of such sites oriented accordingly. As an , CTCF blocks enhancer-promoter communication, such as at the mammalian β-globin and H19-Igf2 loci, thereby preventing ectopic gene activation and maintaining domain autonomy. Additionally, CTCF functions as a chromatin barrier, separating from and limiting the spread of repressive marks, which is critical for processes like X-chromosome inactivation and . Beyond structural roles, CTCF directly influences transcription as a versatile regulator, capable of both activating and repressing genes depending on context, such as at promoters or distal enhancers. Its binding activity is modulated by —typically favoring unmethylated CpG-rich sequences—and interactions with partner proteins like YY1 or Oct4, which fine-tune accessibility and positioning. Approximately 50% of CTCF sites are intergenic, 35% intronic, and the remainder near promoters, allowing it to impact , imprinting, and overall control. Dysregulation of CTCF, through mutations or altered binding, has been implicated in developmental disorders and cancers, highlighting its essentiality in health.

Gene and Protein Overview

Genomic Location and Expression

The human CTCF gene is located on the long arm of chromosome 16 at the cytogenetic band 16q22.1, spanning approximately 77 kb from position 67,562,526 to 67,639,177 on the GRCh38 reference assembly. This genomic region contains 13 exons, with the majority encoding the functional protein domains. Alternative splicing of the CTCF primary transcript produces multiple isoforms, at least five of which have been annotated in humans, arising from variations in exon inclusion particularly in the 5' and 3' untranslated regions as well as coding sequences. The canonical isoform, represented by transcript variant 1 (NM_006565.4), encodes a 727-amino-acid protein that includes the full complement of 11 zinc finger domains essential for DNA binding. These isoforms may contribute to tissue-specific regulatory nuances, though the canonical form predominates in most cell types. CTCF exhibits ubiquitous expression across human tissues and developmental stages, detectable as a ~4-kb mRNA transcript in analyses of various cell lines and organs. Expression levels vary, with elevated abundance observed during embryogenesis and in neural tissues such as the , where it supports early organization. This patterned expression is governed by multiple promoters and distal enhancers within the gene locus, which integrate developmental and environmental signals to fine-tune transcript output. The CTCF coding sequence and protein are highly conserved among vertebrates, reflecting its fundamental role in genome . Notably, the 11 domains display 99% sequence identity between and orthologs, underscoring the evolutionary stability of DNA-binding specificity.

Protein Structure and Domains

CTCF is an approximately 82 kDa protein encoded by the human CTCF gene, characterized by a modular comprising an N-terminal (NTD), a central DNA-binding consisting of 11 tandem (), and a C-terminal (CTD). This tripartite structure enables CTCF to perform diverse functions in regulation, with each contributing specific biochemical properties. The NTD and CTD are intrinsically disordered regions that facilitate protein-protein interactions, while the central ZF provides sequence-specific DNA recognition. The central domain features 11 C2H2-type zinc fingers, each motif coordinating a zinc ion via two conserved cysteine and two histidine residues to form a compact ββα fold that inserts into the DNA major groove. Zinc fingers 3–7 (ZF3-7) primarily contact the core binding motif of DNA, recognizing sequential base triplets, whereas ZF8-11 interact with upstream motifs, allowing CTCF to accommodate variable DNA sequences through combinatorial usage of these fingers. Crystallographic studies of CTCF ZF-DNA complexes, such as those in PDB entry 8SSS (capturing ZF1-7 bound to a 23 bp DNA duplex) and 8SSQ (ZF3-11 with a 35 bp DNA), illustrate the domain's extended, right-handed helical arrangement along the DNA, with ZF8 acting as a flexible spacer across the minor groove to position downstream fingers for cross-strand contacts. These structures highlight the ZF domain's versatility, as subtle residue variations in the fingers enable high-affinity binding to diverse motifs without rigid specificity. The NTD, spanning the first ~200 , is largely unstructured but capable of self-association and dimerization, promoting CTCF multimerization that supports long-range interactions. within the NTD, at sites such as Ser224 and threonines including Thr289, Thr317, Thr346, and Thr374, regulates CTCF's activity by influencing its localization, , and binding dynamics during processes like . In contrast, the CTD engages other regulatory proteins, though its precise partners vary by , underscoring CTCF's adaptability in genomic .

Discovery and Characterization

Initial Identification

CTCF was first identified in 1990 as a sequence-specific DNA-binding protein that interacts with three regularly spaced direct repeats of the CCCTC motif located in the silencer region of the chicken c-myc gene promoter. This nuclear factor was purified to near homogeneity from chicken oviduct nuclear extracts using sequence-specific DNA affinity chromatography, revealing a polypeptide of approximately 130 kDa that specifically recognizes the CCCTC elements and inhibits c-myc transcription, thereby acting as a repressor. Electrophoretic mobility shift assays (EMSA) demonstrated the protein's high-affinity, sequence-specific binding to these motifs, with protection from chemical cleavage confirming the footprint of interaction across the CCCTC repeats. The protein was named CCCTC-binding factor (CTCF) due to its recognition of the conserved CCCTC core sequence, which was essential for tight binding and also present in analogous positions in the mouse and human c-myc promoters. The initial cloning of CTCF cDNA was achieved from a chicken oviduct λgt11 expression library screened with a multimerized probe containing the CCCTC motifs, yielding full-length clones that encoded an 82 kDa protein with 11 domains. These s were predicted to mediate the DNA-binding specificity, aligning with the observed patterns in EMSA experiments where CTCF required intact CCCTC sequences and was sensitive to at CpG dinucleotides within the . Functional studies using assays in cells further confirmed CTCF's role as a transcriptional , as its overexpression reduced c-myc promoter activity in a dose-dependent manner. The human homolog of CTCF was identified shortly thereafter through cross-species hybridization, using the chicken CTCF cDNA as a probe to screen a cell library, resulting in the isolation of full-length human CTCF (hCTCF) clones in 1996. The hCTCF protein shares 93% amino acid sequence identity overall with its counterpart, with over 95% identity in the DNA-binding region, comprising 11 fingers within a 727-amino-acid polypeptide, and exhibits conserved binding specificity to both avian and mammalian c-myc silencer elements as verified by EMSA. This sequence similarity underscored CTCF's evolutionary conservation as a of c-myc expression across vertebrates. This foundational of CTCF as a binding to CCCTC motifs paved the way for subsequent explorations of its multifaceted roles in gene regulation.

Key Experimental Advances

A landmark advancement in mapping CTCF's genomic occupancy came from a 2007 chromatin immunoprecipitation-on-chip (ChIP-chip) study by Kim et al., which identified approximately 13,804 CTCF-binding sites across the in primary fibroblasts, demonstrating its widespread role as an insulator protein and establishing the foundation for genome-wide analyses. This approach revealed that CTCF sites are enriched near transcription start sites and CpG islands, highlighting its potential in both and contexts. Subsequent high-throughput chromatin conformation capture techniques further elucidated CTCF's architectural functions. In 2012, Dixon et al. applied Hi-C to mouse and human cell lines, identifying topologically associating domains (TADs) whose boundaries were strongly associated with convergent CTCF motifs, indicating CTCF's role in compartmentalizing chromatin into stable loops. Complementing this, Nora et al. used 5C in mouse embryonic stem cells that same year to resolve finer interactions at the X-inactivation center, confirming CTCF's enrichment at TAD edges and its contribution to enhancer-promoter insulation. These studies collectively shifted the paradigm from CTCF as a simple insulator to a key organizer of three-dimensional genome structure. Recent methodological innovations have enabled more precise manipulations and observations of CTCF dynamics. In 2023, Hyle et al. introduced the auxin-inducible degron 2 (AID2) system in human B-cell acute lymphoblastic leukemia cells, achieving rapid, near-complete CTCF degradation within 30 minutes upon auxin addition, which allowed dissection of its domain-specific roles without off-target effects seen in earlier AID1 versions. Building on this, a 2025 high-resolution footprinting analysis using CTCF MNase HiChIP data developed the CAMEL tool to map binding at near base-pair resolution, revealing how active chromatin states, such as those with H3K27ac marks, modulate CTCF occupancy and influence cohesin-mediated loop extrusion efficiency. Mutational studies have similarly advanced insights into CTCF's chromatin interactions. A 2025 investigation by Do et al. engineered binding domain mutations in CTCF, including those mimicking disease-associated variants, and demonstrated through editing in cell lines that these alterations disrupt accessibility and looping at specific loci, underscoring CTCF's direct contributions to regulatory landscapes beyond mere DNA binding.

DNA Binding Properties

Binding Motifs and Sites

CTCF primarily recognizes a degenerate DNA consensus motif consisting of a 15-base-pair core sequence, 5'-CCGCGNGGNGGCAG-3', where N denotes any nucleotide. This motif is bound by the central zinc fingers (3-11) of the CTCF protein, with the sequence's asymmetry enabling directional binding that influences chromatin interactions. Approximately 75-80% of identified CTCF binding sites in the human genome contain this or a highly similar motif, underscoring its prevalence in CTCF-DNA recognition. Genome-wide, CTCF occupies 50,000 to 65,000 sites in mammalian genomes, with binding enriched at promoters, enhancers, and topological domain boundaries. These sites exhibit cell-type-specific variations, where only a subset—around 66,000 on average per cell type—are actively bound, reflecting dynamic occupancy influenced by cellular context. Convergent CTCF motifs, where paired binding sites are oriented in opposite directions (e.g., one forward and one reverse relative to the transcription direction), are particularly common at loop anchors and facilitate chromatin loop formation by promoting interactions between distant genomic regions. CTCF binding motifs demonstrate strong evolutionary across vertebrates, with many orthologous sites and motif sequences preserved from to , and a subset retained in more distant species like , as evidenced by cross-species alignments showing retained CTCF occupancy at orthologous loci. DNA at CpG dinucleotides within these motifs can reduce binding affinity, though this is modulated by additional factors.

Factors Influencing Binding

DNA at CpG sites within CTCF motifs significantly reduces the protein's affinity for DNA, thereby modulating its occupancy and regulatory functions. This -sensitive is particularly evident at imprinted loci, such as the H19/Igf2 (ICR), where unmethylated maternal alleles allow CTCF to enforce enhancer blocking and monoallelic expression, while paternal prevents occupancy and permits Igf2 expression. Studies have shown that specific CTCF sites in the H19/Igf2 ICR, when methylated, abolish and disrupt activity, highlighting as a key epigenetic switch for . Nucleosome positioning also plays a critical role in CTCF binding dynamics, as the protein preferentially occupies sites that facilitate displacement, thereby enhancing accessibility. CTCF binding at its motifs often repositions surrounding asymmetrically, creating phased arrays that promote open states conducive to further regulatory interactions. For instance, and analyses reveal that CTCF anchors position up to 20 around binding sites, with the core motif influencing the entry and exit of nucleosome-free regions to maintain accessibility. This displacement mechanism ensures that CTCF can access DNA in nucleosome-occupied regions, particularly following when is reassembled. The broader state, including modifications, further influences CTCF binding affinity and site occupancy. Active marks such as are associated with enhanced CTCF recruitment, as stable at promoters and enhancers correlates with increased binding probability and supports organization. Conversely, recent high-resolution studies indicate that active regulatory elements marked by such modifications can impede cohesin-mediated extrusion at CTCF sites, indirectly affecting binding stability by altering local tension and accessibility. These findings underscore how states dynamically tune CTCF's architectural roles without altering core sequence preferences. Additional factors, including cell cycle progression and sequence variations, contribute to variability in CTCF binding. CTCF occupancy exhibits cell cycle-dependent fluctuations, with increased dynamics in factor and nucleosome positioning during S-phase, where replication-associated transiently reduces site accessibility before restoration. within CTCF can alter binding specificity, as demonstrated in 2024 enhancer assays where disruptions in motif sequences diminished occupancy and compromised barrier functions against ectopic enhancer-promoter contacts. These cell cycle and mutational effects highlight CTCF's responsiveness to both temporal and genetic contexts in maintaining precise genomic regulation.

Regulatory Functions

Transcriptional Regulation

CTCF plays a pivotal role in transcriptional regulation by acting as both an activator and repressor of gene expression, primarily through its binding to promoter regions and modulation of enhancer-promoter interactions. As a zinc finger transcription factor, CTCF influences the initiation and efficiency of transcription at specific loci by directly interacting with DNA sequences near promoters. Its regulatory effects are often context-dependent, varying based on the genomic location and epigenetic modifications at binding sites. In transcriptional repression, CTCF binds to promoter-proximal regions to inhibit , as exemplified by its action on the chicken c-myc . Initially identified in chicken cells, CTCF specifically binds to three CCCTC motifs in the 5'-flanking sequence of the c-myc promoter, suppressing transcription and thereby regulating cell growth. This repression mechanism involves CTCF competing with or blocking access by other transcription factors to the promoter, preventing activation of the gene. Similarly, at the human IGF2 locus, CTCF represses transcription by binding near the promoter and interfering with enhancer-driven activation. CTCF can also function as a transcriptional activator at certain promoters, enhancing through direct binding and recruitment of co-activators. For instance, CTCF binds to the GC-rich APBβ site (-93/-82) in the promoter of the amyloid precursor protein () gene, promoting its transcription in neuronal cells. This activation often occurs by facilitating interactions with distant activators, thereby boosting promoter activity at specific loci. A key example of CTCF's regulatory role is its involvement in at the H19/Igf2 locus, where methylation-sensitive binding controls allele-specific expression. On the maternal chromosome, CTCF binds to the unmethylated imprinting control region (ICR) upstream of H19, repressing Igf2 transcription by blocking enhancer access to its promoter. In contrast, methylation of the paternal ICR prevents CTCF binding, allowing enhancers to activate Igf2 expression. This differential binding underscores CTCF's sensitivity to , which dictates its repressive function in imprinting control.

Insulator and Barrier Activity

CTCF functions as a chromatin insulator by binding to specific DNA sequences that prevent enhancers from inappropriately activating promoters of non-target genes, thereby maintaining the specificity of gene regulation. This enhancer-blocking activity is exemplified in the chicken β-globin locus, where the 5′HS4 insulator element, bound by CTCF, shields the locus control region (LCR) from activating unrelated genes while allowing proper erythroid-specific expression. In mammalian systems, similar insulation occurs at the H19/Igf2 imprinted locus, where CTCF binding to the imprinting control region blocks maternal Igf2 expression by preventing enhancer access. In addition to enhancer blocking, CTCF exhibits barrier activity by protecting euchromatic regions from the invasive spread of repressive . A prominent example is the Tsix/Xist locus on the , where a conserved CTCF-binding element (RS14) at the Tsix-Xist boundary acts as a barrier to prevent heterochromatin propagation from the Tsix promoter into the Xist domain, ensuring proper X-chromosome inactivation initiation in female cells. This function is critical during development, as disruption of CTCF binding at such sites leads to aberrant silencing patterns and impaired dosage compensation. The directionality of CTCF-mediated insulation arises from the asymmetric nature of its binding motifs, which allow CTCF to orient specifically on DNA and enforce unidirectional blocking of regulatory signals. Structural studies reveal that CTCF's zinc finger domains recognize an asymmetric core sequence, with the N-terminal fingers binding the 5′ end and C-terminal fingers the 3′ end, enabling context-dependent insulation that favors one-way prevention of enhancer-promoter crosstalk. This asymmetry is essential for the polarity observed in insulator function, distinguishing it from bidirectional interactions. Recent studies from 2024 have highlighted how clusters of CTCF-binding sites interpose between enhancers and promoters to cooperatively insulate regulatory domains, particularly in developmental genes located near (TAD) boundaries. In and embryonic cells, these CTCF clusters at gene-poor TADs enhance through a combination of physical barriers and promoter competition, preventing ectopic activation while permitting ordered (e.g., at Gbx2 and Six3 loci). Furthermore, analyses of CTCF characteristics in reporter assays demonstrate that binding strength, orientation, and nearby sequence features, rather than motif conservation alone, dictate enhancer-blocking efficacy, underscoring the nuanced role of CTCF elements in fine-tuning .

Chromatin Architecture and Looping

CTCF plays a pivotal role in organizing the three-dimensional structure of the by defining boundaries of topologically associating domains (TADs), which are self-interacting regions typically spanning approximately 1 in mammalian cells. These domains partition the into insulated regulatory territories that constrain enhancer-promoter interactions and maintain stable patterns. TAD boundaries are highly enriched for CTCF binding sites, where the protein acts to restrict interactions across domain borders, thereby preventing ectopic regulatory influences. In the loop extrusion model, cohesin complexes actively extrude chromatin loops in an ATP-dependent manner, starting from random loading sites and progressively enlarging until extrusion is halted by CTCF bound to DNA at convergent orientations. This process generates chromatin loops that connect distal regulatory elements, with CTCF serving as a directional barrier that specifically impedes cohesin progression when its binding motifs face each other across the extruded loop. The convergence of CTCF motifs ensures precise loop anchoring, promoting the formation of stable higher-order structures essential for gene regulation. Loop anchoring is facilitated by CTCF homodimerization, where the domains enable direct CTCF-CTCF interactions at paired binding sites, stabilizing the extruded loops in cooperation with . These homodimers, often co-occupied by , form the structural basis for insulated neighborhoods within TADs, encapsulating promoters and enhancers to focus regulatory signals. Recent studies have elucidated how states modulate extrusion dynamics, with active regulatory elements such as enhancers marked by H3K27ac and impeding progression, resulting in shorter loops averaging ~140 kb compared to ~250 kb in quiescent regions. Additionally, CTCF depletion disrupts multi-way hubs involving enhancers, promoters, and other factors but preserves pairwise enhancer-promoter contacts, indicating that CTCF scaffolds cooperative 3D interactions independently of basic looping. These findings highlight CTCF's nuanced role in fine-tuning genome architecture for .

RNA Splicing Modulation

CTCF plays a pivotal role in modulating by influencing the co-transcriptional processing of pre-mRNA, distinct from its functions in transcriptional initiation. Through its binding to specific DNA sites near s, CTCF affects splice site selection, promoting either exon inclusion or exclusion depending on the genomic context. One primary mechanism involves CTCF-mediated (Pol II) pausing at intragenic sites, which slows transcriptional elongation and allows sufficient time for assembly on weak exons. For instance, in the CD45 gene, CTCF binds upstream of exon 5 in unmethylated DNA regions, inducing Pol II pausing that enhances exon inclusion; at these sites disrupts CTCF binding, leading to . This process links epigenetic modifications directly to splicing outcomes. Similarly, CTCF promotes the inclusion of weak upstream exons by facilitating local pausing, as demonstrated in cellular models where CTCF depletion reduces splicing efficiency. CTCF also regulates splicing through chromatin looping that brings distal regulatory elements into proximity with splice sites. In the protocadherin (Pcdh) gene cluster, critical for neural diversity, CTCF and form loops between enhancers and alternative promoters, enabling exon choice and isoform diversity in neurons. This looping mechanism ensures precise for mutually exclusive exons. Another example is the Cacna1b gene, where CTCF influences the selection of mutually exclusive exons in neuronal transcripts, affecting synaptic function. Beyond these DNA-centric roles, CTCF exhibits RNA-binding capability via distinct domains, allowing potential interactions with nascent transcripts that may stabilize splicing complexes post-transcriptionally. However, direct evidence for persistent CTCF association with mature in splicing regulation remains limited, with most effects occurring co-transcriptionally. Recent studies highlight CTCF's role in regulating global accessibility and transcription during rod photoreceptor , supporting retinal maturation.

Molecular Interactions

Protein-Protein Partnerships

CTCF engages in diverse protein-protein interactions that modulate its regulatory functions in organization and . These partnerships often involve the protein's N-terminal domain (NTD), which facilitates binding to other factors, and its central zinc-finger domain, which can influence association dynamics. Key interactions include those with the cohesin complex, transcription factors, and components of the transcriptional machinery, enabling CTCF to coordinate long-range contacts and local regulatory events. A prominent interaction occurs between CTCF and the complex, particularly through the RAD21 subunit, which binds to the CTCF NTD. This association is crucial for stabilizing loops, as the NTD's 79-amino-acid region positions at CTCF-bound sites to promote loop extrusion and formation. Depletion of RAD21 disrupts these loops, underscoring the partnership's role in maintaining higher-order structures. CTCF also forms homodimers via self-interactions between its N- and C-terminal domains, which are intrinsically disordered regions that support long-distance bridging without requiring DNA binding. These homodimers enhance CTCF's capacity for multivalent contacts in three-dimensional organization. Additional partners include YB-1 (Y-box binding protein 1), which cooperates with CTCF to repress transcription at target loci such as c-myc. Co-expression of YB-1 and CTCF amplifies repression, with CTCF blocking YB-1's access to certain DNA sequences , thereby fine-tuning . Similarly, CTCF interacts with (poly(ADP-ribose) polymerase 1), where PARylation by PARP1 stabilizes CTCF's binding to insulators and enhances barrier activity against spreading, as seen at the Igf2/H19 locus. This modification is essential for CTCF's insulator function in . CTCF also forms transient interactions with (Pol II), particularly its largest subunit RPB1, to influence co-transcriptional processes. This association recruits Pol II to CTCF-bound sites, promoting transcriptional activation or pausing depending on the context, such as in regulation. states of CTCF may modulate this interaction, allowing dynamic control over Pol II progression. These partnerships collectively enable CTCF to integrate soluble protein networks for precise genomic regulation.

Interactions with Chromatin Components

CTCF engages with nucleosome remodeling complexes to modulate chromatin structure at its binding sites, facilitating proper positioning of nucleosomes and enhancing insulator function. Specifically, CTCF interacts with the chromatin helicase DNA-binding protein 8 (CHD8), a member of the CHD family of ATP-dependent remodelers, to maintain epigenetic marks and active insulation. This interaction occurs at CTCF-bound loci such as the H19 differentially methylated region (DMR) and the beta-globin locus control region, where CHD8 recruitment by CTCF repositions nucleosomes, preventing ectopic gene activation and preserving chromatin boundaries. Depletion of CHD8 disrupts these processes, leading to loss of insulator activity and altered histone acetylation near CTCF sites. Similarly, CTCF binding influences nucleosome organization through interactions with ISWI family remodelers, including SNF2H (SMARCA5), which arrays nucleosomes adjacent to CTCF motifs to promote accessibility for regulatory factors. In addition to remodelers, CTCF recruits to enforce transcriptional repression and stable silencing. CTCF associates with histone deacetylases (HDACs) via the corepressor SIN3A, directing deacetylation of histones at target promoters to compact chromatin and inhibit gene expression. This mechanism is evident in the repression of genes like c-myc, where CTCF-mediated HDAC recruitment reduces histone H4 acetylation, thereby limiting transcriptional initiation. Furthermore, CTCF links to Polycomb group proteins for long-term silencing, particularly through direct binding to SUZ12, a core component of the Polycomb repressive complex 2 (PRC2). This interaction at imprinting control regions, such as the IGF2/H19 locus, enables PRC2 to deposit repressive marks on the maternal allele, silencing IGF2 expression and maintaining . Disruption of the CTCF-SUZ12 interface abolishes PRC2 recruitment and reactivates silenced alleles. CTCF also facilitates the recruitment of cohesin-loading factors to its binding sites, influencing dynamics without directly altering three-dimensional structures. Notably, CTCF promotes the activity of nipped-B homolog (NIPBL), the primary loader for complexes, at CTCF-occupied regions to support localized organization. This NIPBL-CTCF synergy enhances the efficiency of factor loading, contributing to stable states during cellular processes like . Recent studies highlight CTCF's broader role in global accessibility through synergistic actions with multiple remodelers. For instance, in developing rod photoreceptors and erythroid cells, CTCF depletion reduces chromatin openness at thousands of sites, underscoring its coordination with ATP-dependent remodelers like CHD and ISWI families to maintain accessible domains prior to overt phenotypic changes. These findings emphasize CTCF's enzymatic partnerships in tuning landscapes for precise gene regulation.

Physiological and Pathological Roles

Roles in Development and Physiology

CTCF plays a critical role in embryogenesis, particularly in the process of X-chromosome inactivation (XCI), where it acts as an at the Tsix locus to prevent the spread of repressive marks from the gene. This boundary element between Tsix and Xist binds CTCF, ensuring that Tsix transcription represses Xist and maintains the active state of the future by preventing Xist-mediated silencing on that during early embryonic . Disruption of this CTCF binding leads to improper initiation of XCI, highlighting its essential function in dosage compensation for X-linked genes in female mammals. In tissue-specific contexts, CTCF exhibits high expression in the , where it regulates neural and supports neuronal development. For instance, CTCF is required for the expression of clustered protocadherin (Pcdh) genes, which are vital for formation and individual identity. Similarly, in the , CTCF binds to regulate global accessibility and transcription during rod photoreceptor differentiation, promoting the maturation of these cells essential for low-light vision. These roles underscore CTCF's contribution to specialized cellular functions in sensory tissues. CTCF is indispensable for maintaining , where it enforces parent-of-origin-specific at multiple loci by acting as a methylation-sensitive . At imprinted control regions (ICRs), such as those regulating Igf2 and H19, CTCF binds the unmethylated maternal to block enhancer-promoter interactions on the maternal , thereby silencing maternally expressed Igf2 and activating maternally expressed H19. This mechanism operates across various imprinted clusters, including those involved in growth and metabolism, ensuring monoallelic expression critical for embryonic viability. In broader physiological processes, CTCF maintains topologically associating domains (TADs) that preserve cell identity and genomic stability throughout development and adulthood. By anchoring loops and preventing ectopic interactions, CTCF ensures compartmentalized regulation that defines tissue-specific transcriptomes. Complete of CTCF in mice results in embryonic lethality around E3.5 to E5.5, as the loss of TAD integrity disrupts essential early patterning and .

Associations with Diseases and Mutations

Mutations in the CTCF gene, particularly missense variants in the (ZF) domain, are associated with CTCF-related disorder (CRD), an autosomal dominant characterized by , developmental delays, and features such as and growth retardation. These variants disrupt CTCF's DNA-binding affinity, leading to altered architecture and dysregulation in neuronal cells. As of 2024, over 70 pathogenic CTCF variants have been cataloged, predominantly and affecting ZF domains, with phenotypes ranging from mild to severe . In cancer, dysregulation of CTCF often involves hypermethylation of its binding sites, which impairs function and allows ectopic enhancer-promoter interactions that activate oncogenes. For instance, in gliomas, hypermethylation at CTCF sites leads to TAD boundary disruption and enhancer hijacking of oncogenes like PDGFRA. Similarly, in , CTCF loss or aberrant binding, influenced by at loci like IGF2/H19, promotes tumor progression by altering looping and . CTCF has also been linked to global instability in various cancers, exacerbating oncogenic transformations. Beyond neurodevelopment and cancer, CTCF variants contribute to male infertility through epigenetic defects in . In humans, altered at CTCF binding sites in is associated with severe defects in sperm morphology, motility, and concentration, leading to subfertility. Mouse models with CTCF knockout in germ cells exhibit impaired and due to disrupted organization during . In autism spectrum disorder (), CTCF mutations, including loss-of-function and missense types, correlate with chromatin folding abnormalities that dysregulate neurodevelopmental genes, increasing ASD risk. Recent studies as of 2025 highlight how specific CTCF mutations cause binding defects that lead to gene dysregulation in disease contexts. For example, brain- and cancer-associated mutations in the ZF binding domain impair CTCF's chromatin organizer role, resulting in altered 3D genome structure and transcriptional imbalances. A 2024 review emphasizes CTCF's non-transcriptional roles in disease, such as in DNA repair and replication, where mutations exacerbate genomic instability in pathologies like cancer and neurodevelopmental disorders.

References

  1. [1]
    CTCF shapes chromatin structure and gene expression in health and disease | EMBO reports
    ### Summary of CTCF: Definition, Structure, and Functions in Chromatin and Gene Expression (Key Points from Introduction)
  2. [2]
    CTCF as a multifunctional protein in genome regulation and gene ...
    Jun 5, 2015 · This finding provides an example of CTCF's role in the regulation of chromatin structure and gene expression. CTCF also helps chromatin attach ...
  3. [3]
    One protein to rule them all: the role of CCCTC-binding factor in ...
    The role of CTCF as an insulator blocking the contacts between enhancers and promoters and thus influencing gene expression has been known for years [81–83].
  4. [4]
    10664 - Gene ResultCTCF CCCTC-binding factor [ (human)] - NCBI
    Sep 9, 2025 · Human CTCF binds specific sites throughout the latent EBV genome. CTCF colocalizes with cohesin but not RNAP II. The overlapping cohesin- ...
  5. [5]
    Entry - *604167 - CCCTC-BINDING FACTOR; CTCF - OMIM
    ► Mapping. By FISH, Filippova et al. (1998) mapped the CTCF gene to chromosome 16q22. 1 in a small region of overlap for common chromosomal deletions in ...
  6. [6]
    An exceptionally conserved transcriptional repressor, CTCF ...
    The ubiquitously expressed 11-zinc-finger factor CTCF is an exceptionally highly conserved protein displaying 93% identity between avian and human amino acid ...
  7. [7]
    N-terminal domain of the architectural protein CTCF has similar ...
    Feb 14, 2020 · N-terminal domain of the architectural protein CTCF has similar structural organization and ability to self-association in bilaterian organisms.
  8. [8]
    Structures of CTCF–DNA complexes including all 11 zinc fingers
    Jul 13, 2023 · CTCF binds tens of thousands of enhancers and promoters on mammalian chromosomes by the use of its 11 tandem zinc finger (ZF) DNA-binding domain ...
  9. [9]
    Exploration of CTCF post-translation modifications uncovers Serine ...
    Jan 24, 2019 · A 25 amino acid window centered on mouse Ser224 remarkably revealed striking conservation proximal to this position amongst vertebrata, but not ...
  10. [10]
    Mitotic phosphorylation of CCCTC‐binding factor (CTCF) reduces its ...
    Mutation analyses indicated that CTCF is phosphorylated in mitosis at Thr289, Thr317, Thr346, Thr374, Ser402, Ser461, and Thr518, all of which are located in ...
  11. [11]
    A novel sequence-specific DNA binding protein which interacts with ...
    1990 Dec;5(12):1743-53. Authors. V V Lobanenkov , R H Nicolas, V V Adler, H ... CTCF was purified to near homogeneity by sequence-specific DNA chromatography.
  12. [12]
    CTCF, a conserved nuclear factor required for optimal transcriptional ...
    A novel sequence-specific DNA-binding protein, CTCF, which interacts with the chicken c-myc gene promoter, has been identified and partially characterized.Missing: original paper
  13. [13]
    Genome-wide Studies of CCCTC-binding Factor (CTCF) and ... - NIH
    Sep 7, 2012 · More than 75% of these binding sites contained the consensus CTCF-binding motif (CCGCGNGGNGGCAG) (18). A landmark ChIP-seq study by Barski ...
  14. [14]
    CTCF: An Architectural Protein Bridging Genome Topology ... - NIH
    The human Pcdhα gene cluster contains 13 similar, tandemly arranged, variable first exons (1 to 13, shown in blue if they are transcribed or in white if ...
  15. [15]
    CTCF Binding Polarity Determines Chromatin Looping - ScienceDirect
    Nov 19, 2015 · As recently reported, our data also suggest that chromatin loops preferentially form between CTCF binding sites oriented in a convergent manner.
  16. [16]
    Analysis of the vertebrate insulator protein CTCF binding sites in the ...
    The CTCF motif is highly conserved in vertebrates. The CTCF protein displays an unusually high conservation with over 95% amino acid sequence identity within ...
  17. [17]
    CTCF binding site classes exhibit distinct evolutionary, genomic ...
    Nov 18, 2009 · CTCF (CCCTF-binding factor) is an evolutionarily conserved, 11 zinc finger protein involved in a wide variety of functions [1]. CTCF is ...Missing: identity | Show results with:identity
  18. [18]
    CTCF mediates methylation-sensitive enhancer-blocking activity at ...
    May 25, 2000 · The Insulin-like growth factor 2 (Igf2) and H19 genes are imprinted, resulting in silencing of the maternal and paternal alleles, respectively.
  19. [19]
    CTCF maintains differential methylation at the Igf2/H19 locus
    The zinc-finger protein CTCF binds to the imprinting control region (ICR) of the genes Igf2 (encoding insulin-like growth factor 2) and H19 (fetal liver mRNA).
  20. [20]
    Role of CTCF binding sites in the Igf2/H19 imprinting control region
    CTCF binding is thought to play a direct role in inhibiting methylation of the ICR in female germ cells and in somatic cells and, therefore, in establishing ...
  21. [21]
    The Insulator Binding Protein CTCF Positions 20 Nucleosomes ...
    The positioning of nucleosomes along eukaryotic chromatin affects accessibility of the genomic DNA in vivo. Nucleosomes may bind to some genomic regions tightly ...
  22. [22]
    CTCF confers local nucleosome resiliency after DNA replication and ...
    Oct 10, 2019 · The access of Transcription Factors (TFs) to their cognate DNA binding motifs requires a precise control over nucleosome positioning.
  23. [23]
    CTCF binding landscape is shaped by the epigenetic state of the N ...
    In this study, we demonstrate that CTCF occupancy is driven by CTCF motifs, strategically positioned at the entry sides of a well-positioned nucleosome.
  24. [24]
    Stable H3K4me3 is associated with transcription initiation during ...
    Thus, stable H3K4me3 may provide positive regulation for CTCF binding and further promote chromatin organization. A previous study suggested that early ...
  25. [25]
    High-resolution CTCF footprinting reveals impact of chromatin state ...
    May 15, 2025 · We further investigate the impact of chromatin state on loop extrusion dynamics and find that active regulatory elements impede cohesin extrusion.
  26. [26]
    CTCF sites display cell cycle-dependent dynamics in factor binding ...
    CTCF sites, key architectural cis-elements, display cell cycle stage-dependent dynamics in factor binding and nucleosome positioning.
  27. [27]
    characteristics of CTCF binding sequences contribute to enhancer ...
    Aug 6, 2024 · We have developed an experimental system to determine the ability of minimal, consistently sized, individual CTCF elements to interpose between enhancers and ...
  28. [28]
    CTCF and Cohesin in Genome Folding and Transcriptional Gene ...
    Genome folding in interphase provides regulatory segmentation for appropriate transcriptional control, facilitates ordered genome replication, and contributes ...<|control11|><|separator|>
  29. [29]
    Methylation of a CTCF-dependent boundary controls imprinted ...
    May 25, 2000 · Methylation of a CTCF-dependent boundary controls imprinted expression of the Igf2 gene. Adam C. Bell &; Gary Felsenfeld. Nature volume 405, ...
  30. [30]
  31. [31]
  32. [32]
    A map of nucleosome positions in yeast at base-pair resolution - Nature
    **Summary of CTCF at TAD Boundaries and Size of TADs:**
  33. [33]
  34. [34]
    CTCF depletion decouples enhancer-mediated gene activation from ...
    May 13, 2025 · These findings demonstrate a role for enhancer–promoter interactions in gene regulation that is independent of cooperative interactions in ...
  35. [35]
  36. [36]
  37. [37]
    CTCF mediates chromatin looping via N-terminal domain ... - PNAS
    Jan 14, 2020 · We demonstrate that a 79-aa region within the CTCF N terminus is essential for cohesin positioning at CTCF binding sites and chromatin loop formation.
  38. [38]
    Cohesin and CTCF control the dynamics of chromosome folding
    Dec 5, 2022 · Cohesin and CTCF stabilize highly dynamic chromosome structures, facilitating selected subsets of chromosomal interactions.<|control11|><|separator|>
  39. [39]
    Study of the N-Terminal Domain Homodimerization in Human ...
    Aug 23, 2021 · Human CTCF has a homodimerizing unstructured domain at the N-terminus which is involved in long-distance interactions.
  40. [40]
    Physical and functional interaction between two pluripotent proteins ...
    Sep 22, 2000 · Although expression of YB-1 alone had no effect, co-expression with CTCF resulted in a marked enhancement of CTCF-driven c-myc transcriptional ...
  41. [41]
    CTCF Interacts with and Recruits the Largest Subunit of RNA ...
    The CTD can be modified by phosphorylation which results in the appearance of two forms of LS Pol II: hypophosphorylated (LS Pol IIa), migrating at 220 kDa, ...Missing: NTD dimerization
  42. [42]
    CTCF-dependent Chromatin Insulator Is Linked to Epigenetic ...
    Sep 1, 2006 · These findings provide insight into the role of CTCF-CHD8 complex in insulation and epigenetic regulation at active insulator sites.Missing: nucleosome | Show results with:nucleosome<|separator|>
  43. [43]
    The Chromatin Remodelling Enzymes SNF2H and SNF2L ... - PubMed
    Mar 28, 2016 · We find that the ATP-dependent chromatin remodelling enzyme SNF2H plays a major role organising arrays of nucleosomes adjacent to the binding sites.Missing: remodeling | Show results with:remodeling<|separator|>
  44. [44]
    Transcriptional repression by the insulator protein CTCF ... - PubMed
    Apr 15, 2000 · We suggest that CTCF driven repression is mediated in part by the recruitment of histone deacetylase activity by SIN3A.
  45. [45]
    Interruption of intrachromosomal looping by CCCTC binding factor ...
    May 2, 2011 · The N-terminal domain of CTCF interacts with SUZ12, part of the polycomb repressive complex-2 (PRC2), to silence the maternal allele. We ...
  46. [46]
    CTCF-mediated chromatin looping in EGR2 regulation and SUZ12 ...
    Aug 17, 2020 · In addition, CTCF interacts with SUZ12, a component of polycomb-repressive-complex 2 (PRC2), to repress the transcriptional program ...
  47. [47]
    A cohesin-independent role for NIPBL at promoters ... - PubMed - NIH
    Feb 13, 2014 · The NIPBL protein is required for the loading of cohesin onto chromatin, but how and where cohesin is loaded in vertebrate cells is unclear.
  48. [48]
    Different NIPBL requirements of cohesin-STAG1 and cohesin-STAG2
    Mar 10, 2023 · We show that NIPBL depletion results in increased cohesin-STAG1 on chromatin that further accumulates at CTCF positions while cohesin-STAG2 diminishes genome- ...
  49. [49]
    CTCF regulates global chromatin accessibility and transcription ...
    Feb 24, 2025 · We identified regulation of global gene expression and chromatin accessibility by CTCF before cellular phenotypes arise in juvenile rods. Our ...
  50. [50]
    CTCF is selectively required for maintaining chromatin accessibility ...
    Feb 28, 2025 · Our study reveals a novel role of CTCF in regulating erythroid differentiation by maintaining its proper chromatin openness and gene expression network.
  51. [51]
    CTCF regulates global chromatin accessibility and transcription ...
    We conclude that the architectural protein CTCF binds chromatin and regulates global chromatin accessibility and transcription during rod development.Ctcf Regulates Global... · Significance · Abstract
  52. [52]
    A Boundary Element Between Tsix and Xist Binds the Chromatin ...
    Here, we investigate the region and discover a conserved element, RS14, that presents a strong binding site for Ctcf protein. RS14 possesses an insulatory ...
  53. [53]
    CTCF: insights into insulator function during development
    Mar 15, 2012 · The nuclear protein CCCTC-binding factor (CTCF) when bound to insulator sequences can prevent undesirable crosstalk between active and inactive genomic regions.Sequences involved in... · model for insulator function · CTCF mediates enhancer...
  54. [54]
    CTCF regulates global chromatin accessibility and transcription ...
    We conclude that the architectural protein CTCF binds chromatin and regulates global chromatin accessibility and transcription during rod development.Results · Ctcf Depletion Leads To... · Strong Ctcf-Binding Promotes...
  55. [55]
    Genomic Imprinting: CTCF Protects the Boundaries - ScienceDirect
    The DNA-binding protein CTCF, which acts as a chromatin 'insulator', regulates imprinting of the mammalian Igf2 and H19 genes in a methylation-sensitive manner.
  56. [56]
    Role of CCCTC-Binding Factor (CTCF) in Genomic Imprinting ...
    CTCF participates in many processes related to global chromatin organization and remodeling, contributing to the repression or activation of gene transcription.Abstract · Introduction · CTCF Structure and Function · CTCF, Gametogenesis, and...
  57. [57]
    CTCF shapes chromatin structure and gene expression in health ...
    Aug 22, 2022 · In this review, we outline how CTCF contributes to the regulation of the three‐dimensional structure of chromatin and the formation of chromatin domains.
  58. [58]
    Loss of maternal CTCF is associated with peri-implantation lethality ...
    Apr 20, 2012 · Although we demonstrated that homozygous deletion of Ctcf is early embryonically lethal, in contrast to previous observations, we showed that ...
  59. [59]
    CTCF-Related Disorder - GeneReviews® - NCBI Bookshelf
    Apr 25, 2024 · CTCF-related disorder is characterized by developmental delay / intellectual disability (ranging from mild to severe), with both speech and motor delays being ...
  60. [60]
    CTCF Variants & Neurodevelopmental Disorder Phenotypes
    CTCF-related disorder (CRD) is a neurodevelopmental disorder (NDD) caused by monoallelic pathogenic variants in CTCF. The first CTCF variants in CRD cases ...
  61. [61]
    An updated catalog of CTCF variants associated with ... - PubMed
    May 31, 2023 · We provide a comprehensive and annotated catalog of all currently known CTCF mutations associated with NDD phenotypes, to aid diagnostic applications.
  62. [62]
    CTCF: A misguided jack-of-all-trades in cancer cells - PMC
    CTCF is involved in gene regulation through a diverse range of mechanisms, including the formation of multiple layers of 3D genome organization. As a result, ...
  63. [63]
    Functional roles of CTCF in breast cancer - PMC - NIH
    Sep 30, 2017 · CTCF binding is strongly associated with DNA methylation status and aberrant CTCF binding to DNA depends on DNA methylation at the IGF2/H19 ...Disease-Related Ctcf... · Dynamic Roles Of Ctcf In... · Fig. 1<|control11|><|separator|>
  64. [64]
    CTCF haploinsufficiency destabilizes DNA methylation and ... - NIH
    CTCF can affect cytosine methylation both locally, through binding to chromatin boundaries, and distally, through its long-range effects on DNA looping and ...
  65. [65]
    Sperm DNA Methylation, Infertility and Transgenerational Epigenetics
    The 6th CTCF binding site has been identified as the most informative in infertile patients with severe sperm defects in morphology, motility and concentration, ...Epigenetic Reprogramming And... · Dna Methylation In Sperm... · Lifestyle And Environmental...<|control11|><|separator|>
  66. [66]
    CTCF contributes in a critical way to spermatogenesis and male fertility
    Jun 27, 2016 · Inactivation of Ctcf in male germ cells in mice (Ctcf-cKO mice) resulted in impaired spermiogenesis and infertility. Residual spermatozoa in ...
  67. [67]
    CTCF - SFARI Gene
    Relevance to Autism. Two de novo loss-of-function variants and several de novo missense variants in the CTCF gene have been identified in ASD probands from ...Missing: misfolding | Show results with:misfolding
  68. [68]
    Abnormal Chromatin Folding in the Molecular Pathogenesis of ... - NIH
    Abnormal patterns of chromatin folding are implicated in a wide range of diseases and disorders, including epilepsy and autism spectrum disorder (ASD).Single Nucleotide... · Copy Number Variants (cnv) · Altered Epigenetic...Missing: misfolding | Show results with:misfolding
  69. [69]
    Binding domain mutations provide insight into CTCF's relationship ...
    Mar 20, 2025 · Summary. Here we used a series of CTCF mutations to explore CTCF's relationship with chromatin and its contribution to gene regulation.
  70. [70]
    molecular roles for CTCF outside cohesin loop extrusion - PMC
    Dec 21, 2024 · We highlight recent studies providing insight into these functions and chart out potential research directions to further dissect CTCF function ...