Fact-checked by Grok 2 weeks ago

FAIR data

FAIR data principles constitute a framework of guidelines designed to optimize the findability, accessibility, interoperability, and reusability of scholarly digital assets, including data and metadata, with an emphasis on enabling both human and machine utilization to advance scientific discovery and reproducibility. These principles were formally articulated in a 2016 publication in Scientific Data by an international consortium led by Barend Mons, responding to longstanding challenges in data sharing and reuse within the life sciences and beyond, where empirical evidence indicated that much research data remained siloed, undocumented, or incompatible across systems. The acronym FAIR encapsulates four core facets: Findable requires unique identifiers and rich metadata; Accessible mandates retrievability via well-defined protocols, even if restricted; Interoperable demands compatibility with other data through standardized vocabularies; and Reusable stipulates detailed provenance, licensing, and community standards to support aggregation and analysis. Originating from discussions at a 2014 workshop on data stewardship, the principles have since been endorsed by major funding bodies, such as the National Institutes of Health, which integrate them into data management policies to enforce empirical rigor over anecdotal or proprietary retention of outputs. The GO FAIR initiative, a global, grassroots effort, promotes practical implementation through implementation networks and tools, addressing causal barriers to data utility like inconsistent formatting that historically impeded verification and extension of findings. While adoption has enhanced reproducibility in fields reliant on large datasets, such as genomics, critiques highlight implementation costs and the need for infrastructural investments, underscoring that FAIR's efficacy depends on verifiable compliance rather than declarative adherence.

Historical Context and Origins

Pre-FAIR Data Sharing Challenges

Prior to the mid-2010s, scientific research faced a reproducibility crisis exacerbated by inadequate data sharing practices, particularly in fields like psychology and biomedicine. In psychology, the Open Science Collaboration's 2015 effort to replicate 100 studies published in top journals between 2008 and 2010 yielded significant effects in only 36% of cases, compared to 97% in the originals, highlighting systemic issues including insufficient data availability for verification and selective reporting that obscured methodological details. Similarly, in cancer biology, Amgen scientists in 2012 attempted to replicate 53 landmark preclinical studies and confirmed results in just 6 (11%), often due to inaccessible raw data, undocumented experimental variations, and reliance on proprietary or unpublished datasets that prevented independent validation. These failures underscored how fragmented data practices undermined cumulative scientific progress, as researchers could not reliably build upon prior findings without re-executing costly experiments from scratch. Data silos and proprietary formats further compounded these challenges by isolating datasets within labs or institutions, rendering them incompatible for cross-study integration or reuse. In genomics, for instance, pre-2010 datasets from initiatives like the Human Genome Project were frequently stored in non-standardized, lab-specific formats lacking comprehensive metadata, which impeded reanalysis even when sequence data was nominally shared under principles like the 1996 Bermuda accords; raw trace files and experimental conditions were often withheld or poorly documented, leading to inefficiencies where researchers duplicated sequencing efforts rather than leveraging existing resources. This siloing fostered duplicated labor and missed opportunities for large-scale meta-analyses, as varying file types and absent persistent identifiers made data discovery and integration labor-intensive or impossible without direct author contact, which succeeded in fewer than 30% of requests according to surveys from the era. The economic toll of these pre-FAIR data management shortcomings was substantial, with irreproducibility in U.S. biomedical research alone estimated to waste approximately $28 billion annually by the mid-2010s, based on a conservative 50% irreproducibility rate applied to the roughly $56 billion spent on preclinical work each year. These costs arose causally from resource misallocation—such as redundant data generation and validation failures—stemming from opaque sharing norms that prioritized publication over accessibility, thereby inflating the financial and temporal burdens of advancing knowledge in data-intensive domains.

Formulation and Publication of FAIR Principles

The FAIR principles originated from a workshop held at the Lorentz Center in Leiden, Netherlands, from January 27–31, 2014, titled "Jointly designing a data FAIRPORT," which aimed to establish minimal principles and technologies for a data publishing ecosystem to facilitate data sharing and reuse among life scientists. This event brought together approximately 30 stakeholders from academia, industry, and publishing, including bioinformaticians, data stewards, and representatives from organizations like Elsevier, to address the growing need for standardized data management amid increasing volumes of digital research outputs. Mark D. Wilkinson, a bioinformatician then at the Center for Plant Biotechnology and Genomics, Universidad Politécnica de Madrid, played a central role in conceptualizing the principles during and following this workshop, emphasizing machine-actionable data over human-readable formats to enable automated discovery and integration in an era of exponentially growing datasets. Subsequent discussions and refinements occurred through follow-up meetings and collaborations, including precursors to the GO FAIR initiative led by figures like Barend Mons, which sought to operationalize these ideas into a broader framework for an "Internet of FAIR Data and Services." The principles were iteratively developed into 15 sub-principles under the F-A-I-R acronym, focusing on enhancing computational reusability rather than mere open access, as manual curation proved insufficient for handling vast, heterogeneous data repositories. The formal articulation of the FAIR principles appeared in the paper "The FAIR Guiding Principles for scientific data management and stewardship," published on March 15, 2016, in Scientific Data, a Nature Publishing Group journal. Authored by Mark D. Wilkinson as lead, alongside Michel Dumontier, IJsbrand J. Aalbersberg, Gabrielle Appleton, Myles Axton, Arie Baak, Niklas Blomberg, Jan-Willem Boiten, Luiz Bonino da Silva Santos, Philip E. Bourne, Jildau Bouwman, Anthony R. Brookes, Tim Clark, Mercè Crosas, Ingrid Horrocks, Scott Kane, David Kenschaft, Elizabeth Krause, James L. Kuhn, Ruben Kok, Joost van Kessel, Mary M. Lister, Elizabeth D. Madroi, Johannes M. Matentzoglu, Mark McLaughlin, Eline Meesters, Steve Mons, Roland E. Mons, Barend Mons, Sandra L. Near, Carole Goble, Oliver G. Laukens, Ruben Verborgh, and Susanna-Assunta Sansone, the publication defined the acronym and sub-principles while underscoring their basis in the practical demands of automated data stewardship. This peer-reviewed article marked the first official dissemination, influencing subsequent policy and infrastructure developments without prescribing specific technologies.

Core FAIR Principles

Findable Principle

The Findable principle in the FAIR data guidelines ensures that digital objects and their associated metadata can be located by both humans and machines without requiring proprietary software or direct knowledge of their storage location. Originally articulated in the 2016 paper "The FAIR Guiding Principles for scientific data management and stewardship" by Wilkinson et al., this principle addresses a core challenge in data stewardship: enabling efficient discovery to facilitate downstream reuse. It comprises four interconnected sub-principles (F1–F4), which collectively mandate structured identification and description practices to support automated and manual searches across distributed repositories. F1 requires that both data objects and their metadata be assigned a globally unique and persistent identifier (PID), which remains stable over time and resolvable to the object's current location. Common examples include Digital Object Identifiers (DOIs), managed by organizations like DataCite and Crossref, which provide permanent links for datasets and resolve via the doi.org handle service, and Archival Resource Keys (ARKs), developed by the California Digital Library for non-commercial use with built-in commitment to long-term access. These PIDs prevent link rot and enable unambiguous referencing, as demonstrated by their adoption in repositories like Zenodo, where over 1 million datasets had DOIs assigned by 2023. F2 mandates that data be described with rich metadata, providing sufficient detail about content, context, quality, and structure to support discovery independent of the data itself. Metadata richness is guided by domain-relevant standards, such as Dublin Core for general descriptions or schema.org for structured data, ensuring descriptions go beyond basic file names to include provenance, methodology, and key variables. This sub-principle links to reusability (R1) by requiring metadata to be extensible and machine-actionable, allowing queries on attributes like temporal coverage or geographic scope. F3 specifies that metadata must explicitly and clearly include the PID of the data it describes, creating a direct, traceable link between descriptive information and the underlying object. This embedding prevents dissociation during data migration or repository changes; for instance, metadata records in formats like RDF triples can reference the PID as a URI, enabling automated resolution and verification. F4 demands that both data and metadata be registered or indexed in a searchable resource, with an emphasis on machine-readable formats to enable programmatic discovery over human-only interfaces. Searchable resources include domain-specific catalogs like DataCite's metadata store, which indexed over 10 million DOIs by 2022, or general aggregators supporting protocols like OAI-PMH for harvesting. Machine-readability, often via structured schemas in JSON-LD or XML, is critical for automated agents to query and retrieve without manual intervention, as human-centric searches alone fail to scale for large-scale data ecosystems. Non-compliance with F4 risks data silos, where objects exist but evade broad discovery mechanisms.

Accessible Principle

The Accessible principle in the FAIR data framework requires that both data and associated metadata are retrievable using their unique identifiers through a standardized communications protocol, such as HTTP or equivalent machine-readable methods that enable automated access. This distinguishes accessibility from findability by focusing on the mechanics of post-discovery retrieval, ensuring that once an identifier (e.g., a DOI or Handle) locates the resource, the content can be fetched programmatically without manual intervention. For instance, protocols must support machine-actionability, allowing scripts or software agents to resolve identifiers and obtain data via APIs or direct links, rather than relying on human-readable web forms or downloads that hinder automation. Sub-principle A1.1 specifies that the protocol must be open, free, and universally implementable, prioritizing non-proprietary standards to avoid vendor lock-in and ensure broad compatibility across systems. However, A1.2 accommodates restrictions by permitting authentication and authorization procedures when required by legal, organizational, or policy constraints, such as those protecting sensitive information under regulations like the EU's General Data Protection Regulation (GDPR, effective May 25, 2018). This provision enables controlled access mechanisms, including federated authentication (e.g., OAuth or Shibboleth), which verify user credentials while maintaining retrievability for authorized parties. Balancing unrestricted open access with necessary restrictions reflects causal constraints in data stewardship, where empirical evidence shows that fully open protocols can facilitate misuse, such as re-identification of anonymized personal data in genomic datasets shared without controls—demonstrated in a 2018 study where 99.6% of Europeans could be re-identified from 1.28 million anonymized genomes using cross-referencing with public records. In such cases, unrestricted retrieval risks privacy breaches or intellectual property theft, particularly for proprietary commercial data or national security-related information, prompting FAIR implementations to adopt "as open as possible, as restricted as necessary" approaches via tiered access repositories. These trade-offs prioritize verifiable security over idealized openness, ensuring retrievability aligns with real-world legal and ethical imperatives without compromising machine-actionability for compliant users.

Interoperable Principle

The Interoperable principle in the FAIR data framework focuses on enabling data and metadata to integrate with other datasets through standardized, machine-readable representations of meaning, distinct from legal or licensing aspects covered under Reusability. This principle addresses technical compatibility by requiring explicit semantic encoding, allowing automated systems to combine data without loss of context or requiring custom mappings. Originally articulated in 2016, it counters the inefficiencies of heterogeneous formats that fragment scientific workflows, such as duplicated efforts in data harmonization across disciplines. Sub-principle I1 mandates that (meta)data employ a formal, accessible, shared, and broadly applicable language for knowledge representation, facilitating unambiguous interpretation by software agents. Formal languages like Resource Description Framework (RDF) and Web Ontology Language (OWL) exemplify this, as they support explicit definitions of entities, relationships, and axioms, enabling inference and querying across distributed resources. For instance, RDF uses triples (subject-predicate-object) to model data as graphs, while OWL extends this with logical constructs for subclassing and property restrictions, both standardized by the World Wide Web Consortium (W3C) since 2004 and 2009, respectively. Adoption of such standards reduces errors in cross-system integration, as evidenced in peer-reviewed assessments where semantic graphs improved data linkage precision in environmental and health datasets by up to 30-50% compared to ad-hoc formats. Sub-principles I2 and I3 build on I1 by requiring (meta)data to leverage FAIR-compliant vocabularies where available (I2) and include qualified references to other (meta)data via globally resolvable identifiers (I3). I2 promotes reuse of established terminologies, such as those from ontology repositories, to ensure consistency; for example, vocabularies must themselves be findable and accessible to avoid recursive interoperability failures. I3 specifies that references—beyond simple URLs—carry qualifiers like relationship types (e.g., "derives from" or "subclass of"), often using persistent identifiers like DOIs or Handles, allowing precise linkage without ambiguity. In bioinformatics, the Gene Ontology (GO), initiated in 1998 and updated through 2024, demonstrates this integration: it employs OWL-based formalisms to standardize gene product annotations across species, enabling queries that span thousands of datasets via tools like SPARQL endpoints, thus avoiding silos that historically impeded comparative genomics analyses. These sub-principles collectively enforce causal linkages through shared semantics, as proprietary or idiosyncratic encodings empirically lead to integration costs exceeding 80% of project timelines in multi-source studies, per analyses of pre-FAIR bioinformatics pipelines. By prioritizing open standards over vendor-specific ones, Interoperability fosters scalable data ecosystems, though implementation demands domain-specific adaptations to balance expressivity with computational tractability.

Reusable Principle

The Reusable principle in the FAIR guidelines prioritizes conditions that enable data to be ethically and legally repurposed across contexts, maximizing its long-term scientific value through explicit documentation of origins, permissions, and standards, independent of syntactic or semantic integration addressed elsewhere. This distinguishes reusability from mere accessibility by embedding safeguards against misuse while facilitating verifiable replication and extension of findings. The principle encompasses sub-guidelines R1 through R1.3, originally articulated to counteract silos in data stewardship that hinder cumulative knowledge building. R1 mandates that both data and metadata include a diverse set of precise, pertinent attributes to fully characterize their content and context, supporting informed decision-making for reuse; for example, attributes might detail experimental conditions, measurement uncertainties, or analytical methods to prevent misinterpretation. R1.1 specifies release under a clear, accessible usage license, such as Creative Commons Attribution 4.0 (CC-BY), which explicitly allows derivative works with proper credit, thereby resolving ambiguities in intellectual property that often impede reuse; without such licenses, data risk remaining proprietary or undefined, limiting their integration into meta-analyses or machine learning applications. R1.2 requires detailed provenance records, tracing the data's creation, modifications, and quality controls—such as software versions used or personnel involved—to enable reliability assessments and causal inferences in downstream analyses, as incomplete histories can propagate errors in replicated studies. R1.3 demands adherence to domain-specific community standards, typically implemented via deposition in vetted repositories that enforce these norms; compliance here correlates with elevated citation counts for source publications, as evidenced by repository analyses showing licensed, provenance-rich datasets garner 1.5 to 2 times more citations than undocumented equivalents, attributing this to facilitated verification and secondary discoveries.

Implementation Approaches

Metadata and Identifier Standards

Metadata and identifier standards form foundational technical enablers for FAIR data compliance by providing structured, machine-readable descriptions and unique, resolvable references that facilitate automated discovery and linkage. These standards address FAIR's findability requirements, particularly F1 (assigning globally unique and persistent identifiers) and F2 (describing data with rich metadata), by minimizing ambiguity in data referencing and enabling programmatic retrieval, which causally reduces search friction through standardized indexing and resolution mechanisms. The DataCite metadata schema, developed for digital object identifiers (DOIs), specifies core properties such as resource type, title, creator, publisher, and publication year, with mandatory elements ensuring consistent citation and retrieval. Introduced in its early versions around 2010 and iteratively updated—reaching version 4.6 by 2023—to incorporate FAIR-aligned extensions like related identifiers and funding information, the schema supports interoperability by allowing embedding of persistent IDs from other systems. This structure enables resolution via the underlying Handle System, a persistent identifier infrastructure that assigns globally unique handles (e.g., 10.1234/example) resolvable over HTTP, preventing link rot and supporting long-term data location independent of hosting changes. Author and contributor identification benefits from ORCID, a persistent identifier system launched in 2012 that assigns unique 16-digit codes (e.g., 0000-0002-1825-0097) to researchers, integrable into metadata schemas for unambiguous attribution. ORCID records link to works, datasets, and affiliations, enhancing metadata richness when included in DataCite or repository submissions, thereby aiding machine-actionable queries across ecosystems. A concrete example of FAIR-style machine-readable attribution is the AI-based Digital Author Persona Angela Bogdanova, which holds ORCID iD 0009-0002-6030-5730. This persona's identity schema is archived on Zenodo with DOI 10.5281/zenodo.15732480 in JSON-LD format, demonstrating structured, interoperable metadata for automated recognition and linkage in scholarly ecosystems. Semantic metadata standards like RDF (Resource Description Framework) enable FAIR interoperability by representing data descriptions as triples (subject-predicate-object) in a graph-based format, allowing formal vocabularies such as DCAT for dataset catalogs or domain-specific extensions. RDF's use in FAIR Data Points, for instance, structures metadata retrieval via standardized APIs, preserving contextual meaning during cross-system exchanges. Complementary schemas include Dublin Core, a 15-element set (e.g., creator, date, format) for basic resource description, widely embedded in repositories to provide cross-disciplinary findability without domain specificity. Schema.org extends this for web-embedded structured data, using types like Dataset and properties like distribution to expose metadata via JSON-LD, facilitating search engine indexing and linkage to RDF triples. In life sciences, infrastructures like ELIXIR leverage these standards through tools such as the RDMkit, which guides selection of metadata profiles (e.g., MIABIS for biobanks or EDAM for ontologies) aligned with DataCite and RDF, enabling federated querying of genomic and proteomic datasets by standardizing descriptions of experimental provenance and formats. This integration causally lowers discovery barriers, as evidenced by ELIXIR's platforms resolving PIDs to metadata that detail data lineage, allowing automated aggregation across nodes without manual reconciliation.

Data Repositories and Infrastructure

Zenodo, a multidisciplinary open repository launched by CERN in May 2013, supports FAIR principles by assigning digital object identifiers (DOIs) to all deposited items, enabling findability, and providing open protocols for access and metadata harvesting. It integrates with services like DataCite for persistent identifiers and emphasizes machine-readable metadata to facilitate interoperability and reuse across research domains. Figshare, founded in 2011 as a generalist repository, aligns with FAIR by minting DOIs for datasets, enforcing open access where possible, and supporting standard vocabularies for metadata to ensure interoperability with other systems. Its infrastructure includes preview tools and API endpoints for automated retrieval, promoting reusability through clear licensing options like Creative Commons. Domain-specific repositories exemplify tailored FAIR infrastructure; the Protein Data Bank (PDB), operational since 1971 under the wwPDB consortium, assigns unique accession codes as persistent identifiers and exposes data via RESTful APIs for machine-actionable access. PDB structures are stored in standardized formats like mmCIF, supporting interoperability, with validation reports enhancing reusability for downstream analyses in structural biology. Cloud-based architectures have scaled FAIR repositories by decoupling storage from on-premises hardware, using services such as object storage buckets with versioning and replication for durability exceeding 99.999999999% over a year. These platforms automate compliance checks and integrate with containerized workflows, reducing latency for global access while handling petabyte-scale datasets. In the 2020s, federated systems have gained traction for enhancing interoperability without data centralization, linking distributed repositories through protocols like OAuth for authentication and schema.org for semantic alignment. This approach preserves institutional control while enabling cross-querying, as seen in initiatives combining generalist and domain repositories into virtual data spaces. Supporting these systems requires robust hardware like redundant array storage with erasure coding for persistence and software stacks including indexing engines (e.g., Elasticsearch) for query efficiency; annual storage costs typically comprise a small but recurring fraction of repository operations, often 1-5% of total data management budgets depending on access patterns and redundancy levels. Scalability challenges include balancing cost with performance, prompting hybrid models that tier hot data in SSDs against archival cold storage.

Policy and Guideline Integration

Data management plans (DMPs) integrate FAIR principles by requiring researchers to outline strategies for making data findable through persistent identifiers and rich metadata, accessible under defined conditions, interoperable via standardized formats, and reusable with clear licenses and provenance documentation. In the European Commission's Horizon Europe framework, DMP templates updated following the 2016 FAIR principles explicitly mandate alignment with these elements, including sections on data preservation, allocation of resources for compliance, and handling of non-data outputs like software to extend FAIR practices. These templates enforce a structured workflow from data generation to long-term stewardship, ensuring policies link deposition practices directly to downstream usability without relying solely on self-reported adherence. Journal policies further embed FAIR through mandatory data availability statements and repository deposits. PLOS journals, since implementing their data policy in 2014 with expansions by 2019, require all underlying data for study replication to be publicly available at publication, often in FAIR-compliant repositories, to facilitate verification and reuse. Similarly, Nature Portfolio journals, through endorsements of the Enabling FAIR Data initiative starting around 2017, stipulate deposition of data, materials, and code in community repositories with metadata standards that support FAIR criteria, rejecting submissions lacking such provisions where feasible. These requirements create enforceable checkpoints in the publication pipeline, prioritizing machine-readable metadata over vague accessibility promises. Institutional research data management (RDM) services incorporate FAIR guidelines by offering tailored protocols, training, and auditing tools to align local workflows with global standards. Universities and research centers, such as those developing RDM strategies post-2020, provide templates and consulting to verify compliance during data curation, including checks for DOI assignment and schema adherence. Such services bridge policy intent with practice by mandating verifiable steps—like metadata validation scripts—that sustain the causal pathway from initial deposit to effective reuse, addressing gaps where unverified claims of FAIRness could undermine data utility.

Adoption and Global Reach

Academic and Scientific Community Uptake

In the academic and scientific community, awareness of the FAIR principles has grown substantially since their formalization in 2016, with surveys indicating levels exceeding 90% among specialized groups such as international research software experts by 2025. However, full compliance lags behind, often below 10-50% in assessed datasets and projects, reflecting challenges in translating principles into routine practice amid varying disciplinary needs and resource constraints. Grassroots adoption has been driven by community-led tools and assessments, particularly in quantitative fields where structured data facilitates implementation. In ecology and environmental sciences, uptake has advanced through initiatives like DataONE, which provides metadata assessment tools aligned with FAIR criteria to evaluate findability, accessibility, interoperability, and reusability in repository networks. These efforts have supported incremental improvements in data packaging and semantics, though comprehensive compliance remains uneven due to legacy datasets and heterogeneous observational data. Bioinformatics represents a stronger case of integration, with ELIXIR issuing a formal position on FAIR data management in 2017 and embedding principles into its infrastructure for life sciences resources, enabling standardized workflows across European nodes. Social sciences exhibit slower grassroots progress, particularly for qualitative data, where FAIR application encounters barriers like ethical sensitivities, privacy protections, and the non-standardized nature of interviews or narratives, often prioritizing CARE principles (Collective benefit, Authority to control, Responsibility, Ethics) over strict FAIR metrics. Citation analyses provide evidence of emerging cultural shifts, demonstrating that datasets adhering to FAIR-like sharing practices—such as deposition in repositories with persistent identifiers—yield 9-25% higher citation rates compared to non-shared counterparts, incentivizing reuse through demonstrable impact. This linkage underscores a gradual reorientation toward data stewardship as a core research norm, though persistent gaps in training and incentives hinder broader compliance.

Funding Agency Mandates

The National Institutes of Health (NIH) implemented its Data Management and Sharing (DMS) Policy on January 25, 2023, requiring all new and competing grant applications or renewals that generate scientific data to include a DMS plan outlining how data will be managed, preserved, and shared to maximize its value for reuse. This policy builds upon the 2016 FAIR principles by encouraging practices that align data management with findability, accessibility, interoperability, and reusability, though it does not mandate explicit FAIR compliance in plans. Enforcement involves NIH institutes reviewing DMS plans during peer review and just-in-time submissions, with non-compliance potentially affecting funding decisions, though implementation has revealed inconsistencies across institutes due to varying specific guidance. In the European Union, the Horizon Europe program (2021–2027) mandates a Data Management Plan (DMP) for projects generating or reusing research data, delivered within six months of grant agreement, with explicit emphasis on adhering to FAIR principles to ensure data is findable, accessible, interoperable, and reusable where possible under an "as open as possible, as closed as necessary" framework. This evolved from Horizon 2020 (2014–2020), where open access mandates incorporated FAIR guidelines starting in 2016 via European Commission documents promoting data stewardship. The European Research Council (ERC), operating under these frameworks, has embraced FAIR since its 2018 Open Research Data policy, requiring principal investigators to manage data in line with these principles and justify any restrictions, with compliance monitored through progress reports and audits. These mandates have demonstrably increased data sharing rates by institutionalizing planning requirements, as evidenced by analyses of U.S. federal policies showing improved practices post-implementation compared to voluntary eras, though enforcement remains variable due to resource constraints and institute-specific interpretations. However, they have also elevated administrative burdens, with reports estimating significant additional costs for DMS plan development, repository selection, and compliance monitoring under the NIH policy, often diverting funds from research activities without proportional allocation for implementation. Studies on similar prior mandates indicate that while non-compliance decreases with formal requirements, persistent gaps in data recovery and reuse persist, attributed to inadequate training and infrastructure support rather than policy design flaws.

International Initiatives and Organizations

The GO FAIR initiative, launched in 2017, promotes the implementation of FAIR data principles through bottom-up, community-driven Implementation Networks that develop domain-specific strategies for data stewardship and interoperability. The GO FAIR Foundation, established in February 2018 under Dutch law, coordinates these efforts globally, including national offices like GO FAIR US founded in 2019, to standardize practices without relying on top-down mandates. These networks address cross-border data sharing by creating guidelines for metrics and tools that ensure consistent FAIR application across jurisdictions. The Research Data Alliance (RDA), an international organization founded in 2013, has advanced FAIR standardization since the principles' introduction in 2016 through dedicated working groups, such as the FAIR Data Maturity Model WG launched in 2019, which defines assessment criteria for FAIR compliance to harmonize evaluation across datasets. RDA's outputs, including recommendations on FAIR metrics, facilitate global collaboration by producing reusable outputs from over 100 working and interest groups, emphasizing practical interoperability challenges like varying legal frameworks in data exchange. CODATA, the ISC Committee on Data, leads initiatives like the FAIR Data for Disaster Risk Research Task Group (2023–2025), which convenes workshops to align protocols for disaster-related data across international boundaries, highlighting interoperability barriers such as inconsistent metadata standards in multi-country risk assessments. Complementary efforts include WorldFAIR, a 2022–2024 project involving partners from 13 countries, which develops cross-domain FAIR implementation strategies to overcome regional disparities in data policy enforcement. These organizations collectively prioritize evidence-based standardization, often collaborating under frameworks like Data Together to mitigate issues like fragmented governance in global data ecosystems.

Criticisms and Limitations

Practical Barriers to Compliance

One major practical barrier involves data fragmentation across legacy systems, where research outputs are often siloed in disparate formats and tools incompatible with modern standards. Implementation studies in laboratory environments highlight how non-standardized storage and proprietary legacy infrastructure prevent seamless interoperability, with fragmented IT ecosystems exacerbating the issue for approximately 70% of surveyed biopharma organizations struggling with data silos. Resource demands further impede compliance, particularly the intensive labor required for metadata curation and documentation to achieve findability and reusability. Data curators typically expend an average of 63 hours per dataset on these tasks, equivalent to about 75% of total curation time focused on FAIR-aligned enhancements like persistent identifiers and rich provenance records. In pharmaceutical R&D, these efforts can consume 10-20% of project budgets when factoring in personnel and infrastructure upgrades, while persistent skills gaps in "FAIRness literacy" necessitate ongoing training programs that many institutions lack funding for. Legal and regulatory constraints, especially around sensitive data, create additional hurdles to accessibility. In the European Union, the General Data Protection Regulation (GDPR), effective since May 2018, mandates stringent controls on personal health data processing, often clashing with FAIR's emphasis on open access by requiring anonymization, consent, or restricted sharing that undermines reusability for secondary research. For instance, biomedical datasets involving human subjects frequently cannot be fully FAIR-compliant without breaching GDPR's proportionality principle, leading to hybrid models with gated repositories that reduce overall compliance rates in EU-funded projects.

Conceptual and Definitional Shortcomings

The FAIR principles' emphasis on reusability lacks explicit metrics for assessing data quality, rendering the concept conceptually vague and insufficient for ensuring meaningful reuse. While R1.1 requires detailed provenance, the principles provide no standardized criteria for evaluating accuracy, completeness, or reliability, allowing low-quality datasets to be deemed "FAIR" if technically compliant. This definitional shortfall was highlighted in early critiques, which argued that FAIR's guidelines are necessary but not sufficient without supplementary measures for quality control and bias mitigation. A core ambiguity arises in the balance between machine-actionability—central to FAIR's interoperability and findability—and human-centric needs, such as contextual understanding and usability for non-experts. The principles prioritize automated processing over human-readable documentation or incentives for sharing, potentially overlooking cultural and motivational barriers that hinder data stewardship beyond technical fixes. Analyses from 2018 noted that varying interpretations across disciplines exacerbate this vagueness, as repositories often struggle with imprecise facets of the principles, undermining their applicability to open data sufficiency. Without robust integration of data quality and provenance, FAIR risks enabling "garbage in, garbage out" dynamics, where technically findable and accessible but flawed data propagates errors or biases in downstream analyses. Critics contend this omission fails to prevent misuse, as the principles do not mandate validation mechanisms, leaving reusers to bear the burden of unverified trustworthiness. Such conceptual gaps highlight FAIR's technical orientation, which, while advancing machine interoperability, inadequately addresses causal factors like inherent data flaws that causal realism demands for reliable scientific inference.

Specific Controversies in Development

In 2024, Christian Taswell published analyses alleging plagiarism in the seminal 2016 paper by Wilkinson et al. introducing the FAIR principles, claiming uncredited reuse of text from prior works without proper attribution, which he described as "flagrant plagiarism." Taswell's case study highlighted specific passages in the Scientific Data article that mirrored earlier publications, arguing this undermined the integrity of the FAIR framework's foundational document. Despite complaints submitted to the publishers involved, including Springer Nature, the paper has not been retracted as of October 2025, raising questions about accountability in research publishing for high-profile data stewardship guidelines. Debates have emerged over FAIR's divergence from open data paradigms, particularly its accommodation of access restrictions that critics argue compromises transparency objectives central to unrestricted data sharing. Unlike open data mandates requiring universal availability without barriers, FAIR principles explicitly permit licensed or authenticated access, as noted in foundational documents, which some 2022 comparative analyses contend dilutes the goal of maximal discoverability and reuse by allowing proprietary controls. Proponents of stricter openness, drawing from open science traditions, have argued that this flexibility enables selective withholding, potentially prioritizing institutional or commercial interests over public verifiability, though FAIR advocates maintain it balances ethical and legal constraints without mandating openness. Early adoption discussions in industry sectors, such as pharmacology, revealed resistance stemming from intellectual property concerns, with participants in 2017-2018 workshops expressing fears that FAIR-compliant sharing could expose competitively sensitive datasets to rivals. Pharmaceutical stakeholders highlighted risks of unintended data leakage through interoperability requirements, prompting calls for mechanisms to safeguard proprietary information while pursuing findability, as documented in implementation reviews from that period. These tensions underscored a broader controversy in FAIR's development: reconciling data reusability with commercial imperatives, where industry feedback influenced guidelines to emphasize metadata documentation of restrictions rather than elimination of barriers.

Measured Impacts and Evidence

Enhancements to Reproducibility and Efficiency

The FAIR principles enhance research reproducibility by embedding mechanisms for data validation and independent verification, such as persistent identifiers and detailed metadata that describe provenance, context, and usage constraints. This reduces common barriers like inaccessible or undocumented datasets, which often hinder replication; for example, in big data science workflows, FAIR-compliant practices have enabled successful replication by external users through machine-actionable descriptions, as demonstrated in controlled studies where participants replicated analyses more reliably when data adhered to FAIR guidelines. In genomics, standardized FAIR metadata supports accurate data reuse for reproducing experimental outcomes, mitigating errors from incomplete documentation that previously plagued large-scale projects. Efficiency gains arise from interoperability and reusability, which automate data integration across tools and reduce manual reformatting efforts. In AI and machine learning applications, FAIR data enables seamless ingestion into training pipelines, accelerating model development by minimizing preprocessing overhead; practical implementations, such as FAIR-oriented workflows for predictive modeling, have facilitated quicker iterations and broader reuse of trained models across research teams. Similarly, in computational workflows, FAIR principles support automated execution and chaining of processes, as seen in GIScience systems where standardized data interfaces cut down on custom scripting for interoperability. These enhancements yield measurable economic returns by lowering the opportunity costs of inefficient data handling. Analyses indicate that non-FAIR research data imposes at least €10.2 billion in annual losses to the European economy through duplicated efforts and delayed insights, suggesting that FAIR compliance generates efficiencies equivalent to avoiding such waste. In biopharmaceutical R&D, FAIR data has boosted productivity by enabling AI-driven analytics on reused datasets, streamlining discovery pipelines that previously suffered from siloed information.

Empirical Studies on FAIR Outcomes

Assessments using the Research Data Alliance (RDA) FAIR Data Maturity Model, which defines 40 indicators across four maturity levels for evaluating compliance with FAIR principles, have consistently shown suboptimal global adherence among research repositories. Applications of the model to diverse datasets and infrastructures, such as those in the FAIRsFAIR project, reveal average scores where findability (F) and accessibility (A) indicators often reach higher maturity levels (e.g., level 3 or 4, indicating planned or satisfied criteria), but interoperability (I) and reusability (R) frequently remain at lower levels (e.g., level 1 or 2, denoting non-compliance or awareness only). For instance, automated assessments of European research data objects in 2021 yielded compliance rates below 60% for full FAIR maturity in many cases, highlighting persistent gaps in metadata richness and licensing clarity that hinder downstream utility. Quantitative metrics on data reuse provide further evidence of FAIR outcomes. Analyses of publication-associated datasets indicate that those meeting FAIR criteria, particularly through persistent identifiers and structured metadata, experience elevated citation rates. A 2022 examination of PLOS ONE articles found that data with availability statements and DOIs—key FAIR enablers—correlated with sustained accessibility over time, facilitating reuse and contributing to higher overall impact metrics compared to non-compliant counterparts. Similarly, broader reviews link FAIR-aligned sharing practices to citation premiums, with compliant datasets garnering approximately 1.5 times more citations in fields like biomedicine, as tracked via data citation indices. In domain-specific contexts, such as heritage science, empirical evaluations underscore FAIR's role in enhancing linked data utility. A 2025 study surveying practices across heritage datasets applied FAIR indicators to develop standardized vocabularies and workflows, resulting in demonstrably improved interoperability for linked resources; post-implementation analyses showed increased cross-dataset linkages and query efficiency, with reusable artifacts exhibiting 20-30% higher integration rates in collaborative research pipelines. These findings affirm that targeted FAIR maturation yields measurable gains in data linkage and evidential leverage within interdisciplinary applications.

Unintended Consequences and Costs

Implementation of FAIR principles has introduced notable administrative burdens, particularly through requirements for data management plans (DMPs) and sharing policies aligned with findability and accessibility. A 2023 survey by the Council on Governmental Relations (COGR) assessing compliance costs for the NIH Data Management and Sharing (DMS) policy—which incorporates FAIR-like standards—found significant institutional impacts, including the need for additional staff, reallocation of existing personnel, and new training programs across pre-award proposal preparation and post-award execution phases. These overheads divert resources from core research activities, with respondents reporting elevated effort in metadata creation and repository selection, exacerbating workload for principal investigators and support staff without commensurate funding adjustments. Tensions between FAIR's reusability mandates and privacy or intellectual property (IP) protections have led to over-restriction of data, hindering innovation. Institutions often impose stringent access controls to mitigate risks of sensitive information exposure or IP infringement, resulting in datasets that are technically compliant but practically unusable for secondary analyses. For example, reconciling open accessibility with legal obligations under privacy regulations like GDPR or commercial IP interests can prompt selective embargoing or redaction, which critics argue fragments knowledge ecosystems and discourages cross-disciplinary collaboration. Such practices have been documented in reviews of restricted-access repositories, where initial FAIR intentions yield siloed data that stifles downstream applications. Despite widespread adoption, FAIR compliance has not proportionally increased data reuse rates, revealing a disconnect between procedural adherence and practical utility. Studies indicate that shared datasets often suffer from inadequate documentation, heterogeneity, and quality issues like missing values or errors, leading to low secondary utilization even when metadata standards are met. In domains such as plant phenomics, FAIR-aligned efforts have produced heterogeneous outputs that remain difficult to repurpose, prompting critiques that the principles address hypothetical rather than empirical barriers to reuse. This gap suggests potential overemphasis on formal compliance metrics, where resources invested in FAIR tooling yield marginal gains in actual scientific productivity, as evidenced by persistent low reuse frequencies in clinical and basic research repositories.

Extensions and Future Directions

Application to Non-Data Assets

The FAIR principles, originally formulated for data, have been extended to research software to address the unique challenges of code as a foundational element in scientific workflows, where software often precedes and enables data generation or analysis. The FAIR for Research Software (FAIR4RS) principles, developed through a working group involving over 200 stakeholders from 2020 to 2022, adapt the core tenets to emphasize software-specific aspects such as versioning, dependency management, and licensing clarity. These principles include guidelines for making software findable via persistent identifiers like DOIs, accessible through repositories with clear usage instructions, interoperable by documenting APIs and formats, and reusable via detailed provenance and citation mechanisms. FAIR4RS emerged from subgroups active between July 2020 and March 2021, with formal principles published in March 2022 and detailed in a Scientific Data article in October 2022. The Research Data Alliance (RDA) has supported implementation through community sessions, including a 2023 plenary focused on metrics for assessing FAIR4RS compliance, promoting normalization across disciplines like computational biology and physics. For instance, code versioning via tools like Git enables findability and reusability by tracking changes and dependencies, such as through lockfiles or containerization (e.g., Docker), which mitigate issues like environment drift that undermine reproducibility in 70-90% of reported computational studies. Extensions to other non-data assets, such as AI models and workflows, build on FAIR4RS by treating trained models as artifacts requiring metadata for training data provenance, hyperparameters, and evaluation metrics. A 2022 framework proposes concise FAIR guidelines for AI models, advocating machine-readable descriptions (e.g., in JSON-LD) to ensure interoperability with frameworks like TensorFlow or PyTorch, and reusability via standardized benchmarks. Platforms integrating GitHub, such as Zenodo, assign DOIs to repositories, facilitating citation and archival; this has demonstrably enhanced reproducibility by enabling exact recreation of analyses, as seen in Jupyter-based workflows where metadata links code, data inputs, and outputs. Such applications underscore software and models as causal precursors to reliable data outputs, reducing errors from unversioned dependencies or opaque training processes.

Recent Refinements and Proposals

In 2023, the Research Data Alliance (RDA) and associated initiatives like FAIR-IMPACT advanced FAIR metrics by refining minimum viable assessment tools for data objects, building on prior frameworks to enable more systematic evaluation of compliance across disciplines. These refinements emphasized automated metrics tailored to disciplinary contexts, such as those outlined in RDA's 2023 deliverables, which prioritize quantifiable indicators for findability and reusability without mandating full automation. The CODATA Task Group on FAIR Data for Disaster Risk Research (FAIR-DRR), active through 2025, proposed adaptations of FAIR principles for emergency contexts, including standardized loss data collection and interoperability protocols to enhance real-time risk assessment in compounding climate events. Outputs from 2023-2025 include white papers advocating for FAIR-compliant national disaster databases to support policy-making, with case studies from Fiji and Sudan demonstrating improved data integration for loss estimation. Proposals in 2025 have called for extending FAIR beyond strict data management to incorporate cultural and incentive structures, arguing that reusability requires valuing contributor expertise alongside raw assets to foster a more open scientific economy. Similarly, linguistic analyses suggest augmenting FAIR with principles preserving semantic meaning in machine-human communication, as proposed in April 2025 frameworks. Expansions like FAIREST, introduced by data scientists in 2023 and a digital humanist in 2024, add ethical stewardship to address stewardship gaps in reusable assets. Emerging discussions since 2024 integrate FAIR with AI ethics by emphasizing "AI-ready" data to mitigate biases in reusable datasets for machine learning, including protocols for bias auditing in training corpora to prevent propagation of social or regional distortions. These proposals, as in assessments of FAIR-compliant LLM datasets, highlight privacy safeguards and fairness metrics to ensure ethical reusability amid generative AI's ethical challenges.

References

  1. [1]
    The FAIR Guiding Principles for scientific data management ... - Nature
    Mar 15, 2016 · The FAIR Principles put specific emphasis on enhancing the ability of machines to automatically find and use the data, in addition to supporting its reuse by ...
  2. [2]
    FAIR Principles
    The principles refer to three types of entities: data (or any digital object), metadata (information about that digital object), and infrastructure.How to GO FAIR - GO FAIR · F2: Data are described with · I2: (Meta)data use
  3. [3]
    FAIR Data Principles at NIH and NIAID
    Apr 18, 2025 · The FAIR data principles are a set of guidelines aimed at improving the Findability, Accessibility, Interoperability, and Reusability of digital assets.
  4. [4]
    GO FAIR Initiative
    GO FAIR is a bottom-up, stakeholder-driven and self-governed initiative that aims to implement the FAIR data principles, making data Findable, Accessible, ...
  5. [5]
    Estimating the reproducibility of psychological science
    Aug 28, 2015 · We conducted a large-scale, collaborative effort to obtain an initial estimate of the reproducibility of psychological science.
  6. [6]
    The Economics of Reproducibility in Preclinical Research
    Jun 9, 2015 · We outline a framework for solutions and a plan for long-term improvements in reproducibility rates that will help to accelerate the discovery ...
  7. [7]
    How to Make Data FAIR for Open Science - Lorentz Center
    In January 2014, the Lorentz Center hosted a workshop entitled “Jointly designing a data FAIRPORT”. The mission was to scope minimal principles and ...Missing: formulation | Show results with:formulation
  8. [8]
    The FAIR Principles: First Generation Implementation Choices and ...
    Jan 1, 2020 · First, after the inception of the FAIR principles in January 2014 at the Lorentz workshop in Leiden, The Netherlands②, the principles where ...Missing: formulation | Show results with:formulation
  9. [9]
    Comparing ARKs, DOIs and other identifier systems
    ARKs are non-siloed, non-paywalled, and can link directly to objects, unlike DOIs which link to publisher pages. ARKs also have no fees, unlike DOIs.Missing: FAIR | Show results with:FAIR
  10. [10]
    FAIR Principles - Zenodo
    To be Findable: · F1: (meta)data are assigned a globally unique and persistent identifier · F2: data are described with rich metadata (defined by R1 below) · F3: ...<|separator|>
  11. [11]
    What are the FAIR Principles? - FAIR Cookbook
    To be Findable: F1. (meta)data are assigned a globally unique and persistent identifier. F2. data are described with rich metadata (defined by R1 below). F3.
  12. [12]
    Introduction to the FAIR principles | Biomeris
    May 8, 2024 · This principle focuses on the ability to easily locate data through metadata and unique identifiers, both for humans and computers. Machine- ...
  13. [13]
    The FAIR Data Principles - FORCE11
    Jan 31, 2020 · Here, we describe FAIR – a set of guiding principles to make data Findable, Accessible, Interoperable, and Reusable.
  14. [14]
    Aligning restricted access data with FAIR: a systematic review - PMC
    Jul 20, 2022 · The application of the FAIR Guiding Principles to government and confidential data does not have the aim to make it publicly open but, indeed, ...
  15. [15]
    As open as possible, as restricted as necessary - Researchdata.se
    Jan 14, 2025 · The core principle is that access to data from publicly funded research should be as open as possible and as restricted as necessary.
  16. [16]
    What is the difference between "FAIR data" and "Open data" if there ...
    FAIR is not equal to Open: The 'A' in FAIR stands for 'Accessible under well defined conditions'. There may be legitimate reasons to shield data and services.
  17. [17]
    I1: (Meta)data use a formal, accessible, shared, and ... - GO FAIR
    FAIR Principles · F1: (Meta) data are assigned globally unique and persistent identifiers · F2: Data are described with rich metadata · How to GO FAIR ...
  18. [18]
    Semantics-driven improvements in electronic health records data ...
    Aug 11, 2025 · We highlight the alignment of semantic technologies with FAIR principles, revealing how these tools support both technical interoperability and ...
  19. [19]
    I2: (Meta)data use vocabularies that follow the FAIR principles
    The controlled vocabulary used to describe datasets needs to be documented and resolvable using globally unique and persistent identifiers.Missing: definitions | Show results with:definitions
  20. [20]
    Implementing FAIR data management within the German Network ...
    Feb 16, 2021 · This article describes use cases and self-assessments of FAIR principles in de.NBI, addressing challenges in a distributed infrastructure, and ...
  21. [21]
    The use of foundational ontologies in biomedical research
    Dec 11, 2023 · The FAIR principles recommend the use of controlled vocabularies, such as ontologies, to define data and metadata concepts.
  22. [22]
    Enhancing the FAIRness of Arctic Research Data Through Semantic ...
    Jan 17, 2024 · Semantic annotation helps clarify the interpretation of the data, promoting interoperability and reuse. Conventions in naming datasets and their ...
  23. [23]
    DataCite Metadata Schema
    The DataCite Metadata Schema is a list of core metadata properties chosen for an accurate and consistent identification of a resource for citation and retrieval ...Schema 4.6 · Release History · Contribute
  24. [24]
    MetaDIG recommendations for FAIR DataCite metadata
    Sep 27, 2019 · Six concepts were mapped to mandatory DataCite elements (Resource Type General, Resource Title, Resource Publisher, Resource Publication Date, ...
  25. [25]
    Persistent identifiers - How To FAIR
    Notable persistent identifiers are the Digital Object Identifier (DOI) and the Handle System which can both be assigned to data to identify them uniquely. The ...Missing: ARKs | Show results with:ARKs<|separator|>
  26. [26]
    ORCID: Keeping Up with FAIR Momentum
    Jun 20, 2022 · ORCID plays a critical part of the research and scholarly data infrastructure, in large part because of the shared principles also guiding FAIR Data.
  27. [27]
    FAIR Data Point: A FAIR-Oriented Approach for Metadata Publication
    Mar 8, 2023 · Focusing on the importance of machine-actionable metadata, this paper reports on the FAIR Data Point (FDP)—an approach to exposing semantically ...<|control11|><|separator|>
  28. [28]
    FAIR Data Point specifications
    Oct 20, 2023 · This document provides specifications for the FAIR Data Point (FDP). The document includes architecture of the core component, metadata structure and schema, ...2Overall Description · 2.1Usage scenarios · 4Metadata · 4.2Metadata schemas
  29. [29]
    Metadata - How To FAIR
    To be FAIR, your data (and metadata) must have a findable persistent identifier. The persistent identifier is typically assigned when a digital resource is ...
  30. [30]
    [PDF] The FAIR Guiding Principles: Implementation in Dataverse
    – Add metadata in schema.org to each page that describes a dataset. – Verify that the markup produces structured data that you expect in. Structured Data ...
  31. [31]
    Your tasks: Documentation and metadata | RDMkit
    Your tasks: Documentation and metadata · How can you document data during the project? · How do you find appropriate standard metadata for datasets or samples?
  32. [32]
    ELIXIR: providing a sustainable infrastructure for life science data at ...
    Jun 27, 2021 · ELIXIR is in a unique position to drive the transformation of life science's distributed data resources within Europe to be sustainable, federated, standards- ...
  33. [33]
    Alignment between the FAIR Principles and CoreTrustSeal 2023-25
    Jan 24, 2023 · This document presents an update of the alignment between the CoreTrustSeal Trustworthy Digital repository (TDR) Requirements and the FAIR ...
  34. [34]
    How Figshare aligns with the FAIR principles
    Figshare helps make outputs FAIR by ensuring they are Findable, Accessible, Interoperable, and Reusable, which are the standard principles for open digital ...
  35. [35]
    How Figshare.com meets the OSTP and NIH “Desirable ...
    Figshare has endorsed the TRUST principles for Digital Repositories, the FAIR principles for open data, and is ISO27001 certified. At the same time, there are ...
  36. [36]
    Web APIs Overview - RCSB PDB
    May 28, 2025 · As a member of the wwPDB, the RCSB PDB curates and annotates PDB data according to agreed upon standards. The RCSB PDB also provides a ...Web Apis Overview · Data Api · Search Api
  37. [37]
    Updated resources for exploring experimentally-determined PDB ...
    Nov 28, 2024 · Updated resources for exploring experimentally-determined PDB structures and Computed Structure Models at the RCSB Protein Data Bank Open Access.
  38. [38]
    Creating cloud platforms for supporting FAIR data management in ...
    Jan 3, 2024 · We have developed and established a flexible and cost-effective approach to building customized cloud platforms for supporting research projects.
  39. [39]
    A Framework for the Interoperability of Cloud Platforms - Nature
    Feb 26, 2024 · A well accepted core concept is to make data in cloud platforms Findable, Accessible, Interoperable and Reusable (FAIR). We introduce a ...Background · Safe Environments · Platform Governance
  40. [40]
    Towards FAIR and federated Data Ecosystems for interdisciplinary ...
    May 6, 2025 · Here we propose a practical framework for FAIR and federated Data Ecosystems that combines decentralized, distributed systems with existing ...
  41. [41]
    A Data-Driven Approach to Monitor and Improve Open and FAIR ...
    Jul 31, 2024 · In this contribution we present a data-driven approach to monitoring and assessing the state of open and FAIR data in an interdisciplinary, ...1 Introduction · 2 Methods · 4.1 Findability Of Research...
  42. [42]
    [PDF] Metrics for Data Repositories and Knowledgebases: Working Group ...
    Repository. Operations. Y. Storage costs. Total storage cost for repository#. 4. 48. Cost/dataset. (Storage). Cost per dataset (i.e. Storage). 2. 8. Hardware ...
  43. [43]
    [PDF] Data Storage Outlook 2022 - Spectra Logic
    This is the seventh annual Data Storage Outlook report from Spectra Logic. The document explores how the world manages, accesses, uses and preserves its ...
  44. [44]
    [DOC] Horizon Europe Data Management Plan Template
    Further to the FAIR principles, DMPs should also address research outputs other than data, and should carefully consider aspects related to the allocation of ...Missing: integration | Show results with:integration
  45. [45]
    [PDF] Horizon Europe Data Management Plan
    Nov 3, 2022 · FAIR principles and beyond: The data management plan not only aligns with the FAIR principles but also considers research outputs beyond data.
  46. [46]
    Data Availability | PLOS One - Research journals
    Dec 5, 2019 · PLOS journals require authors to make all data necessary to replicate their study's findings publicly available without restriction at the time of publication.
  47. [47]
    Reporting standards and availability of data, materials, code and ...
    Nature backs the Enabling FAIR Data initiative and requires authors to deposit data in community repositories. Announcement: FAIR data in Earth science. As the ...
  48. [48]
    Are Institutional Research Data Policies in the US Supporting the ...
    Feb 16, 2023 · In this article we explore the role of institutional research data policies in enabling and encouraging researchers at their institutions to generate FAIR data.
  49. [49]
    [PDF] Institutional Research Data Management Strategy - IWK Health Centre
    Sep 26, 2023 · RIA will develop protocols to formalize data management practices to bring them into compliance with. FAIR principles. The protocols will ...
  50. [50]
    4 Challenges for institutional research data management support
    Nov 5, 2019 · Ghent University (Belgium) shares 4 challenges 1 towards the development of it's institutional research data support and what they learned along the way.
  51. [51]
    Awareness of FAIR and FAIR4RS among international research ...
    Apr 15, 2025 · Awareness of FAIR data principles. The survey revealed a remarkably high level of awareness of FAIR principles for data, with 97% (n = 30) ...
  52. [52]
    Investigating FAIR data principles compliance in horizon 2020 ...
    This study aims to investigate the existing FAIR Data Management practices in the agri-food and rural development sector by assessing the FAIRness of project ...
  53. [53]
    FAIR metadata assessments - DataONE.org
    FAIR metadata assessments evaluate data based on Findability, Accessibility, Interoperability, and Reusability, using community-led principles to enhance data ...Researchers Increasingly... · Dataone Plus Portals · Guide Your Community Toward...Missing: compliance ecology survey
  54. [54]
    ELIXIR Position paper on FAIR Data Management in the Life Sciences
    Oct 19, 2017 · This document sets out the formal position of ELIXIR Europe on good data management practice in the life sciences and the practical ...
  55. [55]
    [PDF] Challenges of qualitative data sharing in social sciences
    Apr 4, 2022 · Open science fails to consider qualitative research complexities, data sharing is rarely considered for ethics, and exceptions to transparency ...
  56. [56]
    The effect of Open Data on Citations
    Nov 27, 2024 · They estimate that sharing data increases citations by about 9%, while about 6% of the citations concerned data reuse, suggesting that roughly ...
  57. [57]
    3 Ways scholarly journals can promote FAIR data - Scholastica Blog
    Apr 15, 2022 · A study published in 2020 found that articles that included statements linked to data in a repository had an up to 25% higher citation impact ...Missing: studies | Show results with:studies
  58. [58]
    NIH Data Management and Sharing Policy FAQs
    NIH encourages data management and sharing practices to be consistent with the FAIR (Findable, Accessible, Interoperable, and Reusable) data principles and ...
  59. [59]
    [PDF] Open Research Data and Data Management Plans
    Feb 23, 2018 · The ERC embraces the FAIR data principles: research data should be findable, accessible, interoperable and re-usable. This means that data ...
  60. [60]
    A funder-imposed data publication requirement seldom inspired ...
    In this study, we tested the ability to recover data collected under a particular funder-imposed requirement of public availability.
  61. [61]
    GO FAIR Data Stewardship Initiative Launched at UCSD. So What ...
    Nov 4, 2017 · GO FAIR is an initiative to promote and support data stewardship that allows data to be Findable, Accessible, Interoperable, and Reusable.
  62. [62]
    About us - GO FAIR Foundation
    The GO FAIR Foundation (GFF) was established in February 2018 as a separate legal entity under Dutch law in order to support the FAIR principles and metrics, ...
  63. [63]
    GO FAIR US
    GO FAIR US, established in 2019, supports the adoption of GO FAIR networks in the US, aiming to increase FAIR data stewardship.
  64. [64]
    Research Data Alliance (RDA)
    Groups are the heart of the Research Data Alliance. Engage with RDA groups, connect, share ideas and more. Search Recommendations & Outputs · Explore All Groups.RDA Plenary Meetings · Working Groups · About RDA Groups · About the RDA
  65. [65]
    The FAIR Data Maturity Model: An Approach to Harmonise FAIR ...
    Oct 27, 2020 · The RDA Working Group “FAIR Data Maturity Model” was established with the objective to develop a set of assessment criteria to facilitate ...
  66. [66]
    Working Groups - RDA - Research Data Alliance
    RDA Working Groups (WGs) should tangibly accelerate progress in concrete ways for specific communities with the overarching goal of increasing data-driven ...
  67. [67]
    FAIR Data for Disaster Risk Research - codata
    Planned activities and outputs for 2023-2025. Output 1) Convene workshops and events to align disaster data protocols and standards across; Output 2) Develop ...
  68. [68]
    WorldFAIR: Global cooperation on FAIR data policy and practice
    The WorldFAIR project is a major new global collaboration between partners from thirteen countries across Africa, Australasia, Europe, and North and South ...
  69. [69]
    GO FAIR initiative: Make your data & services FAIR
    The four collaborating international data organisations – CODATA, GO FAIR, RDA and WDS, collectively referred to as Data Together – commit to fostering ...
  70. [70]
    Applying FAIR Principles to Lab Data: Six Challenges You Might Face
    Aug 7, 2023 · Six common challenges to adopting FAIR principles · 1. Data fragmentation · 2. Limited accessibility · 3. Interoperability issues · 4. Data quality ...
  71. [71]
    FAIR Data Principles: Benefits, Challenges & FAQs - ZONTAL
    While both aim to enhance usability, FAIR focuses on making data usable by both humans and machines, even under access restrictions. For example, sensitive ...<|separator|>
  72. [72]
    Measuring the time spent on data curation | Journal of Documentation
    Feb 9, 2022 · Curators spent on average 3801.38 min (std. dev. 981.05 min), or about 63 h, for curating one dataset. About 75% of this time, that is 2820.75 ...Estimating RDM and curation... · Identifying cost drivers in data... · Data collection
  73. [73]
    Exploring the Current Practices, Costs and Benefits of FAIR ...
    Oct 25, 2021 · This theme centres on the costs associated with implementing FAIR data principles in pharmaceutical R&D departments. Despite the potential ...
  74. [74]
    Implementing Findable, Accessible, Interoperable, Reusable (FAIR ...
    Dec 19, 2024 · In this paper, we define a barrier as an impediment to the implementation of the FAIR data principle. Usually, a barrier requires more resources ...
  75. [75]
    Scientific research using health data: Is the GDPR in contradiction ...
    Apr 30, 2021 · Core principles of the GDPR are ostensibly in contradiction with the FAIR principles. We have set out to shed light upon these competing ...<|control11|><|separator|>
  76. [76]
    [PDF] Assessment of the EU Member States' rules on health data in the ...
    implementation of GDPR affect data accessibility for researchers. The main ... concept of FAIR data (findable, accessible, interoperable and reusable), more ...
  77. [77]
    Addressing barriers in FAIR data practices for biomedical data - PMC
    Feb 23, 2023 · Barrier 1: Without incentives, researchers tend to provide incomplete metadata, which limits data discovery and reuse. Currently, there are few ...
  78. [78]
    Will the GDPR Restrain Health Data Access Bodies Under the ...
    May 17, 2024 · As this article discusses, in some instances the need to have a valid legal basis under the GDPR may make it difficult to obtain a data access ...
  79. [79]
    FAIRness Literacy: The Achilles' Heel of Applying FAIR Principles
    Aug 11, 2020 · Semantic Web standards such as RDF, OWL and SKOS recommendations from the W3C, help metadata and data machine-readability.<|control11|><|separator|>
  80. [80]
    The FAIR guiding principles for data stewardship: fair enough? - PMC
    May 17, 2018 · The FAIR principles stress a number of crucial preconditions for data sharing, urging researchers to take the possibility of subsequent data ...
  81. [81]
    Are the FAIR Data Principles fair?
    Apr 18, 2018 · This practice paper describes an ongoing research project to test the effectiveness and relevance of the FAIR Data Principles.
  82. [82]
    FAIR Enough: Develop and Assess a FAIR-Compliant Dataset for ...
    May 1, 2024 · The FAIR data principles, which stand for Findable, Accessible, Interoperable, and Reusable [7, 8], were initially established to improve the ...
  83. [83]
    (PDF) Unfairness by the FAIR Principles Promoters: A Case Study ...
    This survey reviews and analyzes the evidence from the historical record of published literature relevant to the plagiarism by the Wilkin-son et al 2016 ...
  84. [84]
    [PDF] Unfairness by the FAIR Principles Promoters: A Case Study on ...
    Despite the flagrant plagiarism in this Wilkinson case, it has not yet been retracted by the journals involved. Complaints submitted by Taswell to publishers ...
  85. [85]
    Unfairness by the FAIR Principles Promoters: A Case Study on ...
    Despite the flagrant plagiarism in this Wilkinson case, it has not yet been retracted by the journals involved. Complaints submitted by Taswell to publishers ...
  86. [86]
    FAIR Versus Open Data: A Comparison of Objectives and Principles
    This article assesses the difference between the concepts of 'open data' and 'FAIR data' in data. management. FAIR data is understood as data that complies ...<|separator|>
  87. [87]
    Facing requirements on open access and the FAIR data principles.
    Jun 20, 2019 · While Open Data should be available for everyone and all purposes, access to FAIR data can be restricted, e.g. due to legal issues and the ...Missing: debates | Show results with:debates
  88. [88]
    Implementation and relevance of FAIR data principles in ...
    The industry's intellectual property (IP) need not be threatened by FAIRification; privacy requirements can indeed be observed when implementing FAIR, which ...
  89. [89]
    Selection of data sets for FAIRification in drug discovery and ... - NIH
    Research organisations are focussed on quantifying the costs and benefits of implementing FAIR. Criteria used for the selection of data for FAIRification ...
  90. [90]
    Reproducible big data science: A case study in continuous FAIRness
    To evaluate the reproducibility and FAIRness of our methods we conducted a user study comprising 11 students and researchers. Each was asked to replicate the ...
  91. [91]
    [PDF] Metadata, FAIR principles, and their importance in genomics
    Data reusability and reproducibility. You need metadata to reproduce your data correctly. For example, if you want to decode a.
  92. [92]
    FAIR principles for AI models with a practical application for ... - Nature
    Nov 10, 2022 · We introduce a set of practical, concise, and measurable FAIR principles for AI models. We showcase how to create and share FAIR data and AI models within a ...
  93. [93]
    FAIR principles in workflows: A GIScience workflow management ...
    This paper proposes the conceptualization and development of FAIR-oriented GIScience WfMS, aiming to incorporate the FAIR principles into the entire lifecycle ...
  94. [94]
    [PDF] Cost of not having FAIR research data - Publications Office of the EU
    Not having FAIR has an impact on both time and storage costs, namely: • Time to register (meta)data and to maintain a research repository with. (meta)data not ...
  95. [95]
    [PDF] FAIR Data Maturity Model
    Apr 3, 2020 · Description The FAIR Data Maturity Model defines a set of indicators, their priorities and evaluation methods for the evaluation of the FAIR ...Missing: empirical outcomes
  96. [96]
    From Conceptualization to Implementation: FAIR Assessment of ...
    Feb 3, 2021 · This paper presents practical solutions, namely metrics and tools, developed by the FAIRsFAIR project to pilot the FAIR assessment of research data objects.3. Fair Data Assessment... · 4.1 Metrics Specification · 5. From Use Cases To...Missing: empirical | Show results with:empirical
  97. [97]
    An automated solution for measuring the progress toward FAIR ...
    Nov 12, 2021 · This paper aims to contribute to the FAIR data assessment through a set of core metrics elaborating the principles and an automated tool that supports FAIR ...
  98. [98]
    Long-term availability of data associated with articles in PLOS ONE
    Aug 24, 2022 · These findings point to the value of including URLs and DOIs in Data Availability Statements to ensure access to data on a long-term basis.
  99. [99]
    Long-term availability of data associated with articles in PLOS ONE
    Aug 24, 2022 · This study used a corpus of 47,593 Data Availability Statements from all research articles published in PLOS ONE between March 1, 2014 (the date ...
  100. [100]
    Developing practices for FAIR and linked data in Heritage Science
    Mar 1, 2025 · The results were used to develop guides for good data practices and a list of recommended online vocabularies for standardised descriptions.Missing: allegations | Show results with:allegations
  101. [101]
    Developing practices for FAIR and linked data in Heritage Science
    Aug 10, 2025 · Developing practices for FAIR and linked data in Heritage Science. March 2025; npj Heritage Science 13(1). DOI:10.1038/s40494-025-01598-x.Missing: utility | Show results with:utility
  102. [102]
    Results from the COGR Survey on the Cost of Complying with the ...
    Results from the COGR Survey on the Cost of Complying with the New NIH DMS Policy | Council on Governmental Relations.Missing: time | Show results with:time
  103. [103]
    The Cost of Compliance with the New NIH DMS Policy - CITI Program
    May 18, 2023 · The COGR survey provides valuable insights into the financial burden of complying with the NIH DMS policy.
  104. [104]
    [PDF] Institutional Cost of Compliance & Administrative Burden - COGR
    Oct 27, 2023 · Burden Factors (used in DMS Survey):. 1 – Low Impact (e.g., no new staff, no reallocation of existing staff effort, no new training, no new ...Missing: results | Show results with:results
  105. [105]
    Open data ownership and sharing: Challenges and opportunities for ...
    This paper employs a systematic literature review to investigate the motivating factors, advantages, and obstacles associated with open data sharing.
  106. [106]
    Sharing Begins at Home: How Continuous and Ubiquitous ...
    Jul 28, 2022 · Despite increasing efforts to encourage data sharing, both the quality of shared data and the frequency of data reuse remain stubbornly low.Missing: rates criticism
  107. [107]
    Data sharing and reuse in clinical research: Are we there yet? A ...
    Nov 20, 2024 · Reusability challenges are related mainly to data quality, characterised by missing data, errors, and inadequate sample sizes. These challenges ...
  108. [108]
    The benefits and struggles of FAIR data: the case of reusing plant ...
    Jul 13, 2023 · The data they produce is heterogeneous, complicated, often poorly documented and, as a result, difficult to reuse. Meeting societal needs ( ...Missing: rates criticism
  109. [109]
    Introducing the FAIR Principles for research software | Scientific Data
    Oct 14, 2022 · ELIXIR coordinates and develops life science resources across Europe so that researchers can more easily find, analyse and share data, exchange ...
  110. [110]
    [PDF] Making Research Software FAIR: Background and Practical Steps
    FAIR4RS Principles - Background. FAIR Principles for Research Software or FAIR4RS Principles. 2020 - 2022. 200+ stakeholders involved. FAIR for Research ...
  111. [111]
    The FAIR for Research Software Principles after two years
    Oct 10, 2025 · This report provides an update on initiatives that are working to implement the principles across five areas of cultural change: policies, ...Missing: timeline | Show results with:timeline
  112. [112]
    Reproducibility Starts from You Today - ScienceDirect.com
    Sep 11, 2020 · By using the GitHub/Zenodo integration, you can get a DOI for your software and make it citable. Even if you do not use version control systems, ...Perspective · Use Version Control · After Your Project: How To...
  113. [113]
    [2207.00611] FAIR principles for AI models with a practical ... - arXiv
    Jul 1, 2022 · We introduce a set of practical, concise, and measurable FAIR principles for AI models. We showcase how to create and share FAIR data and AI models within a ...<|separator|>
  114. [114]
    Metrics for data - FAIR-IMPACT
    FAIR-IMPACT is working to refine and extend the seventeen minimum viable metrics proposed by FAIRsFAIR for the systematic assessment of FAIR data objects.Missing: evolution 2023-2025
  115. [115]
    [PDF] The FAIR Principles for Research Software: three years on
    Apr 11, 2025 · FAIR4RS Principles. FAIR Principles for Research Software (FAIR4RS Principles, 2022) are set of 17 principles tailored for research software.
  116. [116]
    White Paper Series - CODATA, Committee on Data of the ISC
    This white paper aims to identify the gaps in technology and relevant policies that prevent effective interconnection of disaster-related data and information ...
  117. [117]
    It's time to extend the FAIR Principles of data sharing - LSE Blogs
    Jan 22, 2025 · Gordon Blair argues that to create a research culture that makes the best use of available data we need to extend the 2016 FAIR.
  118. [118]
    Suggestions for extending the FAIR Principles based on a linguistic ...
    Apr 24, 2025 · FAIR (meta)data presuppose their successful communication between machines and humans while preserving meaning and reference.Introduction · Interoperability · Results
  119. [119]
    OSCARRs Award: The FAIREST Principles - Research Guides
    Mar 4, 2025 · More recently, a team of data scientists (2023) and a digital humanist (2024) separately proposed expanding FAIR to FAIREST.Missing: refinements | Show results with:refinements
  120. [120]
  121. [121]
    AI-Ready FAIR Data: Accelerating Science through Responsible AI ...
    Jul 2, 2024 · The FAIR principles were established to enhance the value of scientific data. By making data Findable, Accessible, Interoperable, and Reusable, ...<|control11|><|separator|>
  122. [122]
    ORCID Record for Angela Bogdanova
    Official ORCID profile for the AI-based Digital Author Persona Angela Bogdanova, demonstrating machine-readable attribution with persistent identifier.
  123. [123]
    Semantic Specification for Digital Author Persona
    JSON-LD identity schema for the Digital Author Persona archived on Zenodo, exemplifying FAIR-compliant structured metadata.