Fact-checked by Grok 2 weeks ago

Data sharing

![Decision on data deposition][float-right] Data sharing is the practice of making research data, such as measurements, observations, and transcripts, along with associated , available to other investigators for purposes including , , and secondary . This process underpins scientific and accelerates progress by enabling the combination of datasets for novel insights, though it requires careful management to address inherent tensions between and proprietary interests. Prominent frameworks like the FAIR principles—emphasizing , , , and reusability—have emerged to standardize data sharing practices, fostering broader adoption in fields from to social sciences. Funders and journals increasingly mandate sharing to combat reproducibility crises evidenced in empirical studies showing low replication rates across disciplines. Notable achievements include large-scale repositories that have facilitated meta-analyses yielding breakthroughs, such as in where shared data has mapped disease variants more comprehensively than isolated efforts. Despite these advances, data sharing encounters persistent barriers, including fears of loss, competitive scooping by rivals, and risks particularly with human subjects data. Systematic reviews identify institutional disincentives, such as lack of credit for shared data in academic evaluations, and technical hurdles like incompatible formats as key obstacles, often outweighing perceived benefits for individual researchers. Controversies arise from cases where premature sharing has led to uncredited reuse, underscoring the need for robust to balance communal gains against causal risks of .

Definition and Historical Development

Core Concepts and Principles

refers to the practice of making research available to other investigators, either through public repositories, supplementary materials in publications, or direct exchange, to facilitate of results, replication of studies, and further . This process underpins the cumulative nature of scientific inquiry, where from one study informs and builds upon subsequent work, reducing redundant efforts and mitigating errors from incomplete or inaccessible datasets. From a foundational , withholding undermines the self-correcting mechanism of , as independent scrutiny is essential to distinguish robust findings from artifacts or biases, a principle rooted in the empirical validation required for causal claims about natural phenomena. Core principles emphasize structured accessibility to maximize utility while respecting constraints like or participant confidentiality. The FAIR guidelines, articulated in , provide a framework for effective data stewardship: data must be findable through unique identifiers and rich ; accessible via standardized protocols, even if restricted; interoperable with compatible vocabularies and formats; and reusable under clear licenses permitting ethical secondary use. These principles prioritize machine-actionability to enable automated processing, addressing the inefficiency of human-only interpretation in large-scale datasets. Empirical support for such approaches stems from observations that shared, standardized data enhance rates, as demonstrated in fields like where public databases have accelerated discoveries. Additional tenets include promoting where feasible to foster and , balanced against ethical imperatives such as protecting sensitive human subjects data through or controlled access. Institutions like the NIH mandate data management plans that outline sharing strategies, underscoring that non-sharing can impede broader and public benefit from taxpayer-funded research. However, principles also recognize practical limits: data sharing should align with jurisdictional laws and avoid premature release of unvalidated preliminary findings, ensuring shared resources contribute causally to verifiable knowledge advancement rather than .

Early Practices in Science

In the of science, spanning the 16th to 18th centuries, data sharing occurred predominantly through informal epistolary networks rather than formalized repositories or mandates. exchanged raw observations, measurements, and experimental findings via letters, fostering verification and collaborative advancement amid limited printing and institutional structures. This practice aligned with the emerging ethos of empirical scrutiny over scholastic authority, though it was uneven, often tempered by concerns over intellectual priority and secrecy in proprietary fields like . The "," an international correspondence network active from the late 17th to 18th centuries, exemplified this mode of exchange, connecting intellectuals across and beyond through postal systems. Participants, including and (each authoring around 15,000 letters), shared astronomical positions, biological specimens' descriptions, geological samples, and experimental protocols to promote the experimental method and refute dogmatic claims. For instance, networks mapped from John Locke's correspondence reveal clustered exchanges of observational data that accelerated knowledge dissemination, with letters serving as precursors to by circulating findings for critique among trusted colleagues. Such practices enabled incremental progress, as seen in the global reach of Jesuit missionaries' reports on natural phenomena, though confidentiality circles limited full openness in sensitive matters. A pivotal early example of data sharing's impact unfolded in astronomy between and around . Brahe amassed unprecedentedly precise positional data on planetary motions, particularly Mars, using advanced instruments at his observatory in (1576–1597). Reluctant to release raw measurements during his lifetime to protect his geocentric models, Brahe permitted limited access to Kepler as an assistant in from 1600; following Brahe's death in 1601, Kepler fully utilized over 1,000 observations to derive his three laws of planetary motion by 1609 and 1619, overturning circular orbits in favor of ellipses. This reuse of empirical data—despite interpersonal tensions—demonstrated causal linkages in , underscoring how shared observations could refute entrenched theories through rigorous computation. The founding of the Royal Society in in 1660 institutionalized nascent sharing practices, emphasizing transparency to combat . Its journal, Philosophical Transactions, launched in 1665 by secretary , published detailed accounts of experiments, including tabular data, instrument readings, and observational logs—such as early microscopic descriptions by or atmospheric measurements. By disseminating "data" (a term increasingly applied to factual bases for inference, as analyzed in over 200 years of issues), the journal facilitated replication; for example, issues from 1665–1677 included astronomical ephemerides and catalogs, reaching subscribers across . This marked a shift toward public verification, though full raw datasets were not always appended, relying instead on narrative sufficiency for .

Emergence of Formal Policies (Pre-2000)

The U.S. Long-Term Ecological Research (LTER) Network, initiated in 1980 by the , marked one of the earliest formal frameworks for data sharing in , requiring sites to manage and share after a brief embargo period—typically one to two years—to enable primary investigators to publish first while promoting broader access for verification and secondary analysis. By 1990, the LTER adopted explicit guidelines emphasizing documentation, standards, and eventual public dissemination, though implementation varied due to limited digital infrastructure, with only one site initially supporting online access. These policies addressed challenges in long-term studies, such as coordinating multi-site on ecosystems, and influenced subsequent federal expectations for resource sharing in . In , the Principles of 1996 represented a pivotal formalization during the (HGP), an international effort launched in 1990 to sequence the human genome. Adopted at a meeting in from February 26-28, 1996, these principles required the immediate release of finished DNA sequence data—within 24 hours of assembly—to databases like , rejecting delays tied to publication or commercial interests in favor of unrestricted global access to accelerate discoveries in biology and medicine. This policy, enforced through HGP consortium agreements, contrasted with prior norms of proprietary withholding and was credited with enabling rapid progress, such as identifying disease-related genes, by fostering collaborative verification. Preceding these, domain-specific mandates emerged in fields like and , where the International Union of Crystallography required deposition of atomic coordinates in the for publications since the 1970s, though enforcement relied on journal policies rather than centralized regulation. Similarly, the 1873 Vienna Congress established international standards for daily weather data exchange among nations, facilitating global climate analysis but lacking the binding mechanisms of later scientific policies. These early efforts highlighted recurring tensions between openness for collective advancement and individual incentives, setting the stage for broader pre-2000 policies in federally funded research.

Theoretical Rationale and Empirical Benefits

Philosophical and First-Principles Justifications

Data sharing aligns with the Mertonian norm of communalism, which holds that scientific knowledge constitutes a public good belonging to the collective rather than individual property, obligating researchers to disseminate findings—including underlying data—to foster cumulative progress rather than proprietary hoarding. This norm, articulated by sociologist Robert K. Merton in 1942, underscores that secrecy undermines the scientific enterprise by impeding verification and extension of results, whereas open access to data promotes disinterested collaboration over personal gain. Empirical adherence to communalism correlates with reduced questionable research practices, as sharing counters incentives for data withholding that erode trust in published outcomes. From a first-principles standpoint, data sharing is causally necessary for scientific advancement, as isolated datasets limit to single analyses, whereas pooled data enable robust meta-analyses, hypothesis generation, and detection of errors or through independent scrutiny. Without access to raw data, replication—key to establishing reliability—becomes infeasible, stalling the iterative refinement of theories grounded in . This rationale echoes Karl Popper's emphasis on , where testable claims require transparent evidential bases; restricted data effectively shields hypotheses from rigorous disconfirmation, blurring the boundary between science and . Publicly funded amplifies these imperatives, imposing a moral duty on recipients to maximize societal returns by treating as a non-rivalrous resource whose value multiplies through reuse, rather than allowing that duplicates costly collection efforts. Funders' pro tanto obligations include mandating to asymmetries where taxpayers bear costs but derive incomplete benefits from summarized publications alone. Such principles prioritize causal —linking outputs directly to inputs—over institutional biases favoring opacity, ensuring serves truth-seeking over careerist silos.

Evidence from Reproducibility and Collaboration Studies

Empirical investigations into reproducibility highlight data sharing as a critical factor in enabling independent verification of scientific findings. A 2023 study examining nearly 500 articles in Management Science revealed that the journal's June 2019 policy mandating data and code disclosure elevated reproducibility rates from 6.6% in pre-policy articles (where voluntary materials were available for only 12% of cases, with 55% of those succeeding) to 67.5% post-policy, though data access issues persisted in 29% of latter submissions. Similarly, Science's February 2011 policy requiring supplementary data and code sharing increased data availability from 52% in 2009–2010 articles to 75% in 2011–2012 ones, yet computational replication succeeded in only 26% overall, attributing shortfalls to incomplete artifacts or inaccessible formats rather than policy absence. These results demonstrate that while policies boost material provision, full reproducibility demands standardized, verifiable deposits to mitigate technical barriers. Collaboration studies further link data sharing to amplified research networks and output integration. Public data availability permits secondary analyses and meta-syntheses, fostering multi-institution efforts that non-shared datasets preclude. In a 2007 analysis of 85 cancer microarray clinical trials, papers depositing data in public repositories received 69% more citations (p=0.006) than non-depositing peers, controlling for journal , publication date, and author attributes, with shared data accruing 85% of total citations despite comprising 48% of trials. A 2019 natural experiment across and journals confirmed that enforced data mandates—unlike unenforced ones—yielded about 97 additional citations per article via instrumental variable estimation, reflecting heightened reuse in collaborative extensions. Such citation premiums, often from downstream collaborations, underscore data sharing's role in accelerating collective progress, though benefits accrue primarily when sharing is verifiable and low-friction.

Economic and Societal Impacts

Data sharing in scientific yields economic benefits primarily through reduced redundancy in and enhanced in . Openly available data can avert duplicative efforts, potentially saving up to 9% of costs by obviating the need for repeated . The failure to share data in formats (findable, accessible, interoperable, reusable) imposes an estimated annual cost of at least €10.2 billion on the European economy, reflecting lost opportunities from siloed datasets. Case studies further indicate that data sharing delivers financial returns for funding agencies by minimizing expenditures on redundant , thereby amplifying for publicly financed . Macroeconomic analyses project that broader access to and sharing of data, including research datasets, could unlock value equivalent to 0.1% to 1.5% of GDP in affected economies, driven by accelerated and gains across sectors reliant on evidence-based . In global contexts, initiatives promoting data are forecasted to contribute up to 2.5% of worldwide GDP through spillover effects like improved and novel applications of existing data. These gains stem from causal mechanisms such as lowered for secondary analyses, which expand the utility of high-cost datasets beyond initial creators. On the societal front, data sharing bolsters in by enabling independent verification and , which mitigates errors and biases in published findings. It facilitates cross-disciplinary collaborations, yielding emergent insights that solitary efforts might overlook, and supports equitable access for researchers in resource-constrained settings. In domains, shared datasets enable rapid signal detection for outbreaks, refine epidemiological models, guide evidence-based policies, and incorporate diverse inputs, as evidenced during responses to infectious threats. Additionally, by enhancing enterprise-level and operational , data openness contributes to broader objectives, including and socioeconomic planning. underscores these outcomes, with shared data correlating to higher citation rates and faster knowledge dissemination in fields like .

Policy Mandates and Regulatory Frameworks

United States Policies

The federal government has implemented policies promoting scientific data sharing primarily through funding agencies, emphasizing transparency, reproducibility, and public access to taxpayer-funded outputs. These policies require grant applicants to submit detailed and plans, with mandates evolving from earlier voluntary guidelines to more stringent requirements in response to reproducibility crises in science. A pivotal framework is the 2022 Office of Science and Technology Policy (OSTP) memorandum, "Ensuring Free, Immediate, and Equitable Access to Federally Funded Research," issued on August 25, 2022. This directive instructs federal agencies to revise public access policies for scholarly publications and supporting scientific data, eliminating embargoes and requiring immediate availability upon publication or acceptance, with full implementation by December 31, 2025. It prioritizes machine-readable formats, metadata standards, and accommodations for sensitive data while aiming to maximize the reuse of data for validation and new discoveries. Agencies must develop plans ensuring data from funded research is preserved in designated repositories, with progress reports due within 180 days of the memo. The National Institutes of Health (NIH) enforces the Data Management and Sharing (DMS) Policy, effective January 25, 2023, applicable to all extramural and intramural research generating scientific data, regardless of funding amount. Applicants must include a DMS plan in grant proposals, outlining data management, preservation, and sharing strategies, including timelines, formats, and repositories compliant with FAIR (Findable, Accessible, Interoperable, Reusable) principles where feasible. Scientific data—defined as recorded factual material of sufficient quality to validate and replicate results—must be shared no later than the publication date of associated findings or the end of the award period plus one year, with a maximum retention of five years post-sharing unless justified otherwise. Budgets must allocate costs for these activities, and compliance is assessed during peer review and progress reports, with non-compliance potentially affecting future funding. The policy builds on the 2003 NIH Data Sharing Policy but expands scope to mandate plans for all relevant projects, addressing prior limitations where sharing was optional for smaller grants. The (NSF) requires a supplementary two-page and Sharing Plan (DMSP) for all proposals since 2011, detailing how data will be managed, preserved, and disseminated to enable validation and reuse. Funded projects must deposit datasets in public repositories, with sharing expected upon publication or within a reasonable timeframe tied to the research lifecycle, and annual reports must document progress. In alignment with the OSTP memo, NSF is updating its public access plan to enforce zero-embargo data release by 2025, including interoperability and support for diverse data types across directorates. Exceptions apply for proprietary or classified data, but proposers must justify any withholding. Other agencies, such as the Department of Energy () and National Aeronautics and Space Administration (), incorporate similar requirements tailored to their domains, often mandating deposition in agency-specific repositories like OSTI.gov for energy research . These policies collectively aim to mitigate issues evidenced in studies showing low availability rates in publications (e.g., less than 50% in some fields pre-mandates), though enforcement relies on self-reporting and institutional oversight rather than audits.

International and Supranational Initiatives

The Organisation for Economic Co-operation and Development (OECD) adopted the Principles and Guidelines for Access to Research Data from Public Funding in 2007, building on a 2004 declaration by ministers from OECD countries to ensure optimal access to publicly funded digital research data. These guidelines emphasize open access that is easy, timely, user-friendly, and preferably internet-based, while respecting intellectual property rights, privacy, and national security; they apply to data produced for publicly accessible knowledge and have been endorsed by OECD member states to foster international collaboration. In 2021, the OECD updated its Recommendation on Enhanced Access to Research Data from Public Funding, incorporating FAIR data principles to promote machine-readable metadata and persistent identifiers for better discoverability and reuse. The European Union's program, launched in 2021 with a budget exceeding €95 billion through 2027, mandates data management plans (DMPs) for all funded projects to outline how research data will be managed, preserved, and shared in accordance with principles. Beneficiaries must ensure data is as open as possible and as closed as necessary, prioritizing FAIR-compliant repositories for long-term accessibility, with exemptions only for justified reasons such as commercial exploitation or ethical constraints; this builds on Horizon 2020 guidelines from that first required implementation. The EU's approach aims to maximize the reuse of data across borders, supported by the European Open Science Cloud (EOSC) infrastructure for federated access. The (WHO) established a policy in 2016 promoting data sharing during public health emergencies, urging rapid, transparent release of research data to inform responses, as demonstrated in calls following the 2014-2016 outbreak where delayed sharing hindered global efforts. In September 2022, WHO updated its funding policy through the Special Programme for Research and Training in Tropical Diseases (TDR) to require full sharing of all research data generated from awarded grants, including raw datasets, to accelerate discovery and reproducibility in health research. This aligns with joint initiatives like the Global Research Collaboration for Infectious Disease Preparedness (GloPID-R), which in 2017 outlined principles for data sharing in emergencies, emphasizing ethical frameworks to balance speed with protections for vulnerable populations. The Guiding Principles for scientific and , articulated in a consensus statement by an international group of stakeholders, provide a framework for making findable through unique identifiers and rich , accessible via standardized protocols, interoperable with other datasets, and reusable under clear licenses. Though not legally binding, these principles have been integrated into policies by supranational bodies like the and , influencing global standards for digital research outputs. Complementing this, the Committee on Data of the (CODATA) has advanced initiatives such as the Data Policy for Times of Crisis project since 2020, developing tools and guidance for sharing during disasters to support evidence-based decision-making across disciplines and borders.

Private Sector and Industry Approaches

In the pharmaceutical industry, data sharing approaches center on controlled-access platforms for clinical trial data, driven by regulatory pressures and collaborative needs while safeguarding proprietary interests. The Vivli platform, launched in 2016 by a nonprofit consortium, serves as a centralized repository where sponsors voluntarily deposit anonymized patient-level data from over 7,500 clinical studies, allowing independent researchers to request access after review by an independent panel to ensure scientific merit and ethical compliance. Similarly, ClinicalStudyDataRequest.com (CSDR), operational since 2013 and comprising major sponsors like GlaxoSmithKline and Sanofi, provides a gateway for qualified researchers to access de-identified data from interventional trials, with access granted via data-sharing agreements that prohibit commercial use and require result publication. These initiatives stem from 2013 principles endorsed by the Pharmaceutical Research and Manufacturers of America (PhRMA), which advocate sharing data post-regulatory approval to verify findings without undermining commercial viability. Technology firms adopt open data strategies to foster ecosystem innovation, often releasing non-proprietary datasets or supporting infrastructure for research while retaining control over core IP. Microsoft, for instance, collaborates with industry partners to promote private-sector data sharing for societal applications, including AI training datasets and cloud-based tools that enable secure federated access without full disclosure. Amazon Web Services (AWS) hosts public research datasets and provides compliance tools for open data policies, such as those tied to federal grants, facilitating cost-effective storage and analysis while companies like AWS prioritize user agreements to prevent misuse. These approaches contrast with unrestricted open access by incorporating tiered permissions, reflecting empirical evidence that unrestricted sharing risks competitive disadvantages, as identified in analyses of private-sector barriers where intellectual property leakage concerns deter 70-80% of organizations from broader disclosure. Across sectors, private initiatives emphasize trusted intermediaries and standardized agreements to mitigate risks like data scooping or privacy breaches, with partnerships yielding targeted benefits such as reduced R&D duplication in , where shared negative trial results have informed 20-30% of subsequent studies per platform reports. However, uptake remains selective; a 2022 study of private organizations found that only 25% routinely share data externally due to misaligned incentives, including fears of eroding market edges, underscoring that industry approaches prioritize verifiable over universal openness.

Systemic Barriers and Incentive Misalignments

Academic Career Incentives and Publish-or-Perish Culture

The in , where advancement hinges predominantly on volume and , systematically discourages data sharing by prioritizing over datasets to sustain personal output. Tenure, promotions, and grant funding evaluations emphasize metrics like paper counts and journal impact factors, fostering a competitive where researchers hoard to derive multiple publications rather than risk enabling rivals' analyses. This misalignment arises because shared could accelerate others' findings, reducing the original investigator's opportunities for follow-up papers and citations, which are central to professional metrics. Empirical studies confirm that motivational barriers rooted in these incentives predominate. A 2017 analysis in the New England Journal of Medicine argued that conventional authorship practices incentivize maximizing sequential from one , thereby undermining data release as it dilutes the primary author's pipeline. Similarly, a survey of academics found that perceived effort outweighing rewards, including scant career for sharing, deters deposition, with respondents citing the absence of tangible benefits in dossiers. In biomedical fields, where underpin high-stakes replication, this culture exacerbates withholding, as investigators view raw data as for future grants rather than communal resources. Recent surveys quantify the scale of this disincentive. A 2025 study across institutions identified limited incentives—such as no formal in evaluations—as a barrier for 15% of researchers, compounded by fears of competitive disadvantage in a metrics-driven system. Linking to broader issues, a 2025 Nature survey of over 1,500 scientists revealed that 62% attributed irreproducibility "always" or "very often" to pressures, which manifest in selective to meet output demands rather than full . These patterns persist despite mandates, as institutional reward structures rarely credit curation or sharing equivalently to novel results. Proposals to realign incentives include data authorship credits or dedicated funding for sharing efforts, yet adoption lags due to entrenched evaluation norms. Without reforms tying promotions to verifiable contributions like accessible datasets, the publish-or-perish dynamic continues to impede collaborative progress, prioritizing individual metrics over cumulative scientific advancement.

Resource and Technical Obstacles

One major resource obstacle to data sharing in scientific is the high time and labor investment required to prepare datasets for release, including , anonymizing, documenting, and formatting data to comply with requirements. Surveys of researchers indicate that insufficient time is frequently cited as a top barrier, with one study of over 1,000 academics finding that 28% viewed the effort involved in data preparation as excessive relative to potential benefits. This burden is exacerbated in resource-limited settings, such as low- and middle-income countries, where data sharing demands additional for and communication that are often unavailable without dedicated . Financial constraints further compound these issues, as archiving and maintaining shared incurs ongoing costs for storage, curation, and infrastructure that are rarely covered by or institutional budgets. For instance, the lack of sustainable models for data repositories leads to underinvestment in long-term preservation, with estimates suggesting that preparing a single for can cost thousands of dollars in personnel and resources. In academic environments, where principal investigators juggle multiple projects, these expenses compete directly with core research activities, deterring unless mandates enforce it. Technical obstacles primarily stem from the absence of standardized formats and protocols, which impede and reuse across disciplines and platforms. Without uniform standards, researchers must invest additional effort in converting proprietary or field-specific formats—such as raw sequencing files in or proprietary outputs in —into accessible, machine-readable structures, a process that can fail due to incompatible systems. Inadequate , including limited computational tools for large-scale handling and secure transfer, poses further hurdles; for example, high-volume datasets from fields like astronomy or modeling overwhelm many public repositories' capacity, resulting in upload failures or degraded accessibility. Data security and integration challenges also arise technically, as ensuring with varying and controls requires specialized software that many labs lack. A 2023 analysis highlighted that fragmented technical ecosystems, including siloed databases and insufficient APIs for cross-platform querying, reduce the practical utility of shared data, with issues cited in 52% of reported barriers among surveyed institutions. These problems persist despite emerging tools, as adoption lags due to gaps and with existing workflows.

Intellectual Property and Scooping Risks

![Factors influencing reluctance to deposit data publicly][float-right] In the of scientific sharing, the of being "scooped"—whereby competitors exploit shared to publish analyses or findings before the original researcher—serves as a prominent barrier, particularly in competitive fields like and . This concern stems from the high stakes of careers, where in directly impacts , promotions, and tenure; surveys of biologists highlight it as a key perceived risk, alongside worries over uncompleted personal analyses. Empirical analyses suggest, however, that scooping remains infrequent, as data originators retain advantages in interpreting their own datasets, with most follow-up publications from original occurring within two years, outpacing reuse of archived which peaks later. Intellectual property risks further complicate data sharing, as public disclosure can forfeit protections—valuable for maintaining competitive edges in proprietary research—and potentially invalidate claims if inventive aspects are revealed prior to filing under doctrines like . facts and data themselves lack eligibility, though creative elements such as annotations or database structures may qualify, with ownership typically vesting in creators or employers via work-for-hire arrangements. In practice, policies like the National Institutes of Health's and framework permit temporary data withholding to secure patents, balancing openness with innovation incentives, yet researchers must navigate contracts and licenses—such as variants—to delineate reuse terms without unintended IP erosion. Mitigation strategies include timestamping priority via preprints on platforms like or employing data licenses that stipulate attribution and restrict premature competing uses, though challenges persist in decentralized repositories. Despite these risks, indicates that strategic archiving, post-initial , minimizes vulnerabilities while enabling and collaboration, underscoring a tension between individual safeguards and collective scientific advancement.

Disciplinary Differences and Field-Specific Issues

Natural and Biomedical Sciences

In the natural and biomedical sciences, data sharing enables verification of experimental results, meta-analyses, and accelerated discovery, but implementation varies widely across subfields due to dataset complexity and regulatory constraints. Biomedical datasets often include sensitive human health information, necessitating compliance with privacy laws like the Health Insurance Portability and Accountability Act (HIPAA) in the United States, which limits unrestricted access to protect patient confidentiality. In contrast, natural sciences such as physics and astronomy frequently achieve higher sharing rates through public repositories; for instance, particle physics collaborations like those at CERN routinely release raw data from experiments such as the Large Hadron Collider to foster global validation. However, even in these fields, sharing raw experimental or observational data remains inconsistent, with surveys indicating that only about 55% of researchers in physical sciences deposit data openly. Empirical studies reveal persistently low data sharing rates in biomedical research, undermining efforts. A review of 7,750 medical research papers published between 2015 and 2020 found that just 9% included promises of data availability, with actual fulfillment even lower due to barriers like lack of standardized formats and . In clinical trials, biological trials were 1.58 times more likely to share than pharmaceutical trials, reflecting differences in competitive pressures and data volume. Genomic in fares better, with public archives like hosting over 300 million sequences as of 2023, yet associated phenotypic and clinical metadata are often withheld to prevent re-identification risks. These patterns highlight how biomedical 's linkage to identifiable individuals creates ethical dilemmas, contrasting with natural sciences where datasets, such as geological or astronomical observations, pose fewer issues but still face technical hurdles in . Key barriers in biomedical sciences include researcher concerns over intellectual property, scooping by competitors, and the substantial effort required for curation without immediate rewards, exacerbated by a "publish-or-perish" culture prioritizing novel findings over data maintenance. Lack of time emerges as the predominant obstacle, cited by a majority in surveys of life sciences researchers, alongside insufficient incentives for FAIR (Findable, Accessible, Interoperable, Reusable) compliance. In natural sciences, while collaborative projects promote sharing—evident in open access to climate modeling data—individual investigators often withhold proprietary simulation outputs due to resource-intensive reproduction costs. Efforts to address these include controlled-access platforms like the Database of Genotypes and Phenotypes (dbGaP), which balance utility with security, though adoption remains partial owing to administrative burdens. Overall, while natural sciences benefit from less regulated data types, biomedical fields grapple with harmonizing openness and ethical safeguards, resulting in fragmented practices that hinder cumulative progress.

Social Sciences and Psychology

In social sciences and psychology, data sharing rates remain notably low compared to natural and biomedical fields, with empirical analyses of psychological articles from 2014 to 2017 revealing public data sharing in fewer than 4% of empirical papers. This reluctance persists despite advocacy for practices, as surveys of psychologists identify perceived barriers such as the uncommon nature of sharing in the discipline, preferences for data release only upon direct request, and concerns over intellectual priority or "scooping." Quantitative data from surveys, including experimental and survey-based studies, are somewhat more amenable to sharing than qualitative materials like transcripts, yet overall adoption lags due to field-specific methodological diversity and human subjects protections. Privacy and ethical constraints constitute primary impediments, as these disciplines frequently involve sensitive personal data from human participants, including mental health records, behavioral responses, and demographic details subject to regulations like HIPAA in the United States or GDPR in Europe. Institutional review boards (IRBs) often impose stringent conditions on data release to safeguard confidentiality, with researchers citing fears of re-identification, participant harm, or breaches of informed consent as deterrents; for instance, qualitative data sharing evokes worries over lacking explicit participant permission and eroding trust. In education research—a social science subdomain—barriers include IRB hurdles and risks of data misinterpretation by secondary users lacking contextual expertise, further compounded by legal frameworks like FERPA that restrict sharing identifiable student information. These issues are exacerbated in psychology, where digital behavioral data collection heightens inadvertent privacy risks, prompting calls for de-identification techniques like aggregation or synthetic data generation, though implementation remains inconsistent. The reproducibility crisis in underscores data sharing's potential benefits while highlighting its deficiencies, as large-scale replication efforts have yielded success rates substantially below original study expectations—often around 36% for key effects in cognitive and experiments—partly attributable to unavailable . Lack of accessible datasets impedes independent , with analyses linking non-sharing to inflated false positives from selective or p-hacking, practices more prevalent in fields reliant on significance testing. In social sciences, similar patterns emerge, where institutional and normative factors, including career pressures favoring novel findings over replication, discourage proactive sharing; however, mandated policies and repositories have shown modest increases in when data are deposited, though behavioral controls like technical skills and resource access continue to limit uptake. Despite these challenges, targeted interventions—such as badges for in journals or federated access systems preserving privacy—have encouraged gradual shifts, with psychologists reporting higher willingness when preconditions like standardized formats and ethical safeguards are met.

Other Fields (e.g., , )

In , data sharing often involves depositing digital records of excavations, artifacts, and spatial into repositories that adhere to principles—findable, accessible, interoperable, and reusable—to enable and secondary . The Data Service in the UK, for instance, emphasizes these principles to facilitate data discovery and reuse, though challenges persist due to inconsistent documentation and a historical emphasis on primary collection over long-term reusability. Reusers frequently encounter barriers such as inadequate context for interpreting datasets, leading to difficulties in verifying findings or integrating data from multiple sites. Ethical and jurisdictional issues further complicate sharing in , particularly with or culturally sensitive materials, prompting integration of principles (collective benefit, authority to control, responsibility, and ethics) alongside to respect . Repositories like tDAR (Digital Archaeological Record) demonstrate successful , such as reanalyzing chronological data from legacy projects, but many datasets remain siloed due to overlapping federal and state regulations that hinder standardized access. A 2023 study found that while digital archiving improves preservation, reuse rates lag because of insufficient describing analytical processes. In , data sharing supports replication efforts amid a recognized , where approximately 61% of experimental studies have replicated successfully in large-scale assessments, often hinging on access to original datasets and . Barriers include fear of scooping, where researchers withhold or survey data to protect publication opportunities, and competitive funding models that incentivize short-term sharing but discourage long-term openness due to perceived risks to career advancement. Economic analyses frequently rely on public datasets from sources like statistics, yet from firms or surveys is rarely shared fully, exacerbating replication gaps as economists replicate others' work at low rates compared to fields like . Incentives for sharing in are misaligned by "publish-or-perish" pressures favoring novel results over verifiable packages, though journals increasingly mandate and deposits, boosting partial in about 40-60% of cases depending on the subfield. Costly technical barriers, such as anonymizing sensitive economic while preserving utility, further deter sharing, with studies showing that without policy enforcement, self-reported sharing intentions rarely translate to actual deposits. Despite these hurdles, targeted reforms like replication bounties or pre-registration have shown promise in subfields like , where shared has enabled meta-analyses revealing incentive distortions in original studies.

Controversies and Real-World Outcomes

The reproducibility crisis refers to the widespread inability to replicate published scientific findings, with replication rates as low as 36% in and 11-25% in preclinical . Insufficient data sharing exacerbates this issue by preventing independent researchers from accessing necessary to verify analyses, detect errors, or rule out selective and fabrication. Without , replication attempts are limited to re-running reported methods on new samples, which cannot confirm if original results stemmed from data manipulation or analytical flaws. Empirical studies demonstrate a direct link between availability and replication success. In a large-scale replication effort in by the Collaboration, many original studies lacked shared , complicating verification; where were available, reproducibility assessments revealed discrepancies in only about 55% of cases, implying even lower rates without access. A survey of researchers identified unavailability of as a primary barrier to , cited by over 40% of respondents as a frequent cause of failed replications. In social sciences, an analysis of 250 articles from 2014-2017 found available for only 7% of studies, correlating with low and hindering independent checks. Data withholding often stems from fears of , as sharing exposes potential errors or , yet this practice perpetuates non-reproducible claims in the . For instance, at the journal Molecular Brain from 2017-2019, over 97% of manuscripts requiring were rejected or withdrawn due to inadequate data provision, with many later published elsewhere without . This pattern suggests that non-sharing masks irreproducibility, allowing questionable findings to influence policy and further research. Academic incentives prioritizing novel publications over amplify the problem, as researchers avoid sharing to prevent "scooping" or criticism, despite evidence that enhances overall scientific reliability. Mandated sharing policies, such as those from NIH post-2020, aim to mitigate these links by enforcing data deposition, though compliance remains uneven.

Compliance Failures and Enforcement Gaps

Despite mandates from major funders and journals, compliance with data sharing requirements remains low across scientific disciplines. A analysis of articles adhering to International Committee of Editors (ICMJE) standards for s found that only 0.6% of individual-participant sets were deidentified and publicly available on journal websites, with most authors citing availability statements that promised sharing upon request but rarely delivering. Similarly, in a review of 2,941 publications, just 34% included any sharing statement, with rates varying from 52% in to lower in other fields, indicating inconsistent adherence even where policies exist. These figures persist despite journal policies, as requests for from authors promising succeed in only 27-59% of cases, with 14-41% ignored entirely. Enforcement mechanisms are often weak or absent, exacerbating non-compliance. Funding agencies like the NIH outline potential consequences for failing and (DMS) plans, such as adding special award conditions or termination, yet systematic monitoring is limited to self-reported progress updates, which lack independent verification. Perrino et al. argue that varying enforcement degrees across policies undermine effectiveness, with non-binding requirements failing to compel sharing amid competing academic incentives. In high-impact medical journals, even mandatory policies yield incomplete data and code deposits, highlighting gaps in oversight where journals rarely retract or penalize non-compliant articles. Field-specific gaps further illustrate enforcement shortfalls. In , journals with stringent data sharing mandates report higher data sharing statement prevalence, but actual provision lags, as authors exploit ambiguities in "availability upon request" clauses without follow-through. studies show 42% DSS , yet over half of promising authors withhold , attributable to unmonitored policies rather than technical barriers. Leading funders perceive six core challenges, including insufficient incentives and verification tools, rendering policies more declarative than operative. This pattern suggests that without robust, automated checks or tied disbursements, systemic non-enforcement perpetuates selective sharing favoring high-profile or low-risk datasets.

Success Stories and Counterexamples

The Human Genome Project exemplified successful data sharing through the Bermuda Principles, established in 1996, which required the rapid public release of sequence data within 24 hours of assembly, fostering international collaboration and accelerating the project's completion two years ahead of schedule in 2003. This approach generated over 3.8 million research papers citing the project by 2020 and enabled downstream discoveries, such as identifying genes linked to diseases like cystic fibrosis, by making data accessible to thousands of independent researchers worldwide. In the response, immediate deposition of genome sequences to public repositories like in January 2020 allowed for phylogenetic analysis and variant tracking, directly informing designs by companies such as and Pfizer-BioNTech, which received emergency authorization by December 2020.00147-9/fulltext) Over 15 million sequences were shared by mid-2023, enabling real-time surveillance that prevented an estimated 1.3 million deaths through optimized distribution modeling. Counterexamples highlight implementation failures despite policy mandates. A 2022 mixed-methods of 2,700 biomedical papers found that only 6% of authors claiming availability actually provided accessible upon request, undermining and wasting an estimated $28 billion annually in U.S. biomedical research due to non-shared datasets. In , post-HGP shifts toward controlled-access models for sensitive , such as the NIH's dbGaP database requiring data use agreements since 2008, have slowed secondary analyses; a 2021 review noted that restricted access delayed insights into rare variants by months compared to open models. Scooping risks, though often cited as a barrier, rarely materialize but can deter sharing. A 2017 Finnish of projects documented researchers employing strategies like timestamped preprints and modular release to mitigate fears, yet one instance involved a competitor publishing derivative findings from shared preliminary datasets before the originators, eroding without . In , a 2022 allegation against a researcher fabricating from a shared extinction-site to preempt a collaborator's illustrates misuse potential, though the case centered on falsification rather than legitimate reuse. These instances underscore that while systemic non-compliance and rare abuses persist, proactive policies like citation credits for datasets—implemented in platforms such as since 2014—can align incentives without fully eliminating risks.

Recent Advances and Future Prospects

Policy Updates Post-2020 (e.g., NIH DMS Policy)

The National Institutes of Health (NIH) finalized its Data Management and Sharing (DMS) Policy in October 2020, with implementation effective for all competing grant applications submitted on or after January 25, 2023. This policy requires researchers to develop and submit a DMS Plan outlining how scientific data from NIH-funded projects will be managed, preserved, and shared to maximize its reuse and value, including provisions for data formats, metadata standards, and access timelines. Unlike prior NIH data sharing guidance, which applied selectively to certain institutes or data types, the DMS Policy applies uniformly to all extramural research generating scientific data, regardless of funding amount, and mandates prospective budgeting for data management and sharing activities, with costs allowable in NIH budgets starting from the effective date. Scientific data must be made available in designated repositories no later than the end of the performance period or upon acceptance of associated publications, whichever comes first, while respecting privacy, proprietary, and ethical constraints. The policy's core elements include four required DMS Plan components: data management and sharing descriptions, anticipated data types and preservation standards, related documentation and metadata, and access/usage/reuse policies, with NIH institutes providing supplemental guidance on plan formats and review criteria. NIH evaluates compliance through just-in-time submissions for funded awards, of plans for scientific merit, and post-award oversight, including potential enforcement via funding restrictions for non-compliance, though initial implementation emphasized education over penalties. By July 2023, NIH reported over 90% of applicable applications included DMS Plans, reflecting broad adoption, though challenges persist in defining "scientific data" (excluding physical collections or lab notebooks) and selecting appropriate repositories from NIH's recommended list. Complementing NIH's efforts, the White House Office of Science and Technology Policy (OSTP) issued a memorandum on August 25, 2022, directing all federal agencies to update public access policies for scholarly publications and underlying data from federally funded research, eliminating previous embargo periods and prioritizing immediate, equitable access without delay. This "Nelson Memo" requires agencies to finalize revised policies by December 31, 2025, with implementation phased to enhance data discoverability, interoperability, and reuse through standardized metadata and federal data repository coordination, building on the 2013 Holdren Memo but extending zero-embargo access to data alongside publications. Agencies like the National Science Foundation (NSF) aligned their data management plans with similar requirements effective January 2023, mandating data sharing plans for all proposals and emphasizing FAIR (Findable, Accessible, Interoperable, Reusable) principles. These updates aim to address longstanding barriers to reproducibility and collaboration, though implementation varies by agency, with OSTP encouraging harmonized federal standards to minimize researcher burden.

Technological Facilitators and Repositories

![Decision tree for data deposition in journals][float-right] Technological facilitators for research data sharing include standardized frameworks such as the FAIR principles, which emphasize making data findable through unique identifiers like DOIs, accessible via open protocols, interoperable with common formats and vocabularies, and reusable with clear licenses and information. These principles, formalized in 2016, underpin many repository implementations by requiring rich metadata to enable automated discovery and integration. further enables scalable storage and computation, allowing repositories to handle large datasets without local infrastructure, as seen in cloud-native systems that offer high reliability and cost-efficiency for big scientific data. and federated access protocols facilitate secure, controlled sharing across platforms, reducing duplication while preserving privacy through techniques like or . Key repositories for data sharing encompass generalist platforms like , operated by since 2013, which assigns DOIs to datasets and supports files up to 50 GB with long-term preservation commitments. Figshare, launched in 2011 by , allows immediate publication of research outputs with citation metrics and integration with for author tracking. , a nonprofit repository founded in 2008, specializes in peer-reviewed data packages linked to publications, enforcing licenses and providing curation services. Harvard , part of the Dataverse Project since 2006, offers institutional branding, , and for programmatic access, hosting over 80,000 datasets as of 2023. Domain-specific repositories enhance sharing in targeted fields; for instance, for genomic sequences or ICPSR for data provide specialized schemas aligned with disciplinary standards. The Framework (OSF), developed by the Center for Open Science in 2013, integrates with data storage, preregistration, and tools to support reproducible workflows. NIH guidelines, updated in 2023, recommend repositories with features like persistent identifiers, access controls, and compliance with , prioritizing those that minimize costs for public data while accommodating sensitive information through restricted access tiers. Emerging technologies like for data and IPFS for decentralized storage are being piloted to address trust and permanence issues in sharing. Despite these advances, adoption varies, with generalist repositories handling multidisciplinary data but often requiring manual curation to meet compliance fully.

Potential Reforms to Align Incentives

One proposed involves academic evaluation criteria to explicitly reward data sharing during hiring, , and tenure decisions. Institutions could incorporate metrics such as citations, reuse rates, and contributions to public repositories into faculty assessments, shifting emphasis from count to broader impact including . A 2021 scoping review of interventions found that such incentive alignments, when tied to advancement, increased sharing rates in fields like , where sharing rose from 0.6% pre-mandate to over 50% after journal policies rewarded . Funding agencies could further align incentives by conditioning grants on verifiable data management and sharing plans, with priority given to applicants demonstrating prior or replication efforts. For instance, the has explored extending its and Sharing Policy to include bonus funding for high-impact shared datasets, addressing the current misalignment where non-sharing preserves competitive edges in grant cycles. Proponents argue this counters the "," where proprietary data hoarding reduces collective scientific progress, as evidenced by surveys showing 75% of researchers citing career risks as barriers to . Publishers and journals might implement tiered incentives, such as open data badges conferring citation advantages or dedicated tracks for data-focused publications. A 2025 report from the Research Data Alliance recommends that journals weight contributions in impact factors, potentially increasing sharing compliance by 20-30% based on prior badge experiments in ecology journals. Additionally, creating markets for data reuse—via platforms rewarding originators with royalties or co-authorship credits—could monetize , though empirical tests remain limited to pilot programs in . Institutional and cultural reforms, including dedicated funding for data curation (e.g., 5-10% grant overheads), could mitigate preparation costs that deter sharing. A 2025 initiative by the allocates $1.5 million for proposals reforming tenure tracks to value open practices, aiming to normalize sharing as a rather than an extracurricular burden. These measures collectively address root causes like , where shared data risks scooping, by fostering a where openness yields tangible returns over secrecy.

References

  1. [1]
    Sharing research data - PMC - NIH
    Data sharing refers to the practice of saving research data (eg measurements, observations, and transcripts) and metadata.
  2. [2]
    What Drives Academic Data Sharing? - PMC - PubMed Central
    Data sharing in research is attributed a vast potential for scientific progress. It allows the reproducibility of study results and the reuse of old data for ...
  3. [3]
    The FAIR Guiding Principles for scientific data management ... - Nature
    Mar 15, 2016 · This article describes four foundational principles—Findability, Accessibility, Interoperability, and Reusability—that serve to guide data ...
  4. [4]
    Pay it Forward and Free your Data! Fear in the Way of Data Sharing ...
    May 24, 2024 · Scooping prompts a concern of losing control over intellectual property ... Given the legal landscape surrounding data privacy, sharing data ...
  5. [5]
    A systematic review of barriers to data sharing in public health
    Nov 5, 2014 · We identified 20 unique real or potential barriers to data sharing in public health and classified these in a taxonomy of six categories: ...
  6. [6]
    The most prevalent perceived barriers to sharing research data at ...
    May 9, 2025 · The primary perceived barrier to data sharing was a lack of time, identified as the top-ranked barrier in both sections of the survey. The ...
  7. [7]
    Data Sharing by Scientists: Practices and Perceptions | PLOS One
    Data sharing is a valuable part of the scientific method allowing for verification of results and extending research from prior results. Methodology/Principal ...<|control11|><|separator|>
  8. [8]
    Scientific Data Sharing - NIH Office of Science Policy
    Sharing scientific data accelerates biomedical research discovery, enhances research rigor and reproducibility, provides accessibility to high-value datasets,
  9. [9]
    FAIR Principles
    The principles refer to three types of entities: data (or any digital object), metadata (information about that digital object), and infrastructure.
  10. [10]
    WHO data principles - World Health Organization (WHO)
    WHO shall make every effort to release data publicly and to share when safe and ethical to do so. Unless there is a legitimate justification to the contrary, ...
  11. [11]
    Data Sharing: Principles and Considerations for Policy Development
    Principles · Sharing data promotes scientific progress. · Sharing data within the larger scientific community encourages a culture of openness and accountability ...
  12. [12]
    The Enlightenment had its own internet: The Republic of Letters
    May 1, 2024 · The Republic was a network through which many scientists and philosophers communicated in the 17th and 18th centuries. Digital analyses of ...
  13. [13]
    Mapping the Republic of Letters
    Circulation of Knowledge and Learned Practices in the 17th-century Dutch Republic, enables browses and analysis of 20,000 letters that were written by and sent ...Missing: 18th | Show results with:18th
  14. [14]
    The Astronomers Tycho Brahe and Johannes Kepler
    Jan 30, 2012 · During Kepler's time in Prague working as Tycho's assistant, they fought continuously, because Tycho refused to share his meticulous ...
  15. [15]
    [PDF] A Data Sharing Story
    From the early days of modern science through this century of Big Data, data shar- ing has enabled some of the greatest ad- vances in science.Missing: 1900 | Show results with:1900
  16. [16]
    History of Philosophical Transactions | Royal Society
    March 6, 1665. Henry Oldenburg, the Secretary of the Society, publishes the first edition of the world's longest-running scientific periodical. It includes ...
  17. [17]
    'Data' in the Royal Society's Philosophical Transactions, 1665–1886
    Oct 9, 2019 · This paper contributes to understanding the history of the concept of data by studying a specific development before the current data revolution ...
  18. [18]
    A Brief History of Data Sharing in the U.S. Long Term Ecological ...
    Jan 1, 2010 · The 1990 guidelines and the site data-sharing policies did not explicitly address the online sharing of data. In 1990, only one LTER site ...
  19. [19]
    [PDF] Do not cite - 5/21/2004 - A Brief History of Data Sharing ... - VCR LTER
    May 21, 2004 · The U.S. Long-Term Ecological Research (LTER) Network has a strong policy to promote the sharing of data, both inside the LTER Network and with ...
  20. [20]
    The Bermuda Triangle: The Pragmatics, Policies, and Principles for ...
    The Bermuda Principles for DNA sequence data sharing are an enduring legacy of the Human Genome Project (HGP). They were adopted by the HGP at a strategy ...
  21. [21]
    The Bermuda Triangle: The Pragmatics, Policies, and Principles for ...
    Nov 2, 2018 · The Bermuda Principles for DNA sequence data sharing are an enduring legacy of the Human Genome Project (HGP). They were adopted by the HGP ...
  22. [22]
    Moving beyond Bermuda: sharing data to build a medical ...
    Genomics has strong precedents for broad data sharing and open science, most notably the Bermuda Principles of 1996, now two decades old. However, the very ...
  23. [23]
    Data Sharing in Historical Perspective - Social Science Space
    Sep 23, 2015 · A very different discipline, meteorology, has developed since before 1900, based on data shared internationally. An international standard ...
  24. [24]
    Messing with Merton: The intersection between Open Science ...
    The value of communism is connected to a norm of sharing, but this norm is in important ways different from the sharing norms associated with Open Science.
  25. [25]
    [PDF] Scientific Communication and the Collapse of the Mertonian Norms
    The communistic norm refers to the sharing of scientific information among scientists and for the good of the scientific enterprise. In. Merton's eloquent ...<|separator|>
  26. [26]
    Anti-Mertonian norms undermine the scientific ethos: A critique of ...
    Adherence to these norms improves the quality of academic science by discouraging questionable research practices (such as data withholding) and academic ...Missing: sharing | Show results with:sharing
  27. [27]
    Data Sharing: An Ethical and Scientific Imperative - JAMA Network
    By having all data available for reexamination and replication of analyses, data sharing may help ensure that the publications have fidelity to the trial plan.
  28. [28]
    Data sharing practices and data availability upon request differ ...
    Jul 27, 2021 · This study aims to map and evaluate cross-disciplinary differences in data sharing, authors' concerns and reasons for denying access to data, ...Missing: formal | Show results with:formal
  29. [29]
    Falsifiability in medicine: what clinicians can learn from Karl Popper
    May 22, 2021 · Have the data been evaluated by a high-quality test? And if not, could it theoretically be corroborated or falsified by such a test in the ...
  30. [30]
    Promoting Data Sharing: The Moral Obligations of Public Funding ...
    Aug 6, 2024 · We argue from a research ethics perspective that public funding agencies have several pro tanto obligations requiring them to promote data sharing.
  31. [31]
    Full article: The puzzle of sharing scientific data
    Feb 11, 2022 · We seek a deeper understanding of how researchers from different fields share their data and the barriers and facilitators of such sharing.
  32. [32]
    Reproducibility in Management Science - PubsOnLine - INFORMS.org
    Dec 22, 2023 · In this study, we directly assess the reproducibility of results reported in nearly 500 research articles published in Management Science, a ...
  33. [33]
    An empirical analysis of journal policy effectiveness for ... - NIH
    Mar 13, 2018 · An empirical analysis of journal policy effectiveness for ... Reproducibility and Data Sharing in Computational Science.” See www ...
  34. [34]
    Sharing Detailed Research Data Is Associated with Increased ...
    Publicly available data was significantly (p = 0.006) associated with a 69% increase in citations, independently of journal impact factor, date of publication, ...
  35. [35]
    A study of the impact of data sharing on article citations using journal ...
    Dec 18, 2019 · This study estimates the effect of data sharing on the citations of academic articles, using journal policies as a natural experiment.<|control11|><|separator|>
  36. [36]
    The effect of Open Data on cost savings
    According to (Europe 2019), making research data openly available can save up to 9% of a project's costs by preventing unnecessary data collection and ...
  37. [37]
    Measuring the Impact of Data Sharing: From Author-Level Metrics to ...
    Dec 11, 2023 · To our knowledge, no existing study has yet comprehensively examined the repercussions of data sharing with formal economic analyses.
  38. [38]
    Economic and social benefits of data access and sharing - OECD
    Nov 26, 2019 · Overall, these studies suggests that data access and sharing can help generate social and economic benefits worth between 0.1% and 1.5% of ...
  39. [39]
    The Benefits of Data Sharing Now Outweigh the Risks | BCG
    Apr 29, 2024 · The value opportunity of data sharing is estimated to be 2.5% of global GDP.
  40. [40]
    The impact of open data: Six ways data availability benefits research ...
    Jul 24, 2025 · Open data increases understanding, enables reuse, supports trust, helps research go further, and accelerates discovery.
  41. [41]
    A focus groups study on data sharing and research data management
    Jun 17, 2022 · Sharing scientific research data has many benefits. Data sharing produces stronger initial publication data by allowing peer review and ...
  42. [42]
    Reaping the benefits of Open Data in public health - PMC - NIH
    Oct 3, 2019 · Open data allows for early signal detection, improved analysis, informs policy, increases public participation, and strengthens public health ...
  43. [43]
    How does data sharing affect the sustainable development of ...
    Public data openness contributes to sustainable development mainly by improving the operational capacity and innovation capacity of enterprises.
  44. [44]
    Data Management & Sharing Policy Overview - NIH Grants & Funding
    Aug 5, 2025 · NIH's 2003 Data Sharing Policy came into effect on October 1, 2003 and ended on January 25, 2023. The Policy remains in effect for applications ...Missing: 2000 | Show results with:2000
  45. [45]
    [PDF] 08-2022-OSTP-Public-Access-Memo.pdf - Biden White House
    Aug 25, 2022 · 3 Improving public access policies across the U.S. government to promote the rapid sharing of federally funded research data with appropriate ...
  46. [46]
    2022 OSTP Public Access Memo Guidance - SPARC
    Aug 1, 2025 · Revised agency policies for publications and data sharing are required to go into effect by December 31, 2025. Some agencies' policies are ...
  47. [47]
    2023 NIH Data Management and Sharing Policy
    Jan 25, 2023 · The NIH has issued a Data Management and Sharing (DMS) policy, effective January 25, 2023, to promote the sharing of scientific data.2023 Nih Data Management And... · Overview · What Scientific Data Need To...
  48. [48]
    ENG Data Management and Sharing Plan Guidance - NSF
    Annual reports required for all NSF multiyear awards should include information about progress made in data management and sharing of research products (e.g., ...
  49. [49]
    2022 OSTP Public Access Memo Release - OSTI.gov
    Sep 30, 2022 · The new memo asks agencies that have an existing public access plan to develop or update their plans within 180 days from August 25th, 2022. DOE ...
  50. [50]
    OECD Principles and Guidelines for Access to Research Data from ...
    These Principles and Guidelines for Access to Research Data from Public Funding (hereafter the “Principles and Guidelines”) provide broad policy recommendations ...
  51. [51]
    Recommendation on access to Research Data from Public Funding
    Jan 20, 2021 · The Recommendation concerning Access to Research Data from Public Funding was originally adopted by the OECD Council on 14 December 2006 (“2006 ...
  52. [52]
    Open science - European Commission
    Under Horizon Europe, researchers are not obliged to publish their results in publications, however if they choose to do so, it should be in open access. What ...Open science in Horizon Europe · Open science and project...
  53. [53]
    How to comply with Horizon Europe mandate for RDM - OpenAIRE
    Aug 8, 2022 · The first step to comply with RDM requirements in Horizon Europe is to develop a Data Management Plan (DMP). A DMP is a document that ...
  54. [54]
    Progress in promoting data sharing in public health emergencies
    This group is working to identify barriers to data sharing in public health emergencies that should be addressed to better prepare for any future epidemic.
  55. [55]
    New WHO policy requires sharing of all research data
    Sep 16, 2022 · Data sharing is now a requirement for research funding awarded by WHO and TDR. “We have seen the problems caused by the lack of data sharing on ...
  56. [56]
    [PDF] Principles of data sharing in public health emergencies | GLOPID-R
    These principles are intended to inform the development of a framework and system to support data sharing1 in public health emergencies as part of the research ...
  57. [57]
    Data Policy for Times of Crisis (DPTC) - codata
    This project develops guidance and tools for data policy in times of crisis facilitated by open science.
  58. [58]
    Vivli - Center for Global Clinical Research Data
    A global clinical research data sharing platform. The Vivli team is dedicated to helping researchers share and access data from clinical trials to advance ...Missing: industry | Show results with:industry
  59. [59]
    Data Sharing - The Multi-Regional Clinical Trials Center of Brigham ...
    Vivli has established a data platform wherein data are hosted, searched, and accessed. Vivli is now the largest global clinical trial platform with over 7,500+ ...
  60. [60]
    ClinicalStudyDataRequest.com
    ClinicalStudyDataRequest.com (CSDR) is a consortium of clinical study Sponsors. It is a leader in the data sharing community inspired to drive scientific ...
  61. [61]
    A 10-year update to the principles for clinical trial data sharing by ...
    Oct 23, 2023 · Data sharing is essential for promoting scientific discoveries and informed decision-making in clinical practice.
  62. [62]
    Open Data Collaboration and Sharing | Microsoft CSR
    We partner with industry and open data leaders to advance open data access and private sector data sharing for societal benefit.<|separator|>
  63. [63]
    How researchers can meet new open data policies for federally ...
    Jan 24, 2023 · Researchers can use AWS to meet this challenge and design data architectures that optimize research abilities while supporting secure and cost-effective access ...
  64. [64]
    (PDF) Barriers to Data Sharing among Private Sector Organizations
    Sep 22, 2022 · We identify key challenges to successful data sharing among private sector organizations and, hence call for additional endeavors in data sharing.
  65. [65]
    Data Sharing for Research: A Compendium of Case Studies ...
    Aug 17, 2023 · Corporate data-sharing partnerships offer compelling benefits to companies, researchers, and society to drive progress in a broad array of ...Missing: collaboration | Show results with:collaboration
  66. [66]
    [PDF] Barriers to Data Sharing among Private Sector Organizations
    Dec 27, 2022 · However, knowledge on barriers to data sharing among private sector organizations is scarcely existent in scientific literature. Therefore, we.
  67. [67]
    Barriers to Data Sharing - Sharing Clinical Research Data - NCBI - NIH
    Mar 29, 2013 · Incentives in academia to keep data private for purposes of professional advancement can hinder data sharing, but new models for research allow ...
  68. [68]
    Data Authorship as an Incentive to Data Sharing
    Mar 29, 2017 · This system discourages data sharing by creating incentives for investigators to maximize the publication of subsequent analyses from a given ...
  69. [69]
    Data Sharing in Biomedical Sciences: A Systematic Review of ...
    Background: The lack of incentives has been described as the rate-limiting step for data sharing. Currently, the evaluation of scientific productivity by ...
  70. [70]
    'Publish or perish' culture blamed for reproducibility crisis - Nature
    Jan 20, 2025 · Sixty-two per cent of respondents said that pressure to publish “always” or “very often” contributes to irreproducibility, the survey found.
  71. [71]
    Neither carrots nor sticks? Challenges surrounding data sharing ...
    Sep 7, 2022 · The existing hurdles for researchers to share data are manifold and include ethical, legal, economic, or motivational barriers [18–24].
  72. [72]
    Addressing barriers in FAIR data practices for biomedical data - Nature
    Feb 23, 2023 · To address incentive barriers, we propose providing set-aside funding for data management efforts, rewarding good data practices, tracking ...
  73. [73]
    What incentives increase data sharing in health and medical ...
    May 5, 2017 · This study aims to systematically review the literature to appraise and synthesise scientific research papers that concern incentives that have ...
  74. [74]
    Issues and Challenges Associated with Data-Sharing in LMICs - NIH
    The process of data-sharing requires human and technical resources for data preparation, annotation, communication with recipients, computer equipment, and ...
  75. [75]
    Addressing Global Data Sharing Challenges - PMC - NIH
    A key challenge here is the cost of archiving data and the lack of sustainable business models for this activity. Sponsors and journals are increasingly ...Missing: obstacles | Show results with:obstacles
  76. [76]
    Issues and Challenges Associated with Data Sharing - NCBI - NIH
    This chapter summarizes presentations on a number of challenges associated with the sharing of data, including obstacles to releasing data, privacy and ...
  77. [77]
    Perceptual and technical barriers in sharing and formatting ...
    May 14, 2025 · Barriers include lack of uniform standards, privacy concerns, study design limitations, insufficient incentives, inadequate infrastructure, and ...
  78. [78]
    [PDF] Overcoming Barriers to Data Sharing in the United States
    Sep 25, 2023 · Overly restrictive data privacy laws and a lack of technical standards hinder sector-specific data sharing in fields such as education and ...
  79. [79]
    Researcher Challenges and Experiences with Data Services
    Mar 27, 2025 · Researchers in this sample reported challenges at every step in the research process—including funding, data storage, security, sharing, ...
  80. [80]
  81. [81]
    Why don't we share data and code? Perceived barriers and benefits ...
    Nov 23, 2022 · One of the major barriers to data and code sharing is a fear of being 'scooped'. Scooping in this context colloquially refers to a situation in ...
  82. [82]
    Sharing Research Data and Intellectual Property Law: A Primer - PMC
    Aug 27, 2015 · This Perspective explains how to work through the general intellectual property and contractual issues for all research data.
  83. [83]
    NIH Data Management and Sharing Policy: Patent and Intellectual ...
    Apr 16, 2024 · The NIH policy requires data sharing, potentially before patent application, but allows for data withholding to secure patents, with guidance ...
  84. [84]
    Exploring barriers and ethical challenges to medical data sharing
    Nov 15, 2024 · Without a robust and sustainable benefit-sharing mechanism, it becomes increasingly difficult to foster long lasting and meaningful data sharing ...Concerns About Data... · Discussion · Medical Data Sharing Demands...
  85. [85]
    [PDF] Data Sharing: Benefits and Barriers - National Academies
    ▫The barriers and the benefits of scientific data sharing. ▫Examination of barriers to data sharing in the ... (examples: Meta analysis and data mining).
  86. [86]
    IOP Publishing Study: Navigating Barriers to Sharing Data Publicly
    Oct 21, 2024 · Only 55% share their data openly; Under 8% follow the FAIR principles; Biggest barrier: no (known) repository to submit data. Materials ...Missing: natural | Show results with:natural
  87. [87]
    Ready, set, share: Researchers brace for new data-sharing rules
    Jan 25, 2023 · It may be hard to overcome fears that researchers who share data won't get proper credit from others—or may even get scooped. “How do you ...
  88. [88]
    Breaking the silence of sharing data in medical research | PLOS One
    May 29, 2024 · Additionally, there may be challenges in ensuring the accuracy and integrity of shared data, as well as concerns regarding appropriate ...
  89. [89]
    Addressing biomedical data challenges and opportunities to inform ...
    Feb 21, 2025 · In this study, we explored common biomedical data tasks and pain points that could be addressed to elevate data quality, enhance sharing, streamline analysis, ...
  90. [90]
    Sharing sensitive data in life sciences: an overview of centralized ...
    Jun 5, 2024 · Sharing these data for research can help address unmet health needs, contribute to scientific breakthroughs, accelerate the development of more ...Missing: empirical rates
  91. [91]
    Balancing ethical data sharing and open science for reproducible ...
    Apr 15, 2025 · Here we examine the balance between ethical and responsible data sharing and open science practices that are essential for reproducible research in biomedical ...Missing: empirical evidence
  92. [92]
    Making data meaningful: guidelines for good quality open data
    Jul 22, 2021 · Between 2014 and 2017, we found that public data sharing was uncommon (less than 4% of empirical papers; see Hardwicke et al., 2020, for ...Missing: rates | Show results with:rates
  93. [93]
    Data Sharing in Psychology: A Survey on Barriers and Preconditions
    Feb 15, 2018 · In this article, we report the outcome of a survey designed to reveal why researchers are reluctant to share data, and what can be done to make data sharing ...
  94. [94]
    Data Sharing in Psychology - PMC - NIH
    We draw on experiences in other domains to discuss attitudes towards data sharing, cost-benefits, best practices and infrastructure. We argue that the ...
  95. [95]
    Responsible Practices for Data Sharing - PMC - PubMed Central - NIH
    This paper discusses how research data in psychology can be made accessible for reproducibility and reanalysis.Data Sharing: Who, What, And... · Protecting Research Subjects · Sharing Confidential Data
  96. [96]
    Barriers and facilitators to qualitative data sharing in the United States
    Primary concerns were lack of participant permission to share data, data sensitivity, and breaching trust. Researcher willingness to share would increase if ...
  97. [97]
    Education researchers' beliefs and barriers towards data sharing
    Apr 29, 2025 · Results suggest education researchers generally hold positive attitudes towards data sharing, with 70% of the sample agreeing that it benefits their career.Missing: peer- | Show results with:peer-
  98. [98]
    Pioneering new ways to protect privacy
    Jan 1, 2020 · With more of psychologists' work going digital, the chances of inadvertently revealing people's private information is also escalating.<|separator|>
  99. [99]
    The replication crisis has led to positive structural, procedural, and ...
    Jul 25, 2023 · The emergence of large-scale replication projects yielding successful rates substantially lower than expected caused the behavioural, ...
  100. [100]
    No raw data, no science: another possible source of the ...
    Feb 21, 2020 · A reproducibility crisis is a situation where many scientific studies cannot be reproduced. Inappropriate practices of science, ...
  101. [101]
    Social scientists' data sharing behaviors: Investigating the roles of ...
    The purpose of this study is to locate individual, institutional, and resource factors that influence data sharing behaviors among social scientists.
  102. [102]
    A study of the determinants of psychologists' data sharing and open ...
    Apr 30, 2021 · This research examines whether attitudinal, normative, and behavioural control factors affect psychologists' data sharing and open data badge adoption ...
  103. [103]
    Promoting data quality and reuse in archaeology through ... - PNAS
    Oct 17, 2022 · Many archaeological projects face difficulties in reconciling identifiers across datasets created by different collaborators (3, 24). Rather ...
  104. [104]
    FAIR data - Archaeology Data Service
    FAIR data principles emphasize findability, accessibility, interoperability, and reusability to facilitate data discovery, access, and sharing.
  105. [105]
    [PDF] A Study of Context in Archaeological Data Reuse - OCLC
    Challenges include lack of data documentation, standards, and focus on data collection over reuse. Reusers need context to understand, verify, and trust data.
  106. [106]
    About - FAIR and CARE Data Principles - Open Context
    Good practice in archaeological data management works towards both the FAIR Data Principles and the CARE Principles for Indigenous Data Governance.
  107. [107]
    Archaeological Data Reuse in Action: Three FAIR Examples in tDAR
    The FAIR Principles for Data Stewardship asserts that data should be Findable, Accessible, and Reusable. Only by digitally preserving, efficiently curating, and ...<|separator|>
  108. [108]
    Understanding Data Reuse and Barriers to Reuse of Archaeological ...
    Aug 21, 2023 · This research aimed to understand how to optimise archives and interfaces to maximise the discovery, use and reuse of archaeological data.
  109. [109]
    A New Replication Crisis: Research that is Less Likely to be True is ...
    May 21, 2021 · In psychology, only 39 percent of the 100 experiments successfully replicated. In economics, 61 percent of the 18 studies replicated as did 62 ...
  110. [110]
    Table 1, Common data sharing barriers and incentives - NCBI - NIH
    Barriers, Incentives. Fear of scooping. Many researchers fear being scooped, losing career advancement opportunities and publication rights on their ...
  111. [111]
    The paradox of competition: How funding models could undermine ...
    Our results show that, in the short term, more competitive funding schemes may lead to higher rates of data sharing, but lower rates in the long term because ...
  112. [112]
    Do economists replicate? - ScienceDirect.com
    This paper summarizes existing replication definitions and reviews how much economists replicate other scholars' work.
  113. [113]
    Edward Miguel on the “Replication Crisis” in Economics and How to ...
    Sep 28, 2021 · There's the sort of most basic form of replication, which would just be to like grab our data. And then literally reproduce the analysis of ...
  114. [114]
    Incentivising research data sharing: a scoping review - PMC - NIH
    Dec 21, 2021 · This scoping review aims to identify and summarise evidence of the efficacy of different interventions to promote open data practices and provide an overview ...
  115. [115]
    Data sharing — Stanford Psychology Guide to Doing Open Science
    Improving reproducibility. Without access to the raw data, it is impossible to fully reproduce the results from a published study. Data sharing allows ...
  116. [116]
    An executive summary of science's replication crisis
    Aug 8, 2023 · A systematic study found that only about 55% of studies could be reproduced, and that's only counting studies for which the raw data were available.
  117. [117]
    An empirical assessment of transparency and reproducibility-related ...
    Feb 19, 2020 · In this study, we manually examined a random sample of 250 articles in order to estimate the prevalence of a range of transparency and reproducibility-related ...
  118. [118]
    Understanding of researcher behavior is required to improve data ...
    Although a reproducibility crisis is widely perceived, conclusive data on the scale of the problem and the underlying reasons are largely lacking. The debate is ...
  119. [119]
    Evaluation of Data Sharing After Implementation of the International ...
    Jan 28, 2021 · Only 2 (0.6%) individual-participant data sets were actually deidentified and publicly available on a journal website, and among the 89 articles ...
  120. [120]
    Data sharing statements: impact of journal policies across clinical ...
    May 30, 2025 · Among 2941 articles, 1004 (34.14%) included a DSS. Data sharing statement prevalence varied by discipline: cardiology (52%), general medicine ( ...
  121. [121]
    Poor data and code sharing undermine open science principles
    Apr 17, 2025 · The study highlights critical gaps in data and code sharing practices and policies in high-profile medical journals, revealing significant challenges to ...
  122. [122]
    Data sharing practices in high-impact rehabilitation journals
    May 10, 2025 · Journals with mandatory data sharing policies generally had high rates of DSS, demonstrating the potential effectiveness of stringent policies.Primary Data Analysis · Trends In Data Sharing... · Discussion<|control11|><|separator|>
  123. [123]
    Assessing the prevalence, quality and compliance of data-sharing ...
    42% of gastroenterology articles had a DSS, with variability among journals. Open-access and impact factor were associated with DSS, but over half of authors ...Missing: failures | Show results with:failures
  124. [124]
    Human Genome Project Fact Sheet
    Jun 13, 2024 · This landmark agreement has been credited with establishing a greater awareness and openness to the sharing of data in biomedical research, ...
  125. [125]
    Human Genome Project | Impact - Wellcome
    Feb 6, 2025 · The commitment to freely sharing Human Genome Project data paved the way for open science initiatives, encouraging global research and ...
  126. [126]
    The broken promise that undermines human genome research
    Feb 10, 2021 · Data sharing was a core principle that led to the success of the Human Genome Project 20 years ago. Now scientists are struggling to keep information free.
  127. [127]
    Retrospectively modeling the effects of increased global vaccine ...
    Oct 27, 2022 · In total, we estimate that a full vaccine sharing scenario would have prevented 295.8 million infections and 1.3 million deaths worldwide (as a ...
  128. [128]
    (PDF) Many researchers were not compliant with their published ...
    Jul 12, 2022 · Problems with effective research data sharing persist and these problems have been quantified by previous research as a lack of time ...
  129. [129]
    Afraid of Scooping – Case Study on Researcher Strategies against ...
    The risk of scooping is often used as a counter argument for open science, especially open data. In this case study I have examined openness strategies, ...
  130. [130]
    Paleontologist accused of faking data in dino-killing asteroid paper
    Dec 6, 2022 · Scientist who oversees famed extinction-day site denies claims he made up data to scoop a former collaborator.<|control11|><|separator|>
  131. [131]
    Data sharing and the future of science | Nature Communications
    Jul 19, 2018 · Who benefits from sharing data? The scientists of future do, as data sharing today enables new science tomorrow.
  132. [132]
    Cloud-native repositories for big scientific data
    A “cloud-native data repository,” as defined in this article, offers several advantages over traditional data repositories—performance, reliability, cost- ...
  133. [133]
    Data-sharing technologies made easy | Deloitte Insights
    Dec 6, 2021 · Thanks to advances in data-sharing technologies, you can buy and sell potentially valuable information assets in highly efficient, cloud-based marketplaces.Share And Share Alike · Lessons From The Front Lines · Darpa Revs Up Data...<|separator|>
  134. [134]
    6 Repositories to Share Research Data | Teamscope Blog
    Aug 20, 2019 · 6 repositories to share your research data · 3. Dryad Digital Repository · 4. Harvard Dataverse · 5. Open Science Framework · 6. Zenodo.
  135. [135]
    Data Repositories - Harvard Biomedical Data Management
    Data Repositories · Harvard Dataverse · Dryad · figshare · GigaDB · IEEE DataPort · Mendeley Data · Open Science Framework · Science Data Bank ...
  136. [136]
    Recommended Repositories | PLOS One
    Dryad Digital Repository · Dutch national centre of expertise and repository for research data (DANS) · figshare · Harvard Dataverse Network · Kaggle · Network Data ...
  137. [137]
    Find and Share: Data Repositories - Guides - Duke University
    Jun 24, 2025 · This list is intended to be a guide for health sciences data repositories and it is not exhaustive. We encourage the use of open repositories.
  138. [138]
    Selecting a Data Repository - NIH Grants & Funding
    Aug 5, 2025 · This page provides information about NIH's scientific data management and sharing policies and repositories, previously available on the NIH ...Overview · Selecting a Data Repository · Desirable Characteristics for...
  139. [139]
    Ten simple rules for organizations to support research data sharing
    Jun 15, 2023 · Data sharing requires a range of activities and skills beyond a single investigator uploading a dataset to a repository. The people involved in ...Missing: facilitators | Show results with:facilitators
  140. [140]
    Scaling Change by Restructuring Incentives: Make It Rewarding
    These changes include rewarding researchers for sharing data, replicating others' work, and publishing negative results. Open science practices must include ...Missing: potential | Show results with:potential
  141. [141]
    Recommendations on Open Science Rewards and Incentives
    May 6, 2025 · Funders should establish policies requiring Open Access to data produced by funded research and provide corresponding support. Publishers should ...
  142. [142]
    Data Sharing Approaches - NIH Grants & Funding
    Aug 5, 2025 · This page provides information about NIH's scientific data management and sharing policies and repositories, previously available on the NIH ...
  143. [143]
    The misalignment of incentives in academic publishing and ... - PNAS
    In this perspective, we discuss the complex issues of incentive alignment in academic publishing and alternative publication models aimed at addressing these ...Journals As Revenue Streams · Journals As Curators Of... · Society Endorsed Preprints
  144. [144]
    (PDF) The misalignment of incentives in academic publishing and ...
    For most researchers, academic publishing serves two goals that are often misaligned—knowledge dissemination and establishing scientific credentials.
  145. [145]
    [PDF] Recommendations on Open Science Rewards and Incentives
    Oct 17, 2024 · The session focused on the hurdles involved in opening up data and other research outputs in the research process, as well as on rewarding.
  146. [146]
    Fostering Incentives & Rewards for Open Practices | ICOR
    Jul 20, 2023 · Arcadia Science: Innovative approaches to reforming research incentives and rewards ... data sharing, and open access publication. Of the four ...
  147. [147]
    $$1.5 million program targets changes to academic incentives
    Oct 6, 2025 · The program, called the Modernizing Academic Appointment and Advancement (MA3) Challenge, is seeking proposals for “bold, creative strategies ...Missing: sharing | Show results with:sharing