Fact-checked by Grok 2 weeks ago

Wikidata

Wikidata is a free and open collaborative knowledge base hosted and maintained by the Wikimedia Foundation, operating as a multilingual central storage repository for structured data that is editable by both humans and machines.^[1]^[2] Launched on October 29, 2012, it supports Wikimedia sister projects such as Wikipedia, Wikimedia Commons, Wikivoyage, Wiktionary, and Wikisource by providing shared factual data, managing interlanguage links, and enabling automated updates for elements like infoboxes and lists.^[3]^[2] As of November 2025, Wikidata contains over 119 million data items, making it the largest Wikimedia project by content volume.^[1] The project's data model represents knowledge through items (entities like people, places, or concepts), properties (relations or attributes, with over 13,000 defined, many as external identifiers), and statements structured as subject-predicate-object triples, such as "Tim Berners-Lee (Q80) instance of (P31) human (Q5)."^[4] Statements can include qualifiers for additional context (e.g., specifying a time period) and references to sources for verifiability, with ranks (preferred, normal, or deprecated) to indicate reliability or preference.^[4] This flexible, extensible structure supports 12 core data types (e.g., strings, quantities, URLs) and integrates with extensions like WikibaseLexeme for linguistic data, allowing complex, computable representations of real-world knowledge.^[4]^[5] Initially funded by the Allen Institute for AI, the Gordon and Betty Moore Foundation, and Google, Inc., Wikidata has evolved into a key resource for open data initiatives, interlinking with external datasets and enabling tools like the Wikidata Query Service for SPARQL-based queries.^[2]^[6] All content is released under the Creative Commons Zero (CC0) dedication, permitting unrestricted reuse and modification.^[1] Its growth reflects community-driven contributions, fostering applications in research, cultural heritage, and beyond while addressing challenges like data quality and multilingual coverage.^[2]^[4]

History

Inception and Early Development

In 2011, Wikimedia Deutschland proposed the creation of Wikidata as a central repository to address key challenges in Wikipedia maintenance, particularly the decentralized management of interlanguage links and the repetitive updating of infobox content across multiple language editions.^[7] This initiative aimed to centralize structured data, reducing duplication and errors that arose from editors manually synchronizing links and facts in over 280 language versions of Wikipedia at the time.^[7] The proposal outlined a phased approach, starting with interlanguage links to streamline navigation between articles on the same topic in different languages, thereby easing the burden on volunteer editors, especially in smaller Wikipedias.^[7] Development of Wikidata officially began on April 1, 2012, in Berlin, under the leadership of Denny Vrandečić and Markus Krötzsch, who had earlier explored semantic enhancements for Wikipedia.^[8] The project was initiated by Wikimedia Deutschland, with initial funding secured from Google, the Allen Institute for Artificial Intelligence, the Gordon and Betty Moore Foundation, and support from the Wikimedia Foundation, totaling approximately €1.3 million for the early stages.^[9] During these initial months, the team focused on designing the core data model, which relied on simple property-value pairs to represent entities—such as linking a city to its population or coordinates—allowing for flexible, machine-readable storage without rigid schemas.^[10] The beta version of Wikidata launched on October 29, 2012, initially restricting editing to the creation of items and their connections via interlanguage links to Wikipedia articles, marking the project's first operational phase.^[11] This limited scope enabled early testing of the centralized linking system while laying the groundwork for broader structured data integration in subsequent phases.^[10]

Key Milestones and Rollouts

Wikidata's development was structured around three primary phases, each building on the previous to expand its functionality and integration with Wikimedia projects. Phase 1, from 2012 to 2013, established the foundational infrastructure for centralizing interlanguage links, replacing the fragmented system where each Wikipedia maintained its own links to other language versions. The project launched in beta on October 29, 2012, initially permitting users to create items—unique identifiers for concepts—and add sitelinks connecting them to corresponding articles across Wikimedia sites.^[10] Pilot testing began in January 2013 with the Hungarian, English, and French Wikipedias, and by March 6, 2013, interlanguage links were enabled across all Wikipedias, streamlining maintenance and improving multilingual navigation.^[10]^[12] Phase 2 in 2013 introduced the core data model, enabling Wikidata to store structured facts beyond mere linking. Statements, consisting of a property-value pair with optional qualifiers and references, were first added on February 4, 2013, initially supporting limited data types such as items and Wikimedia Commons media files. Properties, which define the types of relationships (e.g., "instance of" or "country"), and sitelinks were integrated as essential components, allowing items to represent real-world entities with verifiable claims. Editing of statements was opened to the public on February 4, 2013, marking the transition to full community-driven content creation and significantly increasing contributions.^[10] Phase 3, beginning in 2014, focused on practical applications and broader interoperability, including the integration of Wikidata data into Wikipedia infoboxes and the central coordination of external identifiers. In July 2014, the English Wikipedia began widespread adoption of Lua modules to pull data from Wikidata into infobox templates, automating the display of structured information like birth dates or occupations while reducing redundancy across articles. This integration relied on properties dedicated to external identifiers (e.g., ISBN or GeoNames ID), positioning Wikidata as a hub for linking to external databases and enhancing data reuse beyond Wikimedia.^[10] Subsequent milestones extended Wikidata's scope to specialized data types. In May 2018, lexemes were introduced to support linguistic data, allowing the storage of words, forms, senses, and etymologies in multiple languages, thereby complementing Wiktionary and enabling queries on lexical relationships.^[13] In 2019, entity schemas were launched using the Shape Expressions (ShEx) language to define and validate data models, helping enforce constraints on item structures and improve data quality through community-defined templates.^[14] These advancements solidified Wikidata's role as a versatile, multilingual knowledge graph up to 2023.

Recent Advancements (2024–2025)

In 2025, the Wikidata community engaged in a comprehensive survey conducted in July 2024, with results released on October 13, 2025, revealing key insights into contributors' backgrounds and involvement patterns. The report highlighted that editing data remains the most common activity, while priorities such as research with Wikidata and building applications are growing, informing future developments including enhanced data quality frameworks to support more reliable and reusable structured information.^[15] Technical enhancements continued with the launch of the "Search by entity type" feature on June 19, 2025, which introduced typeahead compatibility in the Wikidata search box and allowed users to filter results specifically to items, properties, or lexemes. This update significantly improved navigation for users seeking particular data classes, streamlining access to the database's diverse entity types.^[1] Accessibility efforts advanced through the introduction of a mobile editing prototype on June 12, 2025, enabling statement editing on items directly from mobile devices—a long-requested capability. Community feedback was actively solicited via video demonstrations and discussions, aiming to refine the tool for broader usability and inclusivity among editors on the go.^[1] The WikidataCon 2025 conference, held online from October 31 to November 2, 2025, and organized by Wikimedia Deutschland, gathered developers, editors, and organizations to explore advancements in linked open data, with a strong emphasis on AI integrations and collaborative tools for enhanced data connectivity.^[16] Wikimedia Deutschland's 2025 plan, outlined in February 2025 and aligned with strategic goals through 2030, prioritized scalability for Wikidata as part of broader linked open data infrastructure, targeting a doubling of technology contributors by 2030 to handle expanding data volumes and global participation. The plan also supported machine editing initiatives by improving editing experiences and productivity tools, facilitating automated contributions while maintaining community oversight.^[17] On October 29, 2025, Wikidata received official recognition as a digital public good from the Digital Public Goods Alliance, affirming its role as an openly licensed, collaborative knowledge base with over 1.6 billion facts that promotes equitable access to information worldwide—the second Wikimedia project after Wikipedia to earn this distinction.^[18]

Core Concepts

Items and Identifiers

Items are the primary entities in Wikidata, representing real-world topics, concepts, or objects such as people, places, events, or abstract ideas.^[19] Each item serves as a unique container for structured data about its subject, enabling the centralized storage and reuse of information across Wikimedia projects without redundancy.^[20] For instance, the item for Douglas Adams is identified as Q42, which encapsulates all relevant data about the author in one place.^[20] Every item is assigned a unique identifier known as a Q-ID, consisting of the letter "Q" followed by a sequential numeric code, such as Q42 or Q7186 for Marie Curie.^[19] These Q-IDs ensure global uniqueness within Wikidata, preventing duplication by providing a single reference point for each entity regardless of language or project.^[20] As of November 2025, Wikidata contains over 119 million items, forming the foundational scale of its knowledge base.^[1] The structure of an item includes monolingual labels, which provide the primary name in a specific language (e.g., "Douglas Adams" in English for Q42); descriptions, offering a brief disambiguating summary (e.g., "English writer and humorist (1952–2001)"); and aliases, listing alternative names or variants (e.g., "DNA" as an alias for Q42).^[19] Additionally, sitelinks connect the item to corresponding pages on other Wikimedia sites, such as linking Q42 to the English Wikipedia article on Douglas Adams, facilitating seamless cross-project navigation and data synchronization.^[20] This structure allows Q-IDs to enable efficient linking across projects; for example, a single item like Q8470 for the 1988 Summer Olympics can be referenced uniformly in multiple Wikipedias, avoiding the need for separate, inconsistent entries.^[19]

Properties

Properties in Wikidata are reusable attributes that function as unique descriptors to define relationships and values for items, forming the predicates in the knowledge graph's triples. Each property has a dedicated page on Wikidata and is identified by a unique alphanumeric label consisting of the prefix "P" followed by a sequential number, referred to as a P-ID; for example, P31 denotes "instance of," which classifies an item as belonging to a specific class or category.^[21] The creation of properties follows a structured, community-governed process to maintain relevance and avoid redundancy. Proposals for new properties are submitted to the dedicated Property proposal forum, where editors discuss their necessity, proposed datatype, and constraints; approval requires consensus or sufficient support from the community. Upon approval, the property is created by users with appropriate permissions, such as property creators or administrators, and assigned the next available sequential numeric ID starting from P1. This process ensures that properties are introduced only when they address a clear need in describing entities.^[21] Properties are categorized by their datatype, which dictates the structure and validation of values they accept, enabling diverse representations of information. Common datatypes include those for geographical coordinates (e.g., P625, used for location data), dates and times (e.g., P569 for date of birth or death), and external identifiers that link to external databases (e.g., P345 for IMDb person ID or P214 for VIAF ID). These types support interoperability with other linked data systems. The most extensively used property is "cites work" (P2860), applied to over 312 million item pages as of November 2025, primarily for bibliographic citations in scholarly and creative works.^[21]^[22] To promote data integrity, properties incorporate constraints that enforce validation rules on associated statements. Examples include format constraints to ensure values match expected patterns (e.g., ISO 8601 for dates), uniqueness constraints limiting a property to a single value per item (e.g., for identifiers like ISBN), and type constraints verifying that values align with specified classes or formats. These are defined on the property's page and checked automatically during editing, aiding in error detection and quality control.

Statements

In Wikidata, statements form the fundamental units of structured data, representing assertions about entities through a subject-predicate-object triple structure. The subject is an item, such as a person, place, or concept identified by a unique Q-number (e.g., Q42 for Douglas Adams); the predicate is a property (e.g., P31 for "instance of"); and the object is the value, which can be another item (e.g., Q5 for "human"), a string, a quantity, a date, or other supported data types.^[23] This triple-based model aligns with linked data principles, enabling interconnections across the knowledge base.^[4] Properties in Wikidata often permit multiple statements to accommodate real-world complexity, such as varying attributes over time or contexts, with each statement-value pair assigned a rank to indicate its status: preferred for the most reliable or current information, normal as the default, or deprecated for outdated or incorrect data.^[23]^[24] Ranks help editors and consumers prioritize information without deleting historical details. These statements can be enhanced with qualifiers for additional context, such as specifying a time period or location, though core assertions remain self-contained.^[23] As of April 2025, Wikidata encompasses approximately 1.65 billion statements, supporting complex queries and integrations across Wikimedia projects and beyond.^[25] A representative example is the statement for Douglas Adams (Q42): instance of (P31) human (Q5), which establishes the item's classification as a person.^[23] This scale underscores Wikidata's role as a vast, collaborative repository of verifiable facts.

Lexemes

Lexemes were introduced to Wikidata in May 2018 to extend its data model beyond encyclopedic concepts, enabling the structured storage of linguistic and lexical information for words, phrases, and their variations across languages.^[26] Unlike general items (Q-IDs), lexemes are specialized entities identified by unique L-IDs, such as L7 for the English noun "cat," allowing for the representation of language-specific lexical units.^[27] This addition supports the integration of dictionary-like data, complementing projects like Wiktionary through shared identifiers and tools for cross-referencing.^[26] The core structure of a lexeme centers on its lemma, the canonical base form of the word (e.g., "cat" for L7), associated with a specific language item (such as Q1860 for English) and a lexical category denoting its grammatical role, like noun or verb.^[28] Senses capture the distinct meanings of the lemma, each with a gloss and optional statements linking to related concepts; for instance, L7 includes senses for the domesticated animal and the musical instrument.^[27] Forms represent inflected or derived variants, such as "cats" or "cat's," including textual representations and grammatical features like number (plural) or case, drawn from ontologies in the Linguistic Linked Open Data community.^[28] These components allow lexemes to link to broader Wikidata items via statements, facilitating connections between lexical and encyclopedic knowledge.^[29] As of 2025, Wikidata's lexeme collection includes over 1.3 million entries across hundreds of languages, reflecting rapid community contributions.^[30] This growth underscores lexemes' role in supporting Wiktionary integration, where data can be imported or exported to enrich dictionary entries.^[26] Lexemes enable detailed linguistic annotations, such as etymological links tracing word origins; for example, the Afrikaans lexeme for "hond" (dog, L208466) connects through derivations to Dutch, Middle Dutch, Old Dutch, Proto-Germanic, and ultimately Proto-Indo-European roots.^[31] Pronunciation data, including International Phonetic Alphabet (IPA) transcriptions, is attached to forms and qualified by senses to specify contexts like regional accents.^[28] These features promote applications in natural language processing and multilingual research by providing verifiable, interconnected lexical data.^[31]

Entity Schemas

Entity schemas in Wikidata are declarative models that define the expected structure and constraints for classes of entities, enabling validation to ensure data consistency and quality. Launched on May 28, 2019, they utilize the Shape Expressions (ShEx) language, expressed in ShExC syntax, and are stored as a dedicated entity type in the EntitySchema namespace, identifiable by the prefix "E". This infrastructure allows users to specify required properties, their cardinalities, allowed values, and relationships for specific item classes, such as mandating a birth date property for items of the class "person" (Q215627). Unlike property constraints, which focus on individual properties, entity schemas provide holistic shape definitions for entire entity sets, including qualifiers and references.^[14]^[32] The primary purpose of entity schemas is to model and validate RDF-based data structures within Wikidata, facilitating the detection of inconsistencies or errors during editing. Community members propose and develop schemas through the WikiProject Schemas, with versioning supported via Wikidata's page history mechanism, allowing revisions and tracking of changes over time. Integration with editing tools enhances usability; for instance, ShExStatements enables schema generation from CSV files and validation against Wikidata items, while tools like Entityshape and WikiShape provide visual interfaces for creation and testing. These features promote collaborative maintenance, where schemas can be proposed via requests for comment (RfC) to standardize data structures for particular subjects.^[32]^[33]^[34] In practice, entity schemas support domain-specific applications, particularly in biomedicine, where they ensure consistent representation of entities like genes, proteins, and virus strains. For example, schemas for molecular biology entities define mandatory properties such as sequence data or taxonomic classifications, aiding in the integration of biomedical ontologies and reducing variability in knowledge graph subsets. Research initiatives have proposed expanding these schemas to cover clinical entities, enhancing Wikidata's utility in health-related data modeling and validation.^[35]^[36]^[37]

Data Structure and Management

Qualifiers, References, and Constraints

In Wikidata, qualifiers, references, and constraints serve as essential mechanisms to add context, verifiability, and validation to statements, which are the core property-value pairs representing knowledge about entities.^[23] Qualifiers provide additional details to refine a statement's meaning, references link statements to supporting sources for credibility, and constraints enforce rules to maintain data consistency and prevent errors. Together, these features enhance the reliability and usability of Wikidata's knowledge graph by allowing nuanced, sourced, and structured information without altering the primary statement structure. Qualifiers are property-value pairs attached to a statement to expand, annotate, or contextualize its main value, offering further description or refinement without creating separate statements.^[38] For instance, a statement about the population of France (66,600,000) might include qualifiers such as "excluding Adélie Land" to specify territorial scope, or for Berlin's population (3,500,000), qualifiers like "point in time: 2005" and "method of estimation" clarify temporal and methodological aspects.^[38] Similarly, a statement designating Louis XIV as King of France could be qualified with "start time: 14 May 1643" and "end time: 1 September 1715" to denote the duration of his reign.^[38] These qualifiers modify the statement's interpretation—such as constraining its validity to a specific period or context—while avoiding ambiguity by not altering other qualifiers on the same statement. By enabling such precision, qualifiers help resolve multiple possible values for a property and support community consensus on disputed facts through ranking mechanisms.^[38] References in Wikidata consist of property-value pairs that cite sources to back up a statement, ensuring its verifiability and traceability to reliable origins.^[39] They typically employ properties like "stated in (P248)" to reference publications or items (e.g., books or journals) and "reference URL (P854)" for online sources, often supplemented with details such as author, publication date, or retrieval date.^[39] For example, a statement about a scientific fact might reference the CRC Handbook of Chemistry and Physics via its Wikidata item, or an online claim could cite a specific webpage URL with the access date to account for potential changes.^[39] References are required for most statements, except those involving common knowledge or self-evident data, and can be shared across multiple statements to promote efficiency. This sourcing practice upholds Wikidata's commitment to reliability, allowing users to verify claims against primary or authoritative materials like academic journals or official databases.^[39] Constraints are predefined rules applied to properties via the "property constraint (P2302)" property, functioning as editorial guidelines to ensure appropriate usage and detect inconsistencies in data entry.^[40] Implemented through the Wikibase Quality Constraints extension, these rules—over 30 types in total—categorize into datatype-independent (e.g., single-value, which limits a property like place of birth to one value per entity) and datatype-specific (e.g., format, which validates identifiers against patterns like ISBN or email syntax).^[40] For instance, a single-value constraint prevents duplicate entries for unique attributes, while a format constraint ensures telephone numbers adhere to expected structures. Violations are reported to logged-in editors via tools like the constraint report, though exceptions can be explicitly noted using qualifiers like "exception to constraint (P2303)" for edge cases, such as a fictional entity defying real-world rules. By providing these checks, constraints proactively prevent errors, promote data quality, and guide contributors toward consistent modeling, ultimately bolstering the graph's integrity without imposing rigid enforcement.^[40]

Editing Processes and Tools

Human editing on Wikidata primarily occurs through a web-based interface accessible via the project's main site, where users can search for existing items using titles or identifiers and create new ones if none exist.^[41] To create an item, editors enter a label (the primary name in a chosen language) and a brief description to disambiguate it, followed by adding aliases for alternative names and interwiki links to corresponding Wikipedia articles in various languages.^[41] Once created, statements—structured triples consisting of an item, property, and value—can be added directly in the interface, with options to include qualifiers and references for precision.^[42] For larger-scale human contributions, tools like QuickStatements enable batch uploads by allowing editors to input simple text commands or CSV files to add or modify labels, descriptions, aliases, statements, qualifiers, and sources across multiple items.^[43] This tool, developed by Magnus Manske, processes commands sequentially via an import interface or external scripts, making it suitable for importing data from spreadsheets without needing programming knowledge, though users must supply existing item identifiers (QIDs) for accurate targeting.^[43] Similarly, OpenRefine supports reconciliation of external datasets with Wikidata by matching values in tabular data (e.g., names or identifiers) to existing items through a dedicated service, flagging ambiguities for manual review and enabling bulk additions of new statements or links.^[44] OpenRefine's process involves selecting a reconciliation endpoint (such as the Wikidata-specific API), restricting matches by entity types or languages, and using property paths to pull in details like labels or sitelinks for augmentation.^[44] Machine editing on Wikidata is governed by strict guidelines to ensure quality and prevent disruption, with bots—automated or semi-automated scripts—requiring separate accounts flagged as "bot" and operator contact information.^[45] Approval for bot flags is obtained through community requests on Meta-Wiki, where proposals detail the bot's purpose, such as importing identifiers from external databases (e.g., ISBNs or GeoNames IDs), and undergo review for compliance with edit frequency limits and error-handling mechanisms; global bots may receive automatic approval for specific tasks like interwiki maintenance.^[45] Once approved, bots operate under reduced visibility in recent changes to avoid overwhelming human editors, but they must pause or be blocked if malfunctions occur, with flags revocable after discussion or prolonged inactivity.^[45] Collaboration and maintenance rely on version history, which tracks all edits to an item with timestamps, user attributions, and diffs for comparison, allowing reversion to prior states via the "history" tab.^[42] Talk pages associated with each item facilitate discussions on proposed changes, disputes, or improvements, mirroring Wikimedia's broader discussion norms.^[42] Reversion tools integrated into the interface enable quick undoing of errors or vandalism, often used in tandem with watchlists to monitor items.^[42] Wikidata's community upholds norms emphasizing notability for items—requiring they support Wikimedia projects, link to reliable sources, or fill structural roles—while promoting neutrality through unbiased descriptions and balanced statements.^[42] All claims must be sourced to verifiable references, such as published works or databases, with unsourced statements discouraged and subject to removal; editors are encouraged to join WikiProjects for coordinated adherence to these standards.^[42]

Content Scope and Quality Control

Wikidata's content scope encompasses structured data across diverse domains, including over 10 million biographies of humans marked as instances of the "human" class (Q5), detailed geographic entities such as locations and administrative divisions, medical concepts like diseases and treatments, and scholarly metadata through initiatives like WikiCite for citations and references. This breadth supports interoperability with Wikimedia projects and external applications while adhering to strict verifiability standards, ensuring all entries draw from reliable, published sources rather than primary data collection. Notably, Wikidata explicitly excludes original research, personal opinions, or unpublished material, positioning it as a secondary knowledge base that aggregates and links to authoritative references such as academic publications, official databases, and news outlets.^[46]^[47]^[47]^[48] To facilitate coordinated development within these thematic areas, Wikidata relies on community-driven WikiProjects that focus on specific domains, providing guidelines, property standards, and collaborative tasks. For instance, WikiProject Music standardizes properties like performer (P175), instrument (P1303), and release identifiers (e.g., Discogs master ID, P1954) to enhance coverage of compositions, artists, albums, and genres, while enabling cross-project data mapping from Wikipedia and Commons. These projects promote thematic consistency by organizing SPARQL queries for gap analysis, encouraging contributor participation through chat channels and task lists, and ensuring alignment with broader Wikidata schemas without imposing rigid notability criteria.^[49]^[50] Quality control mechanisms emphasize proactive detection and community oversight to uphold data integrity. Database reports, such as those tracking constraint violations, systematically scan for non-compliance with predefined rules—like mandatory qualifiers or format constraints—listing affected items and statements for editors to review and resolve, thereby preventing structural degradation. Community-voted deletions further support maintenance, allowing proposals for removing redundant or erroneous properties and items through dedicated request pages, where consensus guides administrative action. These tools integrate with editing interfaces to flag issues in real-time, drawing on templates like the Constraint template for automated validation.^[51]^[52]^[53] Despite these safeguards, challenges persist in maintaining accuracy, particularly vandalism detection and multilingual consistency. Vandalism, often involving disruptive edits like false statements or mass deletions, is mitigated through machine learning classifiers that analyze revision features—such as edit patterns and abuse filter tags—to identify 89% of cases while reducing patroller workload by 98%, as demonstrated in research prototypes adaptable to Wikidata's abuse filter. Multilingual consistency presents another hurdle, with studies revealing issues like duplicate entities, missing triples, and taxonomic inconsistencies across language versions, exacerbated by varying editorial priorities and source availability, though constraint checks and cross-lingual queries help address them.^[54]^[55]

Technical Infrastructure

Software Foundation

Wikidata's software foundation is built upon the MediaWiki platform, which provides the core wiki engine for collaborative editing and version control.^[56] The Wikibase extension suite transforms MediaWiki into a structured data repository, enabling the creation, management, and querying of entities such as items and properties in a versioned, multilingual format.^[57] This integration allows Wikidata to leverage MediaWiki's established infrastructure while adding specialized capabilities for knowledge graph operations.^[58] Data storage in Wikidata relies on a dual approach to handle both relational and graph-based needs. Items and properties are primarily stored in a MySQL database, which supports the revision history, entity metadata, and structured attributes through Wikibase's schema.^[59] For RDF representations, the system uses Blazegraph as a triplestore to manage billions of RDF triples derived from Wikidata entities, facilitating efficient SPARQL queries via the Wikidata Query Service.^[6] This separation ensures robust handling of both editable content and semantic linkages. To address the scale of Wikidata's growing dataset, the infrastructure incorporates scalability features such as sharding and caching. Horizontal sharding partitions data across multiple Blazegraph nodes to distribute query loads and manage edit propagation, with ongoing efforts to optimize entity-based splitting.^[60] Caching mechanisms, including in-memory stores and diff-based updates, reduce latency by minimizing redundant computations during data synchronization.^[60] The entire system is hosted on servers managed by the Wikimedia Foundation in data centers across multiple locations, ensuring high availability and global access.^[61] Wikibase and its components are released under the GNU General Public License version 2.0 or later (GPL-2.0-or-later), promoting open-source development and allowing independent installations of Wikibase repositories beyond Wikidata.^[62] This licensing aligns with MediaWiki's copyleft model, fostering community contributions and reuse in diverse structured data projects.

Query Services and Data Access

Wikidata provides several mechanisms for retrieving and manipulating its structured data, enabling users and applications to access the knowledge graph efficiently. The primary query service is the Wikidata Query Service (WDQS), consisting of SPARQL endpoints launched in September 2015 that support complex, federated queries across Wikidata's RDF triples and external linked data sources. In May 2025, to enhance scalability, the WDQS backend was updated to split the dataset into a main graph (accessible via query-main.wikidata.org or the redirected query.wikidata.org) and a scholarly graph (query-scholarly.wikidata.org), with a legacy full-graph endpoint (query-legacy-full.wikidata.org) available until December 2025. Queries spanning both graphs now require SPARQL federation. The Wikimedia Foundation is also searching for a replacement to Blazegraph, the current triplestore backend, due to its lack of updates since 2018.^[63]^[64]^[65]^[6] This service allows for sophisticated pattern matching and filtering, such as retrieving all instances of cities with a population exceeding 1 million, by leveraging predicates like wdt:P1082 for populated places and wdt:P31 for instance-of relations.^[66] In addition to SPARQL, Wikidata offers programmatic access through APIs tailored for different operations. The MediaWiki Action API facilitates both read and write interactions with entities, supporting actions like fetching entity data via wbgetentities or editing statements through wbeditentity.^[67] Complementing this, the Wikibase REST API provides a modern, stateless interface primarily for entity retrieval, such as obtaining JSON representations of items or properties without the overhead of session-based authentication.^[68] These APIs adhere to standard HTTP practices, with endpoints like https://www.wikidata.org/w/api.php for the Action API and https://www.wikidata.org/rest.php for REST operations, ensuring compatibility with a wide range of client libraries and tools.^[30] To illustrate basic querying, SPARQL SELECT patterns form the foundation of WDQS interactions. A simple example retrieves all humans born in the 20th century:

SELECT ?human ?humanLabel ?birthDate
WHERE {
  ?human wdt:P31 wd:Q5 .  # instance of [human](/page/Human)
  ?human wdt:P569 ?birthDate .  # date of birth
  FILTER(YEAR(?birthDate) >= 1900 && YEAR(?birthDate) < 2000) .
  [SERVICE](/page/Service) wikibase:label { bd:serviceParam wikibase:language "en" . }
}
LIMIT 10
SELECT ?human ?humanLabel ?birthDate
WHERE {
  ?human wdt:P31 wd:Q5 .  # instance of [human](/page/Human)
  ?human wdt:P569 ?birthDate .  # date of birth
  FILTER(YEAR(?birthDate) >= 1900 && YEAR(?birthDate) < 2000) .
  [SERVICE](/page/Service) wikibase:label { bd:serviceParam wikibase:language "en" . }
}
LIMIT 10

This pattern binds variables to subjects, predicates, and objects while applying filters for precision, drawing on Wikidata's statement structure where properties link items to values.^[69] More advanced queries can federate with external endpoints using the SERVICE keyword, expanding results beyond Wikidata's core dataset.^[6] The query services incorporate safeguards to maintain performance and reliability. WDQS enforces timeouts, typically set to 60 seconds for public queries, to prevent resource exhaustion from computationally intensive operations, alongside result limits such as a maximum of 10,000 rows per response to balance load.^[70] Ongoing improvements include query optimization techniques, like index utilization in the underlying Blazegraph engine, and integration with user-friendly interfaces such as the Wikidata Query Service's built-in editor, which offers syntax highlighting, prefix autocompletion, and visualization of results as tables or graphs.^[6] These enhancements, combined with tools like Query Helper for visual query building, lower the barrier for non-experts while supporting advanced federated explorations.^[71]

Integrations and Exports

Wikidata provides regular data exports to facilitate integration with external systems and applications. Weekly dumps of the entire database are generated in multiple formats, including JSON and RDF, which are recommended for their stable and canonical representations of entities, properties, and statements.^[72] These dumps enable developers and organizations to download and process the full dataset offline, supporting use cases such as data mirroring, analysis, and custom knowledge graph construction. Additionally, full database downloads are available via torrent, providing an efficient method for obtaining large volumes of data, with recent dumps accessible through official and unofficial torrent links maintained by the Wikimedia community.^[72]^[73] A key aspect of Wikidata's integrations involves interlinking its entities with other datasets to promote cross-referencing and alignment. For instance, Wikidata maintains connections to DBpedia, a structured knowledge base derived from Wikipedia, allowing bidirectional linking that enhances semantic web interoperability.^[74] Similarly, links to Europeana, a digital cultural heritage aggregator, are established through property statements that map Wikidata items to Europeana records, facilitating enriched metadata for historical and artistic resources.^[75] This interlinking is primarily achieved via external ID properties, which serve as dedicated identifiers (e.g., P31 for instance of or P279 for subclass of combined with source-specific IDs) to align entities across disparate datasets, ensuring consistent entity resolution without duplicating content.^[76] The underlying Wikibase software, which powers Wikidata, is open-source and supports custom installations for organizations seeking tailored knowledge bases. Museums and cultural institutions have adopted Wikibase for this purpose, such as the Botanical Garden and Botanical Museum Berlin, which uses it to manage semantic descriptions of plant specimens and related metadata.^[77] Another example is Rhizome, a New York-based arts organization, which has employed Wikibase since 2015 to archive and link born-digital art, demonstrating its flexibility for domain-specific data management.^[78] All structured data on Wikidata, including statements, properties, and lexemes, is released under the Creative Commons CC0 1.0 Dedication, waiving all copyright and related rights to the fullest extent permitted by law.^[79] This public domain status allows unrestricted reuse, modification, and distribution worldwide, making the data freely available for commercial, non-commercial, or derivative works without attribution requirements.^[80]

Usage and Applications

Role in Wikimedia Projects

Wikidata functions as the central storage for interlanguage links across all Wikimedia projects, particularly enabling seamless connections between articles in different languages on various Wikipedias. This migration began in February 2013, with the process aiming to centralize links in a single database to reduce maintenance efforts, and was completed across all Wikipedias by the end of that year.^[81] Sitelinks on Wikidata items directly power these connections, allowing editors to add or update links from a unified interface.^[81] In Wikipedia, Wikidata provides structured data that populates infoboxes through templates such as {{Infobox}}, which retrieve properties like population or coordinates directly from Wikidata items. As of July 2024, approximately 27% of English Wikipedia content pages include requests for Wikidata statements, often manifesting in these infoboxes to enhance article completeness without redundant editing across language versions.^[82] This integration reduces duplication and ensures consistency, with examples including biographical details or geographic information drawn from verified Wikidata claims.^[82] Wikivoyage leverages Wikidata to structure travel itineraries and listings, incorporating data like coordinates for mapping points of interest and details such as entrance fees or operating hours for attractions. Similarly, in Wikisource, Wikidata supports structured representation of texts through bibliographic metadata, including author details, publication years, and editions, retrieved via Lua modules for index pages and headers.^[83] This allows automatic generation of consistent information across works, minimizing manual entry and enabling cross-project synchronization.^[83] Wikidata further enhances Wikimedia projects with dynamic visualizations, such as interactive maps generated by Lua modules like Module:Mapframe, which pull coordinates and shapes from item properties to display locations in articles. Timelines can also be created using Lua scripting to sequence events or dates sourced from Wikidata, integrating seamlessly into Wikipedia and sister project pages for chronological representations.

External and Research Applications

Wikidata has found significant application in the GLAM (galleries, libraries, archives, and museums) sector, where it serves as a collaborative platform for standardizing and linking metadata across cultural institutions. Libraries leverage Wikidata to create and publish linked open data, enabling the integration of bibliographic records with broader knowledge graphs for improved discoverability and interoperability.^[84] For instance, tools like OpenRefine facilitate bulk reconciliation and editing of library catalogs against Wikidata, allowing institutions to map local metadata to global identifiers and enhance data sharing.^[85] Museums and archives similarly use Wikidata to connect collection items, such as artworks or artifacts, to authoritative descriptions, supporting cross-institutional projects that build universal languages for cultural heritage data.^[86] A 2024 systematic review of Wikidata's adoption in GLAM institutions highlights its role in fostering open data initiatives, emphasizing metadata harmonization and community-driven curation.^[87] The Association of Research Libraries' 2019 white paper further recommends Wikidata for GLAM workflows, noting its potential to address silos in metadata management through structured, reusable properties.^[88] In digital humanities, Wikidata supports interdisciplinary research by providing a flexible repository for historical and cultural data visualization. Tools like Histropedia enable the creation of interactive timelines derived from Wikidata queries, allowing scholars to map events, figures, and relationships across time with links to underlying sources.^[89] This visualization capability transforms structured data into narrative-driven explorations, such as chronological overviews of historical periods or biographical trajectories. A 2023 systematic review of Wikidata in digital humanities projects, covering literature from 2019 to 2022, analyzed 50 initiatives and found that it is predominantly used as a content provider and integration platform, with applications in entity resolution and semantic enrichment for textual analysis.^[90] The review identifies key challenges, including data sparsity in niche domains, but underscores Wikidata's value in enabling reproducible, collaborative humanities scholarship through SPARQL queries and property extensions.^[91] Wikidata's utility in research extends to scholarly metadata and digital preservation, where it acts as a centralized hub for bibliographic and technical information. The WikiCite initiative structures citations and references within Wikidata, compiling open bibliographic metadata for scholarly articles, books, and datasets to support citation tracking and literature reviews across disciplines.^[92] This includes properties for authors, publication dates, DOIs, and peer-review status, facilitating the creation of knowledge graphs for academic provenance. In digital preservation, Wikidata stores machine-readable metadata about software, file formats, and computing environments, aiding long-term access to digital artifacts.^[93] For example, items on historical computing devices, such as early mainframes or programming languages, incorporate preservation-relevant details like emulation requirements and format specifications, enabling registries for software sustainability.^[94] The 2019 iPRES conference paper on Wikidata for digital preservation emphasizes its infrastructure for syndicating such metadata, promoting community contributions to mitigate obsolescence in computing history.^[95] Recent AI integrations have expanded Wikidata's reach by enhancing machine-readable access to its structured data. The Wikidata Embedding Project, launched on October 1, 2025, introduces a vector database that applies semantic embeddings to over 120 million entries, enabling efficient similarity searches and integration with large language models for knowledge retrieval.^[96] This open-source initiative, developed by Wikimedia Deutschland, provides APIs for developers to query Wikidata semantically, supporting applications in natural language processing and recommendation systems while prioritizing data privacy and openness. Applications like Inventaire utilize Wikidata for resource mapping, particularly in bibliographic domains, by building a CC0-licensed database of books and entities to facilitate peer-to-peer sharing and inventory management.^[97] Inventaire's model reconciles user-contributed data with Wikidata items via ISBNs and titles, creating federated networks for cultural resource discovery.^[98]

Impact and Reception

Growth and Adoption Statistics

Wikidata's item count has expanded significantly since its early years, reaching approximately 10 million items by October 2014 and surpassing 119 million items by late 2025.^[99]^[100] The knowledge base now encompasses 1.65 billion statements as of April 2025, reflecting the accumulation of structured data across entities.^[25] Edit activity remains robust, with around 13 million edits performed monthly in October 2025, contributing to a cumulative total exceeding 2.4 billion edits since launch.^[101]^[100] Contributions are driven by a community of nearly 29,000 active registered editors monthly, alongside anonymous users and approximately 1,000 active bots, though bots account for about 52% of all edits.^[100]^[102] Power users, who engage in high-volume and complex editing sessions, dominate alongside bots, while casual editors typically make fewer, simpler contributions.^[103]^[104] Wikidata's global accessibility is enhanced by labels in over 300 languages, supporting multilingual data representation.^[105] API traffic underscores its adoption, with the SPARQL query service processing approximately 10,000 requests per minute and total page views reaching 708 million in October 2025.^[106]^[101]

Awards, Criticisms, and Recognition

Wikidata has received notable recognition for its contributions to open data. In 2014, it was awarded the Open Data Publisher Award by the Open Data Institute for demonstrating high publishing standards and innovative use of challenging data on a massive scale.^[8] More recently, in 2025, Wikidata was officially recognized as a digital public good by the Digital Public Goods Alliance, highlighting its role in promoting sustainable digital development and open access to structured knowledge worldwide.^[107] Despite these accolades, Wikidata has faced criticisms regarding data quality, particularly the prevalence of unsourced claims that can propagate inaccuracies across linked projects. Studies have identified issues such as missing references and constraint violations, which undermine reliability in certain domains.^[108] Additionally, coverage biases have been noted, with content disproportionately focused on Western-centric topics, leading to underrepresentation of non-Western cultural and demographic knowledge.^[109] Reception of Wikidata has been largely positive, especially for advancing open access to structured data, as reflected in community celebrations like its 2017 fifth birthday, where contributors emphasized its growth into a vital hub for inter-language knowledge linking.^[110] Research on editor dynamics further illustrates evolving participation, with analyses showing how power users and standard contributors adapt their editing behaviors over time, contributing to sustained development.^[104] In response to quality concerns, the Wikidata community has initiated frameworks to address these challenges, including a 2024 referencing quality scoring system that evaluates sources based on Linked Data dimensions like completeness and understandability.^[111] These efforts aim to enhance trustworthiness and mitigate biases through collaborative governance.

References

[1]
Wikidata
- **Description**: Wikidata is a free, open knowledge base with 119,311,959 data items, editable by humans and machines.
[2]
Wikidata - Meta-Wiki
Jul 12, 2025 · A Wikimedia hosted and maintained project launched in 2012 that aims to create a free knowledge base about the world that can be read and edited by humans and ...
[3]
Event:Improving Nigeria items on Wikidata to celebrate Wikidata ...
Wikidata was officially launched on October 29, 2012. This date is often considered its "birthday." It is a free and collaborative knowledge base that acts as ...
[4]
Wikidata:Data model
Jun 8, 2025 · Knowledge about data items is represented via statements, whose basic structure consists of a subject, a predicate and an object.
[5]
Help:Data type - Wikidata
Oct 29, 2025 · Data types define how the statement will behave, and what kind of data they take. Different types of statements use different types of properties.
[6]
Wikidata Query Service - Wikitech
Oct 24, 2025 · Wikidata Query Service is the Wikimedia implementation of SPARQL server, based on the Blazegraph engine, to service queries for Wikidata and other data sets.
[7]
Wikidata/Technical proposal - Meta-Wiki
Mar 4, 2020 · Wikidata is a project to create a free knowledge base about the world that can be read and edited by humans and machines alike. It will provide ...
[8]
Wikidata: The Making Of
May 4, 2023 · The development of Wikidata began on April 1st, 2012 in Berlin. ... Markus Krötzsch and Denny Vrandečić. 2008. Semantic Wikipedia. In ...
[9]
Wikipedia's Next Big Thing: Wikidata, A Machine-Readable, User ...
Mar 30, 2012 · Wikipedia's Next Big Thing: Wikidata, A Machine-Readable, User-Editable Database Funded By Google, Paul Allen And Others · Google makes it easier ...
[10]
[PDF] Wikidata: A Free Collaborative Knowledge Base - Google Research
Denny Vrandecic (vrandecic@google.com) works at. Google. He was the project director of Wikidata at. Wikimedia Deutschland until September 2013. Markus.Missing: Berlin | Show results with:Berlin
[11]
The Wikidata revolution is here: enabling structured data on Wikipedia
Apr 25, 2013 · The development of Wikidata began in March 2012, led by Wikimedia Deutschland, the German chapter of the Wikimedia movement. Since Wikidata.org ...Missing: interlanguage | Show results with:interlanguage
[12]
Review of the big Interwiki link migration - addshore
Jun 13, 2015 · Wikidata was launched on 30 October 2012 and was the first new project of the Wikimedia Foundation since 2006. The first phase enabled items ...Missing: 29 | Show results with:29
[13]
Module talk:Wikidata/Archive 1 - Wikipedia
This is an archive of past discussions about Module:Wikidata. Do not edit the contents of this page. If you wish to start a new discussion or revive an old ...
[14]
Lexicographical data on Wikidata: Words, words, words
Mar 25, 2019 · Lexicographical data were introduced in May 2018 and have been with us now for almost a year. ... Wikidata currently has 43440 Lexemes in ...
[15]
[Wikidata] Shape Expressions arrive on Wikidata on May 28th - Lists
May 28, 2019 · - A new entity type, EntitySchema, will be enabled to store Shape Expressions. Schemas will be identified with the letter E. - The Schemas will ...
[16]
None
### Summary of Wikidata Community Survey 2024 Report
[17]
Event:WikidataCon 2025 - Wikidata
WikidataCon 2025 was a conference that took place on October 31 - November 02, 2025. Organized by Wikimedia Deutschland, it was a gathering of Wikidata ...
[18]
Wikimedia Deutschland/Plan 2025/en - Meta-Wiki
Feb 14, 2025 · The programmatic annual planning for 2025 is based for the first time on the Strategic Goals for Wikimedia Deutschland 2030. Wikimedia ...Missing: scalability machine
[19]
Building an Internet for Everyone: Wikidata Recognized as a Digital ...
Oct 29, 2025 · After Wikipedia's recognition as a digital public good (DPG) in February 2025, Wikidata has now also received this distinction. And one project ...
[20]
Help:Items - Wikidata
Jul 14, 2025 · Understanding items, creating a new item, adding to an existing item page, editing an item page, documentation for items, deleting an item page.
[21]
Wikidata:Glossary - Wikidata
### Summary of Key Wikidata Glossary Terms
[22]
Help:Properties - Wikidata
Sep 21, 2024 · A property describes the data value of a statement and can be thought of as a category of data, for example "color" for the data value "blue".
[23]
Wikidata:Database reports/List of properties/Top100
A list of the top 100 properties by quantity of item pages that link to them. Data as of 2025-11-07 09:17 (UTC). Property, Quantity of item pages ...
[24]
Help:Statements - Wikidata
Jun 5, 2025 · Statements are used for recording data about an item · Statements consist of (at least) one property-value pair · Statements can be further ...
[25]
Help:Ranking - Wikidata
Feb 9, 2025 · The default rank is the "normal" rank; statements-value pairs may also be marked with "preferred" or "deprecated" ranks.
[26]
Automatic Verification of References of Wikidata Statements
Apr 23, 2025 · Wikidata contains 1.65 billion statements about over 116 million data items, edited by nearly 25 thousand editors. Manually checking whether ...
[27]
Wikidata:Lexicographical data
Overview · Documentation · Development · Tools · Support for Wiktionary · How to help · Statistics · Lexemes · Discussion. Wikidata:Lexicographical data.Missing: introduction | Show results with:introduction
[28]
cat - Wikidata
Oct 26, 2025 · Statements ; subject sense · cat (English) - domesticated subspecies of feline animal, commonly kept as a house pet ; subject lexeme form · cats.
[29]
Wikidata:Lexicographical data/Documentation
This documentation describes the structure of Wikidata lexemes, how to edit them, and what can be added to enrich them. Modeling may vary by language.Missing: L42 | Show results with:L42<|separator|>
[30]
Wikidata:Lexicographical data/Documentation/Lexeme statements
Nov 29, 2024 · The English noun cat (L7) has four usage examples, covering the two forms of the lexeme and two of the lexeme's senses, each referencing a ...Missing: L42 | Show results with:L42
[31]
[PDF] Lexemes in Wikidata: 2020 status - ACL Anthology
The top language with most lexemes is Russian (101,137 lex- emes), followed by English (38,122), Hebrew (28,278),. Swedish (21,790), Basque (18,519), French ( ...
[32]
Wikidata:WikiProject Schemas
Schemas are stored on a dedicated namespace and expressed in the Shape Expressions language (ShEx), using ShExC syntax. An alternative approach to schemas are ...ShEx · Examples and Tutorials · Tools
[33]
ShExStatements: Simplifying Shape Expressions for Wikidata
Jun 3, 2021 · Wikidata recently supported entity schemas based on shape expressions (ShEx). They play an important role in the validation of items ...Missing: introduction | Show results with:introduction
[34]
Wikidata:Requests for comment/Wikidata to use data schemas to ...
Dec 16, 2023 · An editor has requested the community to provide input on "Wikidata to use data schemas to standardise data structure on a subject" via the ...Missing: versioning | Show results with:versioning
[35]
Wikidata:Database reports/EntitySchema directory
This is a programmatically generated directory of EntitySchemas. Any changes made to this page will be lost during the next update.Empty schema · human · miscellaneous · molecular biologyMissing: versioning | Show results with:versioning
[36]
A protocol for adding knowledge to Wikidata: aligning resources on ...
Jan 22, 2021 · Using the existing Wikidata infrastructure, we developed semantic schemas for virus strains, genes, and proteins; bots written in Python to add ...
[37]
Research:Adapting Wikidata to support clinical practice using Data ...
May 30, 2022 · Timeline. edit. Please provide in this section a short timeline with the main milestones and deliverables (if any) for this project. Aim, Task ...
[38]
Help:Qualifiers - Wikidata
May 17, 2025 · Qualifiers expand, annotate, or contextualize statements, further describing or refining a property's value, providing additional information.
[39]
Help:Sources - Wikidata
Apr 30, 2025 · In Wikidata, references are used to point to specific sources that back up the data provided in a statement. References, like statements, also ...Language-independent... · When to source a statement · Different types of sources
[40]
Help:Property constraints portal - Wikidata
Jul 1, 2025 · Property constraints are rules on properties that specify how they should be used, acting as hints, not firm restrictions, to guide editors.Popular constraint types · Datatype-independent... · Datatype-specific constraintsMissing: documentation | Show results with:documentation
[41]
[PDF] How to Edit Wikidata - Wikimedia Commons
3) Fill in the first interwiki link and any aliases that the entry might have. After filling out the “Create a new item” page, you will be taken to the new ...
[42]
Ten quick tips for editing Wikidata - PMC - NIH
Jul 20, 2023 · The easiest first edit to make is to add a new statement to an existing item. Just use the button and Wikidata will attempt to autocomplete and ...
[43]
QuickStatements - Wikimedia Meta-Wiki
Jan 13, 2024 · QuickStatements is a tool, written by Magnus Manske, that can add and remove labels, descriptions, aliases, statements (with optional qualifiers and sources) ...
[44]
Reconciling with Wikibase - OpenRefine
Dec 7, 2023 · Wikidata reconciliation behaves the same way other reconciliation services do, but there are a few processes and features specific to Wikidata.
[45]
Meta:Bots - Meta-Wiki
Jul 24, 2025 · This wiki uses the standard bot policy, and allows global bots and automatic approval of certain types of bots. Requests for bot flags ...
[46]
Wikidata:Verifiability
Oct 2, 2024 · Please note that just as Wikipedia does not publish original research or content representing the beliefs of its editors, Wikidata does not ...
[47]
Wikidata:WikiProject Biographical Identifiers
Wikidata had entries for approximately 10 million people in August 2022 (6.34M in Jan 2020, 3.3M in Nov 2016; by way of comparison, VIAF had 21M people in Jul ...
[48]
Wikidata and the bibliography of life - PMC - PubMed Central
This article argues that Wikidata can be that database as it has flexible and sophisticated models of bibliographic information.
[49]
Wikidata:WikiProject Music - Wikidata
Summary of each segment:
[50]
Wikidata:WikiProjects
If you wish to suggest a new WikiProject (instead of creating one yourself), don't hesitate to put it on the talk page. WikiProjects can have a list of ...Missing: examples | Show results with:examples
[51]
Wikidata:Database reports/Constraint violations
Aug 31, 2023 · Constraint reports that cannot be evaluated are listed at: Wikidata:Database reports/Constraint violations/Errors. Note: reports include also ...Missing: quality tools
[52]
https://www.wikidata.org/wiki/Template:Constraint
[53]
Wikidata:Properties for deletion
This is the page for requesting the deletion of a property (for items, with IDs beginning with "Q", please use requests for deletions).
[54]
Research:Building automated vandalism detection tool for Wikidata
Feb 2, 2016 · In this work, we build off of past work detecting vandalism in Wikipedia to detect vandalism in Wikidata. This work is novel in that identifying ...
[55]
A study of the quality of Wikidata - ScienceDirect.com
In this paper, we develop a framework to detect and analyze low-quality statements in Wikidata by shedding light on the current practices exercised by the ...
[56]
Extension:Wikibase - MediaWiki
Sep 26, 2025 · Wikibase is a collection of software (applications and libraries) for creating, managing and sharing structured data.
[57]
Wikibase - MediaWiki
Sep 28, 2025 · Wikibase is a software toolbox that offers a collaborative space to pool, edit and curate information in a structured way.
[58]
wikimedia/mediawiki-extensions-Wikibase: The knowledge ... - GitHub
The Wikibase.git package is part of the Wikibase software and consists of multiple MediaWiki extensions and other components.
[59]
Examining Wikidata and Wikibase in the context of research data ...
Mar 17, 2022 · These qualities make Wikidata an attractive environment for data storage, curation and extraction. ... MySQL database that is connected to ...
[60]
Wikidata Query Service/ScalingStrategy - Wikitech - Wikimedia
Feb 29, 2024 · Horizontal scaling via sharding [lag,capacity]. Solutions whose goal is to partition the data in a way that a single edit event is sent to a ...
[61]
Wikimedia servers - Meta-Wiki
Wikipedia and the other Wikimedia projects are run from server racks located in several data centres. Contents. 1 System architecture.System architecture · Hosting · Status and monitoring · Energy use
[62]
Extension:Semantic Wikibase - MediaWiki
Apr 18, 2025 · Extension:Semantic Wikibase ; Composer · professional-wiki/semantic-wikibase ; License, GNU General Public License 2.0 or later ; Download. GitHub:.
[63]
Wikidata Query Service
Statements · instance of · online service. 0 references · inception. September 2015. 0 references · logo image · Wikidata Query Service Favicon.svg · image.
[64]
Wikidata Query Service/User Manual - MediaWiki
Feb 11, 2015 · Wikidata Query Service (WDQS) is a software package and public service designed to provide a SPARQL endpoint which allows you to query against the Wikidata ...<|control11|><|separator|>
[65]
Wikidata Query Service
Does your query include Items describing scientific articles? The graph split affects your queries. Please learn more about it on this help page.
[66]
API:Action API - MediaWiki
Aug 31, 2025 · This page is part of the MediaWiki Action API documentation. This page provides an overview of the MediaWiki Action API, represented by the api.
[67]
Wikidata:REST API
Sep 15, 2025 · Contents · Overview · How to use the API · Functionality · Why should I use the Wikibase REST API? · Libraries · Wikibase: How to enable the API · See ...How to use the API · Functionality · Why should I use the Wikibase...
[68]
Wikidata:Data access
Each Item or Property has a persistent URI made up of the Wikidata concept namespace and the Item or Property ID (e.g., Q42 , P31 ) as well as concrete data ...
[69]
Wikidata:SPARQL query service/queries/examples
This page contains examples of SPARQL queries for data in Wikidata. When adding any further queries, consider adding a comment in the query.
[70]
Wikidata:SPARQL query service/query limits
Aug 27, 2020 · Try to map out the limits of what is currently possible with the query service -- in particular, for example, how far it is possible to broaden narrow queries ...Missing: improvements | Show results with:improvements
[71]
Wikidata:SPARQL query service/Query Helper
Sep 24, 2025 · Query Helper allows you to create or modify an existing query without knowing SPARQL. When working with the tool it will modify the SPARQL query ...
[72]
Wikidata:Database download
The dumps are being created on a weekly basis. ... It is strongly recommended to use the JSON or RDF dumps instead, which use canonical representations of the ...
[73]
Data dump torrents - Meta-Wiki - Wikimedia
This is an unofficial listing of Wikimedia data dump torrents, dumps of Wikimedia site content distributed using BitTorrent, the most popular peer-to-peer file ...
[74]
Wikidata and DBpedia: A Comparative Study - SpringerLink
Feb 8, 2018 · Both Wikidata and DBpedia allow their elements to be referred from other data sources and to establish links between their elements and other ...Missing: interlinking | Show results with:interlinking
[75]
Why data partners should link their vocabulary to Wikidata
Aug 7, 2017 · Europeana creates an expanding network of cultural heritage resources by linking Europeana objects with open datasets available online. For this ...Missing: DBpedia | Show results with:DBpedia
[76]
Wikidata:External identifiers
Aug 16, 2024 · Wikidata External Identifier properties should have dedicated items to represent their values and they should link to those using class ...Missing: alignment | Show results with:alignment
[77]
Use Cases for Wikibase
Museums. Botanical Garden & Bontanical Museum Berlin. Botanical Garden & Botanical Museum Berlin is using Wikibase as plattform to develop a semantic ...
[78]
Many faces of Wikibase: Rhizome's archive of born-digital art and ...
Sep 6, 2018 · Rhizome, an arts organization in New York City, was one of the early adopters of Wikibase, having been using it since 2015 for its archive of born-digital art ...
[79]
Wikidata:Licensing
Jan 26, 2023 · This page presents an overview of copyright licensing for Wikidata contributors and users. Contents 1 Official policy 2 Other compatible copyright designations
[80]
Help:Copyrights - Wikidata
All structured data from the main, Property, Lexeme, and EntitySchema namespaces is available under the Creative Commons CC0 License; text in the other ...
[81]
Wikidata/Help - Meta-Wiki
### Summary of Wikidata's Role in Interlanguage Links for Wikimedia Projects
[82]
[PDF] Statement Signals: Measuring Wikidata Usage on Other Wikis
Oct 13, 2024 · This exploratory project offers an initial proposal for metrics about Wikidata usage on other wikis, to support Wikimedia Deutschland's ...
[83]
Opportunities to improve integration between Wikisource and Wikidata
Jul 15, 2021 · Here are a few ideas for opportunities and applications where further integration between Wikidata and Wikisource can take place. Please note ...
[84]
Wikidata: a platform for your library's linked open data
May 4, 2018 · A freely available hosted platform that anyone–including libraries–can use to create, publish, and use LOD.
[85]
Reconciling | OpenRefine
Sep 25, 2025 · Reconciliation is the process of matching your dataset with that of an external source. Also known as record linkage, or data matching.
[86]
Wikidata in Collections: Building a Universal Language for ... - Medium
Dec 13, 2017 · Wikidata in Collections: Building a Universal Language for Connecting GLAM Catalogs ... Traditionally library metadata puts the names of ...
[87]
A Systematic Review of Wikidata in GLAM Institutions: a Labs ...
Sep 25, 2024 · This paper presents a systematic review of Wikidata use in GLAM institutions within the context of the work of the International GLAM Labs Community.
[88]
[PDF] ARL White Paper on Wikidata: Opportunities and Recommendations
Apr 18, 2019 · Wikidata's structured data is designed to summarize, augment, and update the content of Wikipedia articles in a number of ways, including.
[89]
Histropedia - Interactive Timeline Visualisations
Histropedia is a free project to create a timeline of everything, visualizing human knowledge from Wikipedia and Wikidata, and is free to use.
[90]
A systematic review of Wikidata in Digital Humanities projects
Dec 28, 2022 · A systematic review was conducted to identify and evaluate how DH projects perceive and utilize Wikidata, as well as its potential and challenges as ...<|separator|>
[91]
A systematic review of Wikidata in Digital Humanities projects
Dec 29, 2022 · A systematic review was conducted to identify and evaluate how DH projects perceive and utilize Wikidata, as well as its potential and ...
[92]
WikiCite - Meta-Wiki
Sep 16, 2025 · WikiCite is. the Wikidata project to curate linked open citations and scholarly metadata; a series of conferences and hackathons around ...
[93]
[PDF] Wikidata for Digital Preservationists
Digital preservationists require metadata about software, file formats and computing resources for the identification and management of these entities over time ...Missing: scholarly | Show results with:scholarly
[94]
Wikidata for Digital Preservation: Home
Oct 6, 2017 · The vision of Wikidata for Digital Preservation is to use Wikidata as a technical registry of metadata related to computer software and computing environments.Missing: history | Show results with:history
[95]
[PDF] WIKIDATA FOR DIGITAL PRESERVATION - iPRES2019
The Wikidata knowledge base provides a public infras- tructure for creating and syndicating machine-readable metadata about computing resources.Missing: history | Show results with:history
[96]
Wikidata:Embedding Project
The project's aim is to enhance the search functionality of Wikidata by integrating vector-based semantic search. By leveraging advanced machine learning models ...Missing: Wikisource | Show results with:Wikisource
[97]
Inventaire data
An open data project aiming at building a CC0 Wikidata-centered database. It primarly aims to address the needs of the webapp, and is enriched by its users.
[98]
Entities data - InventaireWiki
Mar 6, 2025 · The open data project aiming at building a CC0 Wikidata-centered database on resources, to support the needs of the web app, while being ...
[99]
Wikidata: A Free Collaborative Knowledgebase
Oct 1, 2014 · This collaboratively edited knowledgebase provides a common source of data for Wikipedia, and everyone else.
[100]
Wikidata:Statistics
Aug 31, 2025 · Wikidata currently contains 119,307,793 items. 2,426,567,601 edits ... This page was last edited on 31 August 2025, at 10:13. All ...
[101]
Wikimedia Statistics - Wikidata
### Summary of Latest 2025 Wikidata Statistics
[102]
Agreeing and disagreeing in collaborative knowledge graph ...
Wikidata has approximately 24 K active registered editors, 20 K anonymous editors, and 100 active bots every month. However, for these groups, different ...
[103]
Better detecting bots and replacing our CAPTCHA - Wikimedia Diff
Sep 2, 2025 · Bots have made great contributions to Wikipedia and other wikis like Wikidata (where bots have made over 90 percent of the edits so far). "Bots ...
[104]
(PDF) The Evolution of Power and Standard Wikidata Editors
Aug 7, 2025 · This is a repository copy of The evolution of power and standard Wikidata editors: comparing editing behavior over time to predict lifespan and ...
[105]
Help:Label - Wikidata
Sep 17, 2025 · Note that an item will have multiple labels in different languages. Labels in different languages may be unrelated to each other (one language ...
[106]
Big Tech locks data away. Wikidata gives it back to the internet
Aug 25, 2025 · Wikidata gets edited almost 500,000 times a day. Our SPARQL endpoint serves about 10,000 requests per minute, and it is growing every day.
[107]
Wikidata DPG Profile - Digital Public Goods Alliance
Owner. Wikimedia Foundation, Inc. ; Type. data portal ; Licence. CC0-1.0 ; Last evaluated. 25.09.2025 ; Origin country. United States of America ...
[108]
[2107.00156] A Study of the Quality of Wikidata - arXiv
Jul 1, 2021 · In this paper, we develop a framework to detect and analyze low-quality statements in Wikidata by shedding light on the current practices ...Missing: reports | Show results with:reports
[109]
Demographic disparity in Wikipedia coverage: a global perspective
Feb 21, 2025 · (3) Development level: Previous studies have found that Wikipedia biographies have a western bias (i.e. celebrities from western counties are ...
[110]
Wikidata, a rapidly growing global hub, turns five - Wikimedia Diff
Oct 30, 2017 · As Wikidata celebrates its fifth birthday, it has evolved from an experimental semantic web database to an inter-language linking hub for Wikipedia articles.Missing: reflections | Show results with:reflections