Fact-checked by Grok 2 weeks ago

Coreference

Coreference is a linguistic in which two or more expressions within a text or , such as phrases, pronouns, or proper names, refer to the same real-world , , object, or , thereby establishing referential across mentions. This relation is essential for cohesion, allowing speakers or writers to avoid by reusing references to previously introduced entities, and it operates on a from full to partial overlaps influenced by context and . In linguistic theory, coreference encompasses various subtypes, including anaphora, where a subsequent expression (the anaphor) links back to an earlier antecedent, and cataphora, a forward-pointing reference resolved by later context. Additional forms involve discourse deixis, which references prior segments of the text, and predicative relations, where expressions attribute properties to the same entity. Theoretical models, such as those drawing on mental space theory, explain shifts in coreference through cognitive operations like specification (adding details), refocusing (changing perspective), and neutralization (reducing specificity), highlighting its dynamic and context-dependent nature. Within (NLP), coreference resolution denotes the computational task of automatically detecting and clustering these coreferential mentions to enhance text understanding. Originating in the with early heuristic systems, the field advanced through annotated corpora from initiatives like the Message Understanding Conferences (MUC) and Automatic Content Extraction () in the , enabling supervised approaches such as mention-pair models and ranking algorithms. Key challenges persist, including ambiguity in non-referential phrases, handling singletons (entities with only one mention, comprising 60-86% of cases), and domain-specific issues like or temporal variations in specialized texts. Applications span , , summarization, and clinical , where accurate resolution improves system performance despite ongoing needs for robust evaluation metrics like B³ for .

Fundamentals

Definition

Coreference is a linguistic in which two or more noun phrases (NPs) or other expressions within a refer to the same real-world entity, abstract concept, or . This enables efficient communication by linking expressions that share referential identity, such as a proper name and a subsequent . For instance, in the "John entered the room. He sat down," the "He" corefers with the proper name "," both denoting the same individual. Central to coreference are referring expressions, which include proper names (e.g., "John"), definite NPs (e.g., "the man"), and pronouns (e.g., "he"), that point to entities in the discourse model. These expressions are distinguished by their roles: an antecedent is the initial expression that introduces or establishes the entity, while subsequent coreferents refer back to it. Key terminology includes "mention," which denotes any surface-level expression (typically an NP) that refers to an entity; "entity," the underlying object, person, or concept being referenced; and "coreferent chain," a sequence of two or more mentions linked as referring to the identical entity. Coreference plays a vital in maintaining , allowing speakers and writers to avoid by substituting full NPs with shorter forms like pronouns, thereby facilitating smoother text flow and reader . This mechanism supports the construction of a shared of the discourse, where entities persist across sentences. Coreference encompasses subtypes such as anaphora (backward reference) and cataphora (forward reference), though these are explored in greater detail elsewhere.

Historical Development

In the early , , pioneered by , shifted focus to synchronic analysis of language as a system of signs, laying foundational ideas on how signifiers relate to signified entities and influencing later explorations of referential relations in . The mid- marked a pivotal advancement through , as integrated coreference into syntactic theory during the 1950s and 1960s. In works like (1957) and Aspects of the Theory of Syntax (1965), linked coreference to and transformational operations, introducing referential indices to track co-referential elements across sentences and distinguishing them from bound variables. This approach treated coreference as a derivational phenomenon, assuming syntactic identity often aligned with semantic equivalence, though challenges from quantifiers soon highlighted limitations. From the 1970s to the 1980s, attention turned to and , with scholars like Edward L. Keenan examining definite descriptions and their role in establishing coreference. Keenan's 1971 work examined two kinds of in , including those carried by definite descriptions regarding existence and uniqueness. This period solidified coreference's place in semantic theory, as seen in Peter Sells' 1985 Lectures on Contemporary Syntactic Theories, which synthesized anaphora across frameworks like government-binding theory and generalized . By the 1990s, coreference transitioned into with the rise of (NLP), shifting from to applied systems for resolving references in text. Early supervised methods, such as those using on annotated corpora, emerged around 1995, enabling automated clustering of mentions and marking a key evolution toward practical discourse understanding.

Types and Examples

Anaphoric Coreference

Anaphoric coreference, commonly referred to as anaphora, occurs when a linguistic expression, such as a or definite , follows an antecedent in the and derives its from that prior element. In this forward-referring relation, the anaphor points back to the antecedent to establish identity of , enabling efficient communication by avoiding repetition of full descriptions. For instance, in the sentence "The chased the . It was fast," the "it" serves as the anaphor referring to "the " as its antecedent, though ambiguity may arise if "it" could plausibly refer to "the " in context, highlighting the need for pragmatic resolution to disambiguate coreferent from non-coreferent interpretations. Linguistic constraints on anaphora are primarily governed by binding theory, a framework in that regulates the structural relations between antecedents and anaphors. Principle A of binding theory stipulates that anaphors, such as reflexives (e.g., "himself") or reciprocals (e.g., "each other"), must be bound to a c-commanding antecedent within a local domain, typically the same , ensuring the antecedent structurally dominates the anaphor. For example, in "John saw himself," "himself" is bound to "" because "John" c-commands it from a higher structural position, whereas "*Himself saw John" violates Principle A due to the lack of such binding. These principles prevent illicit coreference, such as in cases where a pronoun cannot bind to a non-c-commanding element, distinguishing grammatical anaphora from ungrammatical attempts at coreference. In , anaphora plays a crucial role in maintaining by linking sentences and reducing redundancy, allowing speakers to track entities across a without restating them explicitly. This mechanism fosters textual unity, as seen in extended narratives where repeated pronouns or definite descriptions signal continuity of reference, enhancing readability and flow. Anaphora thus contributes to the overall of by presupposing shared of antecedents, which listeners or readers recover to interpret subsequent expressions. Anaphora varies between surface and deep forms, distinguished by whether the anaphoric process relies on syntactic structure or deeper semantic interpretation. Surface anaphora, such as simple pronominal reference, requires a linguistically overt antecedent and is controlled by syntactic rules, whereas deep anaphora involves pragmatic or interpretive reconstruction without strict syntactic parallelism. A classic example of deep anaphora is (VP) ellipsis, as in "John likes apples, and Mary does too," where the elided VP under "does" is interpreted as coreferent with "likes apples" through semantic recovery rather than surface form. This distinction underscores how anaphora can operate at different levels of linguistic processing to achieve coreference. In contrast to cataphora, which involves backward reference to a forthcoming antecedent, anaphora is inherently forward-directed in its textual progression.

Cataphoric Coreference

Cataphoric coreference, or cataphora, occurs when a linguistic expression, such as a , precedes and refers forward to a subsequent antecedent that provides its full interpretation. This contrasts with the more prevalent anaphoric coreference, where reference points backward to a prior expression. Cataphora is less common than anaphora due to the cognitive processing demands it imposes on language users, as resolving the reference requires holding the initial expression in until the antecedent appears. Syntactically, it is constrained and typically occurs in specific structural contexts, such as preposed subordinate clauses or lists, where the forward reference can be anticipated within a bounded . For instance, in relative clauses introduced early, cataphora facilitates integration without violating locality principles. A classic example is the sentence "When he arrived, John was tired," where the pronoun "he" cataphorically refers to the later noun "John." In more complex sentences, such as those involving embedded clauses, cataphora can lead to ambiguous readings if multiple potential antecedents follow, as in "If she wins the award, Mary will celebrate with her team," where "she" anticipates "Mary" but could momentarily suggest another entity. In , cataphora serves anticipatory functions by building or structuring , particularly in stylistic writing like or technical descriptions, where it signals upcoming details to engage the reader. For example, novelists use it to character introductions, enhancing . Identifying cataphora presents challenges due to its higher relative to anaphora, as the lack of an immediate antecedent increases the risk of misresolution during . analyses of literary texts, such as those in the Anaphoric Treebank, reveal that cataphoric instances are rarer and often context-dependent, complicating and analysis in extended narratives like novels.

Theoretical Aspects

Relation to Bound Variables

In formal semantics, coreference refers to the where two expressions denote the same in the , establishing referential identity independent of syntactic structure. In contrast, bound variables involve scope-dependent assignments where a or anaphor is interpreted as a variable governed by a quantifier or lambda operator, as seen in frameworks like and predicate logic. This distinction is crucial because bound variables do not imply coreference; instead, they allow the pronoun's interpretation to covary with the quantifier's scope without referring to a fixed individual. A key example illustrates this difference: in the sentence "John loves his wife," the pronoun "his" is coreferential with "John," denoting the same specific individual. However, in "Every man loves his wife," "his" is not coreferential with "every man" but functions as a bound , interpreted such that each man in the domain loves his own wife, with the pronoun's reference varying across instances. This bound reading arises from the quantifier's scope, avoiding a unified referential link. This theoretical framework originates in , developed in the 1970s, which integrates syntax with to handle quantification and pronouns. Montague's approach treats pronouns systematically as bound variables when under quantifier scope, using translations that embed them within lambda abstractions or predicate logic operators, without invoking coreference. For instance, the of "Every man loves his " can be represented as: \forall x \, [\text{man}(x) \to \exists y \, [\text{wife}(y,x) \land \text{love}(x,y)]] Here, the pronoun "his" corresponds to the bound x, scoped by the universal quantifier, demonstrating how operates without referential identity to a single . This representation highlights the scope dependency, where the existential quantifier for the is subordinated to the universal over men. The implications of this separation are significant for semantic , as conflating coreference with can lead to errors in resolving ambiguities or quantifier interactions. By distinguishing referential links from operator , ensures compositional meanings that accurately capture variable interpretations in quantified contexts, influencing broader formal semantic theories.

Coreference in Formal Semantics

In formal semantics, coreference is modeled through dynamic frameworks that represent how utterances update a shared discourse context, tracking entities and their relations across sentences. Discourse Representation Theory (DRT), introduced by Hans Kamp in 1981, provides a foundational approach by constructing Discourse Representation Structures (DRS) to handle anaphoric relations. In DRT, coreferring expressions link to the same discourse referent, a variable representing an entity in the mental model of the discourse. This allows for systematic resolution of pronouns and definite descriptions by embedding conditions that equate or subordinate referents. DRT employs a notation to visually represent these structures, where referents are listed above a horizontal line and conditions below it. For instance, the "John entered the . He sat down" begins with a DRS for the first introducing referent x with the John(x) \land entered(x, room): \begin{array}{c} x \\ \hline John(x) \land entered(x, room) \end{array} The second updates this DRS by adding referent y with conditions he(y) and y = x, along with sat(y), yielding: \begin{array}{c} x, y \\ \hline John(x) \land entered(x, room) \land he(y) \land y = x \land sat(y) \end{array} This equivalence y = x captures coreference, ensuring the pronoun "he" refers back to John. A related framework, File Change Semantics developed by Irene Heim in 1982, treats the discourse context as a "file" of indexed entries for entities, where coreferents update the same file card rather than introducing new ones. In this system, anaphors succeed only if their descriptive content matches an existing entry, promoting incremental context updates. Advanced mechanisms in these theories address unresolved coreference via , as elaborated in Heim's 1983 work on . Accommodation permits the addition of presupposed referents to the when they are not explicitly introduced, enabling felicitous coreference in underspecified discourses—for example, inferring a shared acquaintance in "His is waiting" without prior mention. This process ensures semantic by globally or locally adjusting the file to satisfy referential presuppositions. Cross-linguistically, formal semantic models like DRT adapt to variations in systems, such as in null-subject languages like , where pro-drop allows omitted subjects to corefer via implicit without overt forms. In , null subjects preferentially resume topic-continuous antecedents, modeled by restricting in the to high-salience positions, unlike in non-pro-drop languages requiring explicit for the same links. This variation highlights how morphological options influence coreference resolution within unified dynamic frameworks.

Computational Approaches

Coreference Resolution Task

Coreference resolution is a fundamental task in (NLP) that involves identifying all mentions—linguistic expressions such as noun phrases—in a document that refer to the same real-world entity and partitioning them into coreference chains. For example, in the sentence "John entered the room. He sat down," the chain would group "John" and "he" as referring to the same individual. The goal is to produce an output that accurately clusters these mentions while handling variations in form, such as pronouns, definite descriptions, or proper names. The task typically breaks into subtasks, beginning with mention detection, which identifies candidate spans in the text that could serve as referring expressions. This step is crucial as a prerequisite, since coreference linking operates on detected mentions, and errors here propagate to subsequent clustering. Following detection, the core linking subtask groups compatible mentions into equivalence classes, effectively resolving which referents are identical without linking to external knowledge bases in the standard setup. Key datasets have standardized evaluation and training for the task. OntoNotes, introduced in the mid-2000s, provides a large-scale, multilingual with coreference annotations integrated alongside other layers like and , covering genres such as news and conversational text. The CoNLL-2011 shared task established benchmarks using OntoNotes data, focusing on unrestricted coreference across English, Arabic, and Chinese, and promoting consistent annotation schemes that include both nominal and pronominal mentions. These resources shifted the field toward end-to-end systems that jointly handle mention detection and resolution on diverse, real-world texts. Evaluation metrics assess the quality of predicted coreference chains against gold-standard annotations, emphasizing in linking mentions. The Message Understanding Conference (MUC) metric, introduced in 1995, measures link-based performance by counting the number of correctly merged or split coreference sets, rewarding systems that avoid spurious merges while penalizing missed connections. B-Cubed, proposed in 1998, is a mention-centric approach that computes for each individual mention based on the proportion of correctly clustered co-referents in its entity, then averages across all mentions for an overall F1 score. The Constrained Entity-Alignment F-measure (CEAF), from 2005, aligns predicted and gold chains via a bipartite matching to optimize entity overlap, providing a more robust measure that accounts for both mention boundaries and partition structure. Shared tasks like CoNLL-2011 often report the average of these three F1 scores as a composite metric to balance their perspectives. Challenges in the task setup arise from inherent linguistic ambiguities, particularly in long-distance coreference where mentions are separated by intervening text, making contextual dependencies harder to capture. Nested entities, such as a embedded within a larger (e.g., "the man's mother" where "man" and "mother" may corefer differently), complicate mention detection and partitioning by introducing overlapping spans that defy simple linear clustering. These issues underscore the need for models robust to structural complexity in annotation guidelines like those in OntoNotes.

Algorithms and Models

Early methods for coreference resolution relied on rule-based systems that leveraged syntactic parsing to identify antecedents, particularly for . A seminal example is Hobbs' algorithm, introduced in , which operates on surface parse trees by traversing the syntactic structure from the pronoun upward and then downward to find potential antecedents based on recency and grammatical constraints. This approach, developed in the and refined through the , achieved notable success in pronoun resolution without requiring deep semantic analysis, serving as a baseline for decades due to its simplicity and efficiency. In the 2000s, statistical approaches shifted the paradigm toward data-driven models, using classifiers to predict coreference links between noun phrases. A foundational work is Soon et al. (2001), who employed maximum entropy classifiers trained on annotated corpora like MUC-7, incorporating features such as agreement in number, gender, and proper names, along with distance metrics; this system achieved approximately 60.4% F1 on the MUC-7 benchmark. These methods improved over rule-based systems by learning from examples, often outperforming them on diverse texts, though they were limited by pairwise decisions that required post-processing for full chains. The advent of neural networks in the late introduced end-to-end models that jointly predict mentions and coreference clusters without explicit pairwise . Lee et al. (2017) proposed the first such system, using a encoder to score potential antecedent for each mention, trained to maximize the marginal likelihood of gold clusters; on the OntoNotes benchmark, it attained 67.2% average F1, surpassing prior statistical methods by eliminating reliance on hand-crafted features or parsers. Building on this, -based models from 2018 onward integrated architectures for contextual embeddings, enabling more nuanced prediction. For instance, Joshi et al. (2019) fine-tuned as a coreference head atop representations, yielding 76.9% F1 on OntoNotes and highlighting 's ability to capture long-range dependencies. Advanced techniques have further refined these neural foundations, incorporating graph-based partitioning to cluster mentions into chains and specialized embeddings for better span handling. Denis and Baldridge (2010) introduced partitioning for end-to-end , where mentions form vertices and edges encode compatibility scores, allowing of clusters in a single step. Integration of contextual embeddings like (in extensions of Lee et al., 2018, reaching 70.4% F1 on OntoNotes) and SpanBERT ( et al., 2020, achieving 79.6% F1) has enhanced representation of variable-length spans, with SpanBERT's pretraining on masked spans proving particularly effective for coreference by emphasizing contiguous text units. Performance trends reflect this evolution, with systems advancing from around 60% F1 in the 2000s on benchmarks like MUC to over 80% in the on OntoNotes, driven by neural architectures and pretrained models that better model context; recent models like (Xia et al., 2024) achieve 83.6% F1 using efficient DeBERTa-based pipelines, with ongoing advances incorporating large models as of 2025.

Applications and Challenges

Practical Applications

Coreference resolution plays a pivotal role in by linking multiple mentions of the same , thereby enhancing the accuracy of entity recognition and relation extraction across diverse texts. In search engines, it improves query understanding by resolving ambiguous references, such as pronouns in user inputs, allowing systems to better interpret intent and retrieve relevant results. In , coreference resolution ensures coherence by correctly linking pronouns and entities across languages, particularly in neural MT systems where or structure differences can lead to errors. By identifying and preserving coreferential relationships in texts, it enables translators to maintain entity consistency in outputs, such as resolving "it" to the appropriate antecedent in translations from English to gendered languages like or . This integration has been shown to boost translation quality in document-level models. Dialogue systems benefit from coreference resolution through improved tracking of referents in multi-turn conversations, enabling chatbots to respond contextually to mentions. Coreference helps resolve ellipses and anaphora in spoken dialogues, such as linking "that one" to a previously mentioned , which enhances natural interaction and reduces misunderstanding in voice assistants. Coreference chains support text summarization by consolidating scattered mentions of entities into unified representations, allowing for more concise and coherent abstracts while preserving key identities. This is particularly useful in abstractive summarization, where unresolved references could lead to fragmented or repetitive outputs. In domain-specific applications, coreference resolution aids legal document analysis by linking entities and events in contracts, facilitating and compliance checks through datasets like LegalCore. Similarly, in biomedical , it connects mentions of proteins, genes, and diseases across scientific texts, improving extraction from literature such as abstracts in the GENIA corpus.

Open Challenges

Despite significant progress in coreference resolution, theoretical gaps persist, particularly in handling implicit coreference that requires inferring connections from world knowledge beyond explicit textual cues. These implicit relationships challenge formal models, as they demand integration of external , which current semantic frameworks struggle to incorporate systematically. Similarly, introduces unresolved issues in aligning coreferential elements across text and , where fine-grained cross-modal associations and inherent ambiguities in visual descriptions hinder accurate resolution. For instance, resolving a in text to an depicted in an accompanying often fails due to limited annotated multimodal datasets and the complexity of encoding inter-modal dependencies. Computationally, coreference resolution faces challenges from imbalanced training , leading to lower performance on less common types or patterns. Long-distance references pose another limitation, as models degrade in accuracy when antecedents and anaphors span large textual distances, complicating modeling over extended contexts. In low-resource languages, performance drops further owing to insufficient labeled and linguistic disparities that generic models cannot adapt to effectively. Cross-lingual coreference faces substantial hurdles from variations in linguistic structures, notably in pro-drop languages like or , where omitted s necessitate implicit inference not captured by annotation schemes designed for non-pro-drop languages. This is exacerbated by the scarcity of diverse, multilingual datasets that adequately represent such patterns, limiting the development of robust approaches. Ethical concerns arise from biases in coreference models, which can amplify societal stereotypes, particularly in where training data imbalances lead to preferential linking of occupational roles to male entities. For example, models may erroneously corefer "" with "he" more often than "she," perpetuating disparities in downstream applications like text summarization. Looking ahead, integrating large language models (LLMs) such as the series offers promising directions for zero-shot coreference resolution, enabling inference without task-specific fine-tuning, though post-2020 advances reveal persistent issues like and suboptimal handling of complex chains. These developments underscore the need for hybrid approaches that combine LLM capabilities with specialized modules to address lingering gaps in robustness and inclusivity. Historical undercoverage of cataphora, where references precede antecedents, exemplifies a broader theoretical oversight in .

References

  1. [1]
    [PDF] Coreference: Theory, Annotation, Resolution and Evaluation - UB
    Coreference relations, as commonly defined, occur between linguistic expressions that refer to the same person, object or event.
  2. [2]
    [PDF] Coreference Resolution
    Coreference, anaphors, cataphors. • Coreference is when two mentions refer to the same entity in the world. • The relation of anaphora is when a term. (anaphor) ...
  3. [3]
    Coreference resolution: A review of general methodologies and ...
    Coreference resolution is the task of determining linguistic expressions that refer to the same real-world entity in natural language.
  4. [4]
    [PDF] Coreference Resolution and Entity Linking - Stanford University
    We introduce the four types of referring expressions (definite and indefinite NPs, pronouns, and names), describe how these are used to evoke and access ...
  5. [5]
    [PDF] A Machine Learning Approach to Coreference Resolution of Noun ...
    Coreference resolution is the process of determining whether two expressions in nat- ural language refer to the same entity in the world. It is an important ...
  6. [6]
    Coreference and Lexical Repetition: Mechanisms of Discourse ...
    Two linguistic expressions are said to be coreferential if they refer to the same semantic entity; the first expression (the antecedent) introduces the entity ...
  7. [7]
    [PDF] DISCOURSE REFERENTS - ACL Anthology
    Discourse referents are established when an indefinite noun phrase justifies a later pronoun or definite noun phrase reference. This is a linguistic problem.
  8. [8]
    [PDF] An outline of the history of linguistics - CSULB
    Saussure championed the idea that language is a system of arbitrary signs, and his conceptualisation of the sign (see Figure 1.1, p.6) has been highly ...Missing: anaphora coreference
  9. [9]
    [PDF] Lecture 7. Anaphora and its History. Reference, Coreference, and ...
    Apr 4, 2012 · Some of the strongest linguistic arguments for limiting sentence-grammar to dealing with bound variable anaphora came from Tanya Reinhart (1983a ...
  10. [10]
    [PDF] Lecture 11: Introduction to Pragmatics. Formal Semantics and ...
    Apr 28, 2011 · 3.1 Presuppositions of definite descriptions. ... presupposition (Keenan 1971, Levinson 1983), but arguments against considering it a ...
  11. [11]
    Supervised Noun Phrase Coreference Research: The First Fifteen ...
    The research focus of computational coreference resolution has exhibited a shift from heuristic approaches to machine learning approaches in the past decade ...
  12. [12]
    Anaphora - Stanford Encyclopedia of Philosophy
    Feb 24, 2004 · Anaphora is sometimes characterized as the phenomenon whereby the interpretation of an occurrence of one expression depends on the interpretation of an ...Unproblematic Anaphora · Recent Theories of... · Anaphora in Sign Language
  13. [13]
    [PDF] Lecture 10. Introduction to Issues in Anaphora
    Apr 21, 2011 · Binding theory is the branch of linguistic theory that explains the behavior of sentence- internal anaphora, which is labelled 'bound anaphora' ...
  14. [14]
    [PDF] A Note on the Binding Theory - Scholars at Harvard
    Chomsky assumes that PRO is a pronominal anaphor because it is on a par with both pronouns and anaphors. According to. (la) and (lb), if PRO has a governing ...
  15. [15]
    [PDF] What is the Right Binding Theory? - University of Delaware
    The “traditional” Binding Theory of Chomsky (1981): • Anaphors (reflexives and reciprocals) need a local c-commanding antecedent. • Pronouns may not have a ...
  16. [16]
    Handouts on Brown & Yule, Halliday & Hasan - MIT Media Lab
    Anaphora is a type of co-reference. COHESION is the internal continuity or network of points of continuity within a text. As Halliday & Hasan say, " The ...Missing: coreference | Show results with:coreference
  17. [17]
    [PDF] Cohesion - conceptualizations and systemic features of English and ...
    Within endophoric reference, Halliday & Hasan (1976) differentiate further between anaphoric reference (to preceding text) and cataphoric reference (to ...Missing: coreference | Show results with:coreference
  18. [18]
    [PDF] Deep and Surface Anaphora
    Summary of Arguments In this article we investigate this difference between syntactically and pragmatically controlled anaphora, and show that anaphoric ...
  19. [19]
    [PDF] Yet another look at deep and surface anaphora
    Merchant 2001; (41)-(42) represent P-stranding languages (as seen in the (b) controls), while (43)-(45) illustrate non-P-stranding languages.
  20. [20]
    [PDF] Sameness, Ellipsis and Anaphora - UC Berkeley Linguistics
    4 In this example VPE (VP ellipsis) is acceptable, in addition to the other VP anaphora forms. VPE is not acceptable for example (1), presumably because of ...<|control11|><|separator|>
  21. [21]
    [PDF] Deep and surface properties of the Dutch dat doen -anaphor - CRISSP
    Jan 23, 2023 · the DDA) has properties of both a deep and a surface anaphor, according to the distinction made by Hankamer and Sag (1976).
  22. [22]
    What is a Cataphora | Glossary of Linguistic Terms - SIL Global
    Definition: Cataphora is the coreference of one expression with another expression which follows it. The following expression provides the information ...
  23. [23]
    On the Motivations and Pragmatic Functions of Cataphora in Natural ...
    May 27, 2023 · 1. Introduction. Cataphora, also termed “backward anaphora”, is “the process or result of a linguistic unit referring forward to. another unit” ...
  24. [24]
    (PDF) Processing Differences for Anaphoric and Cataphoric Pronouns
    Aug 6, 2025 · The results showed that anaphoric pronouns were resolved more rapidly than cataphoric pronouns when a co-referent interpretation was possible, ...
  25. [25]
    When is cataphoric reference recognised? - ScienceDirect.com
    When a pronoun appears in a preposed subordinate clause (as in, Before she began to sing, Susan stood up), incremental interpretation is suspended.
  26. [26]
    Structural constraints on cataphora - Language Log
    Jan 6, 2011 · If he refers to another, then (d), of course, is a fine sentence. Linguists can supply the requisite labels of cataphora, exophora, etc., at ...
  27. [27]
    Definition and Examples of Cataphora in English Grammar
    Jun 19, 2019 · In English grammar, cataphora is the use of a pronoun or other linguistic unit to refer ahead to another word in a sentence (ie, the referent).Missing: Crystal | Show results with:Crystal
  28. [28]
    (PDF) The Influence of Morphological Information on Cataphoric ...
    Oct 9, 2025 · potential coreference relations between the cataphoric pronoun and a following noun phrase,. such as NP2 assignment, are computed simultaneously ...
  29. [29]
    [PDF] On the Motivations and Pragmatic Functions of Cataphora in Natural ...
    May 27, 2023 · Abstract. This paper examines the motivations and pragmatic functions of cataphora in natural conversations. It is found.
  30. [30]
    [PDF] Cataphora, backgrounding and accessibility in discourse
    Cataphora is when pronouns (he, she, it, they) precede their referents, and this paper examines discourse factors and accessibility related to it.
  31. [31]
    [PDF] The Value of an Annotated Corpus in The Investigation of Anaphoric ...
    This thesis investigates English personal pronoun reference in particular focusing on cataphora (backwards anaphora), using the Anaphoric Treebank. (AT), which ...
  32. [32]
    [PDF] The Innateness of Binding and Coreference
    The intuition behind (20) is that if the structure could allow bound variable anaphora, coreference is preferred only if it is motivated-in other words, only if ...
  33. [33]
    Montague Semantics - Stanford Encyclopedia of Philosophy
    Nov 7, 2011 · Montague semantics is a theory of natural language semantics and of its relation with syntax. It was originally developed by the logician Richard Montague.<|control11|><|separator|>
  34. [34]
    [PDF] The Proper Treatment of Quantification in Ordinary English
    Also, by S1 and S2, every man 2 PT; and hence, by S14, every man loves a woman such that she loves him 2 Pt. ... those arguments are understood to be the ...
  35. [35]
    Discourse Representation Theory
    May 22, 2007 · 1. Introduction. This article concerns Discourse Representation Theory narrowly defined as work in the tradition descending from Kamp (1981a).Introduction · Donkey pronouns · Basic DRT · Applications and extensions
  36. [36]
  37. [37]
    On Coreference Resolution Performance Metrics - ACL Anthology
    Xiaoqiang Luo. 2005. On Coreference Resolution Performance Metrics. In Proceedings of Human Language Technology Conference and Conference on Empirical Methods ...
  38. [38]
    OntoNotes: The 90% Solution - ACL Anthology
    Eduard Hovy, Mitchell Marcus, Martha Palmer, Lance Ramshaw, and Ralph Weischedel. 2006. OntoNotes: The 90% Solution. In Proceedings of the Human Language ...Missing: original | Show results with:original
  39. [39]
    [PDF] A Joint Framework for Coreference Resolution and Mention Head ...
    Abstract. In coreference resolution, a fair amount of research treats mention detection as a pre- processed step and focuses on developing.
  40. [40]
    CoNLL-2011 Shared Task: Modeling Unrestricted Coreference in ...
    2011. CoNLL-2011 Shared Task: Modeling Unrestricted Coreference in OntoNotes. In Proceedings of the Fifteenth Conference on Computational Natural Language ...Missing: paper | Show results with:paper
  41. [41]
    A Model-Theoretic Coreference Scoring Scheme - ACL Anthology
    Marc Vilain, John Burger, John Aberdeen, Dennis Connolly, and Lynette Hirschman. 1995. A Model-Theoretic Coreference Scoring Scheme. In Sixth Message ...Missing: evaluation metric
  42. [42]
    [PDF] Review of coreference resolution in English and Persian - arXiv
    Coreference resolution (CR) is a critical task in NLP, defined as "the problem of identifying all noun phrases or in-text mentions that refer to the same real- ...
  43. [43]
    A Machine Learning Approach to Coreference Resolution of Noun ...
    Wee Meng Soon, Hwee Tou Ng, and Daniel Chung Yong Lim. 2001. A Machine Learning Approach to Coreference Resolution of Noun Phrases. Computational Linguistics, ...Missing: paper | Show results with:paper
  44. [44]
    End-to-end Neural Coreference Resolution - ACL Anthology
    We introduce the first end-to-end coreference resolution model and show that it significantly outperforms all previous work without using a syntactic parser.
  45. [45]
    BERT for Coreference Resolution: Baselines and Analysis
    Abstract. We apply BERT to coreference resolution, achieving a new state of the art on the GAP (+11.5 F1) and OntoNotes (+3.9 F1) benchmarks.
  46. [46]
  47. [47]
    Coreference resolution - NLP-progress
    Coreference resolution is the task of clustering mentions in text that refer to the same underlying real world entities.
  48. [48]
    [PDF] State-of-the-art NLP Approaches to Coreference Resolution
    Aug 2, 2009 · Applications and future directions. We present an overview of NLP applications which have been shown to profit from coreference in- formation, ...
  49. [49]
    Evaluating and Improving the Coreference Capabilities of Machine ...
    We develop an evaluation methodology that derives coreference clusters from MT output and evaluates them without requiring annotations in the target language.
  50. [50]
    CREAD: Combined Resolution of Ellipses and Anaphora in Dialogues
    In this work, we propose a novel joint learning framework of modeling coreference resolution and query rewriting for complex, multi-turn dialogue understanding.
  51. [51]
    Online Coreference Resolution for Dialogue Processing: Improving ...
    This paper suggests a direction of coreference resolution for online decoding on actively generated input such as dialogue.
  52. [52]
    [PDF] Using Coreference Chains for Text Summarization - ACL Anthology
    Coreference resolution is carried out by at- tempting to merge each newly added instance with instances already present in the discourse model. The basic ...
  53. [53]
    A Dataset for Event Coreference Resolution in Legal Documents
    In this paper, we present the first dataset for the legal domain, LegalCore, which has been annotated with comprehensive event and event coreference ...
  54. [54]
    Coreference Resolution for the Biomedical Domain: A Survey
    Abstract. Issues with coreference resolution are one of the most frequently mentioned challenges for information extraction from the biomedical literature.
  55. [55]
    [PDF] A brief survey on recent advances in coreference resolution
    The task of resolving repeated objects in natural languages is known as coreference resolu- tion, and it is an important part of modern natural language ...Missing: seminal | Show results with:seminal
  56. [56]
    [PDF] Using an Implicit Method for Coreference Resolution and Ellipsis ...
    From a technical standpoint, it is difficult mainly because it requires natural language understanding, which is a challenging task because it requires world ...
  57. [57]
    Semi-supervised multimodal coreference resolution in image ...
    This poses significant challenges due to fine-grained image-text alignment, inherent ambiguity present in narrative language, and unavailability of large ...
  58. [58]
    A brief survey on recent advances in coreference resolution
    Predicting coreference connections and identifying mentions/triggers are the major challenges in coreference resolution, because these implicit relationships ...
  59. [59]
    Coreference Resolution Based on High-Dimensional Multi-Scale ...
    This paper first compares the impact of commonly used methods to improve the global information collection ability of the model on the BERT encoding ...Coreference Resolution Based... · 2. Related Work · 2.2. Span-Bert
  60. [60]
    (PDF) Coreference Resolution: Toward End-to-End and Cross ...
    Jan 28, 2020 · ... challenges of less-resourced languages. Finally, we discussed the main challenges and open issues faced by coreference resolution systems.
  61. [61]
  62. [62]
    Toward Gender-Inclusive Coreference Resolution: An Analysis of ...
    Nov 3, 2021 · Gender bias in NLP has been considered more broadly than just in coreference resolution, including, for instance, natural language inference ...
  63. [63]
    Gender Bias in Coreference Resolution: Evaluation and Debiasing ...
    Apr 18, 2018 · We introduce a new benchmark, WinoBias, for coreference resolution focused on gender bias. Our corpus contains Winograd-schema style sentences.Missing: models NLP
  64. [64]
    [PDF] Cataphora detection and resolution: Advancements and Challenges ...
    Cataphora is when a pronoun or noun phrase points forward to a yet-to-be-mentioned entity, the reverse of anaphora.