Conversation analysis
Conversation analysis (CA) is an interdisciplinary method originating in sociology and linguistics that systematically examines the structure and organization of naturally occurring talk-in-interaction, revealing how participants collaboratively accomplish social actions such as questioning, agreeing, or repairing misunderstandings.[1][2] Developed in the late 1960s at the University of California, Irvine, by Harvey Sacks in collaboration with Emanuel Schegloff and Gail Jefferson, CA emphasizes the empirical analysis of audio- or video-recorded interactions rather than theoretical speculation or experimental data.[1][2] At its core, CA identifies key organizational principles of interaction, including turn-taking, where speakers alternate with minimal gaps or overlaps to maintain orderly exchanges; adjacency pairs (e.g., question-answer or greeting-response), which form the building blocks of sequences; and repair mechanisms, through which participants correct troubles in speaking, hearing, or understanding.[3][1] These principles are analyzed using the Jefferson transcription system, a detailed notation that captures not only words but also prosody, pauses, overlaps, and non-verbal elements to preserve the richness of interactional details.[4] Seminal work, such as Sacks, Schegloff, and Jefferson's 1974 paper on turn-taking, demonstrates how a simple "systematics" governs speaker transitions across diverse contexts, from casual chats to institutional settings. CA's methods rely on inductive, data-driven analysis: researchers collect corpora of naturally occurring interactions, produce verbatim transcripts, and identify recurrent patterns through repeated close inspection, validating findings internally within the data rather than through external variables.[1][2] Rooted in ethnomethodology—the study of how people produce the social order in everyday life—CA treats talk as a methodical, accountable practice oriented to by participants themselves.[1] This approach has expanded beyond spoken conversation to include multimodal interactions involving gestures, gaze, and embodiment, particularly in digital and institutional environments.[3] Applications of CA span fields like medicine, education, law, and politics, informing how communication shapes outcomes—for instance, in primary care consultations where question design influences patient responses, or in classrooms where teacher-student turns affect learning dynamics.[1] Influential scholars such as John Heritage, Douglas Maynard, and Tanya Stivers have advanced CA by applying it to institutional talk, demonstrating its utility in improving interactional practices and revealing power asymmetries in asymmetric settings like doctor-patient encounters.[1] Today, CA remains a cornerstone of language and social interaction studies, with ongoing developments in computational tools for transcription and analysis.[3]Overview
Definition and Core Concepts
Conversation analysis (CA) is an empirical approach to the study of the sequential organization of naturally occurring talk-in-interaction, examining how participants collaboratively produce and interpret social actions through language.[5] Rooted in ethnomethodology, CA investigates the methods by which ordinary members of society make sense of and account for their everyday conduct in interaction.[6] At its core, CA treats talk as a structured and accountable social activity, where every utterance is designed to perform specific actions and is held accountable to the expectations of co-participants.[7] Utterances exhibit indexicality, meaning their meaning and import are inherently tied to the immediate context of the ongoing interaction, requiring participants to draw on shared knowledge and prior turns for interpretation.[5] Participants routinely orient to normative expectations in interaction, displaying adherence or deviation through their responses, which ensures the orderly progression of talk and reveals underlying social rules.[5] CA rests on several basic assumptions about interaction. Participants publicly display their mutual understanding through their conduct, particularly in how they respond to prior actions, allowing analysts to infer comprehension from observable behaviors rather than internal states.[7] Analysis prioritizes how social actions—such as informing, requesting, or assessing—are accomplished and recognized via the sequential placement and design of utterances, emphasizing the collaborative and emergent nature of meaning-making.[8] Simple examples illustrate these principles in everyday conversation. A greeting like "Hello" functions as the first part of an adjacency pair, expecting a reciprocal greeting such as "Hello" to confirm mutual recognition and open the interaction without further elaboration.[9] Likewise, a question such as "What time is it?" prompts a response like "It's three o'clock," demonstrating the recipient's orientation to the normative expectation of providing an answer, thereby achieving sequential relevance and shared understanding.[5]Importance and Applications
Conversation analysis (CA) is pivotal in demonstrating how social order emerges through the collaborative and sequential organization of talk-in-interaction, illustrating that everyday interactions are not random but governed by shared practices that participants orient to in real time.[1] By examining the fine details of how turns are allocated, sequences are built, and actions are accomplished, CA reveals the methodical ways in which individuals co-construct meaning, accountability, and social reality without relying on external rules or structures.[10] This approach underscores the interactional achievement of social norms, showing that phenomena like agreement, disagreement, or repair are handled through observable conversational devices rather than presupposed intentions.[11] CA challenges foundational assumptions in linguistics by rejecting decontextualized analyses of language, instead emphasizing that context is dynamically produced and renewed within the interaction itself. Traditional views often treat utterances as isolated or governed by abstract grammars, but CA demonstrates that meaning arises from how speakers respond to prior turns, thereby integrating prosody, timing, and sequential positioning as integral to linguistic practice.[12] This shift highlights talk as a primary site for social action, influencing fields beyond linguistics to include sociology, psychology, and anthropology in studying human conduct.[13] In institutional settings, CA offers critical insights into how talk enacts organizational goals, such as in courts where it analyzes questioning sequences to uncover biases in witness interviews; in medicine, where it examines doctor-patient consultations to enhance shared decision-making; and in education, where it reveals how teacher-student interactions shape learning opportunities.[14] These applications extend to improving communication practices, enabling professionals to refine protocols for more effective and equitable exchanges.[15] For instance, in patient education, CA identifies how formulations and confirmations facilitate comprehension, informing targeted interventions to reduce misunderstandings.[16] Emerging applications address digital communication, where CA adapts to video calls and social media by exploring multimodal adaptations like delayed turn-taking or emoji use in sequential organization.[17] In AI development post-2020, CA collaborates with engineers to design conversational agents that mimic natural repair and sequence practices, as seen in training models like Dora to handle interruptions and alignments more human-like. Recent studies from 2024-2025 have explored using large language models (LLMs) to automate aspects of CA, such as identifying interactional patterns in large datasets, and applying CA to assess how LLMs simulate human-like talk-in-interaction.[18][19] Specifically, CA informs therapist training by dissecting session sequences to teach reformulation techniques that foster client progress, and for call center operators, it guides handling of complaint sequences to de-escalate and resolve issues efficiently.[20]Historical Development
Origins in Ethnomethodology
Ethnomethodology, founded by sociologist Harold Garfinkel, emerged in the 1960s as a paradigm shift in sociology, emphasizing the study of everyday reasoning and the methods individuals use to produce and account for social actions in mundane settings. Garfinkel's seminal work, Studies in Ethnomethodology (1967), critiqued the structural functionalism of Talcott Parsons by rejecting top-down theoretical impositions, instead advocating for an "emic" analysis of how people reflexively make sense of their social world through indexical expressions and the documentary method of interpretation. This approach highlighted accountability as a fundamental feature of social interaction, where actions are oriented to and made intelligible within ongoing contexts, drawing heavily from phenomenological influences such as Alfred Schütz's emphasis on the lived experience of intersubjectivity.[21] Conversation analysis (CA) developed as a specialized branch of ethnomethodology in the late 1960s and 1970s, narrowing the focus from broad mundane activities to the sequential organization of talk-in-interaction as the primordial site of social order.[22] Unlike ethnomethodology's wider ethnographic inquiries into practices like jury deliberations or medical consultations, CA prioritized naturally occurring audio recordings of conversations to reveal the rule-governed, accountable methods participants employ in real-time.[7] This emergence was also shaped by symbolic interactionism, particularly Erving Goffman's conception of the interaction order, which underscored how micro-level encounters constitute the fabric of social structure.[21] Key developments in the 1960s-1970s included Harvey Sacks's early empirical work at the University of California, Irvine, analyzing telephone calls from a suicide prevention center to uncover patterns in membership categorization and sequential implicativeness.[22] By the early 1970s, initial studies on telephone conversation openings and closings demonstrated how interactants collaboratively manage transitions without explicit rules, laying the groundwork for CA's methodological rigor. Figures such as Sacks, Emanuel Schegloff, and Gail Jefferson briefly collaborated on these foundational efforts, culminating in the 1974 publication of their turn-taking model.[23]Key Figures and Milestones
Harvey Sacks is regarded as the founder of conversation analysis, having laid its theoretical groundwork through lectures delivered from 1964 to 1972 at the University of California, Los Angeles, and later at Irvine.[24] In these lectures, Sacks conceptualized adjacency pairs as fundamental building blocks of interaction, such as greetings and responses or invitations and acceptances/declinations, emphasizing how ordinary talk is systematically organized.[25] His approach prioritized empirical analysis of naturally occurring conversations, drawing briefly on ethnomethodological roots to examine everyday social actions. Emanuel Schegloff, Sacks's longtime collaborator, advanced CA through his work on interactional structures, most notably co-authoring the 1974 paper "A Simplest Systematics for the Organization of Turn-Taking for Conversation," which proposed a rule-based model for how participants minimize gaps and overlaps in talk. Schegloff further shaped the field with his contributions to repair mechanisms, detailed in the 1977 publication "The Preference for Self-Correction in the Organization of Repair in Conversation," co-authored with Gail Jefferson and Sacks, which identifies sequences for addressing communicative troubles like mishearings or errors. Schegloff passed away in 2024.[26] Gail Jefferson, a pioneering student of Sacks, developed the Jefferson Transcription System, a detailed notation for capturing prosodic, temporal, and non-verbal features of talk, enabling precise sequential analysis. Her system, refined over decades and formalized in a 2004 glossary, remains the standard for CA transcription, supporting the field's emphasis on fine-grained data examination. Jefferson passed away in 2008. Key milestones include the 1974 turn-taking paper, which established CA's methodological rigor and attracted interdisciplinary attention. Sacks's Lectures on Conversation, edited by Jefferson and published posthumously in 1992 across two volumes, disseminated his unpublished teachings and solidified foundational principles.[25] The 1977 repair paper similarly marked a breakthrough, highlighting CA's focus on interactional accountability. Post-2000 developments expanded CA to multimodality, integrating analyses of gesture, gaze, and embodiment, as exemplified in studies from the 2010s that examined how bodily conduct coordinates with talk. Scholars also adapted CA to digital contexts, addressing challenges like asynchronous messaging and emoji use in online interactions to explore evolving interactional norms.[17]Methodological Foundations
Data Collection and Analysis
Conversation analysis (CA) relies exclusively on naturally occurring interactions as its primary data source, captured through audio or video recordings to preserve the authentic sequential organization of talk-in-interaction. This methodological commitment avoids elicited or experimental data, such as role-plays or interviews, which might impose artificial constraints on participants' natural conduct. Researchers collect recordings from diverse everyday and institutional settings, ensuring that the data reflect how participants themselves orient to and accomplish social actions without external prompting. For instance, video recordings are preferred when possible to capture not only verbal elements but also nonverbal behaviors like gaze and gesture, which are integral to interactional organization.[27][10] The analysis process in CA involves an iterative, close examination of both original recordings and detailed transcripts, emphasizing sequential implicativeness—how each turn shapes the relevance and trajectory of subsequent actions—and participant orientation, whereby findings are grounded in evidence of how interactants demonstrably respond to one another's conduct. Analysts begin with unmotivated looking, an inductive approach that involves open-ended scrutiny of the data without preconceived hypotheses to identify recurring patterns of interaction. This is followed by assembling collections of similar cases, where instances of a particular practice (e.g., turn-taking transitions) are gathered and compared to discern underlying rules or mechanisms. To refine these observations, deviant case analysis is employed, systematically investigating exceptions or variations that challenge initial patterns, thereby strengthening the robustness of the identified structures by revealing contextual contingencies. Transcripts for this process are typically prepared using the Jeffersonian system to capture prosodic and temporal details. Throughout, analysis prioritizes the endogenous methods participants use to organize their interactions, often validated through data sessions where peers review and debate interpretations.[27][10] Ethical considerations are paramount in CA due to the intimate nature of recorded interactions, with researchers obligated to secure informed consent while minimizing intrusion to maintain naturalism. Anonymity is rigorously protected in publications by altering identifiers, voices, and visual details in transcripts and excerpts, ensuring participants cannot be recognized. Special care is taken with sensitive institutional data, such as medical consultations, where power imbalances may complicate consent; here, protocols often include post-recording debriefing and secure data storage to mitigate risks of harm or coercion. Curating and sharing datasets, increasingly encouraged for replicability, must balance open access with privacy through trusted research environments that restrict sensitive materials. These practices align with broader principles of doing no harm while advancing understanding of interactional practices.[28][10]Transcription Systems
The Jeffersonian transcription system, developed by Gail Jefferson in the 1960s as a foundational tool for conversation analysis, employs a set of symbols to capture prosodic, paralinguistic, and interactional features of spoken interaction beyond orthographic representation.[29] This system emerged from early collaborative work with Harvey Sacks and Emanuel Schegloff, enabling researchers to document the precise timing, intonation, and overlap in talk that underpin social actions.[30] Jefferson formalized many conventions in her 2004 glossary, which remains the standard reference, emphasizing notations that render audible details visible on the page for analytic scrutiny.[31] The primary purpose of Jeffersonian transcription is to reveal the accountable details of talk-in-interaction—such as pauses, pitch shifts, and simultaneous speech—that audio recordings alone cannot fully convey, allowing analysts to examine how participants orient to these elements in real time.[4] By prioritizing the interactional relevance of delivery features over phonetic precision, the system supports investigations into sequence organization and turn-taking without imposing external linguistic categories.[30] For instance, it highlights how subtle prosodic cues contribute to action formation, making transcripts a central artifact in conversation analytic methodology.[13] Key symbols in the Jeffersonian system address temporal, prosodic, and vocal aspects of talk. The following table summarizes core notations with descriptions and examples, drawn from Jefferson's conventions:| Symbol | Description | Example |
|---|---|---|
| [ ] | Overlapping talk; square brackets aligned across lines mark onset and offset of simultaneous speech. | A: [hello] B: [hi there] |
| = | Latching; no discernible gap between utterances, often within or across speakers. | A: okay.= B: =yeah. |
| (.) | Micropause; brief silence, approximately 0.2 seconds or less. | (.) hmmm |
| (0.5) | Timed pause; silence measured in tenths of seconds. | (0.5) well, |
| ↑ ↓ | Pitch movement; arrows indicate marked rise or fall in intonation. | that's ↑great↓ |
| CAPS | Increased volume or emphasis. | NO WAY |
| underline | Stress or emphasis on a sound or word. | reálly |
| °° | Quiet or decreased volume. | °sorry° |
| > < | Speeded-up talk. | >like this< |
| < > | Slowed-down talk. | <oh:: :kay> |
| : | Prolongation; extended sound, with length indicated by colons. | y:eah |
| (h) (hh) | Inbreath (h) or outbreath/laugh (hh); number indicates duration. | he(hh)llo |
| . , ? | Falling (.), continuing (,), or rising (?) intonation. | yes. wait, really? |
Core Structures of Interaction
Turn-Taking Mechanisms
Turn-taking mechanisms form a core aspect of conversation analysis, describing the orderly allocation of speaking opportunities among participants to minimize gaps and overlaps while enabling collaborative interaction. The seminal model, developed by Harvey Sacks, Emanuel A. Schegloff, and Gail Jefferson, posits that turn-taking is locally managed, recipient-designed, and governed by a simple yet robust set of rules that operate at specific points in talk. This system treats conversation as a speech-exchange system where participants collaboratively construct and transition turns, revealing the interactional competence inherent in everyday talk. At the heart of the model are turn-constructional units (TCUs), the basic building blocks of turns, which include complete syntactic, prosodic, or pragmatic units such as declaratives, questions, or exclamations that signal their own possible completion. The end of a TCU constitutes a transition-relevance place (TRP), a projected boundary where speaker change becomes relevant and turns can be allocated without disrupting the ongoing unit. Projectability is key: participants anticipate TRPs through syntactic structure, intonation, and pragmatics, allowing preemptive actions like self-selection just before completion. For example, in a greeting sequence, a TCU like "Hello, how are you?" reaches its TRP at the falling intonation, inviting an immediate response. Turn allocation at TRPs follows a hierarchical set of three rules, applied sequentially to determine the next speaker:- If the current speaker selects a next speaker—through gaze, address terms, or questions—that recipient is obliged to take the turn.
- If no selection occurs, any participant may self-select, with the first to begin speaking (often via a sharp onset) securing the turn.
- If neither selection nor self-selection happens, the current speaker may extend the turn with another TCU.