LangChain
LangChain is an open-source framework designed to simplify the development of applications powered by large language models (LLMs), enabling the creation of AI agents that integrate LLMs with external data sources, tools, and APIs.[1] It provides modular components, pre-built architectures, and over 1,000 integrations with models, databases, and services to streamline the full lifecycle of LLM applications, from prototyping to production deployment.[2] Available in both Python and JavaScript versions, LangChain emphasizes flexibility, avoiding vendor lock-in while supporting reliable agent engineering.[3] Originating as a side project in October 2022 by Harrison Chase, a former machine learning engineer at Robust Intelligence, LangChain quickly gained traction following the launch of ChatGPT in November 2022.[4] The project was incorporated in January 2023, with Chase as CEO and Ankush Gola as co-founder, and has since evolved into a comprehensive platform used by companies such as Rakuten, Cisco, and Moody's for production AI systems.[4] Key milestones include raising a $10 million seed round in April 2023 led by Benchmark, a $25 million Series A in February 2024 led by Sequoia Capital at a post-money valuation of $200 million, and a $125 million Series B in October 2025 at a post-money valuation of $1.25 billion.[4][5] The framework's GitHub repository has amassed over 118,000 stars as of November 2025, reflecting its active community and widespread adoption.[6] At its core, LangChain offers abstractions like chains, agents, and retrieval-augmented generation (RAG) to handle complex workflows, such as chatbots, document question-answering, and data extraction.[1] It integrates seamlessly with related tools like LangGraph for building stateful, multi-actor applications and LangSmith for observability, debugging, and monitoring of LLM traces.[1] Licensed under the MIT license, LangChain remains freely available while the company provides enterprise features through its platform.[7]Overview
Definition and Purpose
LangChain is a modular open-source framework implemented in Python and JavaScript, designed for orchestrating large language models (LLMs) within applications. It enables developers to compose and integrate key components—such as prompts, models, memory, and output parsers—into cohesive systems that leverage LLMs for tasks like reasoning and context-aware processing.[8][9][3] The framework's core purpose is to streamline the development of LLM-powered applications, from rapid prototyping to scalable production deployment, by abstracting the intricacies of multi-step LLM interactions, external data integration, and state management. This abstraction reduces boilerplate code and promotes reusability, allowing developers to focus on application logic rather than low-level API handling.[8][3] Emerging in 2022 amid the growing prominence of LLMs like GPT-3, LangChain addresses early adoption challenges, particularly the absence of standardized approaches for building reliable multi-step workflows that chain LLM calls effectively.[5][4][10] In a typical high-level workflow, user inputs are formatted into prompts that invoke LLMs, with subsequent output parsing to refine results and enable further processing.[8]Key Principles
LangChain's foundational design emphasizes modularity, enabling developers to construct applications by composing interchangeable components such as prompts, models, and retrievers in a flexible manner. This approach allows for seamless mixing and matching of elements without vendor lock-in, promoting reusability across different LLM ecosystems.[11] Extensibility is another core principle, rooted in LangChain's open-source architecture, which facilitates custom integrations and extensions by providing high-level abstractions for key elements like prompts, models, and retrievers. These abstractions accommodate a wide range of LLMs from various providers, empowering developers to tailor solutions to specific needs without starting from scratch.[3] To address the inherent non-determinism of LLMs, LangChain incorporates observability as a key focus, offering built-in support for tracing and evaluation mechanisms that allow debugging and monitoring of complex chains. This feature, integrated early through tools like LangSmith, enables detailed inspection of application behavior, iteration on performance, and identification of issues in production environments.[12] Interoperability is prioritized through standardized interfaces for tools, data sources, and external components, ensuring compatibility across diverse systems. In 2025, these principles evolved to better support multi-framework ecosystems, enhancing integration with emerging agentic and graph-based workflows.[13] For instance, retrieval-augmented generation (RAG) exemplifies modularity by combining retrieval modules with LLM chains to ground responses in external data.[1]History
Origins and Founding
LangChain was founded by Harrison Chase, a graduate of Harvard University with degrees in statistics and computer science, who had previously worked as a machine learning engineer at Kensho Technologies, a fintech startup focused on AI applications in finance.[14][15][16] In 2022, while employed at Robust Intelligence, Chase initiated LangChain as a personal side project to tackle the emerging challenges in building applications with large language models (LLMs).[4][17] The framework originated from Chase's recognition of the limitations in early LLM experimentation, where developers relied on custom, one-off scripts to handle prompt formatting, model invocations, and output parsing, lacking standardized tools for more complex, production-ready integrations.[18] The project's timing aligned closely with the explosive rise of generative AI, particularly following the public launch of ChatGPT by OpenAI on November 30, 2022, which sparked widespread interest in LLM-powered applications but highlighted the need for reusable patterns to streamline development.[5][19] LangChain addressed this by providing modular abstractions for chaining LLM calls with external data sources, memory management, and agentic behaviors, enabling developers to compose sophisticated workflows without reinventing core functionalities.[3] This focus on practicality emerged from Chase's hands-on experience in ML engineering, where he observed the ad-hoc nature of LLM interactions hindering scalable app building amid the post-ChatGPT hype.[20] Chase open-sourced LangChain on GitHub in October 2022, just weeks before ChatGPT's debut, marking its debut as a lightweight Python library centered on prompt templating and basic LLM orchestration.[5][4] The repository quickly gained traction, amassing over 5,000 stars by February 2023 and tripling to 18,000 by April, reflecting the community's enthusiasm for a tool that simplified LLM app prototyping during the early generative AI boom.[4] At its inception, LangChain operated without a formal company structure, relying on community contributions and Chase's solo efforts to iterate on core utilities for LLM chaining and integration.[5] This grassroots approach laid the groundwork for its evolution into a structured organization in subsequent years.Development Milestones
LangChain Inc. was formally incorporated on January 31, 2023,[21] with Harrison Chase as CEO and Ankush Gola as co-founder, to support the ongoing development of the open-source project, enabling the founders to transition to full-time work on the framework. This move was followed shortly by a $10 million seed funding round in April 2023, led by Benchmark, which provided the resources to expand the team and accelerate feature development.[22] In February 2024, the company raised a $25 million Series A round led by Sequoia Capital, achieving a post-money valuation of $200 million. In October 2025, LangChain secured a $125 million Series B round also led by Sequoia Capital, reaching a $1.25 billion valuation and unicorn status.[23] In July 2023, LangChain released LangSmith, a platform designed to address key challenges in building LLM applications by offering tools for debugging, testing, evaluating, and monitoring chains and agents.[24] This launch filled a critical observability gap in the ecosystem, allowing developers to trace LLM interactions and iterate more effectively on production-grade systems. The following year, in January 2024, the team introduced LangGraph as an extension to the core framework, providing a library for constructing stateful, graph-based workflows that support cyclical processes and multi-actor agent interactions. LangChain achieved significant version stability with the release of v1.0 in October 2025, marking a milestone for both the core library and LangGraph. The LangChain v1.0 update introduced a standardized core agent loop integrated with middleware for enhanced extensibility, while LangGraph v1.0 added support for stateful persistence, improving reliability for long-running applications.[25] These releases emphasized production readiness, with a focus on interoperability across models, tools, and deployment environments to facilitate scalable agentic AI systems. By 2025, the project had grown substantially in the developer community, surpassing 100,000 GitHub stars and contributing to emerging standards in multi-agent orchestration through open-source collaborations.[4]Core Components
Chains and Prompts
In LangChain, the foundational mechanism for orchestrating interactions with language models and other components is the LangChain Expression Language (LCEL), a declarative syntax for composing chains using the Runnable interface. LCEL enables modular, composable workflows by sequencing elements such as prompts, model invocations, and output parsers with the pipe operator (|), e.g.,chain = prompt | model | parser. This approach supports linear execution for tasks like text generation or data extraction, promoting reusability and avoiding custom coding for integrations. Legacy chain classes, such as LLMChain and SequentialChain, are deprecated since version 0.1.17 and should be replaced with LCEL for new development.[26][27]
LCEL chains emphasize deterministic execution, where components are invoked sequentially, with outputs from one feeding into the next, facilitating multi-stage reasoning or transformations. All LCEL components implement the Runnable interface, allowing standardized invocation, streaming, batching, and async support.[28]
Prompts serve as the input structuring layer within chains, using the PromptTemplate class to define reusable string formats that incorporate dynamic placeholders, such as {user_input} or {context}, via Python's f-string syntax or partial formatting methods. This templating ensures consistent guidance for the LLM, improving response reliability across invocations; for example, a prompt might specify "Summarize the following text in three bullet points: {text}" to direct concise output. PromptTemplate objects are directly integrable into LCEL chains, where they are populated with runtime variables before model execution.[29][30]
Output parsers handle the post-processing of LLM responses to enforce structure, transforming unstructured text into parseable formats like dictionaries or objects. The PydanticOutputParser, for instance, leverages Pydantic models to define and validate JSON schemas, rejecting invalid outputs and retrying if needed, which is particularly useful for extracting fields like "summary" or "key_points" from free-form generations. In an LCEL chain, the parser follows the model step, ensuring downstream components receive clean, typed data.[31][32]
A representative workflow is the construction of a summarization chain using LCEL, where a PromptTemplate instructs the LLM to condense input text—e.g., "Write a concise summary of the following: {input_text}"—followed by model invocation and parsing via PydanticOutputParser to yield a structured JSON object with attributes like "main_idea" and "supporting_points". This can be assembled as chain = prompt | llm | parser for single-document handling or extended with document loaders and splitters for multi-document scenarios, using a "stuff" approach that aggregates content into a single prompt for efficient processing of moderate-length texts. Such LCEL chains exemplify linear LLM interactions, scalable to extensions like agents for non-linear decision-making.[26]
Models and Outputs
LangChain provides abstractions for integrating with large language models (LLMs) through standardized interfaces, enabling seamless interaction with various providers. TheChatOpenAI class, for instance, facilitates access to OpenAI's chat models such as GPT-3.5-turbo or GPT-4, handling API calls via the OpenAI API key and supporting parameters like temperature, max tokens for output length limits, and frequency penalties to control generation.[33] Similarly, the HuggingFaceHub integration allows invocation of open-source models hosted on Hugging Face, requiring an API token and supporting tasks like text generation and summarization through the Inference API.[34] These abstractions implement the Runnable interface, which manages underlying API requests, error handling, and response formatting, while also enabling streaming outputs for real-time applications by yielding tokens incrementally during generation.[35]
Beyond chat models, LangChain supports non-generative models for tasks like embeddings, which convert text into vector representations for similarity searches. The HuggingFaceEmbeddings class leverages sentence-transformer models, such as the default "sentence-transformers/all-mpnet-base-v2," to produce dense embeddings locally or via the Hugging Face hub, integrating with libraries like transformers for efficient computation without external API dependencies.[36] This abstraction handles batch embedding of multiple texts and normalizes vectors as needed, making it suitable for downstream applications like retrieval-augmented generation (RAG) pipelines where embeddings facilitate document ranking.
Output handling in LangChain transforms raw, unstructured LLM responses into usable formats through dedicated parsers, mitigating issues like inconsistent or erroneous outputs. Classes such as JsonOutputParser and PydanticOutputParser extract structured data, converting free-form text into JSON objects or validated Pydantic models, respectively, by defining schemas that guide parsing.[32] For reliability, the RetryOutputParser wraps other parsers and employs an LLM to iteratively fix parsing failures, such as malformed JSON due to hallucinations, by prompting the model to correct errors up to a configurable retry limit.[37] This approach ensures robust post-processing, with parsers implementing the Runnable interface for chaining with models.
To optimize for production environments, LangChain's model invocations support efficient patterns like batch processing and asynchronous execution. The batch method processes multiple inputs in parallel using thread pools, reducing latency for high-volume tasks, while abatch and ainvoke enable concurrent async operations via asyncio, ideal for scalable deployments handling numerous requests.[35] These features, combined with token management and streaming, allow developers to build responsive applications without custom orchestration code.
Advanced Features
Agents and Tools
In LangChain, agents are autonomous systems powered by large language models (LLMs) that dynamically decide on sequences of actions to accomplish tasks, rather than following predefined chains of operations. These agents leverage the LLM as a reasoning engine to select appropriate tools, execute them, and iterate based on observations until reaching a resolution. This approach enables more flexible and adaptive interactions with external environments, such as querying databases or performing computations, by interleaving thought processes with actionable steps.[38] A prominent agent type in LangChain is the ReAct agent, which implements the ReAct paradigm of synergizing reasoning and acting, as introduced in the seminal work on ReAct.[39] As of LangChain v1.0 (October 2025), legacy functions likecreate_react_agent have been deprecated; the recommended way to create ReAct agents is via create_agent in langchain.agents, which integrates an LLM, a prompt template, and a list of tools. For more advanced agent workflows, including stateful and multi-actor applications, see the LangGraph section.[40]
Tools in LangChain serve as the interfaces that allow agents to interact with external systems or perform computations beyond the LLM's inherent capabilities. Each tool is defined with a name, description, and schema for inputs and outputs, enabling the agent to select and invoke it based on the task context. Pre-built tools include integrations like SerpAPI for web search, which enables real-time information retrieval, and mathematical utilities such as a calculator for arithmetic operations. Custom tools can be created by wrapping Python functions with the @tool decorator, which automatically generates the necessary schema from function signatures and docstrings; for example, a function to fetch weather data could be decorated and bound to an agent for location-based queries. These tools are passed as a list to the agent during initialization, allowing the LLM to reference their descriptions in its decision-making process.
The legacy AgentExecutor class, deprecated in LangChain v1.0 (October 2025), previously managed the runtime loop for agent execution by handling cycles of observation, reasoning, action, and feedback until task completion, with safeguards like maximum iterations to prevent infinite loops. Current agent execution uses LCEL runnables or LangGraph for more robust handling, supporting asynchronous execution and callbacks for monitoring progress. For example, in a ReAct setup, the system would parse the LLM's output for tool calls, run the selected tool, and append the result to the conversation history for subsequent reasoning steps.[40]
As of LangChain v1.0 (October 2025), basic multi-agent configurations in core LangChain—where a central agent coordinates specialized sub-agents for divided tasks—are deprecated. Recommended multi-agent systems, including supervisor-worker hierarchies, are built using LangGraph's graph-based orchestration (see the LangGraph section). Legacy core setups used simple prompting and execution loops for collaborative problem-solving, such as one agent handling data retrieval while another performs analysis.[25]
Memory and Retrieval
LangChain incorporates memory mechanisms to manage state across interactions, enabling large language model (LLM) applications to retain and utilize context from previous exchanges. This state management is crucial for conversational agents and chains that require awareness of prior inputs and outputs. As of LangChain v1.0 (October 2025), many legacy memory types have been deprecated in favor of LCEL-compatible runnables and LangGraph checkpointers for persistence. Memory types are designed to handle short-term chat history or extract and store key information for longer-term recall, preventing the need to reprocess full histories in each LLM call.[41] A fundamental memory type is ConversationBufferMemory, which maintains a simple buffer of the entire conversation history as a list of messages or a formatted string. It appends new human and AI messages to this buffer after each interaction and returns the full history when prompted, making it suitable for short-term retention in straightforward chat applications. For instance, in a conversational chain, this memory ensures the LLM receives the cumulative dialogue as context, though it may become inefficient for extended sessions due to token limits. For production use, integrate with LangGraph'sInMemorySaver or database-backed checkpointers for thread-level persistence.[42][43][44]
The legacy ConversationEntityMemory, deprecated since LangChain v0.3.4 and removed in v1.0.0, previously extracted named entities—such as people, places, or concepts—from recent chat history using an LLM and generated concise summaries for each, stored in an entity store for retrieval across sessions. Current alternatives include custom entity extraction via LLM tools or LangGraph state management with custom schemas to track entities.[45]
Retrieval capabilities in LangChain augment LLM context by integrating external knowledge sources through semantic search, addressing limitations in models' parametric knowledge. Vector stores, such as FAISS (Facebook AI Similarity Search), index high-dimensional embeddings of documents to enable fast approximate nearest-neighbor searches based on query similarity. Embeddings, generated by models like those from OpenAI or Hugging Face, represent text chunks as vectors capturing semantic meaning, allowing retrieval of contextually relevant passages even for paraphrased queries. This forms the basis for Retrieval-Augmented Generation (RAG), where retrieved documents are inserted into LLM prompts to ground responses in up-to-date or domain-specific information, reducing hallucinations.[46]
Building effective retrieval indexes involves document loaders, text splitters, and embedders to create searchable corpora from raw data. Document loaders ingest unstructured content from diverse formats, including PDFs, web pages, or databases, converting them into LangChain Document objects with metadata. Text splitters then partition these documents into overlapping chunks—typically by character count, tokens, or semantic boundaries—to fit LLM context windows while preserving meaning. Embedders subsequently transform these chunks into vectors, which are persisted in the vector store for querying via retrievers that return the top-k most similar documents. This pipeline enables scalable knowledge bases for RAG applications.
To extend memory beyond ephemeral sessions, LangChain supports long-term persistence through integrations with databases, facilitating episodic (event-based) or procedural (step-based) recall. For entity tracking, use custom state in LangGraph with stores like Redis or SQLite. Chat message histories can persist full conversation buffers in databases such as PostgreSQL, Azure CosmosDB, or Xata, allowing retrieval of past interactions for multi-session continuity via checkpointers. These persistent stores ensure memory survives application restarts, supporting production-scale applications where context spans user sessions or evolves over time. In agent loops, such retrieval can briefly provide historical context to inform tool selection or action planning.[47][48][44]