Fact-checked by Grok 2 weeks ago

XMPP

XMPP (Extensible Messaging and Presence Protocol) is an open communications protocol based on XML for near-real-time exchange of structured data between network entities, primarily enabling , presence information, multi-party chat, voice and video calls, and other collaborative applications. It operates on a decentralized client-server architecture similar to , where users connect to servers that route messages across a federated of independent domains, supporting secure connections via TLS encryption and through SASL mechanisms. Originally developed in 1998 by Jeremie Miller as the open-source project to create a decentralized alternative to proprietary systems, XMPP was formalized as an IETF standard in 2004 through 3920 (core protocol) and 3921 ( and presence), with significant updates in 2011 via 6120 and 6121, and in 2015 via 7622 to address , , and extensibility. The protocol's design emphasizes extensibility, allowing developers to add features through the XMPP Extension Protocols (XEPs) maintained by the XMPP Standards Foundation (XSF), such as Multi-User Chat (MUC) for group discussions, PubSub for publish-subscribe notifications, and for peer-to-peer multimedia sessions. XMPP powers a wide array of applications beyond traditional chat, including Internet of Things (IoT) device coordination, online gaming, social networking, and , with millions of users connected via tens of thousands of public servers worldwide. Its open nature fosters among diverse software clients and servers, such as those implementing technologies, while ensuring privacy through without central control, making it a resilient foundation for and mobile communications.

Introduction

Definition and Purpose

The Extensible Messaging and Presence Protocol (XMPP) is an open-standard protocol standardized by the (IETF) for near-real-time, structured data exchange between network entities, utilizing XML as its foundational format. Defined in core RFCs such as 6120 (for the protocol core) and 6121 (for and presence), XMPP enables asynchronous, bidirectional communication streams that support a variety of applications beyond traditional . Its design emphasizes simplicity, extensibility, and , making it suitable for environments requiring low-latency interactions without reliance on proprietary systems. At its core, XMPP serves primary purposes including for text-based exchanges, presence information to indicate user availability (such as online status), and roster management for maintaining contact lists. These features facilitate seamless, real-time interactions among users, with built-in support for one-to-one and multi-user scenarios. Additionally, XMPP's extensible nature allows it to underpin diverse applications, such as (IoT) device communication for efficient, secure data distribution across networks, and voice/video calls through protocol extensions like . XMPP originated in 1998 as the open-source project, initiated by Jeremie Miller to create a decentralized alternative to closed systems, and was formalized as an IETF standard in 2004 through RFCs 3920 and 3921. As of 2025, it continues to support decentralized communication globally, powering federated networks where servers interoperate without central control, with a resurgence driven by sovereignty efforts and applications in secure, interoperable healthcare chat. Key advantages include its status as an , which prevents by allowing users to select from multiple interoperable implementations, and its inherent support for federation, enabling cross-server connectivity akin to systems.

Federated Architecture

The federated architecture of XMPP relies on a decentralized client-server model, where users connect to their via client-to-server (C2S) streams for , presence management, and initial routing. These C2S connections use XML streams over , secured with TLS and authenticated via SASL mechanisms such as or , allowing clients to resources and exchange data with the server. For communication beyond a single domain, servers establish server-to-server (S2S) streams to route , presence, and other stanzas across independent networks, enabling seamless inter-domain interactions without requiring users to switch servers. This design promotes by eliminating the need for a central authority, much like the system's across domains, where users on different servers (e.g., and otherdomain.org) can communicate as if on a unified . Benefits include enhanced resilience, as the absence of a single point of control distributes risk and allows operators to customize policies per domain, fostering a global, ecosystem that scales with the internet's distributed nature. Servers discover each other via DNS SRV records, supporting optional based on administrative choices, which ensures no mandatory central registry or overseer. Federation mechanics involve establishing S2S links with trust verification to prevent spoofing and ensure domain authenticity. Servers typically use Server Dialback for lightweight verification, where the receiving server queries the sending domain's authoritative server to confirm a shared key, or Public Key Infrastructure (PKI) via TLS certificates and SASL EXTERNAL for stronger mutual authentication. These methods—detailed in XEP-0220 for Dialback—allow dynamic, on-demand connections that form a robust mesh network, avoiding single points of failure by routing through multiple paths if needed. In contrast to centralized protocols like , which route all traffic through proprietary servers under a single provider's control, XMPP's supports self-hosting by individuals or organizations, promoting among diverse clients and servers without . This open model, grounded in domain-based addressing via IDs (JIDs), enables true cross-network messaging while maintaining privacy through localized data control.

Protocol Architecture

Client-Server and Server-Server Communication

In XMPP, client-to-server (C2S) communication establishes a bidirectional XML between a client and its , enabling the exchange of stanzas for messaging and presence. The process begins with stream initiation, where the client opens a connection to the server on port 5222 and sends an opening stream header, such as <stream:stream from='[email protected]' to='im.example.com' version='1.0' xml:lang='en' xmlns='jabber:client' xmlns:stream='http://etherx.jabber.org/streams'>. The server responds with its own stream header, including a unique stream ID, and advertises supported features via a <stream:features/> element. Following this, the client and server negotiate (TLS) for channel encryption if required, using the <starttls/> mechanism; the stream is then restarted after successful TLS handshake to ensure secure communication. Authentication occurs via (SASL) mechanisms, such as SCRAM-SHA-1 or PLAIN (only over TLS), where the client sends an <auth/> stanza with credentials, the server issues a challenge, and upon success, returns <success/> followed by a stream restart. Post-authentication, the client binds a resource identifier to its (JID) using the <bind/> stanza, e.g., <iq type='set' id='bind-1'><bind xmlns='urn:ietf:params:xml:ns:xmpp-bind'><resource>balcony</resource></bind></iq>, allowing the server to return the full JID like [email protected]/balcony. Once established, the client can send presence stanzas, such as <presence/> without a to attribute to broadcast availability, which the server routes to subscribed contacts, and message stanzas like <message from='[email protected]/balcony' to='[email protected]' type='chat'><body>Wherefore art thou, Romeo?</body></message>, which the server delivers or stores for offline users. compression, using methods like zlib advertised in <stream:features/>, may be negotiated optionally to reduce bandwidth. Server-to-server (S2S) communication facilitates federation by allowing to exchange stanzas across domains, typically over port 5269. The initiating opens an XML with a header like <stream:stream from='example.com' to='example.net' version='1.0' xml:lang='en' xmlns='jabber:server' xmlns:stream='http://etherx.jabber.org/streams'>, using the jabber:server , and the receiving responds similarly with a . relies on either Server Dialback for lightweight verification or SASL EXTERNAL with TLS certificates for stronger identity assurance; in Server Dialback, the initiating sends a <db:result/> with a generated key to the receiving , which then queries the authoritative (via a separate connection) using <db:verify/> to confirm the key's validity via HMAC-SHA-256, resulting in <db:verify type='valid'/> or invalid. TLS encryption is mandatory for S2S to protect against . Upon successful , route stanzas—such as messages or presence—based on the to JID's , re-scoping namespaces as needed and ensuring in-order delivery. Stream management across both C2S and S2S uses the <stream:features/> element for capability , where mandatory features like TLS must be completed before proceeding, while optional ones like can be skipped. Errors are signaled via stream-level <stream:error/> stanzas or stanza-level <error/> conditions; for instance, an invalid triggers <jid-malformed/> or <invalid-from/>, while resource conflicts during binding return <conflict/>, prompting the client to select a different resource. Service unavailability for non-existent users results in <service-unavailable/> without revealing presence information. These mechanisms ensure robust, secure inter-domain communication in XMPP's decentralized architecture.

Addressing and JIDs

In XMPP, addressing relies on Jabber Identifiers (JIDs), which serve as unique identifiers for entities such as users, resources, and services within the protocol's federated network. A JID follows the syntactic form localpart@domainpart/resourcepart, where the localpart represents the user or node identifier, the domainpart specifies the authoritative server, and the resourcepart denotes a specific client instance or connection. For example, a full JID might appear as [email protected]/mobile, indicating user Alice on the example.com server using a mobile client. The domainpart is mandatory and limited to 1-1023 octets, processed according to the IDNA2008 standard for internationalized domain names, while the localpart and resourcepart are optional, each up to 1023 octets, and follow the UsernameCaseMapped and OpaqueString profiles of PRECIS for string preparation, respectively. Routing in XMPP uses bare and full s to direct stanzas appropriately. A bare consists of only the localpart and domainpart (e.g., [email protected]), typically used for presence information or s intended for the user regardless of specific . In contrast, a full includes the resourcepart for targeted delivery to a particular client instance, such as a exclusively to [email protected]/mobile when multiple resources are connected. The domainpart determines the destination for initial ; if the sender and recipient share the same domain, communication occurs via client-to-server (C2S); otherwise, server-to-server (S2S) resolves the domain to route across networks. Beyond user accounts, JIDs incorporate node identifiers in the localpart to address non-user entities, such as multi-user chat (MUC) rooms or gateway services. For instance, a MUC room might be identified as [email protected], where conference.example.com is the service subdomain handling room instances. This structure allows services to manage multiple entities under a single domain without conflicting with user JIDs. To ensure privacy, validation, and compatibility, XMPP employs escaping rules for the localpart, accommodating special characters disallowed by base string preparation profiles. Escaping uses a followed by two digits (e.g., , ), applying only to the localpart and not the domainpart or resourcepart. This mechanism, defined in XEP-0106, prevents issues with characters like or punctuation in usernames, supports , and maintains PRECIS compliance while enabling seamless with external systems. In federation, the domainpart's resolution via DNS SRV records (as per 6120) facilitates cross-domain addressing without exposing escaped localparts unnecessarily.

Stanzas and XML Structure

XMPP communications are structured around XML stanzas, which serve as the atomic units of data exchange between entities. These stanzas are encapsulated within bidirectional XML streams, enabling , asynchronous messaging over connections. The mandates the use of well-formed XML, with stanzas processed in the order received to maintain sequence integrity. There are three primary stanza types, each with a distinct semantic purpose: <message/> for delivering unstructured or semi-structured such as messages; <presence/> for conveying and information; and <iq/> (short for "/Query") for structured request-response interactions, such as retrieving lists or setting preferences. The <message/> stanza typically includes a <body/> child element containing the primary , while <presence/> may include <show/> and <status/> elements to indicate states like "away" or custom descriptions. The <iq/> stanza requires a type attribute specifying "get", "set", "result", or "error", and often encloses query-specific payloads in namespaced elements, such as <query xmlns='jabber:iq:roster'/> for roster requests. All stanza types qualify under the jabber:client namespace for client-to-server exchanges or jabber:server for server-to-server. Stanzas follow a consistent XML , featuring common attributes for and : to and from for JID-based addressing; id for correlating requests and responses (mandatory for <iq/>); type to refine semantics (e.g., "" for <message/> or "unavailable" for <presence/>); and xml:lang for specifying the natural of content. Child elements within stanzas carry the , with the root stanza element being self-closing or container-style as needed. The enclosing XML stream is initiated by an opening <stream:stream> tag with attributes like version='1.0', xmlns='jabber:client', and xmlns:stream='http://etherx.jabber.org/streams', establishing a session that persists until terminated by </stream:stream> or an error. After , a restarted stream header is required to resume stanza exchange. Extensibility is achieved by embedding custom XML payloads in stanzas using additional namespaces, allowing for protocol extensions without altering the core structure. These extensions, known as XMPP Extension Protocols (XEPs), define new child elements or attributes for specialized functionality, such as adding threading to messages. For instance, a basic chat message might appear as:
<message to='[email protected]' from='[email protected]/balcony' type='chat' id='msg1'>
  <body>Art thou not [Romeo](/page/Romeo), and a [Montague](/page/Montague)?</body>
</message>
Here, the <body/> holds the text payload, while future extensions could insert elements like <thread/> from a XEP-defined . Unsupported extensions in <message/> or <presence/> are simply ignored, but <iq/> queries in unknown namespaces trigger errors to ensure robust querying.

Transport and Extensions

Standard Transports

The primary transport for XMPP is , which establishes direct, persistent connections between clients and servers or between servers themselves. Client-to-server (C2S) communications default to TCP port 5222, while server-to-server (S2S) communications use TCP port 5269; these ports are registered with IANA for XMPP use. TCP connections support an upgrade to TLS for channel encryption via the STARTTLS mechanism, enabling secure stream negotiation after initial connection establishment. This transport prioritizes efficiency through long-lived, bidirectional streams that minimize overhead for real-time messaging and presence updates. For environments where direct TCP connections are impractical, such as web browsers restricted by same-origin policies or firewalls, XMPP employs HTTP-based bindings. Bidirectional-streams Over Synchronous HTTP (BOSH), defined in XEP-0124, emulates a persistent connection using long-polling: clients issue HTTP requests that the server holds open until data is available or a timeout occurs, allowing bidirectional XML over standard HTTP/1.1. BOSH facilitates compatibility with browser-based clients by avoiding the need for persistent sockets, though it introduces some due to request-response cycles. A more modern HTTP-compatible alternative is XMPP over , standardized in 7395, which provides low-overhead, full-duplex communication by framing XMPP streams within messages using a specific 'xmpp' subprotocol. This binding supports real-time bidirectional data exchange in web contexts, such as applications, and offers improved performance over BOSH by reducing polling overhead and enabling persistent connections akin to . Less common are serverless transports, such as those outlined in XEP-0174 for direct messaging without intermediaries, typically over local networks using direct TLS connections for security. These rely on via and support flexible ports via DNS SRV records to navigate firewalls, but adoption remains limited due to their niche focus on ad-hoc scenarios. Overall, remains the most efficient for native applications, while HTTP bindings ensure broad interoperability; XMPP lacks native support, relying exclusively on connection-oriented protocols.

Extensibility via XEPs

XMPP's extensibility is primarily achieved through XMPP Extension Protocols (XEPs), which are additional specifications developed by the XMPP Standards Foundation (XSF) to introduce new features and capabilities without modifying the core protocol defined in RFC 6120 and RFC 6121. These protocols allow the XMPP ecosystem to evolve modularly, supporting diverse applications from enhancements to integration with other technologies. The XEP development process begins with submission to the XMPP Extensions Editor, followed by community review through the Standards (SIG). Approved proposals are published as Experimental XEPs, entering a lifecycle that includes stages such as Proposed (for active discussion), (after widespread testing and two or more independent implementations), and Final (for mature, unchanging standards). Other statuses include Deferred (for inactive ideas after 12 months), Deprecated (for discouraged but supported extensions), and Obsolete or Rejected (for superseded or unviable proposals). As of 2025, the XSF has published 491 XEPs across various types, including Standards Track (for wire protocols), Informational (for best practices), and Procedural (for processes). XEPs integrate seamlessly into XMPP streams by embedding payload elements within core stanzas—such as <message/>, <presence/>, or <iq/>—using distinct XML namespaces to avoid conflicts with the base 'jabber:client' or 'jabber:server' namespaces. Servers and clients declare support for specific extensions during stream negotiation via the <stream:features/> element, which lists feature namespaces (e.g., for stream management or ) and may mark them as required using a <required/> child element. This mechanism ensures , as entities can ignore unrecognized namespaces without disrupting the stream, triggering errors like <invalid-namespace/> only if XML is violated. For instance, XEP-0096 enables direct file transfers by defining a Stream (SI) profile that negotiates transport methods (e.g., in-band or ) within IQ stanzas, allowing peers to exchange files without altering XMPP's foundational messaging structure. Similarly, extensions for (e.g., XEP-0198 for stream management) or multi-user (e.g., XEP-0045) build on this framework to add reliability and group features. Governance of XEPs is handled by the XSF's XMPP Council, which votes on advancements (+1 for approval, 0 for abstention, -1 for opposition), requiring a of +1 votes with no unresolved -1s for progression. The XSF Board oversees Procedural XEPs, while the XMPP Registrar manages namespace registrations upon reaching Stable or Active status to maintain . This open, consensus-driven process, detailed in XEP-0001, fosters innovation while ensuring extensions remain royalty-free under the XSF's IPR policy.

Limitations and Challenges

One significant limitation of the XMPP protocol stems from its use of XML for encoding s, which introduces verbosity and results in higher bandwidth consumption compared to more compact formats employed by protocols like Matrix's JSON-based structure. This overhead is particularly noticeable in scenarios involving frequent small messages, such as applications or , where XML markup can inflate sizes by several factors. For instance, a simple presence update or short message may require hundreds of bytes due to tags and attributes, exacerbating data usage on low-bandwidth networks. The protocol's design also presents a steep for developers, primarily due to the intricacies of XML parsing, validation, and the vast ecosystem of 491 XMPP Extension Protocols (XEPs) that extend core functionality. Implementing compliant clients or servers requires navigating multi-stage stream negotiations—including TLS handshake, SASL authentication, and resource binding—which add layers of complexity beyond simpler text-based protocols. While this extensibility enables customization, the optional nature of many XEPs often leads to inconsistent implementations across software, complicating development and testing. Adoption barriers further hinder XMPP's widespread use, including fragmentation caused by the optional XEPs, which frequently result in issues between clients and servers supporting different subsets of extensions. For example, features like multi-device or advanced may not function seamlessly across all implementations, leading to user frustration and reduced cohesion. By 2025, XMPP's presence in mainstream has declined, overshadowed by proprietary platforms like and Signal, with open federated alternatives like gaining traction for their more unified feature sets and easier onboarding. Scalability poses challenges in very large deployments, as the core protocol lacks native support for horizontal clustering or distributed processing, requiring server-specific extensions like those in ejabberd or MongooseIM to handle millions of concurrent users. Without such configurations, servers may encounter resource constraints, such as limits on simultaneous connections or stanza processing rates, potentially leading to stream closures under high load. Additionally, message archiving and history retrieval are not built into the base specification, relying instead on XEP-0313 (Message Archive Management) for offloading storage and query functions to external modules, which introduces further architectural overhead. Security critiques highlight that the base XMPP protocol does not include end-to-end encryption (E2EE), depending on extensions like OMEMO (XEP-0384) for such capabilities, which must be explicitly enabled and supported by all parties. While OMEMO provides multi-device E2EE based on the Signal protocol, recent 2024 analyses have questioned its maturity, citing issues with key management, implementation inconsistencies, and vulnerability to certain active attacks in fragmented deployments. These extensions, though mitigating risks, underscore the protocol's foundational reliance on add-ons for modern privacy standards.

Features

Presence and Basic Messaging

Presence in XMPP is managed through <presence/> stanzas, which allow users to broadcast their status to subscribed contacts. These stanzas are sent without a 'type' attribute to indicate availability or with 'type="unavailable"' to signal disconnection. Availability states are specified via the <show/> child element, including online (no <show/> or empty), away (<show>away</show>), and do not disturb (<show>dnd</show>); additional states like chatty (<show>chat</show>) or extended away (<show>xa</show>) may also be used. Upon connecting, a client sends an initial presence stanza to its server, which broadcasts it to all entities with a subscription to the user's presence, ensuring approved contacts receive updates on . Subscriptions for presence are handled via the user's roster, a server-maintained queried using <iq/> stanzas in the 'jabber:iq:roster' namespace. A client retrieves its roster by sending an <iq type='get'/> query, to which the server responds with an <iq type='result'/> containing <item/> elements for each contact, including attributes for their and subscription state. Subscription states include 'none' (no mutual subscription), 'to' (user receives contact's presence), 'from' (contact receives user's presence), and 'both' (mutual subscription), enabling controlled sharing of presence information. To request a subscription, a sends a <presence type='subscribe'/> to the contact's , which the recipient can approve with <presence type='subscribed'/> or deny with <presence type='unsubscribed'/>. Basic messaging in XMPP relies on <message/> stanzas for communication, categorized by 'type' attributes to denote purpose: '' for conversational messages, '' for standalone messages (default), '' for non-reply notifications, and '' for reporting failures. Each message includes a 'to' attribute for the recipient's , a 'from' for the sender, and a <body/> element containing the text payload; an optional 'id' attribute aids tracking. Servers route these stanzas between clients, supporting directed delivery to specific resources or broadcasting based on availability. For enhanced reliability, XEP-0184 introduces delivery receipts, where a sender requests confirmation by including <request xmlns='urn:xmpp:receipts'/> in the ; the recipient's client responds with <received xmlns='urn:xmpp:receipts' id='message-id'/> upon delivery. To support multi-device synchronization, XEP-0280 defines message carbons, enabling the server to copy incoming and outgoing messages to all connected resources of the user. Clients enable this feature by sending an <iq type='set'/> with <enable xmlns='urn:xmpp:carbons:2'/>, after which the server forwards eligible messages—wrapped in <received/> for inbound or <sent/> for outbound—using forwarding (XEP-0297), excluding private or self-sent messages to prevent loops. This ensures consistent conversation history across devices without requiring additional client-side polling. Offline message handling ensures reliability by having the server store undeliverable <message/> stanzas of type 'chat' or 'normal' until the recipient connects with an available resource. Upon reconnection, the server delivers stored messages in the order received, potentially including a count via XEP-0013 for management; 'headline' and 'error' types are discarded, while 'groupchat' triggers an immediate error. This mechanism, combined with presence subscriptions, forms the core of XMPP's real-time, resilient communication model.

Multi-User Chat

Multi-User Chat (MUC) in XMPP enables group conversations and conferencing through a dedicated that allows multiple users to interact in virtual rooms. Defined in XEP-0045, this extension facilitates scalable, real-time text-based discussions by treating rooms as addressable entities and leveraging XMPP's core mechanisms for participation and messaging. Rooms in MUC are addressed using IDs (JIDs) in the form room@service, where service is typically a like conference.domain dedicated to services (e.g., [email protected]). Occupants within a room are identified by occupant JIDs such as room@service/nickname (e.g., [email protected]/thirdwitch), enabling precise targeting in interactions. To join a room, a user sends a directed presence to their desired occupant JID, including an <x/> element in the http://jabber.org/[protocol](/page/Protocol)/muc namespace, which signals intent to participate and prompts the to assign initial roles and affiliations. MUC distinguishes between roles, which are transient privileges tied to an occupant's session, and affiliations, which represent persistent relationships to the room itself. Roles include moderator (highest privileges, including moderation actions), participant (able to send messages), and (observation-only, without messaging capability); an optional none role indicates no active status. Affiliations encompass owner (full room control), admin (elevated administrative rights), member (entry permitted as participant), (banned), and none (no special status). These are managed via IQ stanzas in the http://jabber.org/protocol/muc#admin or #owner namespaces.
RoleDescriptionPrivileges
ModeratorHighest in-room authorityFull , including kicking occupants
ParticipantStandard active memberSend messages to the room
VisitorRestricted observerObserve only, no messaging
NoneNo active role (e.g., pending join)None
AffiliationDescriptionScope
OwnerComplete control over Persistent, across sessions
AdminAdministrative oversightPersistent, configurable
MemberAuthorized to join as participantPersistent, list-based
OutcastPermanently bannedPersistent exclusion
NoneDefault, no privilegesNo special access
Rooms can be configured as persistent, surviving the departure of all occupants and maintaining like member lists, or temporary, which are automatically destroyed when empty. This distinction supports both ad-hoc discussions and ongoing communities. Message handling in MUC supports room-wide broadcasts via groupchat-type messages sent to the room , ensuring delivery to all occupants unless moderated. Private messages to individuals use chat-type stanzas directed at an occupant's , allowing side conversations within the room context. Nickname-based addressing facilitates direct replies or mentions, with the room service resolving nicknames to full occupant JIDs for routing. Key features enhance usability and management: room subjects can be set or changed by authorized users through a subject child element in a groupchat , broadcasting the update to all participants. Upon joining, occupants may receive recent history, configurable by parameters like maxchars (limiting total characters, e.g., 65000) to control and . Moderation is handled via commands embedded in messages (e.g., /kick [nickname](/page/Nickname)) or IQ stanzas for advanced actions like role changes, enabling owners and moderators to enforce rules dynamically.

Voice and Video Calls (Jingle)

The framework, defined in XEP-0166, provides an XMPP extension for negotiating and managing sessions between entities using IQ stanzas for signaling. It enables the establishment of direct connections for exchange, avoiding reliance on centralized servers for the actual data transfer where possible. Session initiation follows an offer-answer model inspired by , where the initiator sends a session-initiate IQ containing proposed content types (such as audio or video) and transport details, and the responder replies with a session-accept IQ to confirm or modify the parameters. The process includes actions like transport-info for exchanging ICE candidates to facilitate via and TURN protocols, ensuring connectivity in varied network environments. transport typically uses RTP over /RTCP for real-time audio and video streams, with support extending to file transfers through dedicated application types. Key extensions enhance Jingle's capabilities, including XEP-0176 for the ICE-UDP transport method, which implements to gather and prioritize connection candidates for robust setup. Security is bolstered by XEP-0320, which integrates DTLS-SRTP for encrypting RTP media streams through a process involving verification in session payloads. For group scenarios, Multiparty Jingle (XEP-0272) coordinates multiple one-to-one sessions via Multi-User Chat rooms, where participants advertise stream capabilities in presence updates and initiate pairwise Jingle negotiations. In practice, powers voice and video calling in clients like , which leverages XMPP for seamless across servers. Bandwidth management is addressed through optional <bandwidth> elements in session descriptions, specifying limits in kilobits per second (e.g., 128 kbps for audio) to optimize resource usage and adapt to network constraints.

Security and Encryption

XMPP provides security through layered protections, beginning with transport-level encryption and to secure communications between clients and servers, as well as between servers. (TLS) is mandatory to negotiate for encryption, ensuring confidentiality and integrity of XML s; implementations must support secure TLS versions and cipher suites that provide , as outlined in 7590. Server via certificates is required, with clients verifying server identity using methods such as PKIX or to prevent man-in-the-middle attacks. Following TLS, the (SASL) handles using mechanisms such as (over TLS to protect credentials) and the mandatory-to-implement SCRAM-SHA-1 and SCRAM-SHA-1-PLUS for salted, secure password-based . These SASL mechanisms operate post-TLS to bind to the encrypted , with stream restarts ensuring secure session establishment. For (E2EE) beyond transport security, XMPP relies on extensions since the core protocol lacks native support. OMEMO (XEP-0384) is the primary E2EE protocol, adapting the Signal Protocol's and X3DH key agreement to enable , post-compromise security, and deniability in one-to-one and multi-user chats. It supports multi-device synchronization by publishing key bundles—consisting of identity keys, signed prekeys, and one-time prekeys—via the Personal Eventing Protocol (XEP-0163) and Publish-Subscribe (XEP-0060), allowing devices to retrieve and establish encrypted sessions independently. As of 2025, OMEMO remains experimental but provides robust protection against server-side passive and active attackers, though it does not secure metadata or traffic analysis. As of 2025, work is progressing on integrating the (MLS) protocol into XMPP through a proposed XEP, enhancing group and enabling better interoperability with other secure messaging standards. Additional security features include channel binding in SASL mechanisms like SCRAM-SHA-1-PLUS, which ties to the TLS channel to mitigate downgrade attacks and enhance resistance to compromised certificates, as specified in XEP-0440 for capability advertisement. Key verification in OMEMO occurs through out-of-band methods, such as QR code scanning or comparison, to confirm device identities and detect compromises. Earlier protocols like Off-the-Record (OTR) offered E2EE but suffered issues with inter-client mobility and lack of multi-device support, rendering them legacy in favor of OMEMO. Best practices for XMPP security emphasize obtaining valid server certificates from trusted authorities to enable seamless client verification and avoid warnings, alongside configuring TLS cipher suites that enforce to protect past sessions from future key compromises. Servers should limit SASL retry attempts to counter dictionary attacks, and clients must enforce TLS negotiation while validating certificates rigorously. Jingle sessions for voice and video may integrate DTLS for media encryption atop these foundations.

Service Discovery and PubSub

Service Discovery in XMPP enables entities to query and retrieve information about other entities' capabilities, identities, and associated items through standardized IQ stanzas. Defined in XEP-0030, this extension uses two primary query types: disco#info for discovering an entity's (such as category and type, e.g., "conference" for a multi-user service) and supported features (e.g., "http://jabber.org/[protocol](/page/Protocol)/muc" for room support), and disco#items for listing child items or services linked to the entity, such as available chat rooms or s. Queries are directed to a Jabber ID (JID), optionally specifying a for targeted discovery, and responses include structured elements like <identity> for self-description and <feature> or <item> for capabilities and associations. Publish-Subscribe (PubSub), specified in XEP-0060, extends XMPP with a node-based framework for disseminating updates from publishers to subscribers, supporting applications like news feeds and event notifications. Entities create or access nodes at a PubSub service, where publishers post items via <publish> IQ stanzas, triggering notifications to subscribers through <message> stanzas containing <event> elements with item details or retractions. Subscriptions are managed with states such as "subscribed" or "pending," and multiple subscriptions per can use SubIDs; temporary subscriptions (pubsub#tempsub) allow short-term without persistence. models include open (public subscription), authorize (owner approval required), presence (tied to presence subscriptions), roster (limited to roster groups), and (explicit list), ensuring controlled information flow. PubSub integrates seamlessly with for enhanced usability, such as querying PubSub nodes and features via disco#items and disco#info, or discovering multi-user chat (MUC) rooms and roster group affiliations. Personal Eventing Protocol (PEP, XEP-0163) profiles PubSub for user-centric applications, treating a user's bare as a virtual PubSub service to broadcast personal updates like or changes to roster contacts via auto-subscriptions based on presence. For , PubSub supports hierarchical nodes through collection nodes (XEP-0248), forming a where subscribers can aggregate notifications from child nodes with configurable depth limits (e.g., "all" or integer values), and item retraction via <retract> stanzas to remove published content while optionally notifying subscribers.

Implementations

Servers

XMPP servers are software implementations that handle the core , presence, and messaging functions of the , enabling users to connect, communicate, and federate across . Prominent open-source options include ejabberd, Prosody, Openfire, MongooseIM, and Tigase, each offering distinct strengths in performance, ease of use, and extensibility. ejabberd, developed by ProcessOne and written in Erlang, is renowned for its high and , making it suitable for large-scale deployments. It supports native clustering to distribute load across multiple nodes and can handle over 2 million concurrent users on a single node, as demonstrated in 2016 benchmark tests. The server features a modular with plugins for numerous XMPP Extension Protocols (XEPs), including support for multi-user chat and . As an active project in 2025, ejabberd remains widely used for both community and enterprise self-hosting, with straightforward configuration for via s2s (server-to-server) connections. A commercial variant, ejabberd Business Edition, provides additional support and optimizations while building on the open-source core. Prosody, implemented in , emphasizes lightweight resource usage and simplicity, ideal for smaller to medium-sized setups where efficiency is key. It includes built-in clustering capabilities for scaling and a flexible system that enables easy integration of XEPs for features like pubsub and . Deployment is rapid, often achievable in minutes, with clear documentation for self-hosting on various platforms and enabling through simple module configurations. Prosody continues as an actively maintained project in 2025, with version 13.0.2 released in May, supporting modern XMPP compliance suites. Openfire, a Java-based from the Ignite , focuses on enterprise-oriented features such as a web-based interface and robust plugin ecosystem for extending functionality with XEPs. It supports clustering through plugins like for redundancy and load balancing, accommodating scalable real-time collaboration in organizational environments. Self-hosting is facilitated across multiple operating systems, with configured via built-in s2s support. The project remains active in 2025, evidenced by the release of version 5.0.2 in September. MongooseIM, also Erlang-based and developed by Erlang Solutions, excels in mobile and applications with strong support for high-availability clustering and metrics integration, handling large-scale notifications and syncing. Tigase, written in , prioritizes performance and security for high-throughput environments, featuring built-in support for HTTP/BOSH and WebSockets alongside extensive XEP compliance for enterprise messaging.
ServerLanguageKey StrengthsClustering SupportXEP ModularityPerformance Example
ejabberdErlangScalability, fault toleranceNativeHigh2M+ concurrent users/node (2016 benchmark)
ProsodyLuaLightweight, easy setupBuilt-inFlexible pluginsSuitable for small to medium deployments (thousands of users)
OpenfireJavaEnterprise admin, pluginsVia pluginsExtensive ecosystemScalable for medium deployments
MongooseIMErlangMobile/IoT, high availabilityNativeHighOptimized for push and real-time
TigaseJavaPerformance, securityBuilt-inExtensiveHigh-throughput environments
These servers facilitate self-hosting with minimal overhead, typically requiring basic DNS setup for domains and TLS certificates for secure , ensuring interoperability with other XMPP instances. While commercial offerings like Cisco's and Presence leverage XMPP for messaging, the open-source variants dominate due to their flexibility and community-driven updates.

Clients

XMPP clients are end-user applications that enable individuals to connect to XMPP servers for , presence sharing, and other real-time communication features. These clients vary by platform, emphasizing usability, security, and compatibility with XMPP Extension Protocols (XEPs) to support functionalities like multi-user chat. Desktop clients provide robust interfaces for prolonged sessions on personal computers. Gajim, developed in , is a full-featured client that offers comprehensive support for numerous XEPs, including advanced capabilities such as and message archiving, making it suitable for users seeking extensibility. Pidgin stands out as a lightweight, multi-protocol messenger that integrates XMPP alongside other networks like IRC and , allowing seamless cross-protocol communication without heavy resource demands. On mobile devices, clients prioritize battery efficiency and native integration. Conversations, an open-source application, focuses on security through built-in support for OMEMO encryption, which provides multi-end message and object encryption based on the , alongside features like offline message delivery. Monal serves as a user-friendly client with a modern interface, supporting XMPP for both iOS and macOS, and incorporating ongoing enhancements for intuitive onboarding and UI/UX improvements funded by initiatives like NLnet. Web-based clients facilitate browser access without installations, leveraging HTTP-based transports. Converse.js is a that embeds XMPP functionality into websites, connecting via BOSH for bidirectional HTTP streams or WebSockets for real-time bidirectional communication, enabling seamless integration into web applications. Jitsi Meet incorporates XMPP for signaling in its open-source video conferencing platform, using the protocol to manage authentication, in-call chat, and media negotiation across sessions. As of 2025, XMPP client development trends emphasize (E2EE) protocols like OMEMO to enhance , alongside push notifications for reliable mobile alerting even when apps are inactive. The user base remains stable and privacy-oriented, with diverse adoption in sectors like federated communication, though it faces competition from centralized alternatives.

Libraries and Development Tools

XMPP libraries provide reusable components for developers to implement the protocol in various programming languages, supporting core specifications like RFC 6120 for stream management and authentication, as well as key extensions such as XEP-0045 for multi-user chat and XEP-0166 for sessions. One prominent library is , an open-source Java-based XMPP client library developed by the Ignite Realtime community, which is highly modular and portable across Java SE and environments. includes full support for core RFCs including RFC 6120 and RFC 6121 for resource binding and , along with implementations of numerous XEPs such as XEP-0030 for and XEP-0198 for stream management to enable reliable connections. It facilitates real-time communication features like presence updates and message exchange without requiring developers to handle low-level XML parsing. For Python developers, SlixMPP serves as a modern, event-based XMPP library forked from SleekXMPP, offering asynchronous support via 's asyncio framework for efficient handling of concurrent connections. It complies with core s like RFC 6120 and supports essential XEPs including XEP-0115 for entity capabilities and XEP-0280 for message carbons, making it suitable for building bots, clients, or components with non-blocking I/O. The library emphasizes threadless operation, allowing seamless integration into async applications while maintaining compatibility with 3.7 and later. In the JavaScript ecosystem, (also known as StanzaJS) provides a modern XMPP library for and browser environments, abstracting the protocol into a JSON-friendly to simplify development without direct XML manipulation. It supports fundamental such as for stream setup and key XEPs like XEP-0082 for date formatting and XEP-0199 for , enabling features like real-time presence and messaging in web or server-side applications. Stanza's design focuses on type safety with bindings, facilitating scalable implementations for -based services. Development tools aid in ensuring protocol adherence and troubleshooting. The XMPP Compliance Tester, an open-source web-based and command-line utility maintained by the Conversations.im project, evaluates servers and clients against XSF-defined suites like XEP-0459 (2022) and XEP-0479 (2023), testing support for required XEPs across categories such as and . levels include Core (basic functionality like TLS and SASL) and Advanced (enhanced features like OMEMO via XEP-0384), helping developers verify without manual testing. For debugging, includes a built-in XMPP dissector that parses stanzas over port 5222, allowing packet inspection for issues in stream negotiation or extension handling. The XSF fosters a robust development ecosystem through resources like the xmpp.org software directory, which catalogs libraries and tools, and detailed compliance guidelines to promote consistent implementations. Libraries often integrate with for voice and video, as seen in examples using the Jingle WebRTC package to negotiate peer-to-peer sessions via XEP-0340 and XEP-0343 for data channels. As of 2025, the ecosystem has seen enhancements in asynchronous capabilities and new bindings; for instance, the go-xmpp library released version 0.2.18 in October, improving support for modern Go concurrency patterns in XMPP clients and components. Rust bindings like xmpp-rs leverage Tokio for async I/O, providing type-safe APIs for core protocol elements and extensions, while continuing work on full compliance with recent XEPs.

History and Development

Origins as Jabber

In 1999, Jeremie Miller founded the project as an open-source initiative to develop a decentralized system, motivated by the need for an interoperable alternative to proprietary services like Instant Messenger that dominated the market and restricted cross-network communication. Announced publicly in of that year, the project emphasized community collaboration and drew inspiration from existing open technologies to enable real-time messaging and presence awareness without . Early development centered on creating extensible protocols using XML streams, which allowed for flexible data exchange and future-proofing against evolving needs. The initial server software, jabberd 1.0, was released in May 2000, providing a stable foundation for core functions such as message routing and user presence, while the community rapidly contributed clients, libraries, and enhancements. This XML focus facilitated easy parsing and extension, aligning with the project's goal of broad applicability beyond basic chat. Key milestones included the establishment of the first federated network in 2000, enabled by jabberd 1.2's introduction of the server dialback mechanism in October, which allowed secure inter-server communication and prevented spoofing across domains. By 2002, the network had expanded significantly, with deployments on thousands of domains and over a million users worldwide, demonstrating the viability of a distributed, open ecosystem. This growth underscored Jabber's success as a counter to centralized proprietary systems, paving the way for its formal as XMPP.

IETF Standardization

In 2002, the Jabber Software Foundation (JSF) submitted the Jabber protocols to the (IETF) as an Internet-Draft, marking the formal beginning of efforts to standardize the technology for near-real-time messaging and presence. In late 2002, the IETF chartered the to adapt the base Jabber protocol into a suitable specification for and presence, focusing on core features like XML stream management, authentication, and resource binding. The working group emphasized through rigorous testing, including a major event in 2005 organized under the IETF's NEWTRK area to validate implementations across diverse entities. The culmination of this process arrived in October 2004 with the publication of RFC 3920, which defined the core XMPP protocol including stream setup, teardown, encryption via TLS, and SASL authentication, and RFC 3921, which specified and presence extensions building on the core. These documents established XMPP as a Proposed Standard, providing a stable foundation for decentralized, federated communication while allowing extensibility through XML namespaces. Following the RFC publications, the JSF continued managing protocol development but rebranded to the XMPP Standards Foundation (XSF) in 2007 to align with the IETF-adopted terminology. Over time, experience from deployments revealed areas for refinement, leading to revisions in 2011. RFC 6120 obsoleted RFC 3920 by clarifying core mechanisms such as error handling and stream resumption, while RFC 6121 obsoleted RFC 3921 with updates to and presence, including enhanced stream management features to handle interruptions and acknowledgments more robustly. This standardization process significantly broadened XMPP's adoption, notably enabling the launch of in August 2005, which federated with public XMPP services until interoperability restrictions were imposed in 2013.

Modern Evolution and IoT Applications

In the 2020s, XMPP has experienced a resurgence driven by growing concerns over centralized platforms' practices and monopolization, positioning it as a key enabler for decentralized communication ecosystems. This revival aligns with broader movements toward digital sovereignty and , particularly in , where regulations emphasize open standards and vendor independence. For instance, XMPP's federated architecture supports self-hosted servers and privacy-enhancing transports like , allowing users greater control over their without relying on services. A prominent example of this trend is its adoption in federated networking, exemplified by Movim, an open-source platform that leverages XMPP for decentralized blogging, chat, and community features across interconnected servers. Movim acts as a frontend for the XMPP network, enabling users to engage in interactions while maintaining with other XMPP-based services, thus fostering resilient, censorship-resistant alternatives to centralized . This application highlights XMPP's extensibility in supporting modern social paradigms without compromising its core decentralized principles. In applications, XMPP has evolved to facilitate secure, real-time communication among constrained devices, leveraging extensions like PubSub (XEP-0060) for efficient event publishing and subscription, which allows sensors and actuators to broadcast updates without constant polling. Authentication via SASL/EXTERNAL enables machine-to-machine (M2M) interactions with certificate-based trust, ensuring scalability in resource-limited environments. Practical deployments include smart home systems, such as the Logitech Harmony Hub, which has connected millions of devices since 2010 using XMPP for control and status reporting, and sensor networks on microcontrollers optimized for low-bandwidth, constrained networks. These integrations demonstrate XMPP's robustness for , supporting bidirectional data flows in healthcare monitoring and industrial automation. Further modern evolutions include Message Archive Management (MAM, XEP-0313), which provides server-side storage and retrieval of messages with filtering by time, sender, or ID, enhancing reliability for mobile and intermittent connections by syncing conversation history across devices. Complementing this, Push Notifications (XEP-0357) enable offline alerting through a two-tier system integrating XMPP PubSub with third-party services like , ensuring timely delivery of events such as new messages. XMPP also plays a strategic role in applications via the extension (XEP-0166), handling signaling for audio/video sessions, and in online gaming for real-time presence, chat, and in multi-user environments. Despite these advancements, XMPP faces competition from protocols like , which offers built-in and synchronized room states appealing to modern chat apps, potentially drawing users away from XMPP's XML-based overhead. However, XMPP maintains an edge in due to its maturity, lightweight extensions for constrained devices, and proven in M2M scenarios, where Matrix's higher demands can pose challenges.

Standards and Interoperability

Core RFC Specifications

The core specifications for the Extensible Messaging and Presence Protocol (XMPP) are defined in a series of Internet Engineering Task Force (IETF) Request for Comments (RFCs) that establish the foundational architecture for real-time XML-based communication. These documents outline the protocol's mechanisms for stream management, data exchange, security, and basic functionality, serving as the mandatory baseline for XMPP implementations. All core RFCs hold Proposed Standard status within the IETF's standards track, with no substantive revisions to the primary documents since their 2011 publication, though errata have been applied and complementary RFCs have addressed specific aspects like addressing and transport bindings. RFC 6120, titled "Extensible Messaging and Presence Protocol (XMPP): Core," published in March 2011 and authored by Peter Saint-Andre, defines the essential protocol elements for XMPP, including the setup and teardown of bidirectional XML streams over connections. It specifies stream headers with attributes such as to, from, id, version, and xml:lang to initiate communication between clients and servers or between servers, enabling feature negotiation for security and capabilities. The document introduces stanzas as the fundamental units of communication—<message/> for payloads, <presence/> for status updates, and <iq/> (Info/Query) for request-response interactions—each carrying routing attributes like to, from, id, type, and xml:lang, and processed in strict order to ensure reliability. Authentication is handled through integration with the (SASL) as per RFC 4422, supporting mechanisms such as SCRAM-SHA-1, EXTERNAL, and PLAIN after mandatory (TLS) negotiation using cipher suites like TLS_RSA_WITH_AES_128_CBC_SHA; successful SASL authentication triggers a stream restart. Error handling for streams and stanzas is also detailed, promoting interoperability in near-real-time exchanges. This RFC obsoletes the earlier RFC 3920 from 2004, refining core methods without altering the XML streaming paradigm. Complementing the core, RFC 6121, also published in March 2011 and authored by Peter Saint-Andre, extends XMPP to support (IM) and presence features in alignment with RFC 2778's requirements for IM and presence services. It details roster management via IQ stanzas in the jabber:iq:roster namespace, allowing entities to add, update, or delete contacts with subscription states such as "none," "to," "from," or "both," and includes versioning via a ver attribute for efficient synchronization. Presence handling involves subscription requests and responses using presence stanzas with types like "subscribe," "subscribed," "unsubscribe," and "unsubscribed," enabling directed presence probes and broadcasts of availability (e.g., initial presence on , subsequent updates, or "unavailable" notifications to subscribed resources). Messaging semantics cover exchanges through message stanzas with types such as "," "normal," or "groupchat," incorporating elements like <body/> for text, <subject/> for topics, and <thread/> for conversation threading, with rules for delivery based on recipient availability and resource selection. This specification obsoletes RFC 3921 from 2004, emphasizing semantic clarity for IM sessions and presence subscriptions to facilitate maintenance and status sharing. Several related RFCs build on these foundations to address specific protocol aspects, ensuring robustness in addressing, security, and transport. RFC 7622 (August 2015), authored by Peter Saint-Andre and , standardizes the XMPP address format (Jabber ID or ) as localpart@domainpart/resourcepart, extending support for internationalized domain names and non-ASCII code points via encoding, which enhances global usability. RFC 7590 (June 2015), authored by Peter Saint-Andre and Alexey Melnikov, mandates the use of TLS 1.2 or higher in XMPP streams with strong cipher suites, updating RFC 6120's requirements to counter evolving threats. Additionally, RFC 7395 (October 2014), authored by Lance Stout, Jack Moffitt, and Evgeni Golov, defines a subprotocol for binding XMPP over (RFC 6455), allowing browser-based clients to establish streams via a /ws endpoint while preserving semantics, thus enabling web integration without altering core behaviors. These documents collectively maintain XMPP's extensibility while enforcing a stable, secure baseline.

XMPP Extension Protocols

XMPP Extension Protocols (XEPs) are the primary mechanism for extending the core XMPP specifications, developed and maintained by the XMPP Standards Foundation (XSF). These protocols are documented as numbered specifications, such as XEP-0001, which serves as a and foundational for the entire series. Each XEP follows a structured , including sections for legal notices, author information, and technical content, ensuring consistency across the collection. The lifecycle of a XEP progresses through defined stages to promote rigorous review and stability: it begins as Experimental for initial development and testing; advances to Proposed after community feedback during a period of at least 14 days; reaches following extensive review and a minimum six-month period; and achieves Final status after another six months in , requiring at least two independent implementations (one open-source) and a Call For Experience. XEPs can also be marked as Deprecated if superseded by newer protocols, potentially with an to encourage . As of November 2024, there are 47 Experimental, 76 , and 11 Final XEPs (134 active across these statuses), out of 490 total documents. Prominent examples include XEP-0384 for OMEMO Encryption, which provides using a double-ratchet scheme inspired by for secure multi-device messaging; XEP-0313 for Message Archive Management (MAM), enabling server-side storage and retrieval of chat histories to support offline access and synchronization; and XEP-0166 for , which facilitates media sessions like voice and video calls over XMPP signaling. The development process starts with ProtoXEPs, informal proposals submitted by community members to the XSF Editor for initial formatting and placement on the XMPP Council's agenda. The Council, composed of elected technical experts, votes on advancement: a +1 vote (without -1 vetoes) is required to move a ProtoXEP to Experimental status and assign it a number, while further progression to or Final demands broader consensus, implementation evidence, and documentation. is emphasized through requirements for multiple independent implementations and community-driven testing during the Call For phase, ensuring practical deployability across diverse software ecosystems. This extensible framework allows XMPP to incorporate innovative features—such as encrypted messaging, media negotiation, and archival persistence—without modifying the foundational RFCs, forming the building blocks for a wide array of modern applications while preserving backward compatibility. XMPP faces competition from several protocols designed for real-time communication, each offering distinct advantages in simplicity, scalability, or specialization. Matrix, a decentralized protocol using JSON over HTTP, emphasizes eventual consistency and room-based synchronization, making it suitable for modern, persistent chat applications with features like end-to-end encryption via Olm/Megolm. In contrast to XMPP's XML-based streams, Matrix's architecture supports easier integration with web technologies but can introduce higher complexity in federation due to its state synchronization model. Internet Relay Chat (IRC) serves as a alternative for basic text-based group communication, lacking native support for presence information or that XMPP provides through extensions. IRC's channel-oriented model excels in low-overhead, server-centric environments but requires add-ons for features like , positioning it as a simpler yet less extensible option compared to XMPP's federated ecosystem. For voice and video, (SIP) competes directly in VoIP scenarios, focusing on call setup and media negotiation rather than XMPP's broader messaging and presence capabilities. While both are IETF standards, SIP's binary efficiency suits telephony integrations, whereas XMPP extensions like enable similar functionality with greater emphasis on . In contexts, Message Queuing Telemetry Transport () offers a lighter-weight alternative to XMPP, employing a publish-subscribe model optimized for low-bandwidth, constrained devices with minimal overhead from its and quality-of-service levels. 's simplicity makes it preferable for networks, though it lacks XMPP's built-in presence and extensibility for bidirectional, human-readable interactions. XMPP supports through gateways defined in XEP-0100, which outline best practices for proxying connections to legacy or non-native services, such as bridging to for cross-platform messaging. These gateways allow XMPP users to interact with external networks by emulating user agents, facilitating translation of messages, presence, and rosters without native federation. XMPP's native federation enables seamless server-to-server communication across independent domains, a core strength over Matrix's reliance on bridges for connecting to non-Matrix protocols, which can introduce latency and metadata leakage. However, as proprietary protocols like Signal prioritize and minimal metadata retention, XMPP's adoption has waned in consumer spaces due to ecosystem fragmentation. By 2025, XMPP maintains a niche in and applications for its secure, real-time device orchestration, while Matrix gains traction in social and collaborative environments through improved client ecosystems and bridge support.