Class diagram
A class diagram is a type of static structure diagram in the Unified Modeling Language (UML) that describes the structure of a system by showing its classes, their attributes, operations (or methods), and the relationships among those classes.[1] Class diagrams play a central role in object-oriented software engineering by providing a visual representation of the static architecture of a system, enabling designers to model entities, their properties, behaviors, and interactions without specifying dynamic behavior.[2] They are essential for requirements analysis, system design, and documentation, facilitating communication between developers, architects, and stakeholders while serving as a blueprint for implementation in languages like Java or C++.[3] Key elements of a class diagram include classes, depicted as rectangles divided into three compartments—the top for the class name, the middle for attributes (e.g., data members with visibility indicators like public or private), and the bottom for operations (e.g., methods with parameters and return types).[4] Relationships are shown via connecting lines, such as associations (general links between classes), generalization (inheritance, using a hollow arrow), aggregation and composition (whole-part relationships, with diamond symbols), and dependencies (one class relying on another, shown as a dashed arrow).[5] These diagrams adhere to the UML standard maintained by the Object Management Group (OMG), with version 2.5.1 being the current specification as of 2017.[6]Overview
Definition and Purpose
A class diagram is a static structure diagram in the Unified Modeling Language (UML) that depicts the static aspects of a system by illustrating its classes, along with their attributes, operations, and relationships between classes.[4] It provides a blueprint for the system's architecture at the level of classes and interfaces, including their features, constraints, and associations.[7] As defined in the UML specification maintained by the Object Management Group (OMG), class diagrams are part of the graphical notation for visualizing, specifying, constructing, and documenting software artifacts.[6] The primary purposes of class diagrams include modeling domain concepts to capture the essential elements of a problem space, designing system architecture by outlining how classes interact, documenting existing code structures for maintenance and onboarding, and facilitating communication among stakeholders such as developers, analysts, and clients.[8] These diagrams enable teams to represent object-oriented designs in a standardized way, supporting the transition from requirements to implementation.[9] Key benefits of using class diagrams lie in their ability to visualize inheritance hierarchies, associations, and dependencies, which helps in identifying design flaws, redundancies, or inconsistencies early in the development lifecycle before coding begins.[10] For example, a basic class diagram for a "Vehicle" might show the class with attributes like speed (of type integer) and operations such as accelerate(speed: int): void, providing a clear view of the object's structure and behavior without delving into dynamic interactions.Historical Development
The development of class diagrams traces its roots to the emergence of object-oriented modeling methods in the late 1980s and early 1990s, which sought to represent software systems through classes, attributes, and relationships. Pioneering approaches included the Booch method by Grady Booch, introduced in his 1991 book Object-Oriented Design with Applications, which used cloud-like notations for classes and emphasized iterative design with graphical representations of object interactions and hierarchies. Similarly, James Rumbaugh's Object Modeling Technique (OMT), detailed in the 1991 book Object-Oriented Modeling and Design, employed entity-relationship style diagrams to depict classes, their attributes, operations, and associations, influencing static structure modeling. Ivar Jacobson's Object-Oriented Software Engineering (OOSE), outlined in his 1992 book Object-Oriented Software Engineering: A Use Case Driven Approach, incorporated object diagrams that evolved into class representations, focusing on use cases and object lifecycles. These methods proliferated as object-oriented programming gained traction, but their notations varied, prompting calls for unification.[11][12] Earlier influences on class diagram concepts came from structured object-oriented methods like Shlaer-Mellor and Fusion. The Shlaer-Mellor approach, introduced in Sally Shlaer and Stephen Mellor's 1988 book Object-Oriented Systems Analysis, utilized Object-Oriented Design Language (OODLE) diagrams to model classes as data stores with information structures, emphasizing executable models and state transitions, which laid groundwork for static class representations. The Fusion method, developed by Derek Coleman and colleagues and published in their 1994 book Object-Oriented Development with Applications, integrated analysis and design phases with class diagrams that included roles, scenarios, and visibility, bridging structured and object-oriented paradigms to support collaborative modeling. These precursors highlighted the need for standardized notations to facilitate communication in large-scale software projects.[12][13] The standardization of class diagrams occurred through the Unified Modeling Language (UML), culminating in UML 1.0 proposed to the Object Management Group (OMG) in January 1997 by the UML Partners consortium, including Rational Software leaders Booch, Rumbaugh, and Jacobson. Adopted by OMG in November 1997 as UML 1.1, it established class diagrams as a core static structure diagram type for depicting classes, interfaces, relationships, and collaborations in object-oriented systems, unifying prior notations into a vendor-neutral standard. This marked a pivotal shift, enabling widespread adoption in software engineering for visualizing system architecture.[14][15] Subsequent evolutions refined class diagrams within UML. UML 2.0, finalized by OMG in July 2005, introduced enhancements such as improved support for stereotypes—allowing custom extensions to metamodel elements—and profiles, which enable domain-specific adaptations through tagged values and constraints, facilitating more flexible modeling of complex systems while maintaining backward compatibility. These changes addressed limitations in earlier versions by aligning UML more closely with the Meta-Object Facility (MOF) for better extensibility. UML 2.5, released by OMG in June 2015, focused on refinements for a lighter, more readable specification, reorganizing content to reduce redundancy and clarify notations without altering core semantics, making class diagrams easier to apply in agile and model-driven development.[16][17] As of 2025, UML remains under ongoing maintenance by OMG, with the current version at 2.5.1, emphasizing integration with digital tools and extensions like SysML v2.0, adopted in July 2025, which builds on UML class diagrams for systems engineering by adding blocks and requirements modeling to support model-based systems engineering (MBSE). This evolution ensures class diagrams continue to serve as a foundational tool in software and systems design amid advancing automation and interoperability needs.[18][19]Core Elements
Class Notation
In UML class diagrams, a class is visually represented as a solid-outline rectangle, which may be divided into up to three compartments separated by horizontal lines to organize its contents. The top compartment contains the class name, the middle one lists attributes, and the bottom one enumerates operations, though compartments can be suppressed or expanded as needed for clarity.[20] This rectangular notation provides a compact way to depict the static structure of a class without implying implementation details. The class name compartment is the mandatory top section, where the name is displayed in bold font and centered horizontally to emphasize its significance as the primary identifier.[21] For abstract classes, the name is rendered in italics to indicate that the class cannot be instantiated directly.[20] If the class represents static features or is part of a utility context, the name may be underlined.[20] Stereotypes offer a mechanism to extend the semantics of a class without altering the core UML metamodel, denoted by guillemets enclosing the stereotype keyword, such as<<entity>> or <<interface>>, placed above the class name compartment.[21] This notation allows modelers to apply domain-specific categorizations, like <<persistent>> for classes involved in data storage.[22]
Optional iconic representations can enhance readability in domain-specific models, where a simple icon or symbol (e.g., a gear for a utility class) is placed in the top-right corner of the class rectangle alongside the name.[20] These icons are not part of the standard UML notation but are permitted to provide intuitive visual cues without conflicting with the primary rectangular form.
Naming conventions for classes follow UML guidelines to ensure consistency and uniqueness: class names use PascalCase (e.g., BankAccount), starting with an uppercase letter, and must be unique within their enclosing namespace or package to avoid ambiguity in the model.[23] This approach promotes clear, readable identifiers that align with object-oriented programming practices.
Attributes
In UML class diagrams, attributes represent the data components or properties of a class, defining the state of its instances. They are displayed in the middle compartment of the class rectangle, positioned below the class name and above the operations compartment, with each attribute listed on a separate line for clarity.[24] The standard notation for an attribute isvisibility name : type = defaultValue, where visibility is indicated by symbols such as + for public, - for private, # for protected, or ~ for package; the type specifies the data type (e.g., int, String); and an optional default value provides an initial assignment. For example, -age: [int](/page/INT) = 0 denotes a private integer attribute initialized to zero. This syntax allows modelers to capture essential structural details without ambiguity.[24][25]
Derived attributes, which are computed from other attributes or relationships rather than stored directly, are prefixed with a forward slash (/) in their notation. For instance, /fullName: [String](/page/String) indicates a derived string attribute calculated, perhaps by concatenating first and last name attributes. These are useful for representing dependent values in the model.[24][26]
Static attributes, shared across all instances of the class rather than belonging to individual objects, are denoted by underlining the entire attribute string in the diagram. An example is _totalInstances: int, underlined to signify class-level scope. Visibility indicators, such as + or -, apply similarly to static attributes to control access.[24][20]
Read-only attributes, which cannot be modified after initialization, are annotated with the property tag {readOnly} appended to the attribute notation. For example, id: [String](/page/String) {readOnly} specifies an immutable identifier. This tag enforces constraints on the attribute's mutability within the model's semantics.[24][25]
Although multiplicity is more commonly associated with relationships, it can be applied to individual attributes to indicate the allowable number of values, typically enclosed in square brackets after the type (e.g., options: [String](/page/String) [0..1] for an optional single value). This usage is rare for simple attributes but supports modeling optional or variable cardinality in structured data.[24][27]
Operations
In UML class diagrams, operations represent the behaviors or methods that a class can perform, depicted in the third compartment of the class rectangle, located at the bottom of the class symbol. This compartment lists operations in a specific syntactic format: operationName(parameterList): returnType, where the parameter list includes zero or more parameters separated by commas, and the return type is optional if void. For example, an operation to retrieve an age might be notated as getAge(): int, illustrating a parameterless method returning an integer value. Parameters in operations are specified with their direction (in, out, inout, or return), name, type, and optional default value, listed in the order they appear in the signature. The direction indicates how the parameter interacts with the operation: 'in' for input only, 'out' for output only, 'inout' for both, and 'return' for the result. An example is setName(in newName: String = ''): void, where 'in' specifies an input parameter of type String with a default empty string, and the operation returns nothing. Multiplicity or effects can also be annotated for parameters, but the core notation prioritizes clarity in direction and typing. Abstract operations, intended for implementation in subclasses, are denoted by italicizing the operation name or the entire signature, signifying that the operation is declared but not defined in the current class. This notation supports polymorphism in inheritance hierarchies, where subclasses provide concrete realizations. For instance, an abstract draw() operation in a Shape class would appear in italics to indicate it must be overridden. Static operations, which belong to the class rather than instances and can be invoked without creating an object, are underlined in the notation. This distinguishes class-level behaviors, such as utility methods, from instance methods; for example, Math.sqrt(value: double): double would be underlined to show it operates on the class scope. Static operations are indicated by underlining the entire operation signature in the diagram.[20] Exceptions that an operation may raise are documented using a tagged value in curly braces immediately following the operation signature, such as calculate(): double {throws ArithmeticException}. This {throws ExceptionType} tag lists possible runtime errors, aiding in understanding the operation's fault tolerance without specifying full exception semantics. Multiple exceptions can be comma-separated within the braces.Member Properties
Visibility Indicators
In UML class diagrams, visibility indicators specify the accessibility of class members such as attributes and operations, controlling which elements can access them based on their namespace and inheritance relationships. These indicators are represented by specific symbols placed before the name of the member in the class notation. The four standard visibility kinds defined in UML are public, private, protected, and package.[28] The symbols for these visibility kinds are as follows: '+' for public visibility, which allows access from any element in any namespace; '-' for private visibility, restricting access to only within the owning class; '#' for protected visibility, permitting access from the owning class and its subclasses; and '~' for package visibility, limiting access to elements within the same package as the owning class.[2][8] For example, an attribute might be notated as-balance: [double](/page/Double), indicating private visibility for the balance attribute of type double.[29] Similarly, operations follow the same convention, with the symbol preceding the operation name, such as +getBalance(): [double](/page/Double) for a public getter method.
If no visibility symbol is specified for a class member, the default visibility is public, allowing access from any element.[24]
Visibility indicators also apply to relationships in class diagrams, particularly on association ends, where they denote the accessibility of the referenced element as a property. For instance, a private reference in an association might be shown with a '-' symbol near the end connected to the target class, restricting access to that reference from outside the owning class.[2]
In UML 2.x, extensions to visibility notation include the use of property strings enclosed in curly braces {} after the member declaration to specify additional constraints, though core visibility remains governed by the standard symbols. While custom visibilities are not directly supported beyond the four kinds, property tags like {unique} can complement visibility by adding qualifiers such as uniqueness to properties, including those on association ends.[20]
Scope Modifiers
In Unified Modeling Language (UML) class diagrams, members such as attributes and operations can exhibit one of two scopes: instance scope or classifier scope. Instance scope is the default, wherein each object (instance of the class) maintains its own independent value or behavior for the member, allowing for individualized state or actions per object. This scoping is essential for modeling object-oriented principles like encapsulation, where attributes like an employee's salary vary across instances but remain private to each. Classifier scope, in contrast, applies to members shared across all instances of the class, akin to static members in programming languages. These features belong to the class itself rather than to individual objects, enabling shared state or utility functions that do not depend on instance-specific data. For example, a static attribute might track the total number of instances created (e.g., a counter incremented upon object instantiation), while a static operation could provide class-level utilities like mathematical functions in a Math utility class. Such scoping supports efficient modeling of global or collective behaviors without requiring object instantiation. The notation for scope in class diagrams is straightforward and applies uniformly to both attributes and operations. Members with instance scope use standard, non-underlined text for their names and signatures. Classifier-scoped members are denoted by underlining only the name (and parameters if applicable), without affecting other elements like types or visibility symbols; for instance, an attribute might appear astotalInstances: [Integer](/page/Integer) for instance scope or _totalInstances: [Integer](/page/Integer) for classifier scope. This underlining convention, inherited from earlier UML versions, ensures visual distinction while maintaining diagram clarity.
The implications of classifier scope extend to design and implementation: these members affect or are accessible to all instances uniformly, promoting reuse for constants, counters, or factory methods, but requiring careful management to avoid unintended global side effects. Unlike visibility indicators (e.g., + for public or - for private), which control access permissions, scope determines binding level and is orthogonal to visibility—a member can thus be both private and classifier-scoped, restricting access to the shared feature while keeping it instance-independent. This separation allows precise modeling of complex systems where access control and scoping needs intersect without overlap.
Relationships
Association and Multiplicity
In UML class diagrams, an association represents a structural relationship between two classifiers, indicating that instances of those classifiers are connected or interact in some way. The notation for an association is a solid line connecting the classifier symbols, with an optional association name placed near the center of the line to describe the nature of the connection. Role names may be specified at each end of the association to denote the specific role played by instances of the connected classifiers, such as "employer" near an Employee class in a link to a Company class.[30][2] Multiplicity specifies the number of instances that may participate in the association from each end, constraining the cardinality of the relationship. It is denoted by textual indicators placed near each association end, using formats like a single number for exact count (e.g., "1" for exactly one instance), ranges for variable bounds (e.g., "0..1" for zero or one, "1.." for one or more), or "" as shorthand for "0..*" (zero or more). Additional modifiers can indicate ordered collections with {ordered} and uniqueness with {unique}, ensuring the related instances are sequenced or distinct as needed.[8][31] Navigability determines the direction in which the association can be traversed, reflecting whether instances of one class can access or reference instances of the other. By default, associations are bidirectional with no arrowheads, implying mutual navigability; a filled arrowhead at one end indicates unidirectional navigability from the source to the target class. A small cross (×) at an end denotes non-navigability, preventing traversal in that direction.[30][2] A qualified association refines a standard association by including a qualifier attribute on one end, which acts as a key to select specific instances from the target set, often reducing multiplicity to 0..1 or 1 for efficient lookups. The notation features a small rectangle attached to the qualified end of the line, containing the qualifier attribute name (e.g., [ISBN] for selecting a Book instance in a Library-Book association). For instance, a Company class might qualify its employees via a [SSN] attribute to uniquely identify one employee per social security number.[20][32] Reflexive associations model relationships where instances of a single class connect to other instances of the same class, such as a self-referential link for hierarchical structures. The notation is a solid line originating and terminating at the same classifier, potentially with multiplicity, role names, and navigability adornments; for example, a Person class might have a reflexive association labeled "parentOf" with multiplicity 0..* to represent family trees. This allows representation of intra-class connections without introducing additional classifiers.[2][33]Aggregation and Composition
In UML class diagrams, aggregation represents a "whole-part" or "has-a" relationship where the part can exist independently of the whole, indicating shared ownership among multiple wholes. Graphically, it is denoted by a hollow diamond attached to the whole class at one end of the association line, with the line connecting to the part class. For example, a Car class might aggregate Wheel instances, as wheels can be detached and shared or reused with other vehicles without affecting their existence.[34][35] Composition, in contrast, depicts a stronger form of whole-part relationship with exclusive ownership, where the lifecycle of the parts is tightly bound to the whole—parts are created with the whole and destroyed when it is. It is illustrated by a filled black diamond on the whole side of the association line. A classic example is a Car composing an Engine, where the engine cannot exist separately and is integral to the car's structure, ceasing to exist if the car is destroyed. Multiplicity on the whole side for composition is typically 1, ensuring exclusive control and preventing sharing.[36][35][37] The key distinction lies in ownership strength: aggregation implies a loose, non-exclusive "has-a" association allowing parts independent lifecycles and potential sharing, while composition enforces a tight "contains-a" bond with coincident lifecycles and no sharing. Additionally, compositions form directed acyclic graphs to avoid circular ownership dependencies that could complicate deletion semantics. These relationships build on basic associations by adding ownership semantics, but without implying generalization or mere usage.[38][39]Generalization and Realization
In UML class diagrams, generalization represents a taxonomic "is-a" relationship between a more general classifier, known as the superclass or parent, and a more specific classifier, known as the subclass or child.[40] This relationship enables the subclass to inherit structural and behavioral features, such as attributes and operations, from the superclass, promoting reuse and establishing a hierarchy of classifiers.[41] The notation for generalization is a solid line ending in a hollow triangle arrowhead, directed from the subclass toward the superclass.[20] UML supports multiple inheritance through generalization, allowing a subclass to extend multiple superclasses simultaneously, which is depicted by multiple generalization arrows converging on the subclass.[40] This can lead to complex hierarchies, often resolved using generalization sets to manage overlapping or disjoint specializations, though care must be taken to avoid ambiguities like the diamond problem in implementation.[8] Abstract classes and interfaces in generalization hierarchies are indicated by rendering their names in italics or annotating them with the {abstract} keyword, signifying that they cannot be instantiated directly and serve as blueprints for concrete subclasses.[20] Subclasses may override or refine inherited operations from the superclass, explicitly marked with the {redefines} constraint to indicate that the subclass operation specializes the parent's behavior while conforming to its contract.[42] Realization, in contrast, models an abstraction relationship where a client classifier implements or refines a specification provided by a supplier classifier, often an interface defining a contract of operations and properties.[43] This is particularly used for interface realization, where a class commits to fulfilling the interface's requirements, such as implementing all specified operations, as in a class realizing the Runnable interface to enable threading in Java.[8] The notation consists of a dashed line with a hollow triangle arrowhead pointing from the implementing class (client) to the interface (supplier), emphasizing the one-way refinement without inheritance of implementation details.[43] Unlike generalization, realization does not imply substitutability in the same hierarchical sense but ensures contractual compliance, supporting design patterns like adapters or strategies.[2]Dependency
In UML class diagrams, a dependency relationship represents a weaker form of coupling where one element (the client) relies on another (the supplier) for its behavior or structure, but without implying ownership or a permanent structural link. This is typically depicted as a dashed line with an open arrowhead pointing from the client to the supplier, optionally annotated with a stereotype such as «use» or «import» to specify the nature of the dependency.[20][8] Dependencies can manifest at various levels and types, including usage dependencies where the client requires the supplier at runtime to perform operations, such as passing an object of the supplier class as a method parameter, and import dependencies that involve namespace or visibility extensions between packages or elements. At the instance level, the relationship indicates that objects of the client class temporarily utilize objects of the supplier class during execution, for example, when a method in one class invokes a service from another without maintaining a reference. In contrast, class-level dependencies occur when the client's structure or operations reference the supplier in a more static way, such as using the supplier's type in attribute declarations or method signatures, potentially affecting compilation if the supplier changes. The keyword «use» may be applied optionally to clarify usage types, emphasizing runtime needs over structural integration.[44][8] To maintain clarity in diagrams, dependencies should be used judiciously for non-structural, changeable relationships and avoided for persistent or navigational ties, where an association would be more appropriate to denote fixed links between classes. Overuse of dependencies can clutter diagrams and obscure stronger relationships, so they are best reserved for scenarios like temporary collaborations or external library usages that do not define the core architecture.[20][8]Advanced Features
Stereotypes
Stereotypes provide a key extensibility mechanism in UML for customizing the semantics of model elements, such as classes, without altering the core language definition. They allow modelers to introduce domain-specific or methodology-specific variations by associating additional meaning, properties, or notations with standard elements. As defined in the UML Superstructure Specification, a stereotype is a profile construct that specifies extensions to a metaclass, enabling the creation of new element types that inherit the structure of their base but add tailored semantics.[45] In class diagrams, stereotypes are applied to classes to refine their role or behavior in a particular context. Standard UML stereotypes for classes include «interface», which denotes a class that declares public operations without providing their implementation, focusing solely on contracts for interacting components; «type», which represents an abstract specification of common characteristics for a set of objects, often used in foundational modeling; and «utility», applied to classes that encapsulate global variables and procedures without instances, typically for stateless helper functions. These predefined stereotypes, outlined in the UML standard, facilitate common modeling needs like defining service contracts or auxiliary computations in class diagrams.[45][23] Beyond standard stereotypes, UML profiles enable the creation of custom ones tailored to specific domains or processes, extending class semantics for specialized applications. For instance, in object-oriented analysis, the «entity» stereotype may be used for classes representing persistent data objects with identity and state, distinguishing them from boundary or control classes in robustness diagrams. Such profile-based extensions are grouped into reusable packages, allowing consistent application across models while maintaining compatibility with core UML.[45][46] The notation for stereotypes in class diagrams places the keyword within guillemets («stereotype») directly above the class name compartment. Multiple stereotypes can be listed if applicable, separated by commas. Additionally, stereotypes often include tagged values—key-value pairs in curly braces, such as {persistence= database }—to specify further properties or constraints associated with the extension. This notation integrates seamlessly with class notation, appearing in the name compartment without disrupting the standard rectangular shape.[45][22] Stereotypes enhance the expressiveness of class diagrams by supporting lightweight customization, making them integral to approaches like Model-Driven Architecture (MDA), where they bridge platform-independent and platform-specific models. For example, a class stereotyped as «actor» can indicate an external entity relevant to use case interactions, tying structural models to behavioral ones without requiring new diagram types. This mechanism ensures UML remains adaptable to evolving software engineering practices while preserving its foundational rigor.[45]Constraints and Notes
In UML class diagrams, constraints specify conditions, restrictions, or assertions that must hold true for model elements such as attributes, operations, or associations. These are commonly notated using curly braces enclosing a textual expression directly adjacent to the constrained element; for instance, an attribute might appear asage: [Integer](/page/Integer) {age > 0} to enforce that the value is positive.[47]
For more intricate rules, the Object Constraint Language (OCL) provides a formal, type-safe mechanism to express constraints precisely, often prefixed with the element name or using self to reference the instance context, such as {[self](/page/Self).age > 0} on an attribute. OCL constraints are particularly useful for defining navigable paths across associations and evaluating boolean conditions.[48][47]
Class-level invariants represent conditions that must remain true for every instance of a class across its lifecycle, typically annotated as {invariant: condition} near the class symbol or within a dedicated constraint box. These invariants ensure consistency in object states and behaviors.[47]
Note elements offer unstructured textual commentary for clarifications, supplementary details, or information outside standard UML semantics, such as implementation hints or rationale. Rendered as a dog-eared rectangle, a note attaches to the target element via a dashed line, allowing multiple connections if annotating several items.[21]
For constraints involving multiple model elements, a dashed line connects the elements, with the constraint in curly braces placed as a label on or near the line. Complex expressions can be enclosed in a note, attached via a dashed line to the relevant elements for clarity in dense diagrams.[47][49]
In analysis contexts, stereotypes like <<entity>> briefly annotate classes as persistent domain objects, often facilitating integration with entity-relationship models by implying data storage requirements.[50]