Fact-checked by Grok 2 weeks ago

Tree structure

A tree structure is a way of representing the hierarchical nature of a structure in a graphical or conceptual form, consisting of nodes connected by edges, forming a with a single and branches representing substructures, with no cycles. In , a tree is a nonlinear consisting of connected by directed edges, with a single node and zero or more subtrees, where each non- node has exactly one . Trees are defined recursively, as each contains a value and pointers to its subtrees, making them ideal for representing hierarchical relationships such as organizational charts, file systems, or biological phylogenies. Key properties of trees include the distinction between (the top node with no parent), leaves (nodes with no children), internal nodes (those with children), and concepts like depth (the number of edges from the root to a node) and (the longest path from a node to a leaf). Nodes sharing the same parent are called siblings, and trees can be empty or contain a single node. Unlike linear structures like arrays or linked lists, trees allow multiple children per node, enabling efficient organization of data with branching relationships. Trees encompass various types, including binary trees (where each has at most two children, often used in search algorithms) and general trees (with any number of children). Common applications include modeling file directories, implementing database indexes for fast querying, representing decision processes in algorithms, organizational hierarchies, and biological models such as evolutionary trees, with efficient operations like search and insertion in structures such as binary search trees, which achieve O(log n) when balanced.

Fundamentals

Definition

In , a graph consists of a of vertices, also known as nodes, and a of edges that connect pairs of vertices. Vertices represent entities, while edges represent relationships or connections between them. A is formally defined as an undirected that is connected and contains no cycles, meaning there is exactly one between any pair of distinct vertices. Equivalently, a with n vertices has exactly n-1 edges and is acyclic. An alternative formulation describes a rooted tree as a in which one is designated as the with no incoming edges, every other has exactly one incoming edge, and there are no cycles. In contrast, a free tree, also known as an unrooted tree, is the undirected version without a designated . The concept of trees in mathematics was introduced by in 1857, who used them to enumerate certain algebraic structures, and he coined the term "tree" for these acyclic connected graphs.

Terminology

In tree structures, which are acyclic connected graphs, standard terminology describes the nodes and their relationships, particularly in the context of rooted trees where a distinguished node serves as the origin. The is the unique starting node in a rooted tree, with no and from which all other nodes are reachable via directed paths away from it. A , or terminal node, is a node with no children, representing an in the tree. Parent-child relationships form the hierarchical connections: a node is one that has one or more child nodes connected directly to it via an directed away from the parent, while siblings are nodes that share the same . A subtree rooted at a given consists of that node and all its descendants, forming a smaller tree embedded within the larger structure. The depth of a node is the length of the unique path from the root to that node, often measured as the number of edges, while the level of a node corresponds to its depth, grouping nodes at the same distance from the root. The height of a node is the length of the longest path from that node to a leaf in its subtree, and the height of the entire tree is the height of the root. In undirected trees, edges lack inherent direction, but rooted trees impose an orientation where edges point away from the root toward the leaves, distinguishing them from undirected representations. The of a in an undirected is the total number of edges incident to it, reflecting its ; in rooted trees, the out-degree specifically denotes the number of children. A in a is a unique sequence of distinct connected by edges, with no cycles. Ancestor-descendant relations follow this: an of a is any on the unique from the to that (excluding the node itself), and a is any reachable from it via a directed away from the .

Properties

A tree with n nodes contains exactly n-1 edges. This follows from the connectedness and acyclicity of the , ensuring minimal without redundancy. The absence of cycles in a tree implies acyclicity by definition, preventing any closed loops. Additionally, there exists a unique path between any two nodes in a tree, which underscores its connectivity without alternative routes. From these basics arise key structural theorems. An adaptation of the handshaking lemma for trees states that the sum of the degrees of all nodes equals $2(n-1), as this is twice the number of edges. Furthermore, every tree with at least two nodes has at least two leaves (nodes of degree one). This property arises because, in a connected acyclic , endpoints of the longest must be leaves, and no tree can have all nodes of degree greater than one without forming a . In rooted trees, the height is defined as the length of the longest from the to a . The of a tree, whether rooted or unrooted, is the length of the longest between any two nodes. A fundamental bound relates these measures: in a rooted of h, the is at most $2h, since any between two nodes passes through their , with each segment to the ancestor no longer than h. Trees are isomorphic if there exists a between their sets that preserves adjacency, meaning they share the identical structural arrangement. For labeled trees, quantifies the total number: there are exactly n^{n-2} distinct trees on n labeled s. This seminal result, counting spanning trees of the K_n, highlights the combinatorial richness of tree structures.

Types and Variations

Binary Trees

A binary tree is a tree data structure in which each node has at most two children, typically distinguished as the left child and the right child, allowing for an ordered structure where the position of subtrees matters. In this ordered variant, the left and right subtrees are not interchangeable, enabling applications that rely on positional semantics, such as binary search trees. Unordered binary trees, by contrast, treat the two children without left-right distinction, resembling free trees with degree at most three but lacking the planar embedding common in computational contexts. Binary trees exhibit specific structural properties that differentiate them from general trees. The maximum number of nodes at any level k (starting from level 0 at the ) is $2^k, leading to a perfect binary tree having exactly $2^{h+1} - 1 nodes for h. A full binary tree requires every node to have either zero or two children, ensuring no nodes with a single child. A complete binary tree fills all levels except possibly the last, which is filled from left to right, optimizing space in representations. A perfect binary tree extends completeness by fully populating every level, with all leaves at the same depth. The height of a , defined as the longest path from to , has bounds that highlight efficiency potential. For n nodes, the minimum height is \lceil \log_2 (n+1) \rceil - 1, achieved in a complete or perfect configuration, while the maximum height is n-1 in a degenerate (linear) case. This minimum height bound underpins balanced binary search trees, where rotations or restructuring maintain logarithmic height to guarantee O(\log n) time for insertions, deletions, and lookups. The enumeration of trees follows a combinatorial given by the . The number of distinct ordered trees with n nodes is the nth : C_n = \frac{1}{n+1} \binom{2n}{n}, which counts all possible shapes considering left-right ordering and nodes with 0, 1, or 2 children. For example, C_3 = 5, corresponding to the five possible structures with three nodes. This sequence arises in various recursive decompositions and has been central to analyzing tree-generating processes since its formulation in the .

Multi-way Trees

Multi-way trees, also known as n-ary trees, are rooted tree structures in which each node can have an arbitrary number of children, generalizing beyond the two-child limit of binary trees. In an n-ary tree, the number of children per node varies flexibly, with no inherent restriction on the , allowing for representations of complex hierarchies where nodes may have zero or more subtrees. These trees are commonly implemented by storing the children of each node in dynamic lists or arrays, which accommodate the variable efficiently without predefined slots for each child. A key property of multi-way trees is the absence of a fixed , enabling their use in scenarios where the number of children is not predetermined. They can be ordered, where the sequence of children matters and is preserved (e.g., via indexed access), or unordered, where only the parent-child relationships are significant without positioning. This flexibility contrasts with binary trees by removing left-right distinctions, focusing instead on the collection of subtrees. Multi-way trees support operations like insertion and traversal adapted to handle variable children, often using recursive methods on the child list or . B-trees serve as a prominent example of balanced multi-way trees, widely used in database indexing to manage large datasets on disk. In a B-tree of order m (where m > 1), every except the root has between \lceil m/2 \rceil and m children, ensuring a minimum occupancy that keeps the tree height low—typically logarithmic in the number of keys—for efficient search, insertion, and deletion. The has at least two children, and all leaves are at the same level, promoting balanced growth and reducing I/O operations in external storage systems. Multi-way trees relate to through representational transformations, such as the left-child right-sibling , which converts an n-ary into a for implementation using standard structures. In this approach, the left pointer of a indicates its first in the original , while the right pointer links to the next , effectively encoding the multi-way as a one without loss of . This technique allows algorithms designed for to operate on multi-way trees by treating siblings as a right-linked chain.

Specialized Trees

Specialized trees are advanced variations of tree structures designed to optimize performance for specific operations, such as maintaining during insertions and deletions or efficiently storing and retrieving strings. These structures build on basic or multi-way trees by incorporating additional constraints or properties to guarantee logarithmic for key operations, making them suitable for dynamic datasets where worst-case performance is critical. AVL trees are s that ensure the s of the two subtrees of any differ by at most one, a property known as balance factor. This strict height constraint prevents the tree from degenerating into a linear structure, guaranteeing O(log n) time for search, insertion, and deletion operations. To maintain balance after modifications, AVL trees use rotations—single or double rotations on the affected and its ancestors—adjusting the structure locally without rebuilding the entire . The concept was introduced by and Evgenii Landis in their seminal 1962 paper, marking the first . Red-black trees are another type of that enforces balance through coloring—each is either —and a set of five properties: every has a color, the is black, red s cannot have red children (no two adjacent reds), every path from a to its leaves has the same number of black s, and all leaves are black. These rules ensure that the longest path from to is no more than twice the shortest, bounding the height at 2 log(n+1) and thus achieving O(log n) performance for insertions, deletions, and searches. Unlike AVL trees, red-black trees allow temporary imbalances but use color flips and rotations to restore properties, often resulting in fewer rotations overall. The structure originated from Rudolf Bayer's work on symmetric binary B-trees and was formalized with the red-black coloring in a 1978 paper by Leonidas J. Guibas and . Heaps, often implemented as , are complete binary trees that satisfy the : in a min-heap, the value of each node is less than or equal to the values of its children, ensuring the smallest element is always at the . This allows efficient of the minimum (or maximum in a max-heap) in O(log n) time via heapify operations, which bubble elements up or down the tree. Heaps are typically array-based for space efficiency, with parent-child access via index calculations (e.g., child of index i at 2i+1 and 2i+2). They serve as the foundation for priority queues, where elements have associated priorities, and are integral to algorithms like . The was introduced by J.W.J. Williams in 1964 as part of the . Tries, also known as trees, are multi-way trees optimized for storing strings or sequences, where each represents a and edges are labeled with characters, allowing paths from to to spell out complete keys. This structure enables prefix-based searches, such as or lookups, in O(m) time where m is the key length, independent of the number of keys n, by traversing character by character without comparisons. Nodes may have up to the size in children, and end-of-word markers distinguish complete words from prefixes. To reduce space for sparse data, compressed variants like tries eliminate single-child nodes by storing edge labels as substrings, effectively merging chains. Tries were first proposed by René de la Briandais in 1959 for file searching with variable-length keys, with introducing the term "trie" (from "retrieval") and the variant in 1960.

Applications and Examples

Organizational Hierarchies

Tree structures provide a natural framework for modeling organizational hierarchies in and abstract systems, where entities are arranged in levels of or relatedness, with a single at the and branching paths downward to nodes. In such representations, the denotes the highest-level element, such as a central , while leaves represent endpoints like individual units without further subordinates. This hierarchical arrangement facilitates clear delineation of relationships and responsibilities, enabling efficient communication and decision propagation from top to bottom. Corporate hierarchies exemplify tree structures, with the (CEO) serving as the , departments or divisions as immediate , and individual employees as at the base. This model establishes unambiguous reporting lines, where each subordinate reports to exactly one superior, promoting and streamlined oversight in large organizations. For instance, in multinational corporations, this tree-like setup allows for scalable management, from executive leadership branching into functional units like and operations, ultimately terminating in frontline staff. The advantages include reduced ambiguity in authority and faster escalation of issues, as information flows unidirectionally along the branches. Family trees, or charts, are directed trees that trace from ancestral roots to leaves, with edges representing parent-child relationships oriented from ancestors to progeny. In this structure, each individual () typically has one or two parents, forming a downward-branching that visualizes generational . However, real-world family trees often deviate from strict trees due to phenomena like marriages, resulting in directed acyclic graphs (DAGs) rather than pure trees, as cycles would imply impossible temporal loops. This adaptation maintains the hierarchical essence while accommodating complexities in kinship networks. Taxonomic systems further illustrate tree structures in organizing knowledge, as seen in the Linnaean classification, which arranges biological entities hierarchically from broad kingdoms at the root to specific at the leaves. Developed by in the 18th century, this system uses nested ranks—such as , , , , , and —to create a branching tree that reflects presumed natural relationships among organisms. Complementing this, employs phylogenetic trees to depict evolutionary divergences, where branches represent shared ancestry and splitting events, prioritizing monophyletic groups (clades) over Linnaean ranks for more accurate reconstructions of descent. These tree-based taxonomies enable systematic identification and comparison, underpinning fields from to . In contexts, decision trees model complex choices as branching structures, starting from a problem or initial decision and extending to nodes that signify possible outcomes or resolutions. Each internal represents a decision point with for alternative actions, such as "invest" or "divest," weighted by probabilities and impacts to evaluate paths. This approach aids by visualizing trade-offs, like versus reward in expansions, allowing managers to select the yielding optimal results. By sequential choices, decision trees enhance foresight and reduce oversight in uncertain environments.

Data Structures in Computing

In computing, tree structures are fundamental for organizing data hierarchically in software systems, enabling efficient storage, retrieval, and manipulation without cycles or multiple paths between nodes. One prominent application is in file systems, where directories form a tree to represent the organization of files and folders. For instance, the filesystem, as defined by the (FHS), structures the entire as an inverted tree starting from the denoted by /, with subdirectories serving as internal nodes and files as nodes. This allows via path notation, such as /home/user/documents/[file](/page/File).txt, where each segment separated by / traverses deeper into the from the . The / contains essential subdirectories like /bin for binaries and /etc for configuration files, ensuring the system can and operate coherently. Another key implementation is the (DOM) for XML and HTML documents, which parses markup into a representation for dynamic processing in web browsers and applications. According to the W3C DOM Level 2 Core specification, the DOM models documents as a of objects, where nodes correspond to tags (e.g., <html> as the ), and child nodes represent nested tags like <body>. Attributes are handled as Attr nodes attached to nodes via a NamedNodeMap, allowing properties like id="main" to be accessed and modified without altering the tree's core structure. This enables parsing tools to traverse the document hierarchically, supporting operations like querying specific elements or updating content in real-time during rendering. Trees also serve as abstract data types (ADTs) in programming, providing a formal interface for hierarchical data storage that abstracts away implementation details. As described in Donald Knuth's The Art of Computer Programming, Volume 3: Sorting and Searching, a tree ADT supports operations like insertion, deletion, and traversal while maintaining a root node with ordered children, distinguishing it from linear structures like lists (which lack branching) or graphs (which permit cycles and multiple paths). Trees are preferable for scenarios requiring unique paths from root to leaves, such as modeling containment relationships, whereas lists suit sequential access and graphs handle interconnected data. This abstraction allows implementations in various languages, emphasizing conceptual hierarchy over specific memory layouts. A specialized use of trees in is binary search trees (BSTs), which organize ordered for fast retrieval in applications like and symbol tables. In a BST, each internal has at most two children, with left subtrees containing smaller keys and right subtrees larger keys, enabling search, insertion, and deletion operations. If balanced, the tree's height remains O(log n) for n nodes, allowing lookups in average O(log n) time by halving the search space at each level, as analyzed in standard structures curricula. This efficiency makes BSTs ideal for dynamic datasets where order matters, outperforming unsorted arrays for frequent queries.

Biological and Scientific Models

In biology, phylogenetic trees model the evolutionary relationships among or taxa, depicting a common at the and subsequent divergences along branches that represent events. These trees illustrate shared ancestry and the hierarchical patterns of , with terminal nodes (tips) corresponding to extant or . The structure emphasizes branching patterns over linear progression, providing a framework for understanding and historical contingencies in . Two primary variants distinguish phylogenetic trees by scaling: cladograms, which display only the topological relationships without branch lengths, focusing on the order of branching to infer recency of common ancestry; and phylograms, where branch lengths are proportional to the amount of evolutionary change or elapsed time, incorporating quantitative data such as . Cladograms prioritize qualitative groupings based on shared derived characteristics, as formalized in cladistic methods, while phylograms integrate metrics like estimates for more precise temporal scaling. These models underpin systematic , enabling of evolutionary histories from genetic, morphological, or data. Dendrograms serve as tree structures in , visualizing the results of algorithms by arranging data points into nested groups based on similarity measures, such as or . In this representation, leaves denote individual observations or clusters, while internal nodes mark merge points, with branch heights indicating the dissimilarity level at which clusters combine, often through agglomerative methods that progressively join the most similar pairs. This tree-like reveals natural groupings in datasets, such as profiles in bioinformatics or ecological communities, without assuming a predefined number of clusters. In , syntax trees, also known as , model the grammatical of sentences as hierarchical trees, where internal nodes represent non-terminal categories like noun phrases or verb phrases, and leaves are terminal words or morphemes. These trees derive from context-free grammars, capturing that govern how constituents combine, such as subject-verb-object hierarchies, to ensure syntactic well-formedness. Similarly, in design, abstract syntax trees (ASTs) abstract the parse tree by eliminating redundant terminals and focusing on essential syntactic constructs, with non-terminals as nodes for operators, expressions, or statements, facilitating semantic analysis and . This shared tree paradigm enables recursive decomposition of complex structures in both domains. Game trees formalize sequential decision-making in and , representing game states as nodes and legal moves as directed edges connecting successor states, typically in perfect-information, two-player zero-sum scenarios like chess. The root node denotes the initial position, with reflecting available actions, and leaf nodes evaluated by terminal payoffs. These trees support algorithms like , which recursively propagate optimal values up the tree by alternating maximization and minimization over opponent responses, enabling strategic foresight in adversarial settings. Originating from foundational work in extensive-form games, they model uncertainty and rationality in competitive interactions.

Representation Methods

Graph-Based Representations

In , trees are represented as undirected acyclic connected , where nodes correspond to vertices and parent-child relationships to edges. Graph-based representations leverage standard graph data structures to model these relationships, facilitating algorithmic operations such as traversal and modification in computational settings. These methods treat the tree as a graph with n vertices and n-1 edges, emphasizing efficiency for sparse structures like . Adjacency lists are a common representation where each vertex maintains a list of its adjacent vertices, typically implemented as an array of linked lists or dynamic arrays. In a tree, this structure points from each node to its children (or parent, if bidirectional), capturing the hierarchical connections without cycles. This approach uses O(n) space for a tree with n nodes, since the total number of edges is n-1, making it efficient for sparse graphs. It supports fast neighbor access in O(degree) time, ideal for tree algorithms like depth-first search. Adjacency matrices represent the tree using an n × n two-dimensional , where the entry at row i and column j indicates an between vertices i and j (e.g., 1 for connected, 0 otherwise). For undirected , the matrix is symmetric. However, this requires O(n²) space, which is inefficient for large due to their sparsity, as most entries remain zero. It is rarely used except for small or when rapid existence checks in O(1) time are prioritized over space. Neighbor enumeration takes O(n) time, scanning the row. Edge lists provide a straightforward representation as a collection of n-1 pairs, each denoting a parent-child (or undirected) edge, often stored in an array or . This uses O(n) space and is simple for or input/output in algorithms. It excels in scenarios requiring edge but is less efficient for vertex-centric operations, as checking adjacency requires scanning the list in O(n) time. Edge lists are particularly useful for constructing trees from external data sources. Incidence matrices offer another graph-theoretic view, with rows for n vertices and columns for n-1 edges; each column has exactly two non-zero entries (1's for undirected trees, indicating endpoints). This O(n²) space structure is suited for undirected trees and supports operations like finding connected components or analyzing edge-vertex incidences. However, its density makes it impractical for large-scale implementations compared to adjacency lists, though it aids in linear algebra-based analysis.
RepresentationSpace ComplexityEdge Check TimeNeighbor Access TimeBest For Trees When
Adjacency ListO(n)O(degree)O(degree)Sparse, large-scale computation
Adjacency MatrixO(n²)O(1)O(n)Small trees, fast queries
Edge ListO(n)O(n)O(n)Serialization, simple construction
Incidence MatrixO(n²)O(n)O(n)Undirected analysis, linear methods

Nested and Indented Formats

Nested and indented formats are textual methods for encoding tree structures, emphasizing compactness and readability for data interchange, configuration files, and documentation. These representations exploit through nesting or spatial alignment via indentation to imply parent-child relationships, avoiding explicit edge definitions found in graph-based approaches. They are particularly valuable in scenarios requiring serializable, human-interpretable hierarchies, such as programming languages, markup standards, and database models. The encodes trees by assigning each two integer values—left and right—that delimit the contiguous interval spanning its entire subtree in a depth-first traversal order. For instance, in a simple with (left=1, right=6), child1 (left=2, right=3), and child2 (left=4, right=5), containment queries like "is A an ancestor of B?" reduce to checking if A's interval fully encompasses B's, enabling efficient operations without recursive joins. This approach, also known as the nested intervals model, originated in for handling hierarchical data and supports fast subtree retrieval, though insertions may require renumbering to maintain interval integrity. Nested parentheses provide an isomorphic, prefix-notation representation of , where the precedes subtrees enclosed in matched pairs of parentheses, recursively mirroring the . A basic example is (root (child1) (child2 (subchild))), which directly corresponds to the tree's branching without additional . This underpins symbolic expression (S-expression) systems in , where trees represent both programs and data through balanced nesting, facilitating parsing and evaluation. It also forms the basis of the Newick standard for phylogenetic trees, allowing compact serialization of evolutionary hierarchies. Indented outlines convey through progressive spacing or tabs, typically augmented with bullets, numbers, or dashes to delineate nodes at each level. For example:
    • Child1
    • Child2
      • Subchild
This text-based method enhances in documents and supports easy editing, commonly employed in and outline processors to model organizational or procedural structures. JSON and XML leverage recursive nesting to serialize trees, with JSON using key-value objects and arrays for nodes and children, while XML employs tagged elements. A JSON tree might appear as:
{
  "name": "root",
  "children": [
    {"name": "child1"},
    {
      "name": "child2",
      "children": [{"name": "subchild"}]
    }
  ]
}
XML equivalents use opening/closing tags for similar , forming the (DOM) as a traversable tree. These formats standardize hierarchical data exchange in web APIs and configurations, with favored for brevity and XML for schema validation.

Visual and Diagrammatic Techniques

Visual and diagrammatic techniques provide graphical representations of tree structures, enabling intuitive comprehension of hierarchical relationships, depths, and node attributes through spatial arrangements. These methods leverage two-dimensional or three-dimensional layouts to depict nodes and their connections, facilitating tasks such as , navigation, and analysis in fields like , information visualization, and data exploration. Unlike linear encodings, these approaches emphasize geometric properties to convey structure, often optimizing for screen space and user interaction. Node-link diagrams represent the most conventional visual form for , where individual nodes are depicted as geometric shapes—typically circles or rectangles—and as straight or curved lines connecting and nodes. In rooted , the layout is commonly oriented top-down, with the at the top and leaves at the bottom, arranged in layers corresponding to depth levels to minimize crossings and highlight ancestry. This approach excels in revealing local and paths but can become cluttered for deep or bushy , as overlaps obscure relationships. Seminal work on layered node-link layouts for includes the Reingold-Tilford algorithm, which computes tidy, non-overlapping drawings by aligning siblings and positioning subtrees compactly. Layered icicle diagrams offer a space-filling , portraying levels as horizontal bars stacked vertically, with each bar subdivided into contiguous sub-bars representing , sized proportionally to attributes like or count. This results in a rectangular resembling hanging , where adjacency encodes sibling relationships and vertical alignment shows depth, making it ideal for dense trees with large fan-outs. Originating from displays, icicle plots prioritize linear ordering and enclosure to avoid the line clutter of node-link methods, though they may distort perceived sizes in unbalanced trees. Enhancements include interactive zooming to reveal deeper levels without losing context. Radial tree layouts arrange the structure in a circular fashion, positioning the at the center and radiating branches outward along concentric rings that correspond to depth levels, with spacing allocated to subtrees based on their size or number of nodes. Edges follow curved or straight paths from to , creating a symmetrical, overview-friendly particularly suited to balanced where uniform expansion prevents peripheral crowding. This supports focus-plus-context , such as animated transitions to recenter on selected nodes, enhancing in large hierarchies. Early implementations drew from principles to compute radial coordinates, ensuring minimal distortion in polar space. Treemaps utilize nested rectangles to fill available completely, partitioning the display area recursively according to values, where each rectangle's area represents a quantitative attribute (e.g., ) and nesting illustrates parent-child containment. The slice-and-dice variant, the original , alternates horizontal and vertical subdivisions to create a grid-like pattern, preserving aspect ratios but potentially yielding elongated shapes in skewed trees. This space-efficient technique supports rapid comparison of magnitudes across the and has been extended with like squarified layouts for more uniform rectangles, improving readability for applications in browsing and . Introduced to address disk challenges, treemaps emphasize enclosure over explicit links, trading path-tracing ease for global overviews.

Operations and Algorithms

Traversal Methods

Traversal methods provide systematic ways to visit all nodes in a tree, which is fundamental for tasks such as searching, evaluating expressions, or serializing the structure without modifying it. These algorithms ensure every node is processed exactly once, typically following either a depth-first or breadth-first strategy. Depth-first approaches prioritize exploring deeper into the tree before backtracking, while breadth-first methods process nodes layer by layer from the root. Both categories support recursive implementations that leverage the tree's recursive nature and iterative versions using stacks or queues to manage the order of visitation. Depth-first traversal (DFT), also known as when applied to trees, begins at the and recursively explores one subtree fully before moving to the next. It can be realized recursively by calling the traversal function on subtrees, or iteratively using an explicit to simulate the stack. The primary variants differ in the order of visiting the root relative to its subtrees: , in-order, and post-order. These are particularly relevant for binary trees, where in-order traversal aligns with the natural ordering of keys in structures like binary search trees. In traversal, the is visited first, followed by the left subtree and then the right subtree. This order is useful for creating a prefix notation or copying the tree structure. The recursive is:
[PREORDER](/page/Pre-order)(node):
    if node is null: return
    visit([node](/page/Node))
    [PREORDER](/page/Pre-order)(node.left)
    [PREORDER](/page/Pre-order)(node.right)
An iterative version uses a to push right and left children in reverse order after visiting the . In-order traversal visits the left subtree, then the , and finally the right subtree, which for binary search trees yields s in sorted order. The recursive form is:
INORDER(node):
    if node is null: return
    INORDER(node.left)
    visit(node)
    INORDER(node.right)
Iterative implementations often employ a to track along the leftmost path, visiting each after exhausting its left subtree. Post-order traversal processes the left subtree, then the right, and visits the last. This is ideal for operations like postfix expression or post-traversal computations such as subtree sizes. Recursively:
POSTORDER(node):
    if node is null: return
    POSTORDER(node.left)
    POSTORDER(node.right)
    visit(node)
Iteratively, it requires and careful tracking of visited children, often using two stacks or a single stack with node states. Breadth-first traversal, or level-order traversal, visits all nodes at the current depth before proceeding to the next, starting from the root. It uses to enqueue children in left-to-right order after dequeuing and visiting a parent. This method is implemented iteratively as follows:
LEVELORDER(root):
    if root is null: return
    queue = new Queue()
    queue.enqueue(root)
    while queue is not empty:
        node = queue.dequeue()
        visit(node)
        if node.left: queue.enqueue(node.left)
        if node.right: queue.enqueue(node.right)
Level-order is particularly useful for finding the shortest in unweighted trees or processing nodes by depth. The extends depth-first traversal by recording a that simulates a "" of the , capturing each traversal in both directions. In a DFS, entry and exit times are assigned to each , creating a linear where subtrees appear as contiguous segments; this enables efficient queries like finding the via range minimum queries on the tour. The technique represents the tree as a balanced over the tour , supporting dynamic updates. It was key in developing fully dynamic algorithms for and other tree properties. All standard traversal methods—pre-order, in-order, post-order, level-order, and Euler tour—achieve time complexity in the worst case, where n is the number of , since each and its edges are examined a constant number of times. Space complexity is O(h) for recursive depth-first methods (h being the ) or in the worst case for iterative versions with explicit stacks or queues.

Modification Operations

Modification operations on tree structures enable dynamic updates to the by adding, removing, or reorganizing while preserving key properties such as acyclicity and, in ordered trees, traversal order. Insertion adds a new node to the tree. In general trees represented with adjacency lists, inserting a leaf node involves appending it to a parent's child list, which takes constant time O(1). In binary search trees (BSTs), insertion begins by traversing from the root following key comparisons to locate the correct leaf position, then attaches the new node there to maintain the sorted in-order sequence; this requires O(h) time, where h is the tree height, yielding O(log n) for balanced trees with n nodes. Inserting an internal node in general trees may require restructuring child pointers but similarly achieves O(1) if the position is known. Deletion removes a specified . For nodes in any , deletion simply unlinks the from its , taking O(1) time. For internal nodes with one , the replaces the directly. In BSTs, deleting an internal with two children involves finding its in-order successor (the minimum in the right subtree), copying that successor's value to the , and then deleting the successor (which has at most one ); this ensures the sorted order is preserved and runs in O(h) time, or O(log n) if balanced. In self-balancing variants like AVL or red-black trees, additional rotations may follow to restore balance factors after deletion. Pruning and handle subtree-level changes. Pruning removes an entire subtree rooted at a by unlinking it from the , which is O(1) and maintains acyclicity since no new edges are added. attaches an existing subtree as a new child to a or internal , also O(1) via pointer updates, provided the attachment point and subtree are disjoint to avoid cycles; this operation is common in representations converting ordered trees to forms, where siblings become right children and first children left children. Rotation restructures local subtrees to balance heights without altering the logical order. In AVL trees, a right on node x with left child y promotes y to x's parent position, attaches x as y's right child, and moves y's former right child (if any) as x's left child:
    y              x
   / \            / \
  x   c    =>    a   y
 / \                / \
a   b              b   c
This reduces x's subtree height by one while preserving in-order traversal. A left is the symmetric on a right-heavy node. Double rotations (left-right or right-left) combine these for imbalances. Rotations take O(1) time and are used post-insertion or deletion in balanced trees like AVL to ensure heights differ by at most one.

Analysis and Complexity

Tree structures, in their basic form, require space proportional to the number of nodes they contain. For a tree with n nodes, the space complexity is O(n) in most representations, as each node typically stores its value, references to child nodes, and possibly a parent pointer, leading to a constant amount of space per node regardless of the tree's shape. The time complexity of fundamental operations such as search, insertion, and deletion in a depends on the tree's height h. These operations traverse a path from the root to a leaf or internal node, resulting in O(h) time complexity in the average and worst cases for a given tree. In the worst case, a skewed tree degenerates to a with h = O(n), yielding O(n) time for these operations; however, in a balanced tree where the height property ensures h = O(\log n), the complexity improves to O(\log n). Tree traversals, including in-order, pre-order, and post-order, visit each node exactly once, achieving O(n) time complexity overall. When implemented recursively, these traversals incur additional space complexity of O(h) due to the recursion stack, which in the worst case (skewed tree) is O(n) but O(\log n) for balanced trees; iterative implementations typically require O(h) space for depth-first traversals using a stack or O(w) for breadth-first using a queue, where h is the height and w the maximum width. For self-balancing trees, such as AVL or red-black trees, operations maintain balance through occasional structural adjustments like rotations, ensuring the height remains O(\log n). reveals that the average time per operation is O(\log n), as the cost of rebalancing is spread across a sequence of operations despite higher costs for individual adjustments.

References

  1. [1]
    Tree Data Structure
    A tree is a collection of nodes connected by directed (or undirected) edges. A tree is a nonlinear data structure, compared to arrays, linked lists, stacks and ...
  2. [2]
    Trees - Computer Science
    Oct 26, 2023 · Trees are the most common non-linear data structure in computer science. Trees are useful in representing things that naturally occur in hierarchies.
  3. [3]
    [PDF] Department of Computer Science Data Structures Tree Definitions
    Sep 15, 2025 · Trees. A non-linear data structure defined recursively as a collection of nodes where each node contains a value and zero or more connected ...
  4. [4]
    [PDF] Trees - UNL School of Computing
    In computer science, trees are useful data structures for searching, indexing, and other applications handling data. Terminology I. Like any plant, we can ...
  5. [5]
    6. Trees
    May 31, 2022 · One of the most important applications of binary trees is the binary tree search algorithm, a method that uses binary trees explicitly to ...
  6. [6]
    [PDF] graph theory: basic definitions and theorems
    Definition 18. A tree is a connected, simple graph that has no cycles. Vertices of degree 1 in a tree are called the leaves of the tree.Missing: formal | Show results with:formal
  7. [7]
    [PDF] 1 Basic Definitions and Concepts in Graph Theory
    A tree is a connected graph with no cycles. A forest is a graph where each connected component is a tree. A node in a forest with degree 1 is called a leaf ...
  8. [8]
    [PPT] CS1102: Data Structures and Algorithms
    A undirected graph that is connected and has no cycle is a tree. A tree with n nodes have exactly n-1 edges. A connected undirected graph with n nodes must have ...<|separator|>
  9. [9]
    [PDF] Chapter 2 Mathematical Preliminaries
    A rooted tree is a tree with one vertex designated as the root. For a directed graph the edges are typically all directed toward the root or away from the root.
  10. [10]
    [PDF] CMSC 420: Lecture 3 Rooted Trees and Binary Trees
    A tree is a special class of graph. Recall from your previous courses that a graph G = (V,E) consists of a finite set of vertices (or nodes) V and a finite set ...
  11. [11]
    Branching Silhouettes—Corals, Cacti, and the Oaks - Oxford Academic
    Mar 11, 2017 · In mathematics, the same term coined by Cayley (1857) refers to a graph theoretical construct composed of n vertices (nodes) and n –1 edges ...
  12. [12]
    [PDF] Graphs - Dartmouth Computer Science
    We adopt the language of family trees—ancestor, descendant, parent, and child—to describe ... A vertex with no children in a rooted tree is called a leaf vertex ...<|control11|><|separator|>
  13. [13]
    16. Trees and their Iterators | CS 2110
    Definition: Parent, Child, Leaf. Within a tree, a parent node has outgoing connections to its child nodes. A leaf node has no outgoing connections. In your ...
  14. [14]
    [PDF] Chapter 11 - Computer Science
    Definition: A rooted tree is a tree in which one vertex has been designated as the root and every edge is directed away from the root. An unrooted tree is ...
  15. [15]
    [PDF] 11 Graphs and Trees - Computer Science
    Apr 5, 2021 · In this section, we'll introduce trees formally—including definitions, properties, algorithms, and applications—as a special type of graph.
  16. [16]
    [PDF] CMSC 420: Lecture 5
    Height & Depth​​ The height of node u is the length of the longest path from u to a leaf. The depth of node u is the length of the path from the root to u. ...Missing: theory degree
  17. [17]
    [PDF] 3.1 Characterizations and Properties of Trees 3.2 Rooted Trees ...
    GRAPH THEORY – LECTURE 4: TREES. Def 2.8. A leaf in a rooted tree is any vertex having no children. Def 2.9. An internal vertex in a rooted tree is any vertex ...
  18. [18]
    [PDF] Module 8: Trees and Graphs - Purdue Computer Science
    An ordered rooted tree is a rooted tree where the children of each internal node are ordered. We usually order the subtrees from left to right. Therefore, for ...
  19. [19]
    [PDF] Graph Theory Fundamentals
    Each node of a directed graph has an in degree (number of incoming edges) and an out degree (number of outgoing edges). 1. 2. 3. 4. 5 !" #$.
  20. [20]
    [PDF] GRAPH THEORY WITH APPLICATIONS
    This book is intended as an introduction to graph theory. Our aim has been to present what we consider to be the basic material, together with a wide.
  21. [21]
    5.5 Trees - Graph Theory
    If G has n−1 edges, which must be the edges of its spanning tree, then G is a tree. Theorem 5.5.10 G is a ...
  22. [22]
    [PDF] Graph Theory and Cayley's Formula - UChicago Math
    Aug 10, 2006 · Cayley's Formula tells how many different trees can be constructed on n vertices, or how many ways to make a tree from a complete graph on n ...
  23. [23]
    7.2. Binary Trees — CS3 Data Structures & Algorithms - OpenDSA
    7. 2.1. Definitions and Properties. A binary tree is made up of a finite set of elements called nodes. This set either is empty or consists of a node called ...
  24. [24]
    [PDF] Full and Complete Binary Trees
    A full binary tree has each node either a leaf or two children. A complete binary tree has all levels full except possibly the last, with the last level nodes ...Missing: perfect | Show results with:perfect<|control11|><|separator|>
  25. [25]
    [PDF] Module 6: Binary Trees - Jackson State University
    What is the minimum possible height of the binary tree? – Min Height (h) = log. 2. (N+1) – 1 = log.
  26. [26]
    Binary Trees - andrew.cmu.ed
    A binary tree is made of nodes, where each node contains a "left" reference, a "right" reference, and a data element.Missing: properties | Show results with:properties
  27. [27]
    [PDF] Query Optimization - Database System Concepts
    The number of binary trees with n nodes is: 1 n + 1. 2n n. This number is known as the Catalan number, and its derivation can be found in any standard ...
  28. [28]
    ICS 46 Spring 2022, Notes and Examples: N-ary and Binary Trees
    A binary tree is an N-ary tree of order 2, in which one of the two subtrees of each internal node is the left subtree and the other is the right subtree.Missing: multi- | Show results with:multi-
  29. [29]
    [PDF] Chapter 12 Trees
    An m- ary tree allows each node to have up to m children. Trees with “fat” nodes with a large bound on the number of children (e.g. 16) occur in some storage.Missing: multi- | Show results with:multi-
  30. [30]
    [PDF] CSE 374 Lecture 15 - Washington
    Trinary trees have branching number of three. For arbitrarily large branching numbers, arrays can make more sense than lists of named pointers. Page 5 ...
  31. [31]
    [PDF] CS 234: Data Types and Structures
    Types of rooted trees. A tree is unordered if there is no order specified on the children of a node, and ordered otherwise. A binary tree is a tree in which ...<|separator|>
  32. [32]
    B-tree
    Definition: A balanced search tree in which every node has between ⌈ m/2⌉ and m children, where m>1 is a fixed integer. m is the order. The root may have as few ...
  33. [33]
    binary tree representation of trees
    The binary tree representation of a multiway tree or k-ary tree is based on first child-next sibling representation of the tree.
  34. [34]
    Symmetric binary B-Trees: Data structure and maintenance algorithms
    Symmetric binary B-Trees: Data structure and maintenance algorithms · Summary · Article PDF · References · Author information · Additional information · Rights and ...
  35. [35]
    [PDF] An algorithm for the organization of information by GM Adel'son-Vel ...
    An algorithm for the organization of information by G. M. Adel'son-Vel'skii and E. M. Landis. Soviet Mathematics Doklady, 3, 1259-1263, 1962.
  36. [36]
    Binary Search Trees and File Organization | ACM Computing Surveys
    PDF. References. [1]. ADELSON-VELsKII, G. M.; AND LANDIS, YE. M. "An algorithm for the organization of information." Dokl. Akad. Naulc SSSR 146 (1962), 263-266 ...Missing: Velsky | Show results with:Velsky
  37. [37]
    [PDF] A dichromatic framework for balanced trees - Robert Sedgewick
    I() this paper we present a uniform framework for the implementation and study of halanced tree algorithms. \Ve show how to imhcd in this.
  38. [38]
    Algorithm 232: Heapsort | Communications of the ACM
    May 2, 2025 · Algorithm 232: Heapsort. Author: J.W.J. Williams. J.W.J. Williams ... Data structures design and analysis · Sorting and searching · Theory ...
  39. [39]
    File searching using variable length keys - ACM Digital Library
    File searching using variable length keys. Author: Rene De La Briandais. Rene ... View or Download as a PDF file. PDF. eReader. View online with eReader ...
  40. [40]
    Trie memory | Communications of the ACM
    R. DE LA BRIANDAIS. File searching using variable length keys. Proceedings, Western Joint Computer Conference, 1959, pp. 295-298. ... E. FREDKIN, Trie memory.Missing: original | Show results with:original
  41. [41]
    13.1. General Trees — CS3 Data Structures & Algorithms - OpenDSA
    If we wanted to model this company with a data structure, it would be natural to think of the president in the root node of a tree, the vice presidents at ...<|separator|>
  42. [42]
    Lecture 8: Hierarchical structures
    Hierarchical structures are tree-like data with composition, where some data is composed of other instances of its own, like an organization with managers and ...
  43. [43]
    [PDF] Directed Acyclic Graphs - Swarthmore College
    Despite the name, a family tree is usually not a tree, since people commonly marry distant cousins, knowingly or unknowingly. However, it is always a DAG, ...
  44. [44]
    6.1: Appendix I- Graph theory - Workforce LibreTexts
    Mar 4, 2023 · In the graph of a family tree, for example, the vertices represent the family members and the edges their relationships (Figure 1). Any part of ...<|separator|>
  45. [45]
    5.1: Linnaean Classification - Biology LibreTexts
    Mar 5, 2021 · The Linnaean system of classification consists of a hierarchy of groupings, called taxa(singular, taxon). Taxa range from the kingdom to the ...
  46. [46]
    There shall be order. The legacy of Linnaeus in the age of molecular ...
    Linnaeus' gift to science was taxonomy: a classification system for the natural world to standardize the naming of species and order them.
  47. [47]
    Reconstructing trees: Cladistics - Understanding Evolution
    Cladistics is a method of hypothesizing relationships among organisms and reconstructing evolutionary trees using data on traits. The result is a tree ...
  48. [48]
    7.7: Phylogeny and Cladistics - Biology LibreTexts
    Sep 24, 2022 · Phylogenetic trees are diagrams used to reflect evolutionary relationships among organisms or groups of organisms.
  49. [49]
    Using a Decision Tree | Principles of Management - Lumen Learning
    A decision tree is a branched flowchart showing multiple pathways for potential decisions and outcomes. The tree starts with what is called a decision node, ...
  50. [50]
    Branching Out: Harnessing the Power of Decision Trees
    Aug 8, 2024 · Constructing a decision tree forces the decision-maker to think through the implications of management decisions and the inherent risk.
  51. [51]
    [PDF] Filesystem Hierarchy Standard - Linux Foundation
    Mar 19, 2015 · This standard consists of a set of requirements and guidelines for file and directory placement under UNIX-like operating systems. The ...
  52. [52]
  53. [53]
    The Art of Computer Programming (TAOCP)
    The Art of Computer Programming (TAOCP). by Donald E. Knuth.
  54. [54]
    ICS 311 #8: Binary Search Trees - AlgoPARC
    A binary search tree (BST) is a binary tree that satisfies the binary search tree property: if y is in the left subtree of x then y.key ≤ x.key. if y is in the ...Missing: source | Show results with:source
  55. [55]
    Understanding phylogenies - Understanding Evolution
    A phylogeny is like a family tree, with the root being the ancestor and tips the descendants. It traces shared ancestry and shows branching at speciation.Missing: cladograms | Show results with:cladograms
  56. [56]
    Reading trees: A quick review - Understanding Evolution
    A phylogeny, or evolutionary tree, represents the evolutionary relationships among a set of organisms or groups of organisms, called taxa (singular: taxon).
  57. [57]
    How Phenograms and Cladograms Became Molecular Phylogenetic ...
    Aug 30, 2024 · Today, cladograms are still used and interpreted as specific types of molecular phylogenetic trees. While phenograms and cladograms represented ...
  58. [58]
    Phylogenetic Inference - Stanford Encyclopedia of Philosophy
    Dec 8, 2021 · Phylogenetics is the study of the evolutionary history and relationships among individuals, groups of organisms (e.g., populations, species, ...
  59. [59]
    Using Hierarchical Clustering and Dendrograms to Quantify ... - NIH
    Dendrograms are tree diagrams that are a graphical representation of a hierarchical clustering of a data set. They are often used in computational biology to ...
  60. [60]
    [PDF] Reading Dendrograms - Wheaton College
    A dendrogram is a branching diagram showing similarity. Clades are branches, leaves are terminal ends. Branch height indicates similarity; read top-down for ...
  61. [61]
    [PDF] Tree Syntax of Natural Language - Cornell: Computer Science
    In linguistics and natural language processing, it is common to attribute labeled tree structures called syntactic trees or parse trees to phrases and ...
  62. [62]
  63. [63]
    [PDF] Introduction to AI Techniques - MIT
    Jun 8, 2009 · We will use graphs with nodes representing game “states” (game position, ... and edges representing a move by a player that moves the game.
  64. [64]
    Game Theory - Stanford Encyclopedia of Philosophy
    Jan 25, 1997 · Game theory is the study of the ways in which interacting choices of economic agents produce outcomes with respect to the preferences (or utilities) of those ...
  65. [65]
    19.1. Graphs Chapter Introduction - OpenDSA
    Both the adjacency matrix and the adjacency list can be used to store directed or undirected graphs. Each edge of an undirected graph connecting Vertices u and ...
  66. [66]
    Representing trees – Clayton Cafiero
    Here we present how to represent a tree, including adjacency list, adjacency matrix, incidence matrix, and simple L/R object-oriented approach. Representing ...
  67. [67]
    [PDF] Lecture 23 Representing Graphs - Carnegie Mellon University in Qatar
    Apr 12, 2022 · Algorithms and Data Structures: We see two basic ways to represent graphs: using adjacency matrices and by means of adjacency lists. ...
  68. [68]
    CS312 Lecture 14: Graph Representations. Graph Traversals
    Adjacency lists can be used to represent both directed and undirected graphs. Of course, in the case of undirected graphs the list will be redundant, since if a ...
  69. [69]
    Data Structures: Lecture 15
    Adjacency Lists. Another representation is to have an array of linked lists. Each linked list contains the vertices adjacent from the corresponding array index.
  70. [70]
    Outlining - Clemson University
    The main ideas should be represented by Roman numerals (I, II, III), subpoints by capital letters (A, B, C), and further details or examples by Arabic numerals ...Missing: origin | Show results with:origin
  71. [71]
    Icicle Plots: Better Displays for Hierarchical Clustering
    An icicle plot is a method for presenting a hierarchical clustering, and their benefits are illustrated using a clustering of 48 objects.Missing: original | Show results with:original
  72. [72]
    [PDF] Animated exploration of dynamic graphs with radial layout
    We describe a new animation technique for supporting interactive exploration of a graph. We use the well- known radial tree layout method, in which the view ...
  73. [73]
    [PDF] Tree-maps: a space-filling approach to the visualization of ... - UMD CS
    Tree-maps map hierarchical information onto a rectangular space in a space-filling manner, providing an overall view of the entire hierarchy.
  74. [74]
    Randomized fully dynamic graph algorithms with polylogarithmic ...
    Abstract. This paper solves a longstanding open problem in fully dynamic algorithms: We present the first fully dynamic algorithms that maintain ...
  75. [75]
    [PDF] Problem Set 4 Solutions
    The normal binary-search-tree insertion algorithm TREE-INSERT always places the new item at a new leaf of tree. Therefore the time to insert an item into a ...
  76. [76]
    Tutorial 12: Binary Search Tree
    Study and complete the code for MaxTreeFinder and MinTreeFinder. These two visitors will be needed in the binary search tree deletion algorithm. BSTDeleter.
  77. [77]
    CS 2100: Data Structures - Department of Computer Science
    Tree Grafting 2. Trees have many applications in computer science. Perhaps the most commonly used trees are rooted binary trees, but there are other types of ...Missing: pruning | Show results with:pruning
  78. [78]
    [PDF] The AVL Tree Rotations Tutorial - UF CISE
    A double right rotation, or right-left rotation, or simply RL, is a rotation that must be performed when attempting to balance a tree which has a left subtree, ...
  79. [79]
    Tree Algorithms :: CC 310 Textbook - Textbooks
    When examining the performance of an algorithm we will look at the time and the space that it will require. Time: To analyze the time of an algorithm, we ...
  80. [80]
    [PDF] Binary Search Trees (BST)
    The main goal is to have insertions and deletions that: – Are efficient (at most logarithmic time). – Leave the tree balanced, to support efficient search ...
  81. [81]
    [PDF] CS 310: Recursion and Tree Traversals - GMU CS Department
    ▷ Tree Traversals. ▷ Recursive traversals. ▷ Recursion practice for ... ▷ Time complexity? ▷ Space complexity? ▷ What makes this so easy? Page 21 ...
  82. [82]
    [PDF] Amortized Computational Complexity - cs.Princeton
    A powerful technique in the complexity analysis of data structures is amortization, or averaging over time. Amortized running time is a realistic but robust ...