Fact-checked by Grok 2 weeks ago

Bisection bandwidth

Bisection bandwidth is a fundamental metric in computer networking and parallel computing that measures the aggregate capacity of the minimum set of links required to partition a network into two equal-sized subsets of nodes, representing the total bandwidth available for data transfer across that cut. This concept quantifies a network's ability to handle balanced communication loads between the two halves, serving as a key indicator of potential bottlenecks, scalability, and overall performance in large-scale systems such as high-performance computing clusters and data centers.^[1]^[2] The calculation of bisection bandwidth involves identifying the bipartition of the network that minimizes the crossing bandwidth—typically the product of the number of such links (bisection width) and their individual capacities—and summing those values to yield the total. Unlike bisection width, which counts only the links, bisection bandwidth accounts for varying link speeds, making it a more precise gauge of throughput under worst-case traffic scenarios, such as all-to-all permutations where half the nodes communicate with the other half. It provides a theoretical lower bound on sustainable aggregate bandwidth, influencing design choices in interconnection networks to avoid congestion in distributed memory systems.^[3]^[2] In practice, bisection bandwidth varies significantly across topologies: a fully connected network achieves the maximum of N^2/4 bidirectional links for N nodes, while a bus is constrained to one link; multistage networks like fat trees offer scalable alternatives with bisection bandwidth that decreases with size but remains efficient for commodity clusters; and toroidal meshes, such as the 3D torus in the IBM Blue Gene/L supercomputer with 65,536 nodes arranged in a 64 × 32 × 32 topology, provide 350 MB/s aggregate bandwidth per bidirectional link, supporting high node counts through virtual channels and cut-through routing. High bisection bandwidth is essential for modern applications, as seen in systems like the Frontier supercomputer, where it reaches 540 TB/s to enable exascale performance without undue latency.^[1]^[3]^[2]

Fundamentals

Definition

Bisection bandwidth is a fundamental metric in computer network topology that quantifies the total communication capacity across the minimum cut dividing a network into two partitions of roughly equal size, typically measured as the aggregate data rate that can be sustained between these partitions under full link utilization.^[4] This minimum cut represents the narrowest boundary separating the network into halves with approximately half the nodes on each side, thereby capturing the bottleneck throughput for inter-partition traffic.^[5] The concept breaks down into core elements: "bisection" describes the balanced partitioning of nodes into two equal or near-equal groups to simulate worst-case data exchange scenarios; "width" refers to the count of edges or links traversing this cut; and "bandwidth" extends this by factoring in the transmission capacity of each link, yielding a measure of overall transfer rate rather than mere connectivity.^[6] For instance, in networks with uniform link speeds, the bisection bandwidth scales directly with the number of crossing links, emphasizing scalable designs that maximize this value to support efficient collective operations.^[4] It is distinct from bisection width, which solely tallies the minimum number of links in the cut without regard to their capacities, whereas bisection bandwidth integrates these capacities—often expressed in bits per second (e.g., Gbps)—to provide a more practical assessment of sustained performance.^[5] The basic relation is given by
\text{Bisection Bandwidth} = (\text{Minimum Bisection Width}) \times (\text{Bandwidth per Link}),
where the per-link bandwidth accounts for directional capacities in bidirectional networks.^[4] This formulation highlights bisection bandwidth's role as an upper bound on aggregate traffic, such as in all-to-all communication patterns.^[6]

Historical Context

The concept of bisection bandwidth emerged in the 1980s amid the rapid development of parallel computing architectures, where researchers analyzed interconnection networks to assess their scalability and communication efficiency in multiprocessor systems. Early investigations focused on hypercube designs, which offered a bisection width of N/2 for N nodes, enabling balanced data transfer across network partitions; such designs influenced foundational projects like the hypercube-based Caltech Cosmic Cube (1985).^[7]^[8] This metric quickly became central to evaluating how well networks could handle collective communication patterns without bottlenecks, as low bisection bandwidth could severely limit overall system performance in distributed workloads.^[7] A pivotal milestone occurred in 1985 with Charles E. Leiserson's introduction of fat-tree topologies, which achieved constant or O(N) bisection bandwidth, addressing the limitations of conventional tree structures that suffered from diminishing inter-group communication capacity as systems scaled. This work, part of broader theoretical models for hardware-efficient supercomputing, emphasized bisection bandwidth as a key indicator of network robustness for parallel processing. Concurrently, the Connection Machine project by Thinking Machines Corporation in the mid-1980s incorporated hypercube-based interconnects in models like the CM-1 and CM-2, where bisection analysis guided designs to support massive parallelism in SIMD architectures, paving the way for practical implementations in early supercomputers.^[9]^[10] By the 1990s, bisection bandwidth had transitioned from academic theory to a standard benchmark in supercomputing evaluations, and in analyses of systems appearing on the TOP500 list starting from its inception in 1993, where it helps quantify network contributions to overall system balance alongside computational metrics like Linpack performance. Theoretical papers from this era, such as those on k-ary n-cube networks, further refined scalability models using bisection constraints to predict latency and throughput under varying wire bisections.^[11]^[12] In the post-2000 era, the metric evolved into a practical design criterion for data center networks, driven by the need for high-throughput clusters supporting cloud computing and big data. Fat-tree topologies, originally proposed for supercomputers, were adapted to provide full bisection bandwidth in commodity hardware, as demonstrated in scalable architectures that minimized oversubscription and maximized aggregate throughput between server partitions. This shift marked bisection bandwidth's role in bridging theoretical parallel computing with modern distributed systems.^[13]

Calculation

Principles of Bisection

The principles of bisection in interconnection networks are rooted in graph theory, where the network is modeled as an undirected graph with nodes representing endpoints such as processors or switches, and edges representing links weighted by their bandwidth capacities. The bisection bandwidth is computed as the value of the minimum cut that partitions the graph into two subsets of nodes with sizes as equal as possible, specifically where one subset has size floor(N/2) and the other ceil(N/2) for a total of N nodes. This min-cut approach identifies the narrowest bandwidth across balanced divisions, providing a measure of the network's aggregate communication capacity under worst-case partitioning scenarios.^[14]^[1] Under typical assumptions of uniform link capacities throughout the network, the calculation simplifies by focusing on the bisection width—the minimum number of edges crossing any balanced partition—multiplied by the per-link bandwidth. These assumptions hold for many regular topologies but may require adjustment for heterogeneous capacities, where edge weights are summed directly. The balanced partition requirement ensures relevance to communication patterns where approximately half the nodes exchange data with the other half, avoiding skewed cuts that overestimate capacity.^[14]^[15] The step-by-step process begins with enumerating or algorithmically identifying all possible balanced partitions of the node set. For each partition, the number of crossing links (or their total weighted bandwidth) is calculated, and the minimum value across all such partitions yields the bisection width or bandwidth. In practice, this involves graph partitioning techniques to approximate the min-cut under the balance constraint, especially for large-scale networks where exhaustive search is infeasible. The final bisection bandwidth is then the bisection width multiplied by the uniform link bandwidth, establishing a theoretical limit on inter-partition throughput.^[14]^[1] Normalization of bisection bandwidth is commonly applied to enable comparisons across networks of varying sizes, such as expressing it per node (divided by N) or as a fraction of the total aggregate bandwidth. A network exhibits full bisection bandwidth when this metric reaches N/2 times the link bandwidth, signifying optimal balance with no reduction in capacity due to topological constraints. Such normalizations highlight scalability and efficiency in design evaluations.^[14]

Examples in Common Topologies

In hypercube topologies, the bisection bandwidth is calculated by partitioning the network into two equal-sized sets of nodes that differ in exactly one bit position. For an n-dimensional hypercube with $2^n nodes connected by unit bandwidth links, this partition cuts along one dimension, severing $2^{n-1} links, yielding a bisection width of $2^{n-1} and thus a bisection bandwidth of $2^{n-1} \times link bandwidth.^[15] For example, a 3-dimensional hypercube with 8 nodes has a bisection bandwidth of $4 \times link bandwidth, as partitioning nodes based on the third bit position crosses 4 edges. This minimum arises because any balanced bipartition must separate at least half the nodes connected via each dimension, and the single-dimension cut achieves the tight bound.^[15] Fat-tree topologies achieve high bisection bandwidth through graduated link capacities that increase toward the root, avoiding bottlenecks in standard trees. In a full k-ary fat-tree with N = k^3 nodes and unit bandwidth at the leaves, the design ensures a bisection bandwidth of (N/2) \times link bandwidth by allocating multiple parallel paths across the middle level, equivalent to a non-blocking crossbar.^[9] This is computed by bisecting at the core switches, where k^2 uplinks from edge groups provide aggregate capacity for half the nodes to communicate fully; non-full versions, with uniform bandwidth, reduce this to O(\sqrt{N}) by underprovisioning higher levels.^[9] The partitioning minimizes the cut by balancing load across the tree's recursive structure, ensuring the minimum over all balanced divisions matches the ideal. In mesh and torus topologies, bisection involves a straight cut along the middle of one dimension, revealing limited inter-partition connectivity. For a 2D mesh with \sqrt{N} \times \sqrt{N} nodes and unit bandwidth links, the bisection width is \sqrt{N}, so the bisection bandwidth is \sqrt{N} \times link bandwidth, as a vertical cut through the central column severs only \sqrt{N} edges.^[15] This minimum holds because alternative partitions, such as diagonal cuts, cross more links but the narrowest balanced division is the linear cut, highlighting poor scalability as bandwidth grows only as O(\sqrt{N}). Torus variants improve slightly via wraparound links, doubling the width in that dimension to $2\sqrt{N}, but retain similar sublinear scaling.^[15]

Significance

Role in Network Performance

Bisection bandwidth serves as a key indicator of network scalability by establishing an upper bound on the maximum throughput achievable for collective operations that involve data exchange across network partitions, such as all-to-all and all-reduce. In these operations, where every node communicates with every other node or aggregates data globally, the bisection bandwidth limits the aggregate data transfer rate between the two halves of the network, preventing the system from scaling linearly with the number of nodes. For instance, performance models for MPI implementations demonstrate that in networks with full effective bisection bandwidth, the communication time for all-to-all operations is bounded by factors incorporating the network's gap parameter and congestion terms, ensuring predictable scalability for large-scale parallel computations.^[16] Low bisection bandwidth acts as a signal for potential bottlenecks, particularly in balanced workloads where traffic is evenly distributed, leading to congestion at the minimal cut between partitions. This is especially evident in non-oversubscribed networks, where achieving full bisection bandwidth—equivalent to (N/2) times the per-link bandwidth for N nodes—allows for efficient, contention-free communication across the network halves, minimizing latency and maximizing utilization. In contrast, insufficient bisection bandwidth results in high contention during global data transfers, forcing algorithm redesigns to reduce communication intensity and thereby compromising overall system performance.^[17]^[18] The metric is particularly critical for communication patterns involving random traffic or shuffle exchanges, such as those in machine learning training or parallel sorting, where data must cross partitions unpredictably. For all-to-all operations, the sustained bandwidth per node is fundamentally limited by the bisection bandwidth divided by (N/2), reflecting the inherent locality that requires only half the full bisection capacity due to symmetric sending and receiving patterns. Optimized scheduling algorithms can achieve near this bound, enabling high-performance exchanges even in cost-effective topologies.^[19] Unlike total aggregate bandwidth, which measures the sum of all link capacities and can appear high in oversubscribed designs, bisection bandwidth reveals whether the network provides balanced performance under worst-case partitioning, highlighting that excessive aggregate capacity alone does not ensure equitable data flow or prevent hotspots. This distinction is vital for evaluating overall network capability, as topologies with high aggregate but low bisection bandwidth may underperform in partition-crossing workloads despite their raw throughput potential.^[17]

Applications in Computing Systems

In high-performance computing (HPC) systems, bisection bandwidth serves as a critical metric for evaluating interconnect performance in supercomputers listed on the TOP500, where InfiniBand networks often employ fat-tree topologies to achieve non-blocking communication and full bisection bandwidth, enabling efficient scaling for large-scale simulations.^[20] For instance, the Summit supercomputer utilizes a Mellanox InfiniBand EDR network in a non-blocking fat-tree configuration, delivering 115 TB/s of bisection bandwidth across its nodes to support collective operations like those in the Message Passing Interface (MPI).^[21] Similarly, exascale machines such as Frontier (deployed in 2022) incorporate high bisection bandwidth—540 TB/s via the Slingshot-11 interconnect in a dragonfly topology—to facilitate MPI collectives and all-to-all communication patterns essential for scientific workloads.^[22] In data center environments, bisection bandwidth quantifies east-west traffic capacity, guiding the design of cloud-scale networks to handle intra-cluster data flows between servers. Google's Jupiter fabric, a Clos-based topology, exemplifies this by scaling to 13 Pb/s of bisection bandwidth in its fifth-generation implementation, supporting dynamic workloads across over 100,000 servers and optimizing for high-volume internal traffic patterns.^[23] In spine-leaf architectures common to these data centers, oversubscription ratios derived from bisection bandwidth assessments—such as 3:1, where aggregate server uplink bandwidth exceeds core capacity—balance cost and performance, ensuring moderate loads (up to 60% of available bisection) do not degrade throughput while allowing efficient resource utilization.^[24]^[25] For parallel processing in GPU clusters, bisection bandwidth ensures sufficient interconnect capacity for model parallelism during AI training, where data must flow rapidly between accelerators. NVIDIA's DGX-2 system leverages NVLink and NVSwitch to provide 2.4 TB/s of bisection bandwidth across 16 V100 GPUs, enabling unrestricted model parallelism for deep learning tasks involving large datasets and complex neural networks.^[26] This high intra-node bandwidth reduces communication bottlenecks in distributed training, as demonstrated in large language model implementations where effective bisection supports pipeline-parallel communication at scales of thousands of GPUs.^[27] Bisection bandwidth also plays a key role in HPC benchmarks, where insufficient capacity can limit scaling efficiency and overall system performance. In the OSU Bandwidth (OSU BW) benchmark, which measures MPI point-to-point and collective communication throughput, low bisection bandwidth reveals network contention, directly impacting results in fat-tree or Clos topologies by capping achievable aggregate bandwidth during all-to-all tests.^[28] Similarly, the High Performance Conjugate Gradient (HPCG) benchmark, which stresses sparse matrix operations and irregular communication, experiences reduced efficiency on systems with suboptimal bisection, as seen in TOP500 analyses where network balance (injection-to-bisection ratio) correlates with HPCG scores below 5% of peak FLOPS.^[29]^[30]

Limitations

Shortcomings

Bisection bandwidth assumes a worst-case balanced partition of the network into two equal halves, minimizing the aggregate capacity across the cut; however, this metric is insensitive to the specific partition choices that align with actual workloads, often overlooking uneven traffic distributions where communication patterns do not uniformly span the network.^[31] In real data center scenarios, traffic may concentrate on particular subsets of nodes, rendering the minimum cut an overly pessimistic or irrelevant bound that fails to capture achievable throughput under typical demands.^[32] For instance, evaluations of topologies like Jellyfish show that bisection bandwidth ranks them favorably against Clos networks, but actual flow-based throughput reveals significant discrepancies due to partition-specific bottlenecks not accounted for in the static cut analysis.^[31] As a static graph-theoretic measure, bisection bandwidth oversimplifies network performance by ignoring critical dynamic factors such as latency, routing algorithms, and non-uniform link speeds, which can drastically alter effective bandwidth in operational settings.^[33] It treats the network as a fixed entity without considering adaptive routing or load balancing, leading to misleading assessments where high bisection values do not translate to low-latency communication under congested conditions.^[32] This limitation is particularly evident in modern architectures, where processor latencies (e.g., 800 ns for inter-node interconnects) and concurrency constraints make exploiting full bisection bandwidth impractical without workload-specific optimizations.^[33] Computing the exact bisection bandwidth is computationally challenging, as determining the minimum balanced cut in general graphs is NP-hard, necessitating approximations that introduce further inaccuracies for large-scale networks.^[34] Practical tools often rely on heuristics like spectral partitioning, but these can deviate substantially from the true minimum, especially in complex topologies with thousands of nodes, complicating precise design evaluations.^[35] In contemporary data centers, bisection bandwidth mismatches hierarchical or bursty traffic patterns prevalent in AI training workloads, where all-to-all or permutation-based communications dominate over uniform bisections assumed by the metric.^[32] Traditional high-performance computing (HPC) environments benefited from its simplicity for collective operations, but 2020s AI systems exhibit irregular, scale-out demands that expose its inadequacies.^[31] This has led to a shift toward flow-based metrics in design, as bisection alone fails to predict performance in bursty, non-uniform scenarios.^[32] In network topology analysis, several metrics complement bisection bandwidth by addressing different aspects of performance, such as latency, total capacity, expansion properties, and real-world throughput under traffic loads. These indicators provide a more holistic evaluation of interconnection networks, revealing limitations that static bandwidth measures like bisection alone cannot capture.^[36] The diameter of a network, defined as the longest shortest path between any two nodes, quantifies the maximum latency in terms of hops, contrasting with bisection bandwidth's focus on aggregate cut capacity. For instance, in an n-dimensional hypercube topology, the diameter is exactly n, indicating that messages may traverse up to n links, which impacts communication delay in parallel computing systems. This metric is particularly useful for assessing worst-case propagation delays in low-latency applications, such as real-time simulations.^[37]^[38] Aggregate bandwidth measures the total link capacity across the entire network, offering a high-level view of overall throughput potential but failing to identify internal bottlenecks that bisection bandwidth highlights. Unlike bisection, which targets the minimum cut dividing the network evenly, aggregate bandwidth sums all edge capacities without considering partitioning, potentially overestimating usable performance in unbalanced traffic scenarios. In fat-tree topologies, for example, aggregate bandwidth scales with the number of ports, yet practical utilization often falls short due to oversubscription at higher levels.^[36]^[4] The expansion ratio, closely related to the isoperimetric number, generalizes bisection by evaluating the minimum edge boundary relative to subset sizes, making it suitable for analyzing connectivity in uneven partitions or small clusters. The isoperimetric number is formally the infimum over all nonempty proper subsets S of vertices of the edge boundary |∂S| divided by min(|S|, |V| - |S|), where V is the vertex set; a higher value indicates better expansion properties, as seen in expander graphs used for fault-tolerant networks. This metric extends bisection's even-cut focus to arbitrary subsets, aiding in the design of scalable data center fabrics where localized traffic dominates.^[39]^[40] Throughput under load introduces dynamic considerations, such as effective bisection bandwidth (EBB), which measures achievable aggregate transfer rates across bisections via simulations or benchmarks, accounting for routing inefficiencies and contention absent in static bisection calculations. In studies of data center topologies, EBB often achieves only 55-60% of nominal bisection bandwidth under application-like traffic patterns, as demonstrated in multistage switch evaluations. Tools like Netgauge compute EBB by averaging bandwidth over random bisection patterns, providing insights into real-world performance; for example, a 2014 analysis showed throughput closely correlating with sparsest-cut metrics but deviating up to 56% from pure bisection in irregular networks.^[41]^[42]^[43]

References

[1]
[PDF] Interconnection Networks - MIT
Aggregate bandwith refers to the total data bandwidth supplied by the network, and effective bandwith or throughput is the fraction of aggregate bandwidth ...<|control11|><|separator|>
[2]
Bisection Bandwidth - an overview | ScienceDirect Topics
Bisection bandwidth refers to the collective bandwidth over the minimum number of wires that must be cut when a network is divided into two equal sets of ...Introduction · Bisection Bandwidth Across... · Impact of Bisection Bandwidth...
[3]
[PDF] Bisection Bandwidth
Bisection width = minimum number of links cut to. divide the network into two halves. ● Bisection bandwidth = bandwidth of above. ●
[4]
[PDF] arXiv:1202.6291v1 [cs.NI] 28 Feb 2012
Feb 28, 2012 · The bisection width of interconnection networks has always been im- portant in parallel computing, since it bounds the amount of information ...
[5]
[PDF] CS 514: Computer Networks Lecture 13: Introduction to Datacenter ...
Bisection Bandwidth. Another equivalent definition of bisection bandwidth: ○ Partition the network into two halves. ○ All hosts in one half want to ...
[6]
[PDF] CS470 Networks - Computer Science - JMU
– Bisection width: number of simultaneous communications. between the two partitions. – Bisection bandwidth: total data rate between the partitions.
[7]
[PDF] Algorithmic Techniques For Regular Networks of Processors
The bisection bandwidth of the network corresponds to the minimum number of communication links that need to be removed (or cut) in order to partition the ...Missing: origins | Show results with:origins
[8]
[PDF] Fat-Trees: Universal Networks for Hardware-Efficient Supercomputing
CHARLES E. LEISERSON, MEMBER, IEEE. Abstract-This paper presents a new class of universal routing networks called fat-trees, which might be used to ...
[9]
[PDF] The Network Architecture of the Connection Machine CM-5
Aug 10, 1989 · Each processor has 2 connections to the data network, corresponding to a raw bandwidth of 40 megabytes/second in and out of each processing node ...
[10]
TOP500 AND HPCC BENCHMARKS - WHAT THEY CAN ... - HPCwire
Jun 25, 2004 · Actually, it measures operand bisection bandwidth, which is the minimum of hardware interconnection-network bisection bandwidth and application ...
[11]
[PDF] Performance Analysis of K-Ary N-Cube Interconnection Networks.
This paper analyzes communication networks of varying dimension under the assumption of constant wire bisection. Expressions for the latency, average case.Missing: original | Show results with:original
[12]
[PDF] A Scalable, Commodity Data Center Network Architecture
Aug 22, 2008 · The ideal bisection bandwidth for the fat-tree network is 1.536Gbps. • Staggered Prob (SubnetP, P odP): Where a host will send to another ...
[13]
[PDF] Principles and Practices of Interconnection Networks
The need for this book has grown with the increasing impact of interconnects on computer system performance and cost.
[14]
[PDF] Chapter 2. Parallel Architectures and Interconnection Networks
The bisection width is low, which means it is poor. It is possible to split the tree into two sets differing by at most one node in size by deleting either ...
[15]
[PDF] Toward Performance Models of MPI Implementations for ...
An all-to-all com- munication could be expressed as Ta2a = α(P)+S ·β(P). In a network with full effective bisection bandwidth, one could set β(P) = G. Startup ...
[16]
https://web.cels.anl.gov/~thakur/papers/mpi-model.pdf
[17]
[PDF] Impact of Bisection Bandwidth Constraints on Software ... - DTIC
However, bisection bandwidth is related to connectivity as both depend on network topology; i.e. networks with a greater ratio of links/processor will have ...
[18]
[PDF] Bandwidth-optimal All-to-all Exchanges in Fat Tree Networks
Jun 10, 2013 · In this paper, we show that all-to-all only requires half bisection bandwidth due to its inherent locality. We then demonstrate how to ...
[19]
[PDF] InfiniBand Clustering - Networking
InfiniBand clustering uses Clos networks, also known as 'Fat Tree' or CBB networks, which maintain identical bandwidth at each level.
[20]
[PDF] Summit and Frontier at the Oak Ridge Leadership Computing Facility
Jul 29, 2019 · • Mellanox EDR Network with non-blocking fat-tree topology. – Bisection bandwidth 115 TB/s. – 2 physical ports per node (4 virtual) at 25 GB/s.
[21]
[PDF] Report on the Oak Ridge National Laboratory's Frontier System
May 30, 2022 · The interconnect provides a bisection bandwidth of 540 TB/s. Each compute node has four. Cassini NICs, each providing 200 Gb/s (25 GB/s) for ...
[22]
Jupiter now scales to 13 Petabits per second | Google Cloud Blog
Oct 30, 2024 · Our current fifth-generation Jupiter data center network architecture, which now scales to 13 Petabits/sec of bisectional bandwidth.
[23]
Cisco Massively Scalable Data Center Network Fabric Design and ...
Therefore, the spine-to-super-spine oversubscription ratio is 3:1 (48x100G: 16x100G). Or 1:1.3 undersubscribed if using 16x400G instead. The super-spine ...Missing: bisection | Show results with:bisection
[24]
[PDF] On the Data Path Performance of Leaf-Spine Datacenter Fabrics
2) Oversubscription does not negatively impact perfor- mance for moderate levels of load (e.g., up to 60% of the available bisection bandwidth), but it makes ...
[25]
DGX-2 : AI Servers for Solving Complex AI Challenges | NVIDIA
Revolutionary AI Network Fabric. The NVSwitch networking fabric enables greater model parallelism with 2.4 terabytes per second (TB/s) of bisection bandwidth, ...
[26]
[PDF] Efficient Large-Scale Language Model Training on GPU Clusters ...
When training a trillion-parameter model on 3072 GPUs, our implementation used an effective bisection bandwidth of 892. GB/s for pipeline-parallel communication ...
[27]
[PDF] Benchmarks Overview SCC23 - SC23
https://www.hpcg-benchmark.org/software/. ○. 9. Page 10. Building and ... The OSU benchmarks measure performance of various MPI operations. ○. How good ...
[28]
[PDF] Performance modeling of the HPCG benchmark
In this paper, we present a model capable of predicting the performance of HPCG for a given architecture within 3% of the real value, based on two simple ...
[29]
An Analysis of System Balance and Architectural Trends Based on ...
On average, the bisection bandwidth is 32% of the total injection bandwidth for the 20 systems. However, three supercomputers, i.e., Tsubame-2.0 (ratio of 1.2, ...
[30]
[PDF] Measuring Throughput of Data Center Network Topologies
What are the shortcomings of existing cut-based through- put metrics such as bisection bandwidth and sparsest- cut? Why is a flow-based metric more accurate ...
[31]
[PDF] A Throughput-Centric View of the Performance of Datacenter ...
Aug 28, 2021 · We find that, for these topologies, the bisection-bandwidth based over-subscription ratio overestimates the throughput by up to 50%. Thus, a ...
[32]
Why don't we talk about bisection bandwidth any more? - UT Blogs
Apr 11, 2022 · Bisection bandwidth was important in the late 90s/early 2000s, but after 10 years of effort, the lack of revolutionary results led to a lack of ...Missing: Machine | Show results with:Machine
[33]
[PDF] A polylogarithmic approximation of the minimum bisection∗
Oct 24, 2001 · Minimum bisection is the problem of computing b for an input graph G. This problem is NP-hard, see [GJS76], and we address the problem of ...
[34]
CS267: Notes for Lecture 23, April 9, 1999 - People @EECS
Apr 9, 1999 · Experiments report a 10x speedup over the basic spectral bisection algorithm. Performance Comparison of Different Partitioning Algorithms.<|control11|><|separator|>
[35]
[PDF] Interconnection Networks: Topology
– Channel bisection → min. channel count over all bisections. – Bisection bandwidth → min. bandwidth over all bisections. • With uniform traffic. – ½ of ...
[36]
[PDF] GRAPHS AND INTERCONNECTION NETWORKS - Hal-Inria
The object of this paper is to survey the results concerning diameter and connectivity in graphs and hypergraphs, in particular those of some importance for ...Missing: seminal | Show results with:seminal<|separator|>
[37]
[PDF] Diameters Of Communication Networks - UCSD Math
In the graph model for communication networks, the diameter of the graph corresponds to the maximum number of edges over which a message between two nodes must ...
[38]
[PDF] Isoperimetric problems - Fan Chung Graham
The basic isoperimetric problem for graphs is essentially the same. Namely, remove as little of the graph as possible to separate out a subset of vertices of ...
[39]
[PDF] Allow Me to Introduce Spectral and Isoperimetric Graph Partitioning
The minimum isoperimetric ratio among all cuts of G is called the isoperimetric number of G. Page 8. 4. Jonathan Richard Shewchuk. If every vertex has mass 1, ...
[40]
Netgauge - Effective Bisection Bandwidth Measurement
The relative effective bisection bandwidth can be determined by repeating the experiment with 2 processes. This resulted in 379 MiB/s in our run, thus, the ...
[41]
Multistage switches are not crossbars: Effects of static routing in high ...
Networks with full bisection bandwidth typically provided effective bisection bandwidth in the range of 55-60%. Simulations with application-based patterns ...
[42]
[PDF] Measuring and Understanding Throughput of Network Topologies
This paper develops a framework to benchmark network topology throughput, using synthetic and real traffic matrices, and measuring worst-case throughput.