Fact-checked by Grok 2 weeks ago

BeeGFS

BeeGFS is a parallel cluster file system designed for (HPC), (AI), and (ML) environments, delivering scalable, high-throughput access to distributed file storage across multiple servers. Originally developed in 2005 by the Fraunhofer Institute for Industrial Mathematics (ITWM) in as FhGFS (Fraunhofer File System), it was renamed BeeGFS in 2014 to reflect its broader applicability beyond Fraunhofer's internal use. The system entered productive installations in 2007, with its first commercial deployment in 2009, and has since evolved through contributions to exascale projects such as DEEP-ER, EXANODE, and EXANEST. Today, BeeGFS is maintained by ThinkParQ, a from Fraunhofer ITWM, under a source-available that includes both a self-supported Community Edition and a fully supported Enterprise Edition with high-availability features. At its core, BeeGFS employs a user-space architecture built over standard filesystems like , , or , utilizing lightweight service daemons to maximize hardware performance and bandwidth without kernel modifications. Key components include distributed metadata servers that handle namespace operations across multiple nodes to reduce , and storage targets that stripe data over numerous disks and servers for parallel access at network wire speeds. This design enables seamless scalability by adding servers and , supporting clusters from dozens to thousands of nodes, and is hardware-independent, compatible with x86_64, , , OpenPower, and other architectures. BeeGFS is widely adopted in sectors like life sciences, oil and gas, and media, powering supercomputers and earning recognition such as the 2024 HPCwire Readers' Choice Award for Best HPC Storage Product or Technology.

Overview

Definition and Purpose

BeeGFS is a source-available parallel cluster file system designed for high-performance computing (HPC) environments, specifically to manage large-scale data storage and I/O-intensive workloads. It originated as the Fraunhofer Gesellschaft File System (FhGFS) and has evolved into a widely adopted solution for cluster-based storage needs. It is currently maintained by ThinkParQ, a spin-off from the Fraunhofer Institute for Industrial Mathematics. The primary purpose of BeeGFS is to deliver scalable, high-throughput access to shared files across distributed clusters by striping over multiple servers, which facilitates parallel I/O operations from numerous clients simultaneously. This ensures efficient handling of concurrent read and write demands without bottlenecks, making it ideal for resource-intensive applications. BeeGFS maintains compliance while incorporating HPC-specific extensions that optimize performance for tasks such as scientific simulations, training, and analytics, allowing standard applications to leverage its capabilities without modifications.

Key Characteristics

The and services of BeeGFS operate in user space, utilizing lightweight, high-performance daemons that run atop standard local file systems such as , , and , while the client is implemented as a . This design minimizes dependencies for the server components, simplifies deployment across diverse environments, and leverages existing -compliant storage without requiring specialized modules for services. The emphasizes , permitting independent scaling of and services; can be distributed across multiple dedicated servers for enhanced reliability and performance, while targets can be expanded by adding servers or disks as needed. This separation allows administrators to optimize based on workload demands, supporting seamless growth from small clusters to large-scale systems with thousands of nodes. BeeGFS provides native integration with multi-rail networking and RDMA protocols, including , , and , via the Open Fabrics Enterprise Distribution ibverbs API. These capabilities enable low-latency, high-bandwidth I/O by allowing between nodes, with configurable multi-rail support for clients equipped with multiple RDMA network interface cards to balance load and maximize throughput. Adopting a hardware-agnostic stance, BeeGFS accommodates diverse storage media ranging from traditional HDDs to high-speed NVMe SSDs, relying on underlying local file systems rather than custom drivers for compatibility. This flexibility extends to various platforms, including x86_64, , and OpenPOWER architectures, without imposing strict hardware constraints. BeeGFS is distributed under the GPLv2 for its core components, particularly the client module, promoting community contributions and broad accessibility. For production environments requiring advanced features and professional , the BeeGFS Hive Enterprise edition offers a licensed version with additional capabilities under a .

Architecture

Core Components

BeeGFS operates through a distributed comprising several key services that enable high-performance parallel file access. These core components include the management service, metadata servers, storage targets, and client modules, which communicate via efficient remote procedure calls (RPC). This allows for and in large-scale environments. The management service is an optional daemon that facilitates , , and . It maintains a registry of all BeeGFS services, tracks states, and stores configuration data in a lightweight database, using minimal resources without impacting file I/O performance. Graphical tools, such as those provided by the BeeGFS Administration and Monitoring System (BeeGFS-admon), enable GUI-based oversight of the , including dynamic addition of . This service is essential for operational oversight but can be omitted in simple deployments. Metadata servers handle file and directory operations, such as lookups, permissions, and attribute management, while coordinating placement and striping across resources. is dynamically distributed across multiple servers, often on a per- basis, to ensure low-latency access and through and load balancing. Each server manages one or more metadata targets, typically backed by efficient filesystems like , supporting high concurrency for metadata-intensive workloads. Storage targets, hosted on dedicated storage servers, export local storage resources for holding striped file data chunks, enabling parallel I/O from multiple clients. These user-space daemons support buddy-mirroring, where targets are paired into groups for redundancy—one primary and one secondary—to protect against failures in drives, servers, or networks, with automatic resynchronization as needed. Targets leverage underlying POSIX-compliant filesystems (e.g., or ) and can bypass kernel caching via direct I/O for enhanced speed in large transfers. Multiple targets per server allow fine-grained scalability. The client module, deployed on compute nodes, provides transparent access to the BeeGFS filesystem via a standard mount point, intercepting system calls without requiring application modifications. Available as a patchless module for optimal performance or in user-space variants for compatibility, it routes requests directly to and servers using RPC, supporting simultaneous communication with multiple nodes for data access. This design ensures low-overhead integration in HPC environments. Inter-service communication in BeeGFS relies on efficient RPC mechanisms over (including support as of version 8.2), , or native RDMA protocols (e.g., , , RoCE), enabling high-throughput, low-latency interactions between clients, metadata servers, and storage targets. Automatic to redundant paths enhances reliability in dynamic networks.

Data and Metadata Handling

BeeGFS manages file by striping contents across multiple storage targets to enable parallel access and balance load distribution. Files are divided into configurable chunks, with a size of 1 , allowing clients to read or write concurrently from several targets for improved throughput. Striping patterns include a (RAID0-like) layout, which distributes chunks sequentially across targets, and mirrored patterns that duplicate across buddy groups for redundancy, resembling declustered configurations. Metadata operations are handled separately on dedicated servers, where each or is assigned to a specific metadata target based on a distributed . The system employs a hierarchical structure mirroring the , with one metadata created per user to store attributes such as ownership, permissions, and chunk locations. To minimize latency, BeeGFS implements caching mechanisms on both clients and servers, leveraging available to store frequently accessed . are supported for custom metadata storage, implemented as extended file attributes (xattrs) on the underlying filesystem, enabling applications to attach additional file-specific information. BeeGFS ensures data and consistency through synchronous operations, particularly in setups where writes complete only after replication to targets. For , supports replication via , pairing servers to provide and maintain during failures, though it requires an even number of metadata servers and careful initialization to avoid inconsistencies. Data similarly uses for replication across storage targets. Clients dynamically select storage targets for I/O using buddy lists, which group available targets and prioritize those with sufficient free space—categorized into normal, low, or emergency pools—to prevent hotspots and ensure efficient routing. This selection process integrates with striping configurations to direct operations to optimal targets without central coordination.

Features

Performance Optimizations

BeeGFS employs client-side caching mechanisms to enhance I/O efficiency by aggregating small requests into larger transfers, thereby reducing overhead and improving throughput in environments. In the default buffered caching mode, the client utilizes a pool of small static buffers, typically hundreds of kilobytes in size, to implement read-ahead prefetching and write-back buffering. Read-ahead anticipates patterns by fetching data ahead of the current read position, while write-back defers writes to the server until the buffer is full or flushed, allowing applications to continue without waiting for immediate disk commits. This approach is particularly effective for streaming workloads, where it can achieve higher throughput compared to non-cached operations by minimizing round-trips and leveraging larger transfers. An alternative native caching mode delegates buffering to the kernel's , which can handle multiple gigabytes of data and dynamically adapts to available . This mode benefits patterns or workloads where data fits entirely in cache, potentially reducing latency by avoiding user-space copies, though it may introduce variability based on kernel version and system load. Configuration of these modes occurs via the tuneFileCacheType in the client's (/etc/beegfs/beegfs-client.conf), enabling administrators to select buffered for predictable streaming or native for memory-intensive scenarios. Both modes aggregate fragmented I/O requests internally, coalescing them into efficient server-side operations to sustain low-latency access in parallel environments. BeeGFS supports operations to enable applications to overlap computation with data transfers, minimizing idle time in I/O-bound HPC workloads. Through its , non-blocking reads and writes allow multiple concurrent requests without blocking the calling , facilitated by the system's striping across targets. This capability is integrated into the BeeGFS client library, supporting interfaces for direct overlap of CPU tasks and file operations, which is crucial for scaling performance in multi-threaded or MPI-based applications. Network optimizations in BeeGFS focus on leveraging high-speed interconnects to reduce and CPU involvement in data movement. Multi-rail bonding enables clients to utilize multiple network interfaces simultaneously, distributing connections across RDMA-capable NICs within a single IPoIB for load-balanced traffic and resilience. RDMA support, based on the ibverbs for , RoCE, and , implements transfers by directly accessing application memory, bypassing the kernel network stack and eliminating intermediate buffering and significantly reducing CPU overhead in bandwidth-intensive scenarios. Congestion control is managed through tunable buffer parameters, such as connRDMABufNum and connRDMABufSize, which optimize chunk sizes (e.g., 1 transfers with 20 buffers) to prevent and ensure consistent low- delivery. These features are configured via connRDMAInterfacesFile in the client configuration, allowing dynamic rail selection for high sustained throughput in multi-rail setups, such as exceeding 100 GB/s on modern hardware. Storage backend tuning in BeeGFS optimizes local file systems for direct I/O alignment and reduced overhead, ensuring efficient data placement on underlying . Direct I/O is encouraged by aligning partitions to native offsets (e.g., 4 KB or stripe widths) and using filesystems like with mount options such as noatime and allocsize=131072k to bypass for large sequential accesses, minimizing double buffering. Support for plugins, such as 's LZ4 algorithm, is available but recommended only for compressible data to avoid CPU penalties; it can be enabled via zfs set [compression](/page/Compression)=lz4 poolname for up to 2x space savings with minimal throughput impact on suitable workloads. Deduplication plugins, like 's native feature, identify and eliminate redundant s but are generally disabled (zfs set dedup=off poolname) due to high overhead, unless capacity constraints demand it, in which case it trades performance for efficiency in repetitive datasets. I/O scheduler tuning, such as setting deadline with read_ahead_kb=4096, further aligns local FS operations with BeeGFS's striping for balanced read/write performance. Built-in monitoring tools in BeeGFS provide statistics to diagnose and mitigate bottlenecks, enabling proactive tuning in large-scale deployments. The beegfs-mon service aggregates metrics from clients, , and storage nodes into an time-series database, covering I/O rates, connection counts, and resource utilization. Visualization via dashboards highlights issues like saturation through per-interface plots or contention via operation histograms on metadata servers. Tools such as beegfs-ctl --liststats and beegfs-net offer command-line insights into target-specific throughput and error rates, allowing administrators to identify imbalances, such as overloaded rails or disk queue depths, and adjust configurations accordingly for optimal system-wide efficiency.

Scalability and Flexibility

BeeGFS achieves horizontal scalability by allowing independent addition of servers and storage targets without downtime, enabling the distribution of across multiple servers to manage petabyte-scale namespaces. operations, such as directory lookups, are parallelized across these servers, reducing and supporting large numbers of files and directories. Similarly, storage targets can be added dynamically and integrated into capacity pools based on available space, facilitating seamless expansion of overall storage capacity and throughput. The system's flexibility in deployment stems from its user-space daemons for server components, which require no modifications and run on commodity hardware or cloud instances such as EC2 and . This design supports converged architectures where clients and servers operate on the same nodes, as well as hybrid setups combining local storage targets with remote ones, including S3-compatible cloud storage for tiered . Remote targets enable synchronization between local and external storage, allowing administrators to integrate existing infrastructure without overhauling core servers. Customization options enhance adaptability through pluggable policies for file striping, quotas, and access controls. Striping patterns, such as RAID0 or mirrored configurations, can be set per directory or file using command-line tools, with parameters like chunk size and number of targets adjustable to optimize for specific workloads. Quotas for disk space and inode counts are enforced on a per-user or per-group basis across storage pools, with configurable tracking and update intervals to balance enforcement overhead. Access controls leverage ACLs, enabled via configuration files, to manage permissions granularly without reliance on external authentication beyond the built-in mechanism. High availability is provided through buddy , where pairs of targets (buddy groups) replicate data for and storage redundancy, enabling automatic without third-party services. Administrators create these groups using tools like beegfs mirror create, ensuring primary and secondary targets are on separate hardware for , with self-healing resynchronization upon target recovery. For management services, integration with clusters supports virtual IP , further bolstering redundancy in production environments. Cross-platform support primarily targets clients via a native for POSIX-compliant , with for Windows and macOS available through third-party native clients, such as BLINK, which provides kernel-level for high-performance workflows. As of 2024, third-party solutions like BLINK enable native on Windows and macOS workstations, supporting high-I/O workflows in cross-platform setups. BeeGFS official clients remain Linux-focused.

History and Development

Origins and Early Development

BeeGFS, initially developed under the name FhGFS, originated in 2005 at the Fraunhofer Institute for Industrial Mathematics (ITWM) in , . The project was launched by the institute's Competence Center for to overcome the limitations of contemporary file systems on their emerging HPC cluster, which struggled with delivering sufficient aggregate bandwidth for data-intensive scientific workloads. At the time, standard network file systems like NFS lacked the parallel access capabilities needed for modern clusters, while alternatives such as PVFS offered some parallelism but fell short in flexibility and ease of management for diverse HPC applications. The core motivations centered on creating a POSIX-compliant parallel optimized for high-throughput scientific , capable of aggregating across multiple servers to support concurrent reads and writes from numerous compute nodes. This design aimed to bridge the growing gap between rapid advancements in CPU and network speeds and the comparatively slower I/O subsystems, enabling more efficient data handling in distributed environments. Early efforts focused on building a that could scale seamlessly without requiring modifications, prioritizing simplicity and reliability for research-oriented HPC setups. Among the key innovations in the initial phases were a fully user-space , which facilitated straightforward and without deep dependencies, and a distributed that eliminated single points of failure by spreading directory and file information across dedicated servers. These features enhanced and metadata query performance, critical for avoiding bottlenecks in large-scale operations. By 2007, the first prototypes had been rigorously tested on small in-house clusters, such as the 32-node Fraunhofer Seislab setup equipped with SSD storage, validating the system's ability to achieve high I/O rates—up to 700 MB/s writes in early benchmarks—under realistic seismic workloads. The development received support from research funding initiatives, including evaluations conducted within projects like the Virtual Research Environment () to assess performance in collaborative scientific scenarios. Following several years of internal refinement and productive deployments starting in 2007, the was renamed BeeGFS in 2014 and made available as in 2016, with the client module under GPLv2 and other components under the BeeGFS (EULA), allowing broader adoption and community contributions while maintaining its research roots.

Commercialization and Evolution

In 2014, the parallel previously known as Fraunhofer FS (FhGFS) was renamed BeeGFS to reflect its evolution into a commercially supported product, coinciding with the founding of ThinkParQ as a from the Fraunhofer Competence Center for . This company, established by key former Fraunhofer developers, aimed to provide enterprise-level support, professional services, and accelerated development to meet the demands of environments beyond academic research. Major version milestones marked BeeGFS's technical evolution under ThinkParQ's stewardship. BeeGFS v6, released in 2016, introduced (RDMA) support to enable high-speed networking over and RoCE, significantly improving data transfer efficiency in cluster environments. In 2018, v7 added a (GUI) for management, simplifying administration tasks such as monitoring and configuration for large-scale deployments. BeeGFS v8, released in 2025, enhanced support for NVMe-over-Fabrics (NVMe-oF) and optimized handling of AI workloads through features like NVIDIA GPUDirect Storage integration, allowing direct data movement from storage to GPU memory to reduce latency in pipelines. Adoption of BeeGFS grew substantially in , with integration into numerous top supercomputers listed on the by 2020, particularly European systems such as those in the EuroHPC initiative. This expansion was bolstered by strategic partnerships, including collaborations with for GPUDirect Storage to accelerate GPU-direct I/O in and simulation workloads, and with to optimize compatibility with processors and oneAPI directives for enhanced parallel processing. The project's open-source nature fostered active community involvement, with its repository—public since 2024—serving as a hub for regular patches, bug reports, and contributions from users worldwide. In 2023, ThinkParQ shifted to a multi-branch release model, maintaining the 7.x series as a (LTS) branch with backported fixes for legacy distributions while advancing newer versions. As of 2025, BeeGFS version 8.2 emphasizes exascale readiness through features like background rebalancing, which enables efficient across expanding pools in massive clusters, and improved resilience via SELinux integration and enhanced () performance with client-side caching.

Deployment and Usage

Installation and Configuration

BeeGFS installation requires a Linux-based cluster environment with supported distributions such as (RHEL) 8, 9, and 10 (including derivatives like and ), Server (SLES) 15, 11 and 12, and 20.04, 22.04, and 24.04, along with compatible versions typically from 4.18 onward for automatic client module building. openSUSE Leap 15.6 has been tested but may require manual compilation for installation. Network setup is essential, particularly for high-performance fabrics like , Omni-Path, or RoCE, where RDMA support necessitates OpenFabrics Enterprise Distribution (OFED) or equivalent drivers installed across nodes. Hardware prerequisites include dedicated storage volumes formatted with or on RAID-configured disks for metadata and storage targets. As of BeeGFS 8.2, SELinux integration allows for policy-based configuration rather than full disabling in supported environments. Installation primarily uses package managers for efficiency, though compilation is available for custom kernels. Begin by downloading and installing the BeeGFS configuration file from the official download page to all nodes using commands like wget https://www.beegfs.io/release/beegfs_8.2/beegfs-repo-latest.noarch.rpm for RHEL-based systems, followed by yum install ./beegfs-repo-latest.noarch.rpm. Install role-specific packages via yum or apt: for the daemon, yum install beegfs-mgmtd; for servers, yum install beegfs-meta; for targets, yum install beegfs-storage; and for clients, yum install beegfs-client beegfs-tools beegfs-utils. For RDMA-enabled setups, additionally install libbeegfs-ib and verify driver compatibility. The BeeGFS setup scripts, such as beegfs-setup-meta for targets and beegfs-setup-storage for targets, automate target preparation by creating points and configuring paths, e.g., /opt/beegfs/sbin/beegfs-setup-meta -p /data/beegfs_meta -m <management-node-ip>. compilation involves downloading tarballs, running ./configure --enable-contrib, make, and make install, but is recommended only for unsupported kernels. Basic configuration involves editing or INI files in /etc/beegfs/ to tune parameters across nodes for consistency. For example, in /etc/beegfs/beegfs-client.conf, set connMaxInternodeNum=12 (the default value) to control the number of connections per client-server pair, adjusting higher for bandwidth-intensive workloads while monitoring usage. (HA) for metadata and storage is achieved by defining buddy groups—pairs of mirrored targets on separate nodes—using the management interface command beegfs-ctl --mirrorcreate --primary=<primary-target> --secondary=<secondary-target> --type=meta, ensuring targets match in size and are rack-diverse for . Clients are configured via beegfs-setup-client -m <management-node-ip>, specifying mount options in /etc/beegfs/beegfs-mounts.conf, such as automatic mounting at . For scheduler integration, BeeGFS mounts can be automated in Slurm or environments using / scripts to ensure job-specific access, with tools like beegfs-ctl for dynamic adjustments. Verification begins after starting services with systemctl start beegfs-mgmtd (and equivalents for , , and client), confirming initialization via beegfs-ctl --init. Use beegfs-node list to verify node registration and NIC detection, beegfs-health net for network health, and beegfs-health df for capacity. For integrity checks, run beegfs-fsck --checkfs --readOnly to scan for inconsistencies without modifications, storing the metadata database on fast like SSD; if issues are detected, follow up with a repair run using --noFetch on the saved database. RDMA verification involves checking logs with journalctl -u beegfs-client for establishment. Common pitfalls include SELinux enforcement causing "access denied" errors on clients, resolvable by configuring policies or disabling it via SELINUX=disabled in /etc/selinux/config followed by a , with configuration preferred in BeeGFS 8.2 for performance in HPC environments. Firewall rules must allow BeeGFS ports: TCP/UDP 8005 for , 8003 for storage, 8004 for clients, and 8008 for management, with dynamic ranges for tools; failure to open these leads to mount failures. Inconsistent tuning across nodes, such as mismatched connMaxInternodeNum, can degrade performance, so synchronize configurations using tools like before startup.

Typical Use Cases

BeeGFS is extensively deployed in scientific simulations, where it manages massive datasets generated by applications in climate modeling and . For instance, it optimizes I/O performance for large-scale geophysical models by integrating with parallel libraries like PnetCDF, enabling efficient data handling in MPI-based codes on systems. In , BeeGFS supports data-intensive workflows at observatories, facilitating the storage and analysis of astronomical datasets through its scalable architecture. This parallel I/O capability ensures seamless access for distributed simulations running on supercomputers. In AI and machine learning workflows, BeeGFS provides high-performance storage for training datasets on GPU clusters, supporting direct data transfer to accelerators via integrations like . This enables low-latency access to large-scale data, accelerating deep learning tasks such as model training for autonomous vehicles without requiring workflow modifications. Its userspace design optimizes I/O for both small metadata operations and large file transfers, making it suitable for iterative AI pipelines on hybrid CPU-GPU environments. BeeGFS addresses the demands of media and rendering applications by delivering high-throughput storage for visual effects (VFX) pipelines in . It handles bursty workloads inherent to rendering farms, where multiple nodes simultaneously access and process large asset files, ensuring consistent performance across small previews and full-resolution outputs. Adopted in and broadcast sectors, it scales to support collaborative workflows, providing the parallel access needed for next-generation without bottlenecks. As a scalable , BeeGFS functions as a target for and archiving in institutions, accommodating long-term retention of voluminous outputs and observational . Its distributed supports high-availability configurations for durable preservation in HPC settings. BeeGFS facilitates hybrid cloud deployments by extending on-premises clusters to public clouds like AWS and , allowing overflow capacity for bursty computational workloads. This integration enables seamless between local and cloud resources, leveraging BeeGFS's compatibility with cloud instances for enhanced in distributed environments as of 2025.

Performance and Benchmarks

Benchmark Methodologies

BeeGFS performance evaluation relies on a combination of built-in tools and widely adopted external benchmarks to assess file system integrity, throughput, metadata operations, and scalability in high-performance computing (HPC) environments. These methodologies enable administrators to measure key aspects such as data transfer rates and operational efficiency without requiring custom implementations. Built-in tools provide foundational testing capabilities directly integrated into BeeGFS. The beegfs-fsck utility is used for integrity checks, verifying consistency across storage targets and enabling repairs as needed, which is essential for ensuring benchmark reliability before performance tests. The beegfs-ctl command-line tool facilitates statistics gathering, such as target states and utilization during tests. For raw throughput evaluation, the StorageBench tool individual storage targets by simulating direct I/O operations, isolating storage hardware performance from influences. Complementing this, NetBench assesses bandwidth and between clients and servers, helping identify communication bottlenecks. External benchmarks are commonly employed to simulate real-world HPC workloads. The IO-500 suite evaluates comprehensive I/O patterns, incorporating scenarios for both data-intensive and metadata-heavy tasks to reflect production demands in large-scale systems. Mdtest specifically targets operations, measuring the creation, , and deletion rates of files and directories to gauge scalability in namespace-intensive applications. IOR, another standard tool, tests parallel file I/O performance across distributed nodes, allowing configuration of parameters like stripe counts to optimize and assess aggregate throughput under varied access patterns. Testing setups typically involve multi-node configurations to replicate HPC environments, with dozens to thousands of client nodes accessing servers over high-speed networks. These setups measure in gigabytes per second (GB/s), input/output operations per second (), and under increasing loads, such as escalating client counts or concurrent streams, to evaluate system behavior at scale. Key metrics in BeeGFS benchmarking emphasize aggregate read and write speeds to quantify data movement efficiency, metadata through operations per second (e.g., files created or accessed), and failure recovery times to assess during target failures or resync processes. These indicators provide a holistic view of system performance without delving into application-specific variances. Best practices for benchmarking include isolating variables to pinpoint bottlenecks, such as using StorageBench for storage-only tests versus NetBench for network-focused evaluations, and employing dummy targets to simulate additional storage capacity without physical hardware changes. This approach ensures targeted optimizations, for instance, tuning RDMA settings to enhance network efficiency as outlined in performance guides.

Comparative Performance

BeeGFS deployments have secured notable positions in IO-500 benchmarks, demonstrating competitive performance in environments. For instance, a BeeGFS configuration on Oracle Cloud Infrastructure achieved an overall IO-500 score of 32.79, with 14.02 GiB/s and 76.67 kIOP/s throughput across 10 client nodes (2021 submission). Another submission from ranked #30 in the ISC 2023 production list, scoring 16.48 with 4.40 GiB/s and 61.76 kIOP/s operations. These results highlight BeeGFS's capability in balanced I/O workloads, though top full-system scores often favor specialized configurations. As of November 2025, BeeGFS has not appeared in top rankings of recent IO-500 lists such as ISC 2024 or 2024. Compared to Lustre, BeeGFS offers advantages in metadata performance due to its distributed model, which supports across multiple servers. BeeGFS's design simplifies deployment for such operations compared to Lustre's distributed extensions. In mdtest benchmarks, BeeGFS achieves high metadata rates, such as 1.72 million file creates per second with 20 metadata servers, scaling effectively for concurrent small-file accesses. For large-file writes, BeeGFS delivers bandwidth comparable to Lustre, often saturating 100 Gbps networks with native clients. Relative to IBM Spectrum Scale (formerly GPFS), BeeGFS offers superior server scalability, supporting over 100 nodes without complex tuning, and simpler deployment via user-space daemons. Benchmarks indicate BeeGFS provides better performance in small-file operations, benefiting from threaded request queuing and distributed targets that avoid potential trade-offs in concurrent scans. In mixed workloads, BeeGFS sustains higher throughput than NFS for multi-user large-file access. Real-world deployments on supercomputers underscore BeeGFS's strengths, with sustained rates up to 45 GB/s in configured systems using SupremeRAID backends. BeeGFS outperforms NFS in mixed read-write patterns common to HPC simulations. A key limitation of BeeGFS is its higher CPU utilization in user-space server scenarios compared to fully kernel-based systems like Lustre or GPFS, due to daemon overhead in processing requests outside the kernel. This can impact efficiency in CPU-constrained environments, though optimizations like SSD backends mitigate it for metadata-heavy loads.

Integrations and Advanced Applications

Containerization Support

BeeGFS provides native compatibility for , allowing its clients to be mounted inside containers through bind mounts on the host filesystem or via overlays, which is particularly seamless with and Apptainer for HPC workflows. This approach enables containerized applications to access the full parallel I/O capabilities of BeeGFS without requiring the client module to be reinstalled inside the container image. For Docker integration, BeeGFS volumes can be incorporated into Compose configurations for development and testing environments, where using host networking mode preserves the original performance characteristics of the . Official BeeGFS container images, available via the Container Registry, support and other OCI runtimes like Podman, facilitating straightforward deployment of BeeGFS services (management, metadata, and storage) in isolated environments. In HPC contexts, BeeGFS integrates with specialized tools such as Charliecloud and Shifter, leveraging their lightweight designs to minimize resource usage while maintaining compatibility through standard Image Manifest V2 and OCI formats. Best practices for optimal integration include granting containers the necessary privileges for RDMA access, such as the --privileged flag or specific capabilities like --cap-add=IPC_LOCK alongside device bindings (e.g., --device=/dev/infiniband/*) on RDMA-enabled hardware like or RoCE. Additionally, BeeGFS helper scripts, including beegfs-setup-<service>, support dynamic mounting and configuration adjustments within containerized deployments via environment variables. The primary advantages of this support lie in enabling reproducible, portable environments for scientific simulations and HPC applications, without compromising the high-speed I/O performance inherent to BeeGFS.

Exascale Computing Applications

BeeGFS has played a significant role in advancing through its participation in the Union-funded DEEP-ER , which ran from 2015 to 2018 and focused on developing heterogeneous architectures to address exascale challenges. In this initiative, BeeGFS was integrated as the primary I/O system within a cluster-booster framework, enabling efficient across diverse compute partitions, including Xeon Phi-based boosters. The demonstrated BeeGFS's ability to handle the massive data volumes anticipated in exascale environments while supporting multi-level checkpointing for application . Key exascale adaptations in BeeGFS include enhanced mechanisms, such as automatic rebalancing, which redistributes chunks across targets in response to failures or changes without interrupting ongoing operations. This feature ensures continuous in fault-prone exascale systems, where hardware failures are expected to occur frequently. Additionally, BeeGFS supports disaggregated configurations, particularly in booster partitions, by leveraging (NVM) and network-attached memory () to create a hierarchical I/O infrastructure that optimizes placement and access for compute-intensive workloads. At exascale levels, BeeGFS has demonstrated robust performance through simulations and prototypes, achieving operation rates exceeding 6 million requests per second across 20 metadata servers. It integrates seamlessly with high-speed exascale interconnects, such as , via native RDMA support, enabling low-latency data transfers across thousands of nodes. These capabilities address core challenges like managing millions of files under extreme concurrency, where BeeGFS scales operations to over 6 million requests per second across 20 servers, maintaining efficiency above 78%. BeeGFS's modular plugin architecture further future-proofs it for emerging exascale paradigms, including potential integrations with quantum and elements, by allowing customizable extensions for specialized hardware without altering core components. Deployments in pre-exascale systems, such as the DEEP-ER prototype and partial integrations in prototypes under initiatives like EXANODE, highlight its readiness. As of 2024, BeeGFS 8 was announced, bringing advanced tools relevant to exascale and AI workloads.

References

  1. [1]
    Home - BeeGFS - The Leading Parallel Cluster File System
    Oct 27, 2025 · BeeGFS is a leading parallel file system for HPC, AI & ML, providing scalable, high-throughput access to file storage systems.Why Use BeeGFS · Documentation
  2. [2]
    Fraunhofer Parallel File System – BeeGFS
    BeeGFS is a parallel cluster file system designed for performance, easy installation, and management, distributing files across multiple servers for parallel ...
  3. [3]
    How BeeGFS Works - The Leading Parallel Cluster File System
    Jul 10, 2025 · BeeGFS uses service daemons in user space, distributes metadata across multiple servers, and scales performance by increasing servers and disks.
  4. [4]
    [PDF] Introduction to BeeGFS
    Jun 2, 2018 · BeeGFS is a software-defined storage based on the POSIX file system interface, which means applications do not have to be rewritten or modified ...
  5. [5]
    [PDF] BeeGFS Parallel File System - The Futurum Group
    Dec 8, 2022 · BeeGFS is an open-source, scale-out parallel file system for large data storage, designed for high performance computing, and is free to ...<|control11|><|separator|>
  6. [6]
    What is BeeGFS
    BeeGFS is one of the leading parallel file systems that continues to grow and gain significant popularity in the community.Missing: definition | Show results with:definition
  7. [7]
    RDMA Support — BeeGFS Documentation 8.2
    RDMA support for InfiniBand, RoCE (RDMA over Converged Ethernet) ... When the client has multiple RDMA NICs, it is advantageous to configure multi-rail support.
  8. [8]
    License — BeeGFS Documentation 8.2
    The BeeGFS client module (all code in client_module ) is distributed under GPLv2. Third party licenses¶. Paul Hsieh hash function¶. See http://www.
  9. [9]
    Enterprise Features - BeeGFS - The Leading Parallel Cluster File ...
    Aug 1, 2025 · BeeGFS Enterprise Hive is available under a license and professional support contract that not only allows you to use all the enterprise ...
  10. [10]
    Architecture — BeeGFS Documentation 8.2
    BeeGFS combines multiple storage servers to provide a highly scalable shared network file system with striped file contents.Missing: core | Show results with:core
  11. [11]
    Management Service — BeeGFS Documentation 8.2
    The BeeGFS management service uses a SQLite database to store information about the systems structure, the registered nodes, targets, groups, pools; quota ...Missing: admon | Show results with:admon
  12. [12]
    Admon - BeeGFS Unofficial Documentation - Read the Docs
    The BeeGFS Administration and Monitoring System (short: Admon) provides a graphical interface to perform administrative management tasks and to monitor the ...
  13. [13]
    BeeGFS for Beginners: A Technical Introduction | NetApp Blog
    Oct 4, 2019 · The BeeGFS metadata service stores data describing the structure and contents of the file system. Its role includes keeping track of the ...
  14. [14]
    Mirroring — BeeGFS Documentation 8.2
    ### Summary of Buddy-Mirroring in BeeGFS Storage Targets
  15. [15]
    Striping — BeeGFS Documentation 8.2
    There are currently two basic parameters that can be configured for stripe patterns: the desired number of storage targets for each file and the chunk size (or ...
  16. [16]
    Metadata Node Tuning — BeeGFS Documentation 8.2
    This page presents some tips and recommendations on how to improve the performance of BeeGFS metadata servers.
  17. [17]
    System Requirements — BeeGFS Documentation 8.2
    We provide a toolset to compile the BeeGFS Client module yourself. Self-compiled kernels are not officially supported.Missing: website | Show results with:website
  18. [18]
    Metadata Mirroring — BeeGFS Documentation 8.2
    In order to synchronize all information across the various components of BeeGFS correctly, when activating metadata mirroring, no clients may be mounted and all ...
  19. [19]
    Client Side Caching Modes — BeeGFS Documentation 8.2
    Client Side Caching Modes¶ · buffered : uses a pool of small static buffers for write-back and read-ahead · native : uses the Linux kernel page-cache.Missing: behind | Show results with:behind
  20. [20]
    Storage Node Tuning — BeeGFS Documentation 8.2
    This page presents some tips and recommendations on how to improve the performance of BeeGFS storage servers.Hardware Raid · Zfs · Mount Options
  21. [21]
    Monitoring service — BeeGFS Documentation 8.2
    The beegfs-mon service collects statistics from the system and provides them to the user using a time series database (InfluxDB).<|control11|><|separator|>
  22. [22]
    [PDF] The Leading Parallel Cluster File System - BeeGFS
    BeeGFS offers maximum performance and scalability on various levels. It supports distributed file contents with flexible striping across storage servers on a ...Missing: horizontal | Show results with:horizontal
  23. [23]
    Remote Storage Targets — BeeGFS Documentation 8.2
    BeeGFS also supports synchronizing files with one or more Remote Storage Targets. Any S3-compatible external storage provider can be used as a remote target ...Getting Started · Using Remote Storage Targets · Pull (download)
  24. [24]
    Quota — BeeGFS Documentation 8.2
    BeeGFS allows the definition of system-wide quotas of disk space allocation and number of chunk files, on a per-user or per-group basis.
  25. [25]
    [PDF] High-Performance BeeGFS Filesystem Reference Architecture ...
    Big Data analytics, artificial intelligence (AI), and machine learning (ML) typically require a solution that provides very low latency and high IO throughput.
  26. [26]
    ELEMENTS BLINK - BeeGFS native client
    ELEMENTS BLINK is a BeeGFS native client for macOS and Windows that delivers block-level storage access over Ethernet with incredible performance.Missing: cross- | Show results with:cross-
  27. [27]
    Move Over Lustre & Spectrum Scale – Here Comes BeeGFS?
    Nov 26, 2018 · You may remember that BeeGFS got its start as an in-house project (2005) at the Fraunhofer Institute for Industrial Mathematics (ITWM) and was ...
  28. [28]
    [PDF] FhGFS - A flexible parallel file system for performance ... - BeeGFS
    ▫ Development started in 2005 ... ▫ How Fraunhofer ITWM uses fhgfs-ondemand: Fraunhofer Seislab. ▫ in-house cluster for development of seismic codes.Missing: history | Show results with:history<|control11|><|separator|>
  29. [29]
    Fraunhofer File System (FhGFS): Solid, Fast, and Made in Europe
    Nov 30, 2013 · The problem with pNFS is that it is still NFS, thus the fundamental idea is still not to provide a file system for scientific / cluster use. So ...Missing: motivations ITWM
  30. [30]
    [PDF] BeeGFS – not only for HPC - Fraunhofer ITWM
    Sep 24, 2015 · •Developed at Fraunhofer ITWM (original name: FhGFS). •Productive installations since 2007. •First commercial installation in 2009. •Renamed to ...Missing: history | Show results with:history
  31. [31]
    ThinkParQ to Showcase BeeGFS Developments at SC14 - HPCwire
    Nov 11, 2014 · ... 2014 ... ThinkParQ was founded as a spin-off from the Fraunhofer Competence Center for HPC in Kaiserslautern to bring BeeGFS to the market.Missing: commercialization | Show results with:commercialization
  32. [32]
    [PDF] Introduction to BeeGFS
    BeeGFS was developed at the Fraunhofer Institute for industrial mathematics (ITWM). It was originally released as “FhGFS”, but was newly labeled as BeeGFS ( ...
  33. [33]
    Changelog.txt - BeeGFS
    = BeeGFS Changelog (v6.x Release Series) = == Changes in 6.19 (release date: 2018-08-28) == Enhancements: * client: Added support for Linux Kernels 4.15 ...
  34. [34]
    BeeGFS Documentation v7.2
    This BeeGFS documentation covers what BeeGFS is, its key benefits, release notes, general changes, and a quick start guide.Missing: 2018 | Show results with:2018
  35. [35]
    GPUDirect Storage Support — BeeGFS Documentation 8.2
    NVIDIA GPUDirect Storage (GDS) is part of the NVIDIA Magnum IO SDK that enables direct memory access directly between a GPU and RDMA NIC.Clients · Verifying Beegfs Support In... · Rdma Nic PriorityMissing: partnerships Intel
  36. [36]
    BeeGFS now supports NVIDIA Magnum IO GPUDirect Storage. So ...
    Jun 3, 2025 · This means that the BeeGFS client can now be configured to use multiple RDMA capable network interfaces on the same network and can ...
  37. [37]
    ThinkParQ Announces Immediate Release of BeeGFS v7.3.0
    Apr 20, 2022 · BeeGFS v7.3.0 includes NVIDIA's Magnum IO GPUDirect Storage (GDS) and support for client side multirail RDMA networking, supporting Arm ...
  38. [38]
    Public repository for the BeeGFS Parallel File System - GitHub
    BeeGFS (formerly FhGFS) is the leading parallel cluster file system, developed with a strong focus on performance and designed for very easy installation and ...
  39. [39]
    BeeGFS has grown up (quietly): Introducing 7.2.10, 7.3.4, and 7.4.0
    user-friendly and opinionated about what features are important to ...
  40. [40]
    ThinkParQ Launches BeeGFS 8.2 with Full IPv6 Support, Enhanced ACL Performance and Background Data Rebalancing - HPCwire
    ### Summary of BeeGFS 8.2 Features (Focus on Exascale Readiness and Improved Resilience)
  41. [41]
    Release Notes 8.2 - BeeGFS Documentation
    BeeGFS 8.2 improves ACL performance, introduces background data rebalancing, adds full IPv6 support and SELinux integration, and delivers a wide range of ...Missing: admon | Show results with:admon
  42. [42]
    Quick Start Guide — BeeGFS Documentation 8.2
    BeeGFS will automatically enable RDMA on startup if the corresponding hardware and drivers are present. To use enterprise features such as storage pools or ...
  43. [43]
    Manual Installation — BeeGFS Documentation 8.2
    (Optional) If you are using the BeeGFS Hive Edition, add your license to /etc/beegfs/license. ... Enterprise Feature Support (optional for the management).
  44. [44]
    Client Node Tuning — BeeGFS Documentation 8.2
    Increasing the number of connections may improve performance and responsiveness for certain workloads. When increasing the value, it is extremely important ...
  45. [45]
    BeeOND: BeeGFS On Demand
    Due to the very simplified startup, it is easy to integrate BeeOND with workload managers, such as Torque or Slurm. Torque, for example, provides prologue ...
  46. [46]
    File System Check — BeeGFS Documentation 8.2
    It is used for two independent actions: it checks the file system for consistency and provides a repair tool, and it is used to enable quota support.
  47. [47]
    General Questions — BeeGFS Documentation 8.2
    BeeGFS may have issues like 'access denied' errors, client mount failures, too many open files, and split-brain scenarios. Client module errors can also occur.
  48. [48]
    Network Tuning — BeeGFS Documentation 8.2
    All BeeGFS services use fixed TCP ports. The only exception are the beegfs and beegfs-fsck tools. In general, it is not required that ...
  49. [49]
    Improving the I/O of large geophysical models using PnetCDF and ...
    Sep 29, 2025 · Large scale geophysical modeling uses high performance computing systems to expedite the solutions of very large, complex systems.
  50. [50]
    [PDF] INAF Trieste Astronomical Observatory Information Technology ...
    INAF-OATs commitment in the application of information technology to Astronomical and Astrophysical use cases involves a variety of activities. ... using BeeGFS, ...
  51. [51]
    None
    ### Summary of BeeGFS Use Cases in HPC
  52. [52]
  53. [53]
    None
    ### Summary of BeeGFS Use Cases in AI/ML Workflows
  54. [54]
    None
    ### Summary of BeeGFS Use Cases in Media and Rendering, High-Throughput for VFX Pipelines, Bursty Workloads
  55. [55]
    BeeGFS File System - ELEMENTS Media Storage
    ... Originally designed at the renowned Fraunhofer ITWM, BeeGFS is proven in science powerhouses like NASA and other supercomputing environments.Missing: Virtual ViR FhGFS
  56. [56]
    Enabling parallel file systems in the cloud with Amazon EC2 (Part I
    Sep 10, 2021 · In this post you will learn how to deploy the popular open source parallel file system, BeeGFS, using AWS D3en and I3en EC2 instances.Missing: hybrid Azure
  57. [57]
    Azure Reports 1 TB/s Cloud Parallel Filesystem with BeeGFS
    Nov 19, 2020 · The Azure HPC team reports it has successfully demonstrated the first-ever one terabyte per second cloud-based parallel filesystem.
  58. [58]
    ELEMENTS BOLT & BeeGFS set a new SPEC SFS performance ...
    Besides allowing for an easy integration of cloud instances such as AWS and Microsoft Azure into a hybrid storage environment, this file system offers one ...
  59. [59]
    Benchmarking a BeeGFS System
    The storage targets benchmark is intended to determine the maximum theoretical performance of BeeGFS on the storage targets or to detect defective or ...
  60. [60]
    [PDF] BeeGFS unofficial documentation - Read the Docs
    Sep 24, 2019 · The metadata service stores information about the data e.g. directory information, file and directory ownership and the location of user file ...
  61. [61]
    [PDF] sc21-io500-slides.pdf
    IO500 Benchmark Usage. • IO500 benchmark's mdtest and IOR scenario can be used to form a bounding box of user expectations 4 as illustrated by the figure ...
  62. [62]
    Dell Ready Solutions for HPC BeeGFS High-Performance Storage
    Write performance was measured using dd command by creating a 10GB file of 1MB block size and direct I/O for data, for RAID 0 devices the average was about 5.1 ...
  63. [63]
    [PDF] Evaluating the MetaData Performance of BeeGFS®
    Scope of this paper is to evaluate the MetaData performance of BeeGFS. We work with three different scenarios. In the first one we compare the performance ...Missing: aggregate write failure recovery
  64. [64]
    BeeGFS on Oracle Cloud - IO500 - Submissions
    IO500 SCORES. IO500 Score, 32.79. IO500 BW, 14.02 GiB/s. IO500 MD, 76.67 kIOP/s. INFORMATION. Client Nodes, 10. Client Total Procs, 160. Metadata Nodes, 10.
  65. [65]
  66. [66]
  67. [67]
    NFS or BeeGFS for High speed storage? : r/HPC - Reddit
    Dec 17, 2024 · No, NFS (v3) is going to be better than any fully POSIX filesystems at small file I/O, GPFS/Lustre/BeeGFS/etc. Your NFS server ...
  68. [68]
    SupremeRAID™ BeeGFS™ Performance with GIGABYTE Servers
    Mar 1, 2024 · To summarize, SupremeRAID™ delivers high performance under raw storage and real-world workload scenarios. ... Up to 45.10 GB/s BeeGFS performance.Missing: NFS | Show results with:NFS
  69. [69]
    QCT HPC BeeGFS Storage: A Performance Environment for I/O ...
    Jul 27, 2023 · The demand for cost-effective and powerful GPU solutions continues to soar, driven by growing needs in lightweight AI inferencing, ...
  70. [70]
    Linux user-space vs kernel -space file system performance
    Sep 5, 2012 · The conclusions reached match my own perceptions of this that user-space file-systems (FUSE specifically) do perform measurably worse than kernel-space ...<|control11|><|separator|>
  71. [71]
    Running BeeGFS in Containers
    This guide will focus on how to get up and running using Docker. To follow along you will need to install the correct server version of Docker Engine.
  72. [72]
  73. [73]
    [PDF] On the Use of Containers in High Performance Computing ...
    We study four container solutions: Docker, Podman, Singularity, and Charliecloud on the widely popular Lustre file system. We present a frame- work for ...Missing: RDMA | Show results with:RDMA
  74. [74]
    [PDF] DEEP-ER DEEP Extended Reach - DEEP-Projects
    On ISC'14, the DEEP-ER project co-organised and was present at the “European Exascale ... been done on the DEEP Cluster to measure BeeGFS's baseline performance.
  75. [75]
    BeeGFS 8: Data Management and Beyond
    Apr 25, 2025 · BeeGFS 8 is our most ambitious release yet, built with three goals in mind: Expand data management capabilities.Establishing The Ground... · Integrate Don't Reinvent... · Modernize Without Breaking...Missing: monitoring bottlenecks