GROMACS
GROMACS is a free and open-source software package designed for high-performance molecular dynamics simulations and energy minimization, solving Newton's equations of motion for atomic systems ranging from hundreds to millions of particles.[1] Primarily focused on biochemical molecules such as proteins, lipids, and nucleic acids, it also supports simulations of non-biological systems like polymers and fluid dynamics.[1] Developed as a versatile, community-driven tool, GROMACS emphasizes efficiency, user-friendliness, and extensibility, making it one of the most widely used programs in computational chemistry and molecular modeling.[2] Originating in the early 1990s from Herman Berendsen's group in the Department of Biophysical Chemistry at the University of Groningen, Netherlands, GROMACS was initially created to address the need for efficient parallel simulations of biomolecular systems.[3] It has since evolved through contributions from an international team of developers, now led by Berk Hess and Erik Lindahl from the Royal Institute of Technology (KTH) and Stockholm University, with coordination at the Science for Life Laboratory in Stockholm.[3] Released under the GNU Lesser General Public License (LGPL) version 2.1, the software encourages open collaboration, including bug fixes, feature additions, and documentation improvements from the global community.[1] Key features of GROMACS include optimized performance through SIMD intrinsics, support for GPU acceleration via CUDA, OpenCL, and SYCL, and advanced parallelization with MPI and Thread-MPI for load balancing across CPUs and GPUs.[1] It provides tools for topology generation, trajectory analysis, and enhanced sampling methods, often without requiring scripting, though a Python API is under development.[1] Widely adopted in academia and industry for predicting properties like binding affinities and viscosities, GROMACS continues to advance with regular releases, such as version 2025 as of November 2025.[2]Introduction
Overview
GROMACS is a free, open-source software suite designed for high-performance molecular dynamics (MD) simulations and energy minimization of biomolecular systems, including proteins, lipids, and nucleic acids.[4] It enables researchers to model the behavior of these systems at the atomic level, supporting simulations of systems ranging from hundreds to millions of particles.[4] The core purpose of GROMACS is to simulate the Newtonian equations of motion for atoms and molecules, allowing the study of dynamic processes such as protein folding, ligand binding, and membrane dynamics.[4] Key capabilities include preparation of input structures from Protein Data Bank (PDB) files, generation of simulation trajectories over time, and basic analysis of output data like energies and coordinates.[4] Originating in the 1990s at the University of Groningen in the Netherlands, GROMACS has evolved into a widely used tool in computational biology and chemistry.[3] The latest version, GROMACS 2025.3, released on August 29, 2025, incorporates ongoing enhancements for improved performance, new features, and compatibility with modern hardware.[5]Licensing and Availability
GROMACS has been distributed as free and open-source software since 2000, initially under the GNU General Public License (GPL) version 2 or later, which permits free use, study, modification, and distribution provided that derivative works are also released under the GPL and source code is made available.[6] In 2010, with the release of version 4.6, the licensing transitioned to the GNU Lesser General Public License (LGPL) version 2.1 or later to facilitate broader integration as a library while maintaining open-source principles; this allows linking with proprietary software without requiring the entire application to be open-sourced, as long as modifications to GROMACS itself are shared under the LGPL.[7][8][1] The current licensing remains under LGPL 2.1 or later, emphasizing freedom for academic, research, and commercial use within the license terms.[9] The software is freely downloadable from the official website at gromacs.org, where users can access the latest stable releases, such as version 2025.3, in source code form for compilation on various platforms.[10] Pre-compiled binaries are available for major operating systems including Linux distributions, macOS, and Windows (often via the Windows Subsystem for Linux or Cygwin), simplifying installation for non-experts, while the source code supports custom builds optimized for high-performance computing (HPC) clusters using tools like MPI and CUDA.[11] GROMACS is also designed for easy deployment on supercomputers, with installation guides covering architectures like x86_64, ARM, and GPU-accelerated systems.[11] Development and distribution are community-driven, with the primary source repository hosted on GitLab at gitlab.com/gromacs/gromacs, enabling version control, issue tracking, and contributions through merge requests.[12] Comprehensive documentation, including user manuals, tutorials, and API references, is hosted at manual.gromacs.org, ensuring accessibility for users ranging from beginners to advanced developers.[13] Special licensing considerations exist for certain projects; for instance, a non-LGPL variant of GROMACS has been provided to the Folding@home distributed computing initiative, allowing integration into their proprietary client software for large-scale protein folding simulations without imposing open-source requirements on the entire application.[14]History
Origins and Early Development
GROMACS originated in 1991 at the Department of Biophysical Chemistry, University of Groningen, in the Netherlands, under the leadership of Professor Herman J.C. Berendsen.[3] The project was initiated to address the need for efficient molecular dynamics (MD) simulations of biomolecules, particularly proteins in aqueous environments.[15] Berendsen's group aimed to create a high-performance software package optimized for parallel computing, building on prior work in the field.[16] The name GROMACS stands for GROningen MAchine for Chemical Simulations, reflecting its initial close association with the development of a custom parallel computer featuring a 32-processor ring architecture.[15] This hardware was designed specifically to accelerate MD calculations, with the software rewritten in the portable ANSI C language to leverage its capabilities.[17] Unlike the earlier Fortran 77-based GROMOS package, which Berendsen had co-developed, GROMACS emphasized speed and flexibility for large-scale simulations without relying on proprietary hardware.[15] During its early development from 1991 to 2000, the focus was on constructing a core MD engine capable of handling proteins, solvents, and other biomolecules under periodic boundary conditions.[18] Key milestones included implementing message-passing parallelism for distributed computing and ensuring compatibility with force fields like GROMOS for accurate biomolecular modeling.[15] The software remained an internal academic tool within the Groningen group, prioritizing research efficiency over public distribution. In 2000, GROMACS transitioned to an open-source model, broadening its accessibility.[17]Open-Source Transition and Modern Evolution
In 2000, GROMACS transitioned to an open-source model by releasing its source code under the GNU General Public License (GPL), which facilitated community-driven development and broader adoption by researchers worldwide. This shift coincided with a relocation of primary leadership from the University of Groningen in the Netherlands to the Royal Institute of Technology (KTH) and Uppsala University in Sweden, where developers like Erik Lindahl and David van der Spoel played key roles in expanding its scope. The GPL licensing ensured that modifications and extensions remained freely available, fostering a collaborative ecosystem that has since involved hundreds of contributors globally. In 2010, with version 4.6, GROMACS was relicensed under the GNU Lesser General Public License (LGPL) version 2.1.[3][19] Key evolutions in the following years included its integration into large-scale distributed computing projects, such as Folding@home starting in 2001, where a modified version of GROMACS powered volunteer-based simulations of protein folding on thousands of machines. By around 2005, GROMACS adopted an annual release cycle to incorporate user feedback and performance enhancements rapidly, culminating in major rewrites like version 4.5 in 2010, which introduced initial GPU acceleration via CUDA for faster non-bonded force calculations. These updates significantly boosted simulation throughput, enabling studies of larger biomolecular systems. Institutional support evolved further with the establishment of the Science for Life Laboratory in Stockholm as the current lead development hub, complemented by the EU-funded BioExcel Center of Excellence since 2016, which has driven optimizations in scalability, usability, and integration with high-performance computing resources.[20][21][22] Notable milestones underscore GROMACS' modern trajectory, including version 5.0 in 2015, which implemented multi-level parallelism to distribute workloads across CPU cores, GPUs, and nodes for exascale simulations. In 2023, the project added SYCL support for heterogeneous GPU acceleration, extending compatibility to Intel, AMD, and NVIDIA hardware while improving portability across accelerators. The 2025 series further advanced interoperability with a built-in interface to the PLUMED library for enhanced free-energy calculations, support for amino-acid-specific CMAPs enabling the Amber ff19SB force field, and neural network potential support using PyTorch-trained models. These developments have solidified GROMACS as a cornerstone of open computational biophysics, with ongoing contributions from international teams ensuring its relevance in cutting-edge research.[23][24][25]Core Functionality
Molecular Dynamics Algorithms
GROMACS employs molecular dynamics (MD) simulations to model the time evolution of atomic systems by numerically integrating Newton's equations of motion, m \mathbf{a} = \mathbf{F}, where m is the particle mass, \mathbf{a} is the acceleration, and \mathbf{F} represents the total force on each particle derived from interatomic potentials such as Lennard-Jones for van der Waals interactions and Coulombic terms for electrostatics.[26] These forces are computed from the system's potential energy, enabling the prediction of trajectories over discrete time steps. The software's algorithms prioritize numerical stability, accuracy, and efficiency for biomolecular systems. The core integrator in GROMACS is the leap-frog algorithm, a variant of the Verlet method that staggers position and velocity updates for symplectic integration, preserving energy and phase-space volume over long simulations.[26] In this scheme, velocities are first updated at half-steps t + \Delta t/2, followed by positions at integer time steps t + \Delta t, using the relations: \mathbf{v}(t + \Delta t/2) = \mathbf{v}(t - \Delta t/2) + \mathbf{a}(t) \Delta t \mathbf{r}(t + \Delta t) = \mathbf{r}(t) + \mathbf{v}(t + \Delta t/2) \Delta t where accelerations \mathbf{a}(t) = \mathbf{F}(t)/m are evaluated from forces at time t.[26] An equivalent formulation is the velocity Verlet scheme, which updates both positions and velocities within a full step and produces identical trajectories to leap-frog: \mathbf{r}(t + \Delta t) = \mathbf{r}(t) + \mathbf{v}(t) \Delta t + \frac{1}{2} \mathbf{a}(t) (\Delta t)^2 \mathbf{v}(t + \Delta t) = \mathbf{v}(t) + \frac{1}{2} \left[ \mathbf{a}(t) + \mathbf{a}(t + \Delta t) \right] \Delta t This derivation ensures second-order accuracy and time-reversibility, with typical time steps of 1-2 fs to resolve high-frequency vibrations while maintaining stability.[26] To simulate bulk behavior without surface artifacts, GROMACS applies periodic boundary conditions (PBC), replicating the simulation box infinitely in all directions using a triclinic unit cell defined by vectors \mathbf{a}, \mathbf{b}, and \mathbf{c}.[27] The minimum image convention restricts interactions to the nearest periodic image of each particle, ensuring that distances are computed modulo the box vectors to select the shortest vector \mathbf{r}_{ij} satisfying |\mathbf{r}_{ij}| \leq \frac{1}{2} times the shortest box dimension.[27] This approach, combined with cut-off radii not exceeding half the minimum box edge, accurately models infinite systems like liquids or crystals. Non-bonded interactions, encompassing van der Waals and electrostatics, are handled efficiently to capture long-range effects. For electrostatics, GROMACS implements the Particle Mesh Ewald (PME) method, which decomposes Coulomb interactions into short-range real-space sums (via direct cut-off) and long-range reciprocal-space contributions evaluated on a Fourier grid using fast Fourier transforms.[26] The potential is given by: V_{\text{Coulomb}} = \frac{1}{4\pi\epsilon_0} \sum_{i<j} q_i q_j \frac{\mathrm{erfc}(\kappa r_{ij})}{r_{ij}} + \text{reciprocal terms} where \mathrm{erfc} is the complementary error function that screens the short-range parts, achieving O(N \log N) scaling for N particles. Van der Waals forces follow the Lennard-Jones form, V_{\text{LJ}}(r) = 4\epsilon \left[ \left( \frac{\sigma}{r} \right)^{12} - \left( \frac{\sigma}{r} \right)^6 \right], truncated at a cut-off with optional shifting.[26] Both are optimized using buffered pair lists, which precompute interacting particle pairs within an extended cut-off radius (Verlet buffer) to reduce recalculation frequency, updated every few steps based on a tolerance for energy drift.[26] Prior to dynamics, GROMACS performs energy minimization to relax initial structures from overlaps or strains, using iterative optimization of the potential energy V. The steepest descent method displaces atoms opposite the gradient \mathbf{F} = -\nabla V, with step size \delta scaled by the maximum force component: \mathbf{x}_{\text{new}} = \mathbf{x} + \delta \frac{\mathbf{F}}{|\mathbf{F}|_{\max}} where \delta \approx 0.01 nm, accepting steps only if V decreases and halting when the maximum force falls below a tolerance (e.g., 1-10 kJ mol⁻¹ nm⁻¹).[28] For finer convergence near minima, the conjugate gradient algorithm builds orthogonal search directions from prior gradients, accelerating descent via the Fletcher-Reeves update: \mathbf{p}_{k+1} = -\mathbf{g}_{k+1} + \beta_k \mathbf{p}_k, \quad \beta_k = \frac{|\mathbf{g}_{k+1}|^2}{|\mathbf{g}_k|^2} where \mathbf{g} = -\mathbf{F} is the gradient; it is slower initially but more efficient overall, though incompatible with constraints like rigid water models.[28]Supported Force Fields and Integrators
GROMACS provides native support for several widely used biomolecular force fields, enabling simulations of proteins, lipids, nucleic acids, and small molecules. These include the AMBER family (versions such as AMBER94, AMBER96, AMBER99, AMBER99SB, AMBER99SB-ILDN, AMBER03, and AMBERGS), the CHARMM series (including CHARMM19 united-atom, CHARMM22, CHARMM27, and CHARMM36), the GROMOS force fields (43a1, 43a2, 45a3, 53a5, 53a6, and 54a7 united-atom variants), and the OPLS-AA parameters (united-atom OPLS-UA, all-atom OPLS-AA, and OPLS-AA/M).[29] Each force field parameter set is distributed with GROMACS or available through official ports, ensuring compatibility for standard biomolecular systems.[29] Topology files in GROMACS, typically with the .top extension, define the molecular structure and interaction parameters for these force fields. These files specify bonded terms such as bond stretching, modeled as harmonic potentials U_{\text{bond}} = \frac{1}{2} k (r - r_0)^2, where k is the force constant, r the current bond length, and r_0 the equilibrium length, as well as angle bending and dihedral torsions. Non-bonded interactions, including van der Waals and electrostatic forces, are handled via Lennard-Jones and Coulomb potentials, respectively, with parameters derived from the selected force field. The core integration in GROMACS relies on the leap-frog Verlet algorithm for propagating equations of motion. Supported integrators include stochastic dynamics (SD), which applies a leap-frog scheme with friction and random forces for constant temperature control using parameters like coupling time \tau_t and reference temperature T.[30] Langevin dynamics, implemented for Brownian motion simulations, incorporates velocity updates with friction coefficients and thermal noise to maintain temperature.[30] Temperature coupling options feature the velocity rescaling thermostat, which combines kinetic energy rescaling with stochastic terms to sample the canonical ensemble accurately.[30] For pressure control, the Parrinello-Rahman barostat couples the simulation box to an external pressure via an extended Lagrangian, allowing anisotropic scaling with a coupling constant \tau_p.[30] Specialized simulation methods in GROMACS enhance sampling and thermodynamic calculations. Replica exchange molecular dynamics (REMD) enables parallel tempering across multiple replicas at different temperatures, with periodic attempts to swap configurations to overcome energy barriers and improve conformational sampling. Free energy perturbation (FEP) supports alchemical transformations, such as mutating one molecule into another, through λ-parameter scaling that interpolates between initial (λ=0) and final (λ=1) states, often using soft-core potentials to avoid singularities during the process.[31][30] Key file formats facilitate input and output for these features. The .gro format stores atomic coordinates (in nanometers) and optional velocities (in nm/ps), along with box dimensions, serving as a lightweight structure file or concatenated trajectory.[32] The .tpr file, generated by the grompp preprocessor, combines topology, coordinates, and simulation parameters into a portable binary run input essential for executing dynamics with specified force fields and integrators.[32]Performance and Parallelization
Hardware Acceleration
GROMACS optimizes CPU computations through extensive support for Single Instruction, Multiple Data (SIMD) instructions, enabling vectorized force calculations on modern processors. It leverages instructions such as AVX-512 on x86 architectures and ARM Scalable Vector Extension (SVE) on ARM-based systems to accelerate core molecular dynamics operations like non-bonded interactions.[11][33] Additionally, multi-threading is implemented via OpenMP, allowing efficient parallelization across CPU cores for tasks including bonded interactions and particle mesh Ewald (PME) calculations.[34][35] For GPU acceleration, GROMACS has supported offloading since version 4.5 released in 2010, with native integration from version 4.6 in 2013. It utilizes CUDA for NVIDIA GPUs, OpenCL for AMD and Intel GPUs, and SYCL via Intel oneAPI for Intel and AMD GPUs, enabling up to 90% of non-bonded computations—such as short-range electrostatic and van der Waals interactions—to be performed on the GPU while the CPU handles remaining tasks.[36][37] This approach minimizes data transfer overhead and maximizes hardware utilization in single-node setups. In cluster environments, GROMACS employs hybrid CPU-GPU execution through domain decomposition, which partitions the simulation space into subdomains assigned to MPI processes, balancing load across CPU cores and associated GPUs.[34][38] Each domain's non-bonded work is offloaded to its GPU, with dynamic adjustments to maintain efficiency as simulation conditions evolve. Support for SYCL via AdaptiveCpp enables GPU acceleration on AMD GPUs, while OpenCL backend recognizes Apple-designed GPUs for portability on Apple Silicon hardware (initial support added in 2023).[11][39] The 2025 version (up to 2025.3, released August 2025) introduced AMD HIP as a direct GPU backend for AMD devices and addressed bugs in SYCL builds and GPU handling to enhance performance and stability on diverse accelerators.[40][41]Optimization and Scalability Techniques
GROMACS employs sophisticated software strategies to enhance computational efficiency and scalability in molecular dynamics simulations, focusing on algorithmic optimizations that minimize communication overhead and maximize load distribution across processing units. Central to these techniques is a domain decomposition approach that partitions the simulation space into cells assigned to parallel processes, enabling efficient handling of large systems with millions of atoms. This framework supports weak scaling, where performance remains stable as system size increases proportionally with the number of cores, achieving simulations of up to millions of atoms on over 1000 cores.[34] Parallelization in GROMACS utilizes the Message Passing Interface (MPI) for inter-node communication in multi-node setups, facilitating scalable distribution of computational tasks across clusters. For single-node operations, thread-MPI provides an efficient alternative to full MPI, optimizing intra-node parallelism without the overhead of inter-process messaging, and is compatible with OpenMP for multi-core utilization. Dynamic load balancing is implemented through a grid search algorithm that adjusts the decomposition of particles into spatial domains, ensuring even workload distribution by independently tuning cell volumes in 2D or 3D, which is particularly effective for heterogeneous systems with varying particle densities.[34] To reduce the frequency of expensive neighbor searches, GROMACS uses buffered pair lists that precompute interacting atom pairs within a defined cut-off radius, extended by a buffer to account for atomic displacements over multiple time steps. Typical cut-off schemes range from 1.0 to 1.4 nm for van der Waals interactions, allowing pair lists to be reused for 20-40 steps depending on the buffer size, which significantly lowers computational cost while maintaining accuracy. This buffering strategy integrates seamlessly with the parallel domain decomposition, minimizing data redistribution during updates. Tuning options further optimize performance, such as thegmx tune_pme tool, which systematically benchmarks and selects the optimal number of ranks dedicated to Particle Mesh Ewald (PME) electrostatics calculations by varying FFT grid configurations to balance reciprocal and real-space workloads. For constraint handling, the LINCS algorithm enforces bond lengths non-iteratively after unconstrained updates, projecting atomic positions onto the constraint manifold in two steps to achieve high stability and enable larger time steps, with adjustable parameters like cell size limits to prevent extrapolation errors. These techniques contribute to precise energy conservation, with drift typically below $10^{-5} kT per step in standard simulations.[42][34]1096-987X(199709)18:12%3C1463::AID-JCC4%3E3.0.CO;2-H)
Usage
Installation and Setup
GROMACS source code is freely available under the GNU Lesser General Public License (LGPL) for version 2.1 or later, with some components under the GNU General Public License (GPL) version 2. The latest version can be downloaded as a tarball from the official FTP site or cloned from the GitLab repository. Pre-compiled binaries are accessible via package managers, includingapt on Debian- and Ubuntu-based systems, conda through the conda-forge channel, and as loadable modules on high-performance computing (HPC) clusters, though these may lag behind the current release. For optimal performance and the newest features, compiling from source is recommended, especially on custom hardware configurations.
The compilation process relies on CMake (version 3.28 or later) and requires a C99-compliant C compiler and C++17-compliant C++ compiler, such as GCC 11 or later or Clang 14 or later. Key dependencies include the FFTW library (version 3 or later) for Fourier transforms; optional libraries like BLAS and LAPACK can enhance linear algebra operations. To enable parallel computing, configure with the CMake flag -DGMX_MPI=on for MPI support. For GPU acceleration, use flags such as -DGMX_GPU=CUDA for NVIDIA hardware, -DGMX_GPU=OpenCL for AMD or Intel GPUs, -DGMX_GPU=SYCL for Intel oneAPI, or -DGMX_GPU=HIP for AMD ROCm. The build sequence typically involves creating a separate build directory, invoking cmake with options (e.g., cmake .. -DGMX_BUILD_OWN_FFTW=ON if building FFTW internally), followed by make -j for parallel compilation, make check to run unit tests, and make install to deploy binaries and libraries.
Post-installation, configure the environment by sourcing the GMXRC script (e.g., source /usr/local/gromacs/bin/GMXRC), which appends the GROMACS binary path to $PATH and sets the $GMXDATA environment variable to locate essential data files, including force field parameters. This step ensures tools access topology and residue data correctly. For initial system preparation, the pdb2gmx utility converts standard Protein Data Bank (PDB) files into GROMACS-compatible formats, producing a .gro file for atomic coordinates and a .top file for molecular topology, while selecting appropriate force fields like AMBER or CHARMM.
Installation integrity can be confirmed by executing gmx -version in the terminal, which outputs the GROMACS version, compilation date, and enabled features such as MPI or GPU support. A basic functional test involves running a short energy minimization (em) on a sample system from the lysozyme-in-water tutorial, verifying that the process generates output files without runtime errors and completes in expected time. If issues arise, regression test suites downloadable from the official FTP site provide comprehensive validation.
Running Simulations
Running simulations in GROMACS follows a structured workflow that begins with preparing the necessary input files and culminates in executing the molecular dynamics (MD) trajectory using dedicated command-line tools. The core process involves preprocessing the system with thegmx grompp command to generate a portable binary run input file (.tpr), which encapsulates the topology, coordinates, and simulation parameters, followed by running the simulation with gmx mdrun. This separation allows for flexible setup and efficient execution, particularly on high-performance computing resources.[43]
The .mdp file defines the MD parameters and is a key input to gmx grompp. Essential settings include nsteps to specify the total number of integration steps (e.g., 50000000 for a 100 ns simulation at a 2 fs time step), dt for the time step size (typically 0.002 ps), integrator = md to select the standard MD integrator (which employs the leap-frog Verlet scheme), tcoupl = v-rescale for temperature coupling via velocity rescaling to maintain constant temperature, and pcoupl = parrinello-rahman for pressure coupling using the Parrinello-Rahman barostat to control system volume. These parameters ensure stable and realistic dynamics while adhering to thermodynamic constraints.
Once the .tpr file is prepared via gmx grompp -f simulation.mdp -c coordinates.gro -p topology.top -o run.tpr, the simulation is executed with gmx mdrun -s run.tpr, optionally using flags like -deffnm to set default prefixes for output files (e.g., -deffnm sim generates sim.trr for the trajectory). GROMACS produces several output files for monitoring and analysis: .trr for full-precision trajectories including coordinates, velocities, and forces; .xtc for compressed coordinate-only trajectories to save storage; and .edr for energy data such as potential energy, kinetic energy, and temperature. Real-time monitoring during runs can be enabled through these files, allowing users to assess convergence without interrupting the process.[43]
A typical workflow example for a solvated biomolecular system includes ionization and equilibration phases before production MD. After initial topology and solvation setup, ions are added for charge neutralization using gmx genion -s ions.tpr -p topology.top -neutral -o ionized.gro, replacing solvent molecules (e.g., water) with counterions like Na⁺ or Cl⁻ as needed. Equilibration proceeds in two stages: first, an NVT (constant number, volume, temperature) run with an .mdp file setting tcoupl = v-rescale and nsteps = 50000 to stabilize temperature at 300 K; second, an NPT (constant number, pressure, temperature) run adding pcoupl = parrinello-rahman and ref-p = 1.0 bar to adjust density. The production MD then uses a similar .mdp but with extended nsteps (e.g., 5000000 for 10 ns) and all couplings active, executed via gmx mdrun on the final .tpr file. This stepwise approach minimizes artifacts and ensures reliable trajectories.[44]
Analysis Tools
Built-in Modules
GROMACS provides a suite of built-in command-line modules for processing and analyzing molecular dynamics simulation outputs, enabling users to extract physical quantities, manipulate trajectories, and perform statistical evaluations directly from native file formats such as .xtc. These tools are integrated into the core package and operate on data generated by simulations, facilitating post-processing without external dependencies.[45] Among the core analysis tools,gmx energy extracts components of potential and kinetic energies, as well as virials, from energy files produced during simulations, allowing computation of time averages and export to plotting formats like XVG for further visualization or statistical analysis.[46] The tool supports selection of specific energy terms and time ranges via options such as -f for input files and -o for output, making it essential for evaluating thermodynamic properties like pressure or temperature fluctuations.[46] Similarly, gmx rms computes root-mean-square deviations (RMSD) between atomic positions in a trajectory and a reference structure, fitting the trajectory to minimize deviations and outputting RMSD values over time to assess structural stability or conformational changes. It includes options like -s for the reference structure and group selections for focusing on specific atoms, such as backbone residues. Complementing these, gmx gyrate calculates the radius of gyration, a measure of molecular compactness based on the mass-weighted root-mean-square distance of atoms from their center of mass, tracking changes in shape and size across the trajectory. Key options include -pbc for periodic boundary condition handling and output to XVG files for time-series analysis.
For trajectory manipulation and structural analysis, gmx trjconv enables editing and conversion of trajectory files, such as centering molecules, fitting to a reference, or selecting subsets of atoms and frames to prepare data for downstream processing. It supports input from compressed formats and options like -ur for unit cell representation, ensuring compatibility with various analysis workflows. The gmx cluster tool performs structural clustering using the GROMOS method, which iteratively groups conformers based on RMSD cutoffs to identify representative structures and quantify population dynamics in ensembles. It generates output files detailing cluster sizes and central structures, with tunable parameters like the -cutoff for RMSD thresholds.
In preparation for visualization and advanced profiling, gmx sham constructs free energy landscapes by processing one- or two-dimensional histograms from simulation data, applying the Boltzmann relation to convert probability densities into free energy estimates (in units of kT). Users specify histogram inputs via -f and control binning or normalization options to generate contour plots or XVG outputs. Additionally, gmx do_dssp assigns secondary structure elements to protein residues using the DSSP algorithm, analyzing hydrogen bonds and dihedral angles in trajectories to classify helices, sheets, and turns over time. It outputs per-residue assignments to index or XVG files, with options for trajectory fitting to align structures prior to assignment.
For statistical evaluation, gmx analyze processes time-series data to compute autocorrelation functions and time correlations, incorporating block averaging to estimate errors and assess the reliability of averages from finite simulations.[47] It handles inputs from XVG files with options like -ac for autocorrelations and -bw for block widths, providing normalized correlation coefficients and integrated times for quantities such as energies or coordinates.[47] This module is particularly useful for validating simulation convergence by quantifying statistical uncertainties.[47]
External Integrations
GROMACS provides seamless integration with the PLUMED library for enhanced sampling techniques, enabling methods such as metadynamics and umbrella sampling directly within simulations. This built-in support was introduced in the 2025 release, allowing users to activate PLUMED functionality via the commandgmx mdrun -plumed plumed.dat on non-Windows systems without requiring external patching, by bundling a feature-limited version of the library. The interface relies on the PLUMED_KERNEL environment variable to link the PLUMED kernel, supporting plain-text input files for defining collective variables and biases, though it does not interact with GROMACS energy minimization or replica exchange.[48][49]
For visualization, GROMACS outputs in formats like .gro and .pdb are natively compatible with tools such as VMD and PyMOL, facilitating the display and analysis of molecular structures and trajectories. VMD directly reads GROMACS trajectory files including .trr and .xtc, supporting 3D graphics, animation, and scripting for large biomolecular systems. PyMOL, while not reading GROMACS trajectories by default, imports them through incorporated VMD plugins, enabling rendering of bonds and structures, though topology details from GROMACS .top or .tpr files may require manual adjustment for accurate visualization.[50]
GROMACS interoperates with AMBER tools through format conversion utilities, supporting hybrid simulations by transferring topologies and parameters between the packages for multi-scale or mixed-force-field workflows. Tools like ParmEd allow conversion of AMBER .prmtop files to GROMACS .top formats, enabling the use of AMBER force fields in GROMACS simulations with minimal adjustments to non-bonded interactions. This interoperability facilitates hybrid setups, such as combining AMBER-parameterized regions with GROMACS dynamics for enhanced accuracy in biomolecular modeling.[51]
Additionally, GROMACS integrates with QwikMD, a graphical user interface plugin for VMD, to streamline GUI-based simulation setup and preparation. QwikMD supports the generation of input files compatible with GROMACS workflows, allowing users to configure systems visually before exporting topologies and coordinates for GROMACS execution, particularly useful for novices handling complex biomolecular setups. This integration leverages VMD's visualization capabilities to bridge GUI ease with GROMACS's computational power.[52][53]
Advanced integrations include density-fit simulations with cryo-EM data, supported through dedicated tutorials available since 2025, which guide users in refining atomic models against electron density maps. These simulations apply additional forces to fit protein structures into cryo-EM densities, using GROMACS options like density-guided-simulation for periodic boundary conditions and map alignment, as demonstrated in interactive Jupyter notebooks for systems like the calcitonin receptor-like receptor.[54][55]
GROMACS also couples to quantum mechanics via the CP2K interface for hybrid QM/MM simulations, enabling electrostatic embedding of quantum regions within classical molecular dynamics. This interface, requiring CP2K version 8.1 or later linked as libcp2k, supports DFT methods like PBE and BLYP with DZVP-MOLOPT basis sets, activated through .mdp options such as qmmm-cp2k-active=true and QM group selection, though it excludes topology changes in the QM region and virial contributions from QM/MM forces.[56]
Applications
Biomolecular Research
GROMACS has been extensively applied in biomolecular research to simulate the dynamics of proteins, enabling detailed investigations into folding pathways and conformational changes. For instance, in peptide folding studies, GROMACS integrates NMR-derived restraints into molecular dynamics simulations to refine structures and explore folding trajectories, as demonstrated in simulations of the 1LB0 peptide where restrained runs converged to low RMSD values compared to unrestrained ones over 100 ns production time.[57] These capabilities allow researchers to model protein folding under physiological conditions using force fields like AMBER or CHARMM. In ligand binding scenarios, GROMACS facilitates free energy perturbation (FEP) calculations to quantify binding affinities, crucial for drug discovery. A notable example is the virtual screening of approximately 12,000 compounds against SARS-CoV-2 main protease (Mpro) and TMPRSS2, where FEP simulations identified 50 potent Mpro inhibitors with IC50 values below 100 µM, including dipyridamole (IC50 = 0.6 µM), validated through experimental assays.[58] Similarly, FEP applied to known inhibitors of SARS-CoV-2 Mpro achieved a high correlation (R = 0.94) with experimental binding free energies, outperforming docking methods and highlighting compounds like delamanid with nanomolar affinity.[59] In membrane and nucleic acid studies, GROMACS supports coarse-grained modeling to access longer timescales relevant to biological processes. For lipid bilayers, the MARTINI force field, implemented in GROMACS, enables simulations of self-assembled membranes, accurately reproducing properties such as area per lipid, thickness, and bending modulus in implicit solvent environments.[60] This approach has been used to study multicomponent bilayers, including domain coexistence and membrane fusion events, providing insights into cellular membrane organization without explicit solvent overhead. For nucleic acids, GROMACS simulations reveal flexibility differences between DNA and RNA duplexes; all-atom MD runs on 40-bp dsDNA and dsRNA showed dsRNA exhibiting lower stretch modulus due to enhanced base-pair inclination and slide, with twist-stretch coupling signs differing between B-form DNA and A-form RNA, consistent with experimental data across multiple force fields.[61] Enhanced sampling techniques in GROMACS, such as replica exchange molecular dynamics (REMD), improve exploration of rare events in biomolecular systems, particularly for enzyme mechanisms and allostery. REMD simulations of the oncogenic phosphatase SHP2, using 76 replicas across 300–400 K, demonstrated high mobility in the N-SH2 domain during activation, revealing allosteric regulation via EF and BG loops that modulate binding site accessibility, rather than the central β-sheet, in both wild-type and E76K mutant forms.[62] These runs, spanning 250–300 ns, highlighted conformational instability of the crystallographic active state in solution, informing drug design strategies for allosteric inhibitors. Case studies underscore GROMACS' integration with experimental methods in biomolecular research. In validating AlphaFold-predicted structures, GROMACS-based MD simulations assess dynamic stability; for example, refinements of serpin mutants like antithrombin variants revealed limitations in capturing latent or polymer transitions, with RMSD analyses showing persistent native conformations despite predicted changes, emphasizing the need for extended simulations to probe mutation effects.[63] For cryo-EM refinement, GROMACS employs correlation-driven molecular dynamics (CDMD) to fit atomistic models into density maps via simulated annealing and gradual resolution scaling, achieving superior agreement without manual restraints, as implemented in workflows using options likeDensityFitting = yes. A 2025 BioExcel webinar tutorial details these density-guided simulations for cryo-EM structure reconstruction, demonstrating automated refinement across resolutions from near-atomic to subnanometer.[64][65]