Physics-informed neural networks
Physics-informed neural networks (PINNs) are a class of deep neural networks designed to solve forward and inverse problems governed by physical laws, particularly nonlinear partial differential equations (PDEs), by embedding these laws directly into the network's training process. Building on earlier neural network methods for PDE solving, such as those proposed by Lagaris et al.,[1] the modern PINN framework was first introduced in 2017 by Maziar Raissi, Paris Perdikaris, and George E. Karniadakis,[2] with formal publication in 2019.[3] PINNs leverage automatic differentiation to compute PDE residuals and incorporate them into a composite loss function alongside data-fitting terms, enabling mesh-free approximations that respect underlying physics without requiring extensive labeled datasets.[3] This approach bridges traditional numerical methods like finite element analysis with machine learning, offering a data-efficient framework for modeling complex physical systems.[4] At their core, PINNs approximate the solution to a PDE as the output of a neural network, typically a fully connected architecture with hyperbolic tangent or similar activation functions, where the network parameters are optimized to minimize a loss that balances empirical data error and the violation of physical constraints evaluated at collocation points in the domain.[2] For forward problems, PINNs predict solutions given known parameters and boundary conditions, while for inverse problems, they infer unknown parameters or even discover governing equations from sparse or noisy observations.[3] Key advantages include their ability to handle high-dimensional problems, incorporate uncertainty quantification through Bayesian variants, and generalize beyond training data by enforcing conservation laws or symmetries, outperforming purely data-driven models in scenarios with limited measurements. Since their inception, PINNs have evolved through numerous variants to address limitations such as optimization challenges and stiffness in stiff PDEs, including conservative PINNs (cPINNs) that enforce integral constraints for better stability in transport phenomena, extended PINNs (XPINNs) using domain decomposition for scalability, and fractional PINNs (fPINNs) for non-local operators.[4] These extensions have expanded applications across fields like fluid dynamics (e.g., simulating blood flow from MRI data), quantum mechanics (solving Schrödinger equations), climate modeling (parameterizing subgrid processes), and materials science (predicting microstructure evolution). Despite successes, ongoing challenges include balancing loss terms to avoid failure modes like spectral bias and improving theoretical guarantees for convergence in diverse settings.[5] Overall, PINNs represent a cornerstone of physics-informed machine learning, fostering hybrid models that enhance scientific discovery and engineering design.[4]Overview and Background
Definition and Principles
Physics-informed neural networks (PINNs) are neural networks trained to solve supervised learning tasks while respecting physical laws, such as those governed by nonlinear partial differential equations (PDEs) or ordinary differential equations (ODEs). They function as universal function approximators, representing solutions to physical systems by embedding the governing equations directly into the loss function during training via automatic differentiation, which enables the computation of derivatives without explicit discretization.[6] The core principles of PINNs revolve around leveraging known physics to regularize the learning process, thereby addressing data scarcity in scientific applications where observations are often limited or expensive to obtain. By incorporating physical constraints as priors, PINNs constrain the solution space to physically plausible outcomes, reducing overfitting and improving generalization. In contrast to conventional numerical methods like finite element analysis, which rely on mesh generation and can be computationally intensive for complex geometries, PINNs employ a mesh-free paradigm, evaluating equations at arbitrarily chosen collocation points within the domain.[5][6] The fundamental workflow of PINNs involves parameterizing the solution—such as the function u(x,t) for spatiotemporal problems—with a neural network and enforcing physical consistency through minimization of the residual arising from the governing equations. This residual is computed seamlessly using automatic differentiation and added to the training loss, often alongside data-fitting terms from boundary or initial conditions.[6] PINNs provide distinct advantages, including their capacity to tackle high-dimensional problems that challenge traditional solvers due to exponential scaling in computational cost, while integrating noisy or incomplete data effectively through balanced loss components. They support unified treatment of forward modeling, where solutions are predicted given known parameters, and inverse modeling, where parameters are inferred from measurements, all within a differentiable framework. For example, PINNs approximate solutions to the one-dimensional Burgers' equation, a canonical nonlinear PDE describing shock formation in viscous fluids, by directly embedding the equation's structure to guide learning from sparse data points.[5][6]Historical Development
The origins of physics-informed neural networks (PINNs) trace back to 2017, when Maziar Raissi, Paris Perdikaris, and George Em. Karniadakis published an arXiv preprint introducing the core concept as a means to solve nonlinear partial differential equations (PDEs) by embedding physical laws directly into neural network training.[2] This foundational work built on earlier ideas of data-driven PDE discovery but marked the explicit formulation of PINNs for both forward PDE solving and inverse parameter estimation.[3] The approach leveraged the success of deep learning in scientific computing, particularly the rise of automatic differentiation in frameworks like PyTorch and TensorFlow, which enabled efficient computation of PDE residuals without traditional discretization. The foundational 2017 arXiv preprint by the same authors, updated in 2018 and published in 2019, formalized PINNs as a deep learning framework for forward and inverse problems involving nonlinear PDEs, including demonstrations of applications to the Navier-Stokes equations for incompressible flow modeling.[3] This seminal 2019 publication established PINNs' data-driven solution capabilities, emphasizing sparse data integration with physics constraints, and rapidly gained influence with over 10,000 citations by 2023.[7] Early adoption focused on fluid dynamics, such as solving the Navier-Stokes and Burgers' equations, where PINNs outperformed traditional numerical methods in handling noisy or limited data scenarios.[8] Subsequent developments in 2020 extended PINNs to uncertainty quantification, with Bayesian physics-informed neural networks (B-PINNs) incorporating Bayesian inference to assess prediction reliability in PDE solutions, addressing a key limitation in deterministic neural network outputs.[9] From 2021 to 2023, the field saw rapid growth in variants, including conservative PINNs (cPINNs) introduced in 2020, which enforce flux continuity on discrete domains to improve stability for conservation laws.[10] By 2022, PINNs expanded beyond fluid dynamics to multiphysics problems, such as coupled flow-mechanics systems and subsurface transport, enabling simulations of interacting phenomena like poroelasticity and multiphase flows.[11] These advancements were fueled by the frameworks' flexibility in handling complex geometries and the growing availability of open-source implementations, solidifying PINNs as a standard tool in computational science. By 2024–2025, PINNs continued to evolve with advancements in network architectures and theoretical guarantees, as detailed in subsequent sections.[12]Mathematical Foundations
Network Architecture
Physics-informed neural networks (PINNs) typically employ fully connected feedforward neural networks as their core architecture to approximate solutions to partial differential equations (PDEs). These networks take spatial and temporal coordinates, such as (x, t), as inputs and output the corresponding solution variables, for instance, u(x, t), representing the latent solution of the PDE.[6] The architecture leverages the universal approximation theorem, enabling the network to represent complex functions defined over continuous domains without requiring a mesh.[6] Activation functions play a crucial role in ensuring the smoothness required for accurate derivative computations via automatic differentiation, which is used to enforce physical constraints. Common choices include the hyperbolic tangent (tanh) function, favored for its bounded and differentiable properties that facilitate smooth approximations, or the swish activation (\text{swish}(z) = z \cdot \sigma(z), where \sigma is the sigmoid function), which has shown improved performance in capturing non-linear behaviors in certain PDEs.[6][13] Typical hyperparameters for PINN architectures involve 5 to 10 hidden layers with 50 to 100 neurons per layer, selected based on the complexity of the PDE to balance expressiveness and computational efficiency.[6] These configurations are often optimized using gradient-based methods like Adam, allowing the trainable parameters \theta to adjust the network to fit both data and physics. For example, in solving the Burgers' equation, a network with 8 hidden layers and 20 neurons each has been effectively used.[6] The input-output mapping in PINNs involves sampling collocation points randomly within the computational domain to enforce the governing PDE, while boundary and initial conditions are incorporated as additional training data points. This unsupervised sampling strategy enables mesh-free enforcement of physics across the domain.[6] Mathematically, the neural network provides an approximation \tilde{u}(\theta; x, t) to the true solution u(x, t), where \theta denotes the set of network weights and biases.[6] For challenging problems involving stiff PDEs, adaptations such as multi-scale architectures have been developed to better resolve disparate length or time scales. These include hierarchical networks or Fourier feature embeddings that enhance the representational capacity for multi-scale phenomena, improving convergence on problems like high-Reynolds-number flows or reaction-diffusion systems.[14]Loss Function Formulation
The loss function in physics-informed neural networks (PINNs) is designed to enforce both data consistency and adherence to physical laws by combining multiple terms into a composite objective. The neural network approximates the solution to a partial differential equation (PDE) as ũ(θ; x, t), where θ represents the network parameters, and x, t denote spatial and temporal coordinates. This approximation is trained by minimizing a loss that balances empirical data fitting with the residual of the governing physics. The composite loss is generally expressed as\mathcal{L}(\theta) = \mathcal{L}_\text{data}(\theta) + \lambda \mathcal{L}_\text{physics}(\theta) + \mathcal{L}_\text{boundary}(\theta),
where \mathcal{L}_\text{data} quantifies the discrepancy between predicted and observed values, \mathcal{L}_\text{physics} penalizes violations of the PDE, \mathcal{L}_\text{boundary} enforces initial and boundary conditions (ICs/BCs), and λ serves as a hyperparameter to balance the physics term. In the original PINN formulation, the data loss \mathcal{L}_\text{data} is the mean squared error (MSE) over N_f observed points:
\mathcal{L}_\text{data}(\theta) = \frac{1}{N_f} \sum_{i=1}^{N_f} \left\| \tilde{u}(\theta; x_f^i, t_f^i) - u_f^i \right\|^2,
while boundary terms are incorporated into this MSE to ensure compliance with ICs/BCs at designated points. The physics loss \mathcal{L}_\text{physics} arises from the PDE residual; for a general PDE \mathcal{F} = 0, the residual r is computed as r = \mathcal{F}[\tilde{u}(\theta; x, t)], evaluated at N_c collocation points via automatic differentiation to obtain required derivatives (e.g., partials with respect to x and t). Thus,
\mathcal{L}_\text{physics}(\theta) = \frac{1}{N_c} \sum_{i=1}^{N_c} \left\| \mathcal{F}[\tilde{u}(\theta; x_c^i, t_c^i)] \right\|^2.
The full unweighted loss in the seminal work simplifies to the sum of data and physics MSEs without an explicit λ, treating them equally. Boundary enforcement is handled softly by inclusion in the data loss, though separate \mathcal{L}_\text{boundary} terms can be added for clarity in more complex setups. Weighting strategies are crucial for effective training, as mismatched scales between loss terms can lead to suboptimal convergence. The original 2019 PINN approach used fixed equal weights, which often requires manual tuning of λ to prioritize physics over data or vice versa. Subsequent improvements introduced self-adaptive weighting, where λ (or equivalent per-term weights) is learned during training via gradient-based updates, such as ascent on a soft attention mechanism applied to residuals at each point. This adaptive approach dynamically balances terms without hyperparameter intervention, enhancing robustness across diverse PDEs. Soft constraints, like PDE residuals added to the loss, predominate in PINNs for their differentiability, in contrast to hard constraints enforced directly in the network architecture (e.g., via custom layers), though the former enables seamless backpropagation. Training proceeds by minimizing the composite loss using gradient descent optimizers (e.g., Adam), with automatic differentiation facilitating computation of gradients through both the network outputs and embedded derivatives in the residuals. This end-to-end differentiability allows the optimizer to propagate errors from physics violations back to θ, ensuring the learned ũ satisfies both data and equations simultaneously.