Fact-checked by Grok 2 weeks ago

Biomolecular structure

Biomolecular structure refers to the three-dimensional arrangement of atoms in biological macromolecules, which dictates their , , and interactions within living organisms. These structures are primarily composed of four major classes of biomolecules: proteins, nucleic acids, carbohydrates, and , each exhibiting distinct architectural features essential for cellular processes. Proteins, the most diverse class, are linear polymers of 20 standard amino acids linked by peptide bonds, folding into complex shapes that enable roles in catalysis, transport, and structural support. Nucleic acids, including DNA and RNA, consist of nucleotide monomers with nitrogenous bases, sugars, and phosphates, forming double helices (in DNA) or single strands (in RNA) that store and transmit genetic information through base pairing. Carbohydrates are polysaccharides built from monosaccharide units like glucose, connected by glycosidic bonds to create linear or branched chains that provide energy storage and structural integrity, such as in cellulose. Lipids, including fats, phospholipids, and steroids, feature hydrophobic hydrocarbon chains and often amphipathic properties, assembling into membranes and serving as energy reserves or signaling molecules. The organization of these biomolecules occurs across hierarchical levels of structure, particularly evident in proteins and nucleic acids. Primary structure defines the linear sequence of monomers (e.g., order in proteins). Secondary structure involves local folding patterns stabilized by hydrogen bonds, such as alpha helices and beta sheets in proteins or base-paired stems in . Tertiary structure encompasses the overall three-dimensional fold of a single chain, driven by non-covalent interactions like hydrophobic effects and electrostatic forces. structure, when applicable, describes the assembly of multiple subunits into functional complexes, as in . Understanding these levels is crucial, as disruptions in structure—due to or environmental factors—can lead to loss of function and diseases.

Overview

Definition and Scope

Biomolecular structure refers to the three-dimensional arrangement of atoms in biological molecules, which determines their shape, stability, and function at atomic, molecular, and hierarchical levels. This organization arises from the precise positioning of atoms connected by chemical bonds and influenced by surrounding environmental factors, enabling molecules to perform essential roles in cellular processes. The scope of biomolecular structure encompasses the primary classes of biomolecules: proteins, which serve as enzymes and structural components; nucleic acids, including DNA and RNA for genetic information storage and transfer; carbohydrates, involved in energy storage and cell recognition; and lipids, which form membranes and signaling molecules. These macromolecules, along with their smaller constituents, constitute the building blocks of living organisms. The field originated in early 20th-century biochemistry, with foundational progress such as Frederick Sanger's determination of the amino acid sequence of insulin between 1945 and 1955, providing the first complete primary structure of a protein and establishing sequencing as a key tool for . Central to biomolecular architecture are covalent interactions, such as and phosphodiester bonds, which define the primary connectivity, contrasted with non-covalent interactions—including hydrogen bonds, van der Waals forces, ionic bonds, and hydrophobic effects—that drive folding and assembly into functional three-dimensional forms.

Biological Importance

The three-dimensional of biomolecules fundamentally dictates their biological , enabling precise molecular interactions essential for cellular processes. For instance, the of an enzyme's determines its catalytic specificity and efficiency, allowing substrates to bind and reactions to proceed with . Similarly, the structural features of receptor sites govern ligand recognition and , which are critical for processes like signaling and immune responses. This structure-function paradigm underscores how biomolecular conformations enable the diverse activities that sustain life, from to cellular communication. Evolutionary pressures have conserved key structural motifs across species, reflecting their indispensable roles in core biological functions. A prominent example is the Rossmann fold, a β-α-β sandwich domain found in nucleotide-binding enzymes like dehydrogenases, which has been preserved throughout due to its efficiency in cofactor binding and catalysis. Such conservation highlights how structural stability and functionality are selected for, allowing homologous proteins to perform analogous tasks in distant organisms and providing insights into the origins of metabolic pathways. Aberrant biomolecular structures contribute significantly to disease , often through misfolding or mutations that disrupt normal function. In , misfolded amyloid-β peptides aggregate into insoluble fibrils, leading to neurotoxic plaques that impair neuronal health and contribute to cognitive decline. Likewise, in sickle cell anemia, a single substitution in (glutamic acid to at position 6 of the β-chain) alters its quaternary structure, promoting polymerization into rigid fibers that deform red blood cells and cause vascular occlusion. These examples illustrate how structural deviations can cascade into systemic disorders, emphasizing the need for structural biology in diagnostics and therapeutics. Understanding biomolecular structures has revolutionized applications in and , enabling targeted interventions. Structure-based drug design leverages atomic-level models to develop inhibitors that bind specific protein pockets, accelerating the discovery of therapies for diseases like cancer and . In biotechnology, advances in since the 2000s have used structural insights to create novel enzymes and therapeutics with enhanced stability and function, powering innovations in industrial biocatalysis and .

Protein Structure

Primary Structure

The primary structure of a protein is defined as the linear sequence of covalently linked by bonds to form a polypeptide chain, with a free amino group at the and a free carboxyl group at the . This sequence is typically denoted using one-letter codes for the , such as A for , C for , and G for , as standardized by the International Union of Pure and Applied Chemistry (IUPAC). Proteins are composed of 20 standard amino acids, each distinguished by a unique side chain (R group) that imparts specific chemical properties, including hydrophobicity, polarity, or charge. The average length of a protein is approximately 300 amino acids, though this varies widely across organisms and functions, with eukaryotic proteins often longer than those in bacteria. Historically, primary structure was determined using Edman degradation, a method developed by Pehr Edman in the 1950s that sequentially removes and identifies the N-terminal amino acid through reaction with phenylisothiocyanate, enabling automated sequencing of up to 50-60 residues. Modern approaches primarily rely on mass spectrometry, such as tandem mass spectrometry (MS/MS) coupled with liquid chromatography, which fragments peptides and analyzes their mass-to-charge ratios to reconstruct the sequence by matching against databases. The primary structure serves as the foundational blueprint for all higher levels of protein organization, as alterations in the sequence—often caused by like single nucleotide polymorphisms (SNPs) that change a codon—can disrupt protein function and lead to diseases, such as sickle cell anemia resulting from a single substitution in . For instance, SNPs in coding regions may introduce missense , replacing one with another and thereby affecting the protein's overall properties. This sequence directly influences the propensity for local folding patterns in secondary structure.

Secondary Structure

Secondary structure refers to the local spatial arrangement of the polypeptide backbone in proteins, primarily stabilized by hydrogen bonds between the carbonyl oxygen and amide hydrogen atoms of the peptide bonds, excluding those involving side chains. These conformations arise from the inherent flexibility and steric constraints of the backbone, allowing segments of the chain to adopt repeating patterns that contribute to the overall folding without considering distant interactions. The most prevalent secondary structures are the α-helix and β-sheet, first proposed by Linus Pauling and Robert Corey in 1951 based on model-building constrained by known bond lengths and angles. The α- is a right-handed coiled in which the polypeptide backbone forms a cylindrical with 3.6 residues per turn and a rise of 5.4 along the helical axis per turn, resulting in a pitch of approximately 5.4 . In this configuration, hydrogen bonds form between the of residue i and the group of residue i+4, creating a stable, intra-chain network that aligns the peptide dipoles nearly parallel to the axis. The side chains project outward from the , enabling hydrophobic residues to interact with the environment; α- are particularly common in transmembrane proteins, where they span the as bundles of 20-25 residues. The β-sheet consists of two or more β-strands—extended polypeptide segments—aligned laterally to form a pleated sheet-like , with bonds between adjacent strands stabilizing the assembly. β-sheets can adopt or antiparallel orientations: in antiparallel sheets, adjacent strands run in opposite directions (N-to-C terminus), allowing for optimal, perpendicular bonding patterns that enhance stability; sheets have strands running in the same direction, with slightly offset and less direct bonds. These sheets often twist due to the of L-amino acids, and in proteins, they frequently form closed cylindrical structures known as β-barrels, which are prevalent in porins and outer membrane proteins for channel formation. The conformational space available to the polypeptide backbone is visualized in the , which maps the dihedral angles φ (phi, rotation around the N-Cα bond) and ψ (psi, rotation around the Cα-C bond) for each residue, revealing regions allowed by steric constraints from van der Waals repulsions. Allowed regions cluster around φ ≈ -60°, ψ ≈ -45° for α-helices and φ ≈ -120°, ψ ≈ 120° for β-sheets, while glycine's lack of a permits broader access, and proline's ring restricts φ to about -60°. Disallowed areas highlight conformations that would cause atomic clashes, guiding the feasibility of secondary structures. Beyond helices and sheets, other secondary structure motifs include β-turns and loops, which introduce reversals or irregular segments in the chain to connect regular elements. A β-turn typically spans four residues, with a tight bend stabilized by a between the carbonyl of residue i and the of i+3, classified into types (I, , etc.) based on φ and ψ angles at positions i+1 and i+2.85807-8/fulltext) Loops are longer, non-repetitive segments lacking regular hydrogen bonding patterns, often solvent-exposed and functionally important for flexibility. Early methods for predicting secondary structure from primary relied on empirical parameters derived from known protein structures, such as the Chou-Fasman rules from the , which assign propensity values P_α, P_β, and P_t to each for , sheet, and turn formation, respectively, to identify potential segments where local averages exceed thresholds (e.g., P_α > 1.03 for nucleation). These parameters, calculated from statistical analysis of 29 proteins, reflect preferences influenced by primary but refined over time for better accuracy.

Tertiary Structure

The tertiary structure of a protein describes the spatial arrangement of its side chains in a single polypeptide chain, resulting in a compact, globular fold that positions distant residues in close proximity to enable functional conformation. This fold is primarily stabilized by the formation of a hydrophobic core, where nonpolar side chains cluster in the interior away from the aqueous environment, driven by the —an entropy-dominated process in which molecules gain disorder upon release from ordered shells around hydrophobic residues. Additional stabilization arises from covalent disulfide bonds between residues, which lock specific regions in place, and noncovalent salt bridges (ionic interactions between oppositely charged side chains like and aspartate), which contribute to overall stability, particularly in thermophilic proteins. Pi-stacking interactions between aromatic rings, such as those in or , further reinforce the core by providing attractive forces between electron clouds. Protein tertiary structures often consist of modular domains and motifs, which are recurrent folding patterns that confer specific functions. For example, the immunoglobulin fold is a beta-sandwich domain composed of two antiparallel beta-sheets stabilized by a conserved disulfide bond, commonly found in variable regions and molecules. Another prominent motif is the , a compact structure where a zinc ion coordinates and residues to stabilize a beta-beta-alpha fold, enabling DNA binding in transcription factors. These elements demonstrate how tertiary folding integrates secondary structural features, such as alpha-helices and beta-strands, into functional units. The principle that a protein's tertiary structure is dictated by its primary amino acid sequence—known as —was established through experiments showing that denatured ribonuclease A could spontaneously renature to its native fold upon removal of denaturants, regaining full enzymatic activity as the thermodynamically most stable conformation under physiological conditions. This renaturation highlights the reversibility of tertiary folding and the absence of obligatory covalent information beyond the sequence itself. Intermediates in this process, such as the molten globule state, represent partially compact forms with native-like secondary structure but fluctuating side-chain packing and a less defined hydrophobic core, serving as kinetic waypoints during folding. Typical globular proteins exhibit a between 20 and 50 Å, reflecting the scale of these compact folds for chains of 100–500 residues.

Quaternary Structure

Quaternary structure describes the spatial arrangement and non-covalent interactions between multiple polypeptide subunits that form a functional . These interactions occur at specific interfaces, often exhibiting to maximize stability and efficiency, such as in homodimers composed of two identical subunits or heterotetramers consisting of four distinct subunits.00196-2) A prominent example is heterotetramer comprising two α and two β subunits, where oxygen binding induces allosteric conformational changes that transition the complex from a low-affinity tense (T) state to a high-affinity relaxed (R) state, enhancing cooperative oxygen transport. In viral capsids, icosahedral organizes numerous identical protein subunits into a geometrically efficient shell, as seen in many viruses where quasi-equivalent positions allow for stable assembly without genetic redundancy. The stability of quaternary structures arises from hydrophobic and electrostatic interactions at subunit interfaces, typically burying 1000–2000 Ų of surface area per interface, with dissociation constants (K_d) ranging from micromolar to nanomolar, reflecting affinities sufficient for physiological function. Approximately 30–50% of proteins function as oligomers, enabling regulatory control and metabolic efficiency. Evolutionarily, many such assemblies arise from gene duplication events, where paralogous subunits diverge to form heteromeric complexes, diversifying function while retaining core interactions. The tertiary folds of individual subunits provide the scaffolds for these inter-subunit associations.

Nucleic Acid Structure

DNA Structure

The structure of deoxyribonucleic acid (DNA), the primary genetic material in most organisms, was determined in 1953 by James D. Watson and Francis H. C. Crick, who proposed a double-helical model based on diffraction data from and Maurice H. F. Wilkins. This model revealed DNA as two antiparallel polynucleotide strands wound around a common axis, stabilized by hydrogen bonds between complementary bases. The human diploid genome comprises approximately 6.4 billion base pairs, extending to about 2 meters in length if uncoiled. Under physiological conditions, DNA predominantly adopts the B-form, a right-handed double helix characterized by 10.5 base pairs per helical turn and a pitch of 3.4 nm, with an axial rise of 0.34 nm per base pair. The strands are connected by Watson-Crick base pairing, where adenine (A) pairs with thymine (T) through two hydrogen bonds, and guanine (G) pairs with cytosine (C) through three, ensuring specific and stable complementarity. This configuration allows the molecule to compactly store genetic information while permitting access for replication and transcription. DNA can assume alternative conformations depending on environmental conditions and sequence. The A-form, observed in dehydrated states such as during , is a shorter, wider right-handed with 11 base pairs per turn and a of about 2.8 , resembling the double-helical structure of . In contrast, is a left-handed formed preferentially in sequences with alternating purines and pyrimidines, such as poly(dG-dC), featuring 12 base pairs per turn and a zigzag backbone that gives it its name. These non-B forms can influence local DNA flexibility and interactions, though B-DNA remains the predominant physiological structure. To achieve further compaction in cells, DNA undergoes supercoiling, where the double helix twists upon itself beyond its relaxed state. The topology is described by the (Lk), defined as Lk = Tw + Wr, where Tw is the (helical turns) and Wr is the writhe (superhelical ). In eukaryotes, negative supercoiling facilitates packaging; for instance, each wraps about 147 base pairs of DNA in 1.65 left-handed turns, introducing negative supercoils that aid in folding. This topological constraint is essential for fitting the into the while regulating access to genetic information.

RNA Structure

RNA, unlike DNA, is typically single-stranded and folds into complex three-dimensional structures that enable diverse functions beyond genetic information storage. A key distinguishing feature is the presence of a hydroxyl group (-OH) at the 2' position of the sugar, which imparts chemical reactivity absent in and facilitates RNA's catalytic capabilities by participating in nucleophilic attacks during reactions such as cleavage. RNA bases pair via Watson-Crick rules (A-U, G-C) but also form non-canonical pairs like the G-U wobble, where guanine's amino group hydrogen-bonds with uracil's carbonyl, allowing structural flexibility and stability in folded regions. This wobble pairing is ubiquitous in RNA motifs and contributes to functional diversity across RNA classes. (mRNA) molecules, which carry protein-coding information, typically range from 1000 to 5000 in length, allowing for the encoding of polypeptides of varying sizes. At the secondary structure level, RNA forms double-helical stems through intramolecular base pairing, often terminated by unpaired s that create motifs like stem-loops (hairpins), where a short double-stranded region connects to a single-stranded . These stem-loops are critical for RNA stability, protein recognition, and regulatory functions, appearing in precursor microRNAs and ribozymes. More complex secondary elements include pseudoknots, formed when bases in a pair with a distant single-stranded region, creating intertwined helices that enhance and are common in viral RNAs for frameshifting during . A classic example is (tRNA), whose secondary structure adopts a cloverleaf model with four stems—acceptor, D-arm, anticodon, and T-arm—connected by loops, which folds into a compact L-shaped tertiary conformation essential for delivery during protein synthesis. Tertiary RNA structures arise from long-range interactions stabilizing secondary motifs into functional folds, often involving metal ions and non-canonical base pairs. Ribozymes exemplify this, as catalytic RNAs that perform self-cleavage or ligation; the first were discovered in the early 1980s when identified self-splicing introns in pre-rRNA and discovered the catalytic activity of RNase P, where the RNA components perform reactions without protein assistance, revealing RNA's enzymatic potential. These discoveries earned the Nobel Prize in Chemistry for Cech and Altman. These introns fold into intricate tertiary structures with active sites coordinating Mg²⁺ ions for catalysis. (rRNA) domains further illustrate tertiary complexity; the 23S rRNA in the large subunit comprises seven domains radiating from a central Domain 0 core, forming the peptidyl transferase center, while the 16S rRNA in the small subunit has four domains that assemble into the decoding site, enabling formation and mRNA reading. Functional RNA motifs often rely on precise tertiary folding for regulation, as seen in microRNAs (miRNAs), small non-coding RNAs (~22 ) that post-transcriptionally repress by binding target mRNAs. MiRNA regulation primarily occurs through seed pairing, where nucleotides 2–8 at the miRNA 5' end form complementary base pairs with the mRNA 3' , leading to translational inhibition or mRNA degradation. This mechanism underscores RNA's role in fine-tuning cellular processes via structural specificity.

Structures of Other Biomolecules

Carbohydrates

Carbohydrates are essential biomolecules composed primarily of carbon, , and oxygen, often in the ratio of 1:2:1, forming polyhydroxy aldehydes or ketones known as sugars. Their structural diversity arises from monomeric units called monosaccharides, which polymerize into oligosaccharides (2–10 units) and (more than 10 units) through glycosidic linkages. These structures enable carbohydrates to serve as stores and structural components in cells, with their configurations influencing , digestibility, and biological . Monosaccharides, the simplest carbohydrates, are classified as aldoses, which possess an group at the carbonyl carbon (C1), or ketoses, which have a group typically at C2. For example, glucose, an aldohexose with six carbons, predominantly exists in cyclic ring forms rather than the open-chain structure. In its form, glucose cyclizes via a reaction between the aldehyde at C1 and the hydroxyl at C5, forming a six-membered ring. This cyclization creates a new chiral center at C1, termed the anomeric carbon, resulting in two anomers: α-D-glucopyranose, where the hydroxyl at C1 is axial, and β-D-glucopyranose, where it is equatorial. Oligosaccharides and form through reactions that create glycosidic bonds between the anomeric carbon of one and a hydroxyl group of another. These bonds can be α or β, depending on the anomeric configuration, and specify the linkage position, such as α-1,4 or β-1,4. In , a branched , glucose units link via α-1,4-glycosidic bonds in linear chains, with branches introduced every 8–12 residues through α-1,6-glycosidic bonds at , enhancing and rapid enzymatic access for mobilization. The three-dimensional conformations of carbohydrates significantly affect their properties. rings, common in hexoses like glucose, adopt a conformation as the most stable form, with substituents positioned either equatorially (preferred for bulkier groups) or axially; less stable boat conformations can occur but are rare under physiological conditions. further diversifies structures: and forms are mirror images distinguished by the configuration at the penultimate carbon ( in hexoses), with predominant in . Epimers are diastereomers differing at a single chiral center, such as glucose and , which differ at C2. A key example of structural variation is seen in and , both glucose polymers but with distinct linkages. Cellulose consists of linear chains of β-D-glucose linked by β-1,4-glycosidic bonds, promoting an extended, rigid conformation stabilized by hydrogen bonds between chains, forming microfibrils that provide tensile strength to walls. In contrast, starch features α-1,4-glycosidic bonds in its linear component and branching via α-1,6 linkages every 24–30 residues in , yielding a helical, compact suited for in . These differences render cellulose indigestible by most animals, while starch is readily hydrolyzed.

Lipids

Lipids constitute a diverse group of amphipathic biomolecules essential for cellular architecture, primarily forming the structural basis of biological membranes and depots. Unlike proteins or nucleic acids, lipids do not form linear polymers but instead self-assemble into dynamic supra-molecular structures driven by hydrophobic interactions between their nonpolar tails and hydrophilic interactions of their polar heads. This amphipathicity enables lipids to create barriers that compartmentalize cellular processes while allowing selective permeability. In biomolecular structure, lipids are classified based on their core scaffolds, with fatty acids serving as the fundamental hydrophobic components. Fatty acids are long-chain carboxylic acids typically containing 12 to 24 carbon atoms, with a polar carboxyl group at one end and a hydrocarbon chain that can be saturated or unsaturated. Saturated fatty acids, such as palmitic acid (16:0), feature fully hydrogenated chains with no carbon-carbon double bonds, resulting in straight, linear structures that pack tightly due to van der Waals interactions. In contrast, unsaturated fatty acids incorporate one or more cis double bonds, which introduce kinks in the chain—for instance, oleic acid (18:1 Δ9 cis) has a single cis double bond between carbons 9 and 10, disrupting alignment and reducing packing density. These structural variations in chain saturation and configuration profoundly influence the physical properties of lipid assemblies. Major lipid classes in membranes include phospholipids, steroids, and , each contributing distinct structural motifs. Phospholipids, the predominant membrane lipids, consist of a backbone esterified to two tails and a phosphorylated polar head group, such as choline in , creating a classic head-tail architecture. This amphipathic design drives spontaneous formation of bilayers, where hydrophilic heads face aqueous environments and hydrophobic tails sequester inward, as observed in membranes. Steroids, exemplified by , feature a rigid, planar tetracyclic with a hydroxyl group at and a nonpolar isooctyl tail, allowing intercalation between phospholipid tails to enhance membrane stability without disrupting the bilayer core. 's fused rings confer rigidity, counteracting excessive fluidity in high-temperature or unsaturated environments. share a backbone—formed by (an 18-carbon amino ) amide-linked to a chain—and bear diverse head groups like in , enabling roles in and signaling domains. Lipid assemblies vary by molecular geometry and environmental conditions, yielding structures like micelles, vesicles, and phase-separated domains. Micelles form from single-tailed amphiphiles, such as lysophospholipids or detergents, arranging into spherical monolayers with tails inward to minimize water contact, often seen in solubilization processes. Vesicles, or liposomes, arise from bilayer-forming like phospholipids, enclosing an aqueous core in closed spherical structures that mimic cellular compartments and are used in models. In native membranes, lipid rafts emerge as ordered microdomains through lateral , enriched in , , and glycosphingolipids, which adopt a distinct from the surrounding liquid-disordered , facilitating protein clustering and signaling. These rafts highlight how lipid composition dictates heterogeneous membrane organization. Membrane fluidity, critical for protein mobility and permeability, is finely tuned by fatty acid properties. Longer acyl chains increase van der Waals interactions, promoting tighter packing and reduced fluidity, whereas shorter chains enhance disorder and mobility. Unsaturation further modulates this: each cis double bond introduces bends that hinder crystallization, elevating fluidity—as seen in polyunsaturated fatty acids like (18:2 Δ9,12 cis,cis), which maintain membrane flexibility at physiological temperatures. This modulation ensures adaptive responses to environmental stresses, such as temperature changes. The foundational lipid bilayer model, proposing a bimolecular leaflet arrangement, was established in 1925 by Gorter and through monolayer experiments on extracted lipids, revealing that surface area doubled upon spreading, indicating a dual-layer . Glycolipids, hybrids of and carbohydrates, briefly extend this diversity by attaching sugar moieties to or backbones, influencing surface recognition.

Experimental Structure Determination

X-ray Crystallography

X-ray crystallography is a cornerstone technique for elucidating the atomic-level three-dimensional structures of biomolecules, such as proteins and nucleic acids, by exploiting the diffraction of X-rays through ordered molecular crystals. The method measures the interference patterns generated when X-rays scatter off electrons in the atoms, yielding data that can be transformed into electron density maps for model building. This approach has been instrumental in understanding biomolecular function, as structures reveal key features like active sites and folding motifs. The process commences with and , the most labor-intensive step, where biomolecules are screened against thousands of conditions involving salts, polymers, or ligands to nucleate and grow diffraction-quality , often using vapor diffusion or microbatch methods. Suitable , typically micrometers in size, are then mounted and irradiated with monochromatic X-rays, producing a diffraction pattern of discrete spots whose positions and intensities correspond to the components of the . These intensities provide the magnitudes of structure factors, but reconstructing the full density requires solving for the missing phases. The phase problem arises because X-ray detectors record only intensities (proportional to the square of amplitudes), necessitating indirect methods to infer phases. Multiple isomorphous replacement (MIR) addresses this by deriving phases from differences in diffraction between the native crystal and isomorphous heavy-atom derivatives, such as mercury or compounds, which introduce phase shifts without disrupting the lattice. Complementing MIR, multiwavelength anomalous diffraction (MAD) exploits tunable synchrotron X-rays near the absorption edge of atoms like (incorporated via substitution), collecting data at multiple wavelengths to exploit anomalous scattering for phase determination, offering higher accuracy and avoiding non-isomorphism issues. With phases in hand, an map is calculated via inverse , contoured to display regions of high electron density where atoms are positioned manually or automatically, followed by refinement to minimize discrepancies with observed data. High-quality structures achieve resolutions of 1-2 , sufficient to distinguish individual atoms, bond lengths, and side-chain orientations, though resolutions below 1.5 are ideal for unambiguous interpretation. Historically, the technique's application to proteins culminated in 1959 when reported the 2 Å structure of using , marking the first visualization of a protein's polypeptide chain folded into α-helices and revealing its oxygen-binding pocket. This seminal work, shared with for , earned Kendrew the 1962 and established as viable for complex biomolecules. The advent of sources in the 1980s dramatically accelerated progress by delivering collimated, high-flux X-rays orders of magnitude brighter than rotating anodes, enabling rapid data collection from tiny or weakly diffracting crystals and facilitating time-resolved studies. Facilities like the UK's Daresbury Laboratory, operational since 1980, democratized access and boosted output. As of 2025, the holds 199,418 entries from , representing the majority of deposited biomolecular structures and enabling comparative analyses across diverse systems. Despite these advances, the method's reliance on crystals introduces limitations, as packing forces can induce conformational artifacts not present in solution, potentially misrepresenting dynamic or flexible regions. excels for static, high-resolution snapshots of compact biomolecules but is often paired with cryo-electron microscopy for large, heterogeneous complexes.

Nuclear Magnetic Resonance (NMR) Spectroscopy

Nuclear magnetic resonance (NMR) spectroscopy provides atomic-level insights into biomolecular structures in solution, complementing techniques like by capturing dynamic ensembles rather than static crystals. It relies on the magnetic properties of atomic nuclei, such as ¹H, ¹³C, and ¹⁵N, to probe interatomic interactions and conformations under near-physiological conditions. This method has been instrumental in determining structures of proteins, nucleic acids, and their complexes, with over 14,600 entries in the derived from NMR data as of 2025. The core principles of NMR for structure determination involve spectral parameters that report on local environments and spatial relationships. Chemical shifts indicate the electronic surroundings of nuclei, correlating with secondary structure elements like α-helices and β-sheets through deviations from values, often quantified via the . The (NOE) yields through-space distance restraints up to approximately 5 , with intensity scaling as 1/r⁶ where r is the interproton distance, enabling mapping of tertiary contacts such as those in β-sheet hydrogen bonds. J-couplings, mediated through bonds, provide information via the Karplus relation; for instance, the three-bond ³J_{HN-Hα} coupling (typically 3–9 Hz) distinguishes backbone φ angles in helices (<4 Hz) from those in sheets (>8 Hz). Structural elucidation proceeds through multidimensional NMR experiments on isotope-labeled samples. In 2D spectroscopy, COSY detects J-coupled protons within spin systems for initial residue identification, while HSQC correlates ¹H with ¹⁵N or ¹³C, producing a "fingerprint" spectrum with one peak per amide group. Higher-dimensional (/) spectra, such as HNCA and HN(CA)CO, facilitate sequential resonance assignment by linking intra- and inter-residue correlations through backbone nuclei, often via "sequential walks" that trace the polypeptide chain using NOE connectivities. These assignments, pioneered in the early with the bovine pancreatic inhibitor (BPTI), marked the first complete protein structure determination by NMR, achieving a bundle of conformers with root-mean-square deviations below 1 Å in rigid regions. Resulting distance and angle restraints are input into molecular modeling software to generate ensembles of structures. Despite its strengths, solution NMR faces limitations, including a practical size threshold of about 50 kDa for comprehensive studies due to increasing linewidths from slower tumbling, which reduce sensitivity and resolution. Uniform with ¹³C and ¹⁵N is essential to access heteronuclear experiments and suppress spectral overlap, typically achieved by expressing proteins in media enriched with ¹⁵N-NH₄Cl and ¹³C-glucose. NMR excels at probing , such as conformational exchanges via CPMG relaxation dispersion, which quantifies exchange rates (k_{ex} ≈ 100–3,000 s⁻¹) and populations of excited states in enzymes like . These dynamic insights, often validated by comparison to structures, highlight functional flexibility invisible in crystal lattices.

Cryo-Electron Microscopy (Cryo-EM)

Cryo-electron microscopy (cryo-EM) is a pivotal technique for determining the three-dimensional structures of biomolecular complexes at near-atomic resolution, particularly those that are large, dynamic, or resistant to . Developed over decades, it involves imaging biological samples preserved in a frozen-hydrated state to minimize structural perturbations, enabling visualization of proteins, nucleic acids, and assemblies in near-native conditions. Unlike methods requiring ordered , cryo-EM accommodates heterogeneous and flexible biomolecules, making it ideal for studying macromolecular machines such as viruses and ribosomes. The core process begins with , where purified biomolecules are applied to a holey carbon grid and rapidly frozen by plunging into , forming a thin layer of vitreous ice that embeds the particles without formation. This , pioneered by in the , preserves the native and conformation of the samples. The grid is then transferred to a cryo-electron microscope, where low-dose beams (typically <20 e⁻/Ų) are used to capture 2D projection images at cryogenic temperatures, often as dose-fractionated movies with direct detectors to mitigate beam-induced motion. Particle picking follows, involving automated or semi-automated identification and extraction of individual macromolecular projections from thousands of micrographs, followed by 2D classification to remove junk particles and generate class averages. These are then used for via iterative alignment and refinement algorithms, such as projection matching, to build a density map that can be interpreted with atomic models. A major advancement, termed the "resolution revolution," occurred in the 2010s with the introduction of direct electron detectors, which improved signal-to-noise ratios and enabled movie-mode imaging to correct for specimen drift, routinely achieving resolutions better than 4 . These detectors, such as the Gatan K2 Summit and Thermo Fisher Falcon, capture individual electron events with high , dramatically enhancing data quality compared to earlier cameras. By 2025, resolutions of 2-4 have become standard for well-behaved samples, allowing model building and visualization of side-chain densities in many cases. This breakthrough was recognized with the 2017 awarded to , , and Richard Henderson for their foundational contributions: Dubochet's method, Frank's development of single-particle algorithms in the 1970s-1980s, and Henderson's demonstration of atomic-resolution potential in the 1990s. Early milestones included the first near-atomic resolution structures of icosahedral viruses achieved between 2008 and 2010, such as the 3.8 Å reconstruction of double-layer particles and capsids, which resolved secondary structures and interfaces previously inaccessible. Applications have since expanded to complex assemblies like ribosomes, where structures at 2.5-3 Å have elucidated mechanisms across , and viruses, revealing entry and assembly pathways for pathogens like Zika and SARS-CoV-2. To address sample heterogeneity—variations in conformation, composition, or occupancy—modern methods employ focused classification, 3D variability analysis, or Gaussian mixture models during refinement, allowing separation of distinct states without averaging out dynamics. As of November 2025, the Electron Microscopy Data Bank (EMDB) holds 51,509 entries, predominantly cryo-EM maps, underscoring its dominance in . Cryo-EM data can also integrate with for hybrid models of subdomains.

Computational Structure Analysis

Structure Prediction

Structure prediction in biomolecular science involves computational approaches to infer three-dimensional (3D) conformations from primary sequences, such as amino acid or nucleotide chains, without relying on experimental data. Traditional methods include homology modeling, which constructs models by aligning a target sequence to structurally similar templates in databases like the Protein Data Bank (PDB), and ab initio prediction, which uses physics-based energy minimization to explore conformational space from first principles. Homology modeling relies on evolutionary conservation, achieving reliable results when sequence identity exceeds 30% to known structures, as implemented in tools like SWISS-MODEL. Ab initio methods, exemplified by the Rosetta protocol, employ fragment assembly and Monte Carlo sampling to generate low-energy decoys, proving effective for small proteins lacking close homologs during early Critical Assessment of Structure Prediction (CASP) experiments. The advent of (AI) has revolutionized prediction, particularly through models that leverage multiple alignments (MSAs) to capture coevolutionary signals indicating residue proximities. DeepMind's , first entering CASP13 in 2018, outperformed competitors by integrating convolutional neural networks with MSAs and structural templates. Its successor, AlphaFold2, dominated CASP14 in 2020 with a global distance test (GDT) score of 92.4, achieving backbone (RMSD) accuracies below 1 Å for many targets. The uses an Evoformer to MSAs and pairwise representations, followed by iterative refinement via invariant point attention, enabling atomic-level predictions even for novel folds. In July 2021, DeepMind released an initial AlphaFold database containing over 365,000 high-accuracy models for 20 proteomes, later expanded to more than 200 million covering nearly all known proteins. Subsequent AI developments, such as Meta AI's ESMFold released in 2023, further accelerated predictions by using large language models trained on evolutionary-scale data to directly infer structures from single sequences, bypassing MSA computation and achieving near-AlphaFold accuracy in seconds rather than hours. These methods typically yield RMSD values under 2 Å for ordered regions of globular proteins, establishing atomic precision comparable to experimental techniques. However, limitations persist: AlphaFold2 struggles with intrinsically disordered regions (IDRs), where low-confidence predictions (pLDDT <50) indicate poor MSA signals due to rapid sequence evolution, and with protein complexes, particularly those dominated by heterotypic interactions lacking strong intra-chain contacts. ESMFold shares similar challenges for IDRs and multi-chain assemblies. Despite these advances, predictions remain static snapshots, often requiring experimental validation for functional insights. Building on these, AlphaFold 3, released by DeepMind in May 2024, extends predictions to complexes involving proteins with DNA, RNA, ligands, and ions using a diffusion-based architecture, achieving improved accuracy for biomolecular interactions. Additionally, ESM3, developed by EvolutionaryScale (founded by former Meta AI researchers) and released in June 2024, is a generative multimodal model that jointly reasons over protein sequence, structure, and function, simulating evolutionary processes to design novel proteins.

Molecular Modeling and Simulation

Molecular modeling and simulation play a crucial role in elucidating the dynamic aspects of biomolecular structures, complementing static experimental by capturing conformational changes, interactions, and energetic landscapes over time. These techniques primarily employ (MD) simulations, which compute the of atomic positions and velocities in a biomolecular system based on . By solving the for thousands to millions of atoms, MD reveals how structures fluctuate, fold, and interact at the level, providing insights into processes that occur on timescales inaccessible to many experiments. The core of MD simulations involves empirical force fields that approximate the of the system, such as and CHARMM, which parameterize bonded (bonds, , dihedrals) and non-bonded (van der Waals, electrostatic) interactions. These fields enable the of forces on each , derived from the negative of the . The are governed by Newton's second of motion, \mathbf{F}_i = m_i \mathbf{a}_i, where \mathbf{F}_i is the on atom i, m_i its mass, and \mathbf{a}_i its acceleration. To propagate the system in time, these equations are discretized and numerically integrated using algorithms like the Verlet or velocity Verlet methods, typically with timesteps of 1-2 femtoseconds to maintain energy conservation. The first biomolecular MD simulation, performed on the bovine pancreatic trypsin inhibitor (BPTI) protein in 1977, covered just 10 picoseconds and demonstrated atomic fluctuations consistent with experimental observations. Standard all-atom MD simulations, which treat every atom explicitly, are limited to timescales of picoseconds to microseconds due to computational demands, restricting their ability to observe slower processes like large-scale conformational changes. Coarse-grained models, which represent groups of atoms as single beads, extend accessible timescales to microseconds or longer by reducing the , though at the cost of atomic detail. To overcome sampling limitations for rare events, enhanced sampling techniques such as and are employed; applies biasing potentials along a to sample multiple windows, while deposits Gaussian hills in collective variable space to flatten the free-energy landscape and accelerate exploration. Applications of MD simulations in biomolecular structure include probing protein folding pathways, where trajectories reveal intermediate states and transition mechanisms, and studying ligand , which captures , association , and induced-fit adaptations. Free energy calculations, often using thermodynamic integration or within MD frameworks, quantify affinities via the relation \Delta G = -RT \ln K, where \Delta G is the standard free energy change, R the , T the , and K the ; these enable relative of ligands for . Advances in , such as the Anton developed by D.E. Shaw Research, have enabled millisecond-scale simulations of proteins like BPTI by the early , unveiling rare events like domain motions and folding funnels previously inaccessible.

Biomolecular Design

Biomolecular design involves the rational and computational creation of novel biomolecules, primarily proteins, with predefined structures and functions not found in . This field leverages physics-based modeling and to engineer sequences and folds for applications in therapeutics, , and . design starts from scratch, generating entirely new backbones and sequences, while folding designs sequences compatible with target s. These approaches have enabled the development of stable, functional proteins, with over 1,500 structurally characterized designs reported by 2025. Key approaches in de novo protein design include blueprint-based methods that assemble secondary structure elements into novel topologies, as exemplified by the Rosetta software suite. RosettaDesign, introduced in 2000, optimizes amino acid sequences for given backbones by minimizing free energy using a physics-based potential, allowing the creation of proteins with atomic-level accuracy. For instance, it has been used to redesign nine natural protein folds with sequences that fold correctly and maintain stability comparable to wild-type proteins. Inverse folding complements this by solving the inverse problem: generating sequences likely to adopt a specified 3D structure. The ProteinMPNN model, a deep learning-based inverse folder from 2022, achieves high success rates in designing functional sequences for diverse motifs, outperforming traditional methods in both in silico and experimental validation. Recent AI-assisted tools have accelerated de novo design, particularly diffusion models that generate protein backbones from . RFdiffusion, released in 2023, fine-tunes a RoseTTAFold-derived network to produce diverse, high-fidelity structures conditioned on specifications like or sites, enabling the design of monomers, oligomers, and binders with experimental success rates exceeding 20% for novel folds. Generative models like ESM3 (2024) further advance this by simulating evolutionary trajectories to create proteins with integrated sequence, structure, and function, facilitating the design of entirely novel entities such as fluorescent proteins. Hybrid methods combine computational with , where initial designs are iteratively improved through random mutagenesis and selection. Frances Arnold's pioneering work on , awarded the 2018 , demonstrated the creation of enzymes with new specificities, such as variants active in organic solvents; integrating this with computational tools like has yielded enzymes with catalytic efficiencies rivaling natural ones. Notable examples include computationally designed enzymes for the Kemp , a for proton abstraction not catalyzed by natural proteins. In 2008, eight de novo enzymes were created using , achieving rate accelerations up to 10^5-fold over uncatalyzed reactions through theozyme-based placement. For therapeutics, de novo miniproteins—compact scaffolds of 40-60 residues—have been designed as high-affinity binders. RFdiffusion-generated miniproteins inhibit viral proteins like the MERS-CoV spike with picomolar affinity, offering advantages over antibodies in stability and manufacturability, and have advanced to preclinical testing for infectious diseases and cancer targets. These designs are often validated using molecular simulations to confirm folding and dynamics.

References

  1. [1]
    Molecular Structure and Function - Opportunities in Biology - NCBI
    The central focus in structural biology at present is the three-dimensional arrangement of the atoms that constitute a large biological molecule. Two decades ...
  2. [2]
    The Molecular Composition of Cells - The Cell - NCBI Bookshelf - NIH
    Most of these organic compounds belong to one of four classes of molecules: carbohydrates, lipids, proteins, and nucleic acids. Proteins, nucleic acids, and ...
  3. [3]
    [PDF] Biomolecular structure (including protein structure)
    Oct 1, 2024 · Chemical (two-dimensional) structure shows covalent bonds between atoms. Essentially a graph. • Three-dimensional structure shows relative ...
  4. [4]
    Frederick Sanger – Facts - NobelPrize.org
    Beginning in the 1940s, Frederick Sanger studied the composition of the insulin molecule. He used acids to break the molecule into smaller parts, which were ...
  5. [5]
    Molecular Interactions (Noncovalent Interactions) - Loren Williams
    Jun 10, 2024 · These interactions can be cohesive (attraction between like substances), adhesive (attraction between different substances), or repulsive.What are molecular interactions? · Electrostatic interactions · Cation-Π interactions
  6. [6]
    From structure to function | Nature Methods
    By docking potential substrates into the active site of an enzyme of known structure, researchers accurately predict the catalytic activity of the enzyme.
  7. [7]
    G protein-coupled receptors: structure- and function-based drug ...
    Jan 8, 2021 · The flexibility of receptor-binding pocket endows the complex pharmacological mechanisms of ligand recognition and signal transduction.
  8. [8]
    Structural biology in motion | Nature Structural & Molecular Biology
    Biological macromolecules such as proteins and nucleic acids perform crucial tasks that sustain life. The specific task, such as an enzymatic reaction or a ...
  9. [9]
    Michael G. Rossmann (1930–2019) | Nature Structural & Molecular ...
    Jul 8, 2019 · Evolutionary structural conservation was a common ... Rossmann fold' and represents one of the most common protein folds across evolution.
  10. [10]
    Protein Misfolding and Degenerative Diseases - Nature
    Misfolded proteins (also called toxic conformations) are typically insoluble, and they tend to form long linear or fibrillar aggregates known as amyloid ...
  11. [11]
    First molecular explanation of disease - Nature
    The mutation makes sickle cell hemoglobin less soluble and more prone to form the distinct fibrous precipitates that cause the erythrocytes to adopt the ...
  12. [12]
    The Process of Structure-Based Drug Design - Cell Press
    Structure-based drug design is a powerful method, especially when used as a tool within an armamentarium, for discovering new drug leads against important ...
  13. [13]
    Protein engineering via sequence-performance mapping - Cell Press
    Jul 25, 2023 · Summary. Discovery and evolution of new and improved proteins has empowered molecular therapeutics, diagnostics, and industrial biotechnology.
  14. [14]
    Biochemistry, Primary Protein Structure - StatPearls - NCBI Bookshelf
    Oct 31, 2022 · To reiterate, the primary structure of a protein is defined as the sequence of amino acids linked together to form a polypeptide chain.
  15. [15]
    Style Points and Conventions - The NCBI Style Guide
    Note: Single-letter abbreviations should be used sparingly. Example: MFVNQH (instead of Met-Phe-Val-Asn-Gln-His). Amino acid abbreviations are determined by ...
  16. [16]
    Biochemistry, Essential Amino Acids - StatPearls - NCBI Bookshelf
    Apr 30, 2024 · Proteins are made up of 20 amino acids. Each amino acid has an α-carboxyl group, a primary α-amino group, and a side chain called the R group ( ...
  17. [17]
    Mathematical modeling and comparison of protein size distribution ...
    Eukaryotic proteins have an average size of 472 aa, whereas bacterial (320 aa) and archaeal (283 aa) proteins are significantly smaller (33-40% on average).
  18. [18]
    Protein Sequencing, One Molecule at a Time - PMC - NIH
    This early work was quickly superseded by a method presented by Pehr Edman that used phenylisothiocyanate as a reagent for the stepwise degradation of a protein ...
  19. [19]
    Exploring the Impact of Single-Nucleotide Polymorphisms on ... - NIH
    Mutations have the potential to alter all steps of gene expression depending on their genomic location. When present within transcriptional regulatory elements, ...
  20. [20]
    The Shape and Structure of Proteins - Molecular Biology of the Cell
    The amino acid sequence is known as the primary structure of the protein. Stretches of polypeptide chain that form α helices and β sheets constitute the ...
  21. [21]
    The structure of proteins: Two hydrogen-bonded helical ... - PNAS
    Two hydrogen-bonded helical structures for a polypeptide chain have been found in which the residues are stereochemically equivalent.
  22. [22]
    Transmembrane α helices - PMC - PubMed Central - NIH
    This chapter discusses effects of intrinsic membrane proteins on lipid bilayers and model transmembrane α helices.
  23. [23]
    How does a β-barrel integral membrane protein insert into the ...
    May 28, 2016 · β-Barrel membrane proteins are usually located in the outer membranes of Gram-negative bacteria, as well as mitochondria and chloroplasts of ...
  24. [24]
    A Perspective on the (Rise and Fall of) Protein β-Turns - PMC
    Oct 14, 2022 · The β-turn is the third defined secondary structure after the α-helix and the β-sheet. The β-turns were described more than 50 years ago and account for more ...
  25. [25]
    AF2Complex predicts direct physical interactions in multimeric ...
    Apr 1, 2022 · Accurate descriptions of protein-protein interactions are essential for understanding biological systems.
  26. [26]
  27. [27]
    Functional implications of protein-protein interactions in icosahedral ...
    Jan 9, 1996 · The mechanism of assembly-dependent cleavage is conserved in noda- and tetraviruses, although the quaternary structures of the capsids are ...
  28. [28]
    General trends in the relationship between binding affinity and ... - NIH
    Although most of the protein–protein complexes have interfaces areas more than 2000 Å2, there are also many of them with interface areas less than 2000 Å2.Missing: quaternary Ų
  29. [29]
    Protein oligomerization as a metabolic control mechanism
    It has been estimated that 30%–50% of proteins self‐assemble to form complexes consisting of multiple copies of themselves. If there is a functional difference ...Missing: percentage | Show results with:percentage
  30. [30]
    How gene duplication diversifies the landscape of protein oligomeric ...
    Aug 22, 2022 · Oligomeric proteins are central to cellular life and the duplication and divergence of their genes is a key driver of evolutionary innovations.How Gene Duplication... · Gene Duplication Can Drive... · Figure 1
  31. [31]
    A Structure for Deoxyribose Nucleic Acid - Nature
    The determination in 1953 of the structure of deoxyribonucleic acid (DNA), with its two entwined helices and paired organic bases, was a tour de force in ...
  32. [32]
    On the length, weight and GC content of the human genome - PMC
    Feb 27, 2019 · The male nuclear diploid genome extends for 6.27 Gigabase pairs (Gbp), is 205.00 cm (cm) long and weighs 6.41 picograms (pg).
  33. [33]
    DNA: Alternative Conformations and Biology - NCBI - NIH
    Left-handed Z-DNA has been mostly found in alternating purine-pyrimidine sequences (CG)n and (TG)n. Z-DNA is thinner (18 Å) than B-DNA (20 Å), the bases are ...Missing: original | Show results with:original
  34. [34]
    From DNA to RNA - Molecular Biology of the Cell - NCBI Bookshelf
    The chemical structure of RNA. (A) RNA contains the sugar ribose, which differs from deoxyribose, the sugar used in DNA, by the presence of an additional -OH ...
  35. [35]
    Mechanisms of catalytic RNA molecules - PMC - PubMed Central
    All structures show that the active site includes two Mg2+ coordinated by adjacent non-bridging oxygens in phosphate groups of the backbone of the RNA subunit, ...
  36. [36]
    The Application of mRNA Technology for Vaccine Production ...
    Apr 4, 2025 · Messenger RNA (mRNA) is a relatively small single-stranded structure with a length of approximately 1000–5000 nucleotides. ... mRNA from DNA in ...
  37. [37]
    mRNA secondary structure optimization using a correlated stem ...
    Jan 15, 2013 · Secondary structure of messenger RNA plays an important role in the bio-synthesis of proteins. Its negative impact on translation can reduce the ...
  38. [38]
    RNA pseudoknots - PMC - PubMed Central - NIH
    Pseudoknot structures appear to play a pivotal role in small subunit ribosomal RNA and in the noncoding regions of viral RNAs.
  39. [39]
    Naturally Occurring tRNAs With Non-canonical Structures - PMC
    The cloverleaf secondary structure is formed from Watson–Crick base pairs (bp) which create helical stems typically ending in unpaired bases to form loops.Figure 3 · Trna-Like Structures · Allo-Trna
  40. [40]
    Thomas R. Cech – Article - NobelPrize.org
    Dec 3, 2004 · The three-dimensional structure of the original ribozyme, the self-splicing intron of Tetrahymena (13). Green and blue ribbons indicate the path ...
  41. [41]
    Secondary structure and domain architecture of the 23S and 5S rRNAs
    Jun 14, 2013 · The best domain model for the 23S rRNA contains seven domains, not six as previously ascribed. Domain 0 forms the core of the 23S rRNA, to which ...
  42. [42]
    A guide to microRNA‐mediated gene silencing - FEBS Press - Wiley
    Sep 29, 2018 · MiRNAs preferentially bind their target mRNAs with perfect Watson-Crick pairing between nucleotides 2–7 from the 5′ end of the miRNA referred to ...Alteration Of Mirna... · 3′ Utr Mirna Reporters · Genome Editing Approaches
  43. [43]
    Monosaccharide Diversity - Essentials of Glycobiology - NCBI - NIH
    The new asymmetric center is termed the “anomeric carbon” (i.e., C-1 in the ring form of glucose). Two stereoisomers are formed by the cyclization reaction ...
  44. [44]
    [PDF] Chapter 7 Carbohydrates: Nomenclature Monosaccharides
    Cyclic structures and anomeric forms. • Cyclization to form hemiacetal or hemiketal introduces an additional chiral carbon, called the anomeric carbon.
  45. [45]
    6.8: Polysaccharides - Chemistry LibreTexts
    Mar 18, 2025 · It is a branched polysaccharide composed of alpha-D-glucose units with α-1,4 and α-1,6-glycosidic bonds. It is more highly branched than ...
  46. [46]
    Structure and Function of Carbohydrates | Biology for Majors I
    Glycosidic bonds (also called glycosidic linkages) can be of the alpha or the beta type. An alpha bond is formed when the OH group on the carbon-1 of the first ...
  47. [47]
    2.1: Carbohydrates- structure and diversity in biology
    Jun 2, 2019 · For pyranose rings, the two conformations are the chair and the boat. Substituents are either equatorial or axial. If drawn to the left in a ...
  48. [48]
    D- and L- Notation For Sugars - Master Organic Chemistry
    May 24, 2017 · D- and L- notation describes a sugar's absolute configuration. In a Fischer projection, if the bottom-most OH is on the right, it's D; if on ...Missing: boat | Show results with:boat
  49. [49]
    [PDF] Carbohydrate-Structure-1.pdf
    Monosaccharides contain several chiral carbons and therefore exist in a variety of stereochemical forms. Epimers are sugars that differ in configuration at only ...
  50. [50]
    BCH//PLS/PPA 609 | Lecture Eleven B Web Notes
    Jan 23, 2018 · Cellulose is by far the earth's most abundant biological polymer. It is a major structural component of the plant cell wall or extracellular ...
  51. [51]
    [PDF] Lecture 31 (12/07/20)
    Aug 5, 2020 · Starch is a mixture of two homopolysaccharides of glucose. – Amylopectin is like glycogen, but the branch points (α(1 → 6) linkages) occur every ...
  52. [52]
    The Lipid Bilayer - Molecular Biology of the Cell - NCBI Bookshelf
    The most abundant membrane lipids are the phospholipids. These have a polar head group and two hydrophobic hydrocarbon tails. The tails are usually fatty acids, ...Membrane Lipids Are... · The Fluidity of a Lipid Bilayer... · The Plasma Membrane...
  53. [53]
    Biochemistry, Lipids - StatPearls - NCBI Bookshelf - NIH
    May 1, 2023 · Fatty acids are made of 12 carbons or less and are absorbed through the intestinal mucosal villi. They enter the bloodstream through capillaries ...
  54. [54]
    Lipids - MSU chemistry
    Natural fatty acids may be saturated or unsaturated, and as the following data indicate, the saturated acids have higher melting points than unsaturated acids ...
  55. [55]
    Biochemistry, Cholesterol - StatPearls - NCBI Bookshelf - NIH
    Cholesterol is a structural component of cell membranes and serves as a building block for synthesizing various steroid hormones, vitamin D, and bile acids.
  56. [56]
    A Comprehensive Review: Sphingolipid Metabolism and ...
    May 28, 2021 · The basic composition of complex sphingolipids includes a ceramide backbone and often a polar head group at position 1. Typically, sphingolipids ...
  57. [57]
    x Ray crystallography - PMC - PubMed Central - NIH
    The resulting electron density map will form the three dimensional contours into which the protein structure will be built. Each of the unit cell edges is ...
  58. [58]
    Protein Crystallization for X-ray Crystallography - PMC - NIH
    Jan 16, 2011 · In this article we describe and demonstrate general current protocols for protein crystallization. Since it a multi-step procedure there are few considerations ...
  59. [59]
    Isomorphous Replacement - an overview | ScienceDirect Topics
    Isomorphous replacement phasing methods that are routinely used in X-ray crystallography are technically difficult to apply to large macromolecular complexes.
  60. [60]
    Multiwavelength anomalous diffraction analysis at the M absorption ...
    Abstract. The multiwavelength anomalous diffraction (MAD) method for phase evaluation is now widely used in macromolecular crystallography.
  61. [61]
    Learn: Guide to Understanding PDB Data: Crystallographic Data
    Two pieces of information are needed to create an electron density map: the amplitude of X-rays in each reflection and the phase of X-rays in each reflection.
  62. [62]
    Resolution - Proteopedia, life in 3D
    May 16, 2022 · Structure determination by X-ray crystallography or cryo-electron microscopy produces an electron density map (shown in green).
  63. [63]
    A Three-Dimensional Fourier Synthesis at 2 Å. Resolution - Nature
    Structure of Myoglobin: A Three-Dimensional Fourier Synthesis at 2 Å. Resolution. Nature 185, 422–427 (1960).
  64. [64]
    John Kendrew and myoglobin: Protein structure determination ... - NIH
    Abstract. The essay reviews John Kendrew's pioneering work on the structure of myoglobin for which he shared the Nobel Prize for Chemistry in 1962.
  65. [65]
    About Synchrotrons - - Diamond Light Source
    This changed in 1980, when the UK built the world's first synchrotron dedicated to producing synchrotron light for experiments at Daresbury in Cheshire. Now ...
  66. [66]
    PDB Statistics: Growth of Structures from X-ray Crystallography ...
    PDB Statistics: Growth of Structures from X-ray Crystallography Experiments Released per Year ; 2025, 198,931, 8,473 ; 2024, 190,458, 9,206 ; 2023, 181,252, 9,584.
  67. [67]
    Discrimination between biological interfaces and crystal-packing ...
    Nov 2, 2008 · However, the structures determined by X-ray crystallography could contain nonbiological interactions due to the nature of crystals.Missing: limitations | Show results with:limitations
  68. [68]
  69. [69]
  70. [70]
  71. [71]
  72. [72]
  73. [73]
    The Nobel Prize in Chemistry 2017 - Popular information
    Jacques Dubochet, Joachim Frank and Richard Henderson have made ground-breaking discoveries that have enabled the development of cryo-EM. The method has ...
  74. [74]
    A Primer to Single-Particle Cryo-Electron Microscopy - PMC
    This primer explains the different steps and considerations involved in structure determination by single-particle cryo-EM to provide an overview for ...
  75. [75]
    The Nobel Prize in Chemistry 2017 - NobelPrize.org
    The Nobel Prize in Chemistry 2017 was awarded jointly to Jacques Dubochet, Joachim Frank and Richard Henderson for developing cryo-electron microscopy.
  76. [76]
    Near-atomic resolution reconstructions of icosahedral viruses ... - NIH
    Nine different near-atomic resolution structures of icosahedral viruses, determined by electron cryo-microscopy and published between early 2008 and late 2010, ...
  77. [77]
    Determination of the ribosome structure to a resolution of 2.5 Å by ...
    The ribosome has served as a benchmark sample for decades in the development of single‐particle cryo‐EM.10 Sub‐3 Å resolution of ribosomes, in the range of 2.5– ...
  78. [78]
    Virus structures revealed by advanced cryoelectron microscopy ...
    Nov 2, 2023 · This review outlines current advanced cryo-EM methods for high-resolution structure determination of viruses and summarizes accomplishments ...
  79. [79]
    Integrating molecular models into CryoEM heterogeneity analysis ...
    We introduce an improved computational method that uses Gaussian mixture models for protein structure representation and deep neural networks for conformation ...
  80. [80]
    EMDB < Home - EMBL-EBI
    As of 12 November 2025, EMDB contains 51217 entries (latest entries, trends). EMDB News. Q-score as a reliability measure for protein, nucleic acid and small- ...EMDB < Policies · EMDB < Statistics · EMDB < News history · REST API
  81. [81]
    SWISS-MODEL: homology modelling of protein structures and ...
    May 21, 2018 · Homology modelling has matured into an important technique in structural biology, significantly contributing to narrowing the gap between known ...INTRODUCTION · MATERIALS AND METHODS · RESULTS AND DISCUSSIONS
  82. [82]
    Ab initio protein structure prediction of CASP III targets using ...
    These results suggest that ab initio methods may soon become useful for low-resolution structure prediction for proteins that lack a close homologue of known ...
  83. [83]
    AlphaFold: a solution to a 50-year-old grand challenge in biology
    Nov 30, 2020 · In the results from the 14th CASP assessment, released today, our latest AlphaFold system achieves a median score of 92.4 GDT overall across all ...
  84. [84]
    Highly accurate protein structure prediction with AlphaFold - Nature
    Jul 15, 2021 · Here we provide the first computational method that can regularly predict protein structures with atomic accuracy even in cases in which no similar structure ...
  85. [85]
  86. [86]
    Evolutionary-scale prediction of atomic-level protein structure with a ...
    Mar 16, 2023 · We demonstrate direct inference of full atomic-level protein structure from primary sequence using a large language model.
  87. [87]
    Before and after AlphaFold2: An overview of protein structure ...
    Feb 27, 2023 · In this mini-review, we provide an overview of the breakthroughs in protein structure prediction before and after AlphaFold2 emergence.
  88. [88]
    Molecular Dynamics Simulations of Biomolecules - ACS Publications
    The simulation concerned the bovine pancreatic trypsin inhibitor (BPTI), which has served as the “hydrogen molecule” of protein dynamics because of its small ...
  89. [89]
    Coarse-Grained Molecular Dynamics
    Coarse-Grained Molecular Dynamics. Problem of Scale. One of the main unresolved problems in biological science is the time-scale and length-scale gap ...
  90. [90]
    On Free Energy Calculations in Drug Discovery - ACS Publications
    Oct 10, 2025 · ... the unbound state (i.e., the free ligand and protein forms). This relationship is expressed by the following equation: ΔG°b=−RTln(KaC°).
  91. [91]
    Millisecond-scale molecular dynamics simulations on Anton
    Anton is a recently completed special-purpose supercomputer designed for molecular dynamics (MD) simulations of biomolecular systems.
  92. [92]
    Code to complex: AI-driven de novo binder design: Structure
    Sep 1, 2025 · The recently published Protein Design Archive is an open-access, curated database of structurally characterized de novo designed with over 1,500 ...
  93. [93]
    Folding and stability of nine completely redesigned globular proteins
    Sep 12, 2003 · A previously developed computer program for protein design, RosettaDesign, was used to predict low free energy sequences for nine naturally ...
  94. [94]
    Robust deep learning–based protein sequence design ... - Science
    Sep 15, 2022 · We describe a deep learning–based protein sequence design method, ProteinMPNN, that has outstanding performance in both in silico and experimental tests.
  95. [95]
    De novo design of protein structure and function with RFdiffusion
    Jul 11, 2023 · A general method for de novo binder design from target structure information alone using the physically based Rosetta method was recently ...
  96. [96]
    Frances H. Arnold – Facts – 2018 - NobelPrize.org
    In 1993, Arnold conducted the first directed evolution of enzymes, which are proteins that catalyze chemical reactions. The uses of her results include more ...
  97. [97]
    Kemp elimination catalysts by computational enzyme design - Nature
    Mar 19, 2008 · Here we describe the computational design of eight enzymes that use two different catalytic motifs to catalyse the Kemp elimination—a model ...
  98. [98]
    Designed miniproteins potently inhibit and protect against MERS-CoV
    Jun 24, 2025 · We computationally designed monomeric and homo-oligomeric miniproteins that bind with high affinity to the MERS-CoV spike (S) glycoprotein.Missing: examples | Show results with:examples