Fact-checked by Grok 2 weeks ago

Cryptographic hash function

A cryptographic hash function is a special type of hash function that maps data of arbitrary length to a fixed-length bit string, serving as a digital fingerprint for verifying the integrity and authenticity of information in cryptographic systems. These functions are designed to be computationally efficient while providing strong security guarantees, making them fundamental building blocks in modern cryptography. Approved cryptographic hash functions must satisfy key security properties, including preimage resistance (it is computationally infeasible to find an input that produces a given output), second preimage resistance (it is computationally infeasible to find a different input that produces the same output as a given input), and (it is computationally infeasible to find two distinct inputs that produce the same output). These properties ensure that even minor changes to the input data result in a significantly different hash value, often called the . The National Institute of Standards and Technology (NIST) specifies approved algorithms in standards such as FIPS 180-4 for the Secure Hash Algorithm 2 () family and FIPS 202 for the family. Cryptographic hash functions are widely applied in security protocols, including digital signatures (to condense messages before signing), keyed-hash message authentication codes (HMACs) for data integrity and authenticity, and key derivation functions (KDFs) to generate cryptographic keys from shared secrets. They also support random number generation, password storage (via salted hashing to prevent rainbow table attacks), and blockchain technologies for ensuring tamper-evident records. Common examples include SHA-256 (producing a 256-bit hash, part of SHA-2) and SHA3-256 (part of SHA-3, selected through a 2007–2012 NIST competition to address potential weaknesses in prior designs). Older functions like MD5 and SHA-1, while historically significant—MD5 was specified in RFC 1321 in 1992—are no longer recommended due to vulnerabilities such as practical collision attacks.

Definition and Properties

Definition

A cryptographic hash function is a mathematical that maps data of arbitrary length to a fixed-length bit string, known as the hash value or message digest, through a one-way process that is computationally infeasible to reverse. This mapping is designed to produce a unique, compact representation of the input data, often used for verification in cryptographic protocols. Key operational characteristics include , where the same input always yields the identical output; computational , enabling rapid processing even for large inputs; and a consistent fixed output size, independent of the input length—for instance, SHA-256 always produces a 256-bit (32-byte) digest. Unlike non-cryptographic hash functions, which prioritize speed for tasks like and retrieval and may tolerate accidental collisions, cryptographic variants emphasize resistance to intentional attacks, such as finding collisions or preimages through adversarial computation. The concept of cryptographic hash functions originated in the late 1970s with early proposals for one-way functions in cryptographic systems, though it gained formal standardization in the . A landmark formalization occurred in with the publication of Federal Information Processing Standard (FIPS) 180, which specified the Secure Hash Algorithm () as a standardized cryptographic hash function. For example, applying SHA-256 to the input string "hello" produces the fixed-length digest 2cf24dba5fb0a30e26e83b2ac5b9e29e1b161e5c1fa7425e73043362938b9824 in format, demonstrating the deterministic mapping from variable input to a uniform 64-character output.

Security Properties

Cryptographic hash functions must satisfy several key security properties to ensure their suitability for applications such as digital signatures and verification. These properties formalize the computational difficulty of inverting or finding collisions in the function, assuming an adversary with bounded computational resources. The primary properties are preimage resistance, second preimage resistance, and , each defined in terms of negligible success probability against polynomial-time attackers. Preimage resistance, also known as one-wayness, requires that given a hash value h in the output space, it is computationally infeasible to find any input x such that H(x) = h, where H is the producing n-bit outputs. Formally, for a random h, the probability that an adversary succeeds in finding such an x must be less than $2^{-n} after $2^n hash evaluations in the worst case, making exhaustive search the only viable attack but prohibitively expensive for large n. This property ensures that hash values cannot be reversed to recover inputs, protecting against forgery in scenarios like password storage. Second preimage resistance stipulates that, given an input x, it is computationally infeasible to find a distinct y \neq x such that H(y) = H(x). Unlike preimage resistance, this targets fixed-point attacks and requires roughly $2^n work for an n-bit , as the adversary must search the entire . However, practical attacks may leverage the birthday paradox, reducing effort to approximately $2^{n/2} in some constructions, though ideal functions resist this. Collision resistance is the strongest of these properties, demanding that it is computationally infeasible to find any two distinct inputs x \neq y with H(x) = H(y). This implies both preimage and second preimage resistance, as a collision-finding could be adapted for the weaker attacks. For an n-bit output, the best generic attack is a requiring about $2^{n/2} evaluations, exploiting the probabilistic structure of hash outputs. The collision probability under uniform random mapping to a of size N = 2^n for k inputs derives from the : the probability of no collision is the product \prod_{i=1}^{k-1} \left(1 - \frac{i}{N}\right). Approximating the product using the , \ln\left(1 - \frac{i}{N}\right) \approx -\frac{i}{N} for small \frac{i}{N}, yields \sum_{i=1}^{k-1} \ln\left(1 - \frac{i}{N}\right) \approx -\frac{1}{N} \sum_{i=1}^{k-1} i = -\frac{k(k-1)}{2N} \approx -\frac{k^2}{2N}. Thus, the no-collision probability is approximately e^{-k^2 / 2N}, and the collision probability is P(\text{collision}) \approx 1 - e^{-k^2 / 2N}, reaching 50% when k \approx 1.18 \sqrt{N}. Beyond collision-based properties, the measures , where a single-bit change in the input should flip approximately half the output bits on average. Quantitatively, the strict avalanche criterion requires that for any input pair differing in one bit, each output bit flips with probability exactly 1/2, independent of the position. This criterion, combining completeness (every output bit depends on every input bit) and the avalanche property, ensures good local nonlinearity and resistance to attacks. Finally, pseudo-randomness ensures that the hash function's output, when viewed as a family indexed by keys or inputs, is computationally indistinguishable from a truly random . This property underpins security proofs in the random oracle model, where hash functions are idealized as to simplify protocol analysis while preserving provable security. For instance, protocols like digital signatures rely on this indistinguishability to prevent adaptive adversaries from exploiting patterns.

Non-Security Properties

Cryptographic hash functions produce a fixed-size output, known as the digest or hash value, regardless of the input . This fixed output typically ranges from 128 to 512 bits, enabling efficient and comparison in applications. For instance, outputs 128 bits, while SHA-256 produces 256 bits, allowing digests to be represented compactly as strings of 32 or 64 characters, respectively. This design facilitates quick equality checks between hashes without needing to process the original data, reducing computational overhead in verification processes. These functions are engineered for high computational efficiency, prioritizing rapid execution on standard hardware to support real-time or large-scale use. On modern CPUs, such as those in the i9 series from the early , SHA-256 achieves throughputs exceeding 1 GB/s in software implementations optimized for single-threaded processing. This performance stems from simple bitwise operations, rotations, and modular additions that leverage processor instruction sets like and AVX, ensuring minimal latency for hashing even gigabyte-scale inputs. Efficiency varies by algorithm; older functions like can process data at over 2 GB/s on similar hardware due to their lighter design, though they are deprecated for security reasons. In non-adversarial contexts, cryptographic hash functions approximate uniqueness through a low probability of collisions for random inputs, providing practical reliability for identification tasks. The implies that for an n-bit hash, the expected number of inputs needed for a 50% collision chance is about 2^{n/2}, yielding negligible risk for typical data volumes when n ≥ 128; for example, 's 160-bit output makes accidental collisions improbable in non-malicious file hashing. Unlike security properties, this relies on statistical behavior rather than provable resistance to deliberate attacks, making it suitable for benign applications like duplicate detection. Hash functions exhibit high sensitivity to input changes, where even a single-bit alteration in the input produces a substantially different output, often termed the in design goals. This property ensures that small modifications, such as appending a byte, result in a completely unpredictable , supporting uses like incremental updates in chains—sequences where each element's incorporates the previous one for verifying ordered data streams without recomputing everything. The one-way nature means outputs cannot be reversed to recover inputs without infeasible in honest scenarios, enhancing in versioning systems. Standardization has been crucial for , with cryptographic hash functions integrated into protocols and formats since the 1990s. The construction, standardized in RFC 2104, combines hashes with keys for message authentication in TLS, ensuring consistent implementation across systems. NIST has evolved guidelines from FIPS 180 (1993) for to FIPS 202 (2015) for , with recommendations as of 2025 to transition to and by December 31, 2030, while maintaining backward compatibility in file formats like ZIP and PDF. These standards define output lengths and efficiency targets, promoting adoption in diverse ecosystems. A practical illustration of output length's impact appears in probabilistic data structures like Bloom filters, where longer hashes reduce false positive rates. For a filter with m bits and k hash functions using n-bit digests, the false positive probability approximates (1 - e^{-kn/m})^k; using high-quality cryptographic hashes like SHA-256 or ensures good randomness for Bloom filters, with the approximated by (1 - e^{-kn/m})^k, where longer outputs help in very large-scale applications by reducing the risk of hash collisions affecting the filter, though the theoretical rate remains the same for ideal hashes. This non-security benefit underscores how fixed lengths balance storage and accuracy in approximate computing.

Design Principles

Core Constructions

Cryptographic hash functions typically employ block-based processing to handle inputs of arbitrary . The input message is divided into fixed-size s, commonly 512 bits or 1024 bits, with applied to ensure the total is a multiple of the block size. This often includes a single '1' bit followed by zeros and the message encoded in 64 bits, allowing the function to process variable-length data while maintaining security properties. At the heart of these constructions is the compression function, a one-way primitive that takes the current state (chaining variable) and a message block as input, producing a new state of fixed length, typically smaller than the combined input to achieve compression. This function must resist preimage and collision attacks to ensure the overall hash's security, as formalized in early design principles where collision resistance of the iterated hash follows from that of the compression function. The process begins with an (IV), a fixed, publicly known constant that serves as the initial chaining value, ensuring deterministic output for a given input while promoting thorough mixing of data across iterations. This IV, often derived from constants chosen to avoid weak starting states, is crucial for , as it randomizes the initial conditions without relying on secret values. Iterated hashing applies the compression function sequentially to each padded , chaining the output state to the next input, resulting in a final state truncated or processed to yield the hash digest. This iterative structure, pioneered in constructions like those based on block ciphers, extends the compression function to arbitrary lengths while preserving one-wayness. A representative padding scheme, such as the , appends a '1' bit, sufficient zeros to align the length, and the 64-bit message length to prevent ambiguities in block processing. Key design goals include resistance to length-extension attacks, where an adversary appends data to a known hash without knowing the original message; proper and state handling mitigate this by incorporating length information, though trade-offs exist between larger state sizes (e.g., 256 bits for enhanced ) and computational efficiency. Historically, early hash functions in the and , such as introduced in 1991, relied on ad-hoc designs optimized for speed on 32-bit processors without formal security proofs. The discovery of practical collisions for in 2004 shifted the field toward provable security models, emphasizing constructions where security reductions to underlying primitives like collision-resistant compression functions provide guarantees against attacks.

Merkle-Damgård Construction

The is a foundational method for building from a fixed-input-length , enabling the hashing of arbitrarily long messages. Independently proposed by and Ivan Damgård in 1989, it iterates the over message blocks to produce a fixed-size output, preserving under certain conditions. This approach underpins many widely adopted , including , , and the family, by transforming a into a full suitable for practical applications. The construction begins with a message M of arbitrary length, which is first padded to ensure proper block alignment and to encode the original length, preventing ambiguities in message recovery. A standard Merkle–Damgård padding scheme appends a '1' bit, followed by zeros to reach a length congruent to b - 64 modulo the block size b (where b is the input block length of the compression function and 64 is the fixed bit length of the length encoding), and finally the 64-bit representation of the original message length in bits. This padded message is then divided into blocks x_1, x_2, \dots, x_n, each of size b. The hash computation starts with an initial value (IV), often a fixed constant, denoted as H_0, and proceeds iteratively: H_{i} = f(H_{i-1} \| x_i), \quad i = 1, 2, \dots, n where f: \{0,1\}^{c + b} \to \{0,1\}^c is the compression function (c is the output size, typically equal to the internal state size), and \| denotes concatenation. The final hash value is H_n, truncated or directly output as needed. Merkle demonstrated this using DES as the underlying primitive, constructing a compression function f from multiple DES encryptions to achieve one-wayness assuming DES behaves as a pseudorandom permutation. Damgård formalized it more generally, showing how to extend a collision-free function from short inputs to arbitrary lengths. The security of the relies on the of the compression function f. Damgård proved that if f is computationally —meaning no efficient can find distinct inputs yielding the same output—then the resulting H is also collision-resistant for arbitrary-length messages. The proof proceeds by : suppose an adversary finds a collision H(M) = H(M') for M \neq M'; by examining the iteration chains from the , the differing blocks lead to a collision in f after at most steps, contradicting the assumption on f. Merkle extended this to one-wayness, arguing that inverting H requires inverting the iterated compression, which is as hard as inverting under the model. This preservation holds for preimage resistance as well, though second-preimage resistance requires additional care in padding. Despite its elegance and provable security in the ideal model, the construction has known limitations that have motivated alternatives. It is vulnerable to length-extension attacks, where an attacker knowing H(M) and |M| can compute H(M \| \text{pad}(|M|) \| X) without knowing M, due to the iterative . Generic attacks, such as multi-collision finding in $2^{c/2} time (versus $2^{c} naively expected) or Joux's herding attack, also exploit the structure, though these do not break outright if f is ideal. Enhancements like or wide-pipe variants address some issues by incorporating or larger internal states, but the core Merkle–Damgård remains influential for its simplicity and efficiency.

Sponge Construction

The sponge construction is a versatile cryptographic design paradigm introduced by Bertoni, Daemen, Peeters, and Van Assche in , which transforms a fixed-input-length into a that processes arbitrary-length inputs and produces arbitrary-length outputs. It operates on a fixed-width of b bits, divided into a publicly modifiable rate portion of r bits and a secret portion of c = b - r bits, where c serves as the primary security parameter. The construction proceeds in two alternating phases: , where input data is incorporated into the state, and squeezing, where output data is extracted from the state. This stateful approach enables flexible handling of variable input and output lengths without relying on Merkle-Damgård-style chaining, providing inherent resistance to length-extension attacks due to the non-revelation of the internal capacity bits. In the absorption phase, the input message M is first padded using a multi-rate padding rule to ensure its length is a multiple of r, such as the "10*1" rule employed in , which appends a '1' bit, zero or more '0' bits, and another '1' bit to reach the required alignment. The padded message is then divided into r-bit blocks m_i, which are successively XORed into the first r bits of the state S, followed by applying a full-state permutation \pi (a b-bit transformation, such as the Keccak-f{{grok:render&&&type=render_inline_citation&&&citation_id=1600&&&citation_type=wikipedia}} permutation consisting of 24 rounds). This process can be expressed as: S \leftarrow \pi(S \oplus (m_i \| 0^c)) where \| denotes concatenation and $0^c are c zero bits. Once is complete, the squeezing phase begins by XORing the first r bits of the current to produce output blocks z_i, again followed by the permutation \pi for each block: z_i \leftarrow S[0 \dots r-1], \quad S \leftarrow \pi(S). The output length is user-specified and flexible, up to roughly b - r bits before security degrades, and the construction naturally supports extendable-output functions (XOFs) by continuing the squeezing phase indefinitely. The security of the sponge construction is fundamentally bounded by the capacity c, with resistance to preimage attacks estimated at approximately $2^c operations under the assumption of an ideal underlying permutation, as the capacity bits absorb any leakage while remaining hidden. Unlike chain-based designs, the sponge avoids length-extension vulnerabilities because appended data would require knowledge of the secret state, making such attacks computationally infeasible. This construction was standardized by NIST in FIPS 202 (2015) as the basis for the SHA-3 family, utilizing a 1600-bit state with varying r and c (e.g., c = 512 for SHA3-256). Its permutation-centric design offers advantages in hardware efficiency, particularly for parallel implementations and resource-constrained devices, and recent analyses as of 2025 indicate potential resilience against quantum attacks like Grover's algorithm due to the sponge's indifferentiability from a quantum random oracle.

Specific Algorithms

Early Algorithms

One of the earliest widely adopted cryptographic hash functions was , designed by Ronald Rivest in 1991 as an improvement over the vulnerable MD4. It produces a 128-bit (16-byte) hash value and employs the Merkle-Damgård construction with four rounds of 16 operations each, processing input in 512-bit blocks padded to a multiple of that size. MD5 was standardized in 1321 in 1992 and quickly integrated into early protocols such as SSL precursors and file integrity checks, where its compact output facilitated efficient signatures. For example, the MD5 hash of an is d41d8cd98f00b204e9800998ecf8427e. In 1995, the National Institute of Standards and Technology (NIST) published as part of the Secure Hash Standard in FIPS 180-1, aiming to provide a more secure alternative with a longer 160-bit (20-byte) output. follows a design similar to , using the Merkle-Damgård construction but expanded to 80 rounds of 20 operations each on 512-bit blocks, incorporating additional bitwise functions for enhanced diffusion. It was initially specified for use with the (DSA) in FIPS 186 from 1994 and adopted in protocols like SSL/TLS for certificate validation and message authentication. Developed by a European consortium under the RIPE project, RIPEMD-160 emerged in 1996 as a 160-bit intended for resource-constrained environments such as smart cards. Like its predecessors, it relies on the Merkle-Damgård construction with a 512-bit block size but introduces a double-branch parallel structure—two independent 160-bit chains combined at the end—to improve security margins and computational efficiency through parallelism. These early algorithms shared core design principles rooted in the Merkle-Damgård paradigm (detailed in the Merkle-Damgård Construction section), emphasizing iterative compression of message blocks into a fixed-length digest using 512-bit inputs. On typical 1990s hardware, such as late-era Intel Pentium processors, their software implementations achieved throughputs around 50 MB/s for , with and RIPEMD-160 slightly lower due to increased round counts. Standardization progressed rapidly: via RFC 1321 in 1992, integrated into in 1994 and formalized in FIPS 180-1 in 1995, and RIPEMD-160 proposed in academic literature in 1996 before inclusion in ISO/IEC 10118-3. By the mid-2000s, however, theoretical attacks prompted initial deprecation efforts, starting with NIST's 2005 comments on vulnerabilities, leading to phased withdrawals from standards.

SHA Family

The Secure Hash Algorithm (SHA) family, standardized by the National Institute of Standards and Technology (NIST), encompasses a series of cryptographic hash functions designed for high security and broad applicability in federal systems. Introduced to succeed earlier vulnerable algorithms, the family progressed from the series in 2002 to the series in 2015, providing robust and preimage security while supporting various output lengths. These functions are integral to NIST's cryptographic standards, with relying on the Merkle-Damgård construction and employing a sponge-based approach for enhanced flexibility. SHA-2, specified in Federal Information Processing Standard (FIPS) 180-4 and published by NIST in 2002 with subsequent updates, includes variants SHA-224, SHA-256, SHA-384, and SHA-512, producing digest sizes of 224, 256, 384, and 512 bits, respectively. These algorithms operate on 512-bit blocks for the 224- and 256-bit variants and 1024-bit blocks for the 384- and 512-bit variants, processing messages through a 64-round compression function based on the Merkle-Damgård construction. The round constants for SHA-256, for example, derive from the first 32 bits of the fractional parts of the cube roots of the first 64 prime numbers (starting with 2, 3, 5, ..., 311), ensuring across rounds. Additional truncated variants, SHA-512/224 and SHA-512/256, were added in FIPS 180-4 updates to provide with legacy systems while maintaining security margins equivalent to their namesake digest lengths. SHA-3, formalized in FIPS 202 and released by NIST in 2015, standardizes the KECCAK algorithm selected from the SHA-3 competition, utilizing a construction for variable input and output lengths. The family includes fixed-output hash functions SHA3-224, SHA3-256, SHA3-384, and SHA3-512, alongside extendable-output functions (XOFs) SHAKE128 and SHAKE256, which allow arbitrary-length outputs while providing at least 128 or 256 bits of , respectively. For SHA3-256, the sponge operates with a state size of 1600 bits, a rate r = 1088 bits for absorbing input, and a capacity c = 512 bits dedicated to ; similar parameters scale for other variants, with c = 448 for SHA3-224, c = 512 for SHA3-256, c = 768 for SHA3-384, and c = 1024 for SHA3-512. This design enables resistance to length-extension attacks inherent in Merkle-Damgård structures, as briefly referenced in the sponge construction principles. SHA-3's permutation-based absorption and squeezing phases support diverse levels without fixed block sizes. NIST fully deprecated in 2020 due to practical collision attacks, mandating its disallowance for all cryptographic protections by December 31, 2030, in favor of or . remains recommended for use through at least 2030, with NIST outlining migration strategies to address potential quantum threats via , which could reduce collision search complexity to the square root; longer digests or adoption are advised for extended security. As of 2025, sees increased integration in standards from NIST's fourth-round post-quantum cryptography (PQC) selections, including use in signature schemes like those leveraging SHAKE functions for hash-based security. On modern CPUs, SHA-256 implementations achieve throughputs of approximately 1 GB/s in software without , scaling higher with Intel's SHA New Instructions (SHA-NI) available since 2013 in processors like and later cores, which optimize the 64-round processing. The core round function in incorporates bitwise operations for non-linearity and majority voting, defined as: \text{Ch}(x, y, z) = (x \land y) \oplus (\lnot x \land z) \text{Maj}(x, y, z) = (x \land y) \oplus (x \land z) \oplus (y \land z) These functions, applied bitwise across 32- or 64-bit words in each of the 64 rounds per block, contribute to the algorithm's and resistance to differential .

Other Notable Algorithms

is a cryptographic hash function designed by Paulo S. L. M. Barreto and in 2000 as a submission to the New European Schemes for Signatures, Integrity, and Encryption (NESSIE) project. It produces a 512-bit output and employs a wide-pipe Merkle-Damgård , where the internal state matches the output size to enhance security against certain attacks. The function operates on 512-bit blocks using a block cipher-like primitive called W, which features an 8×8 byte state matrix and AES-inspired S-boxes for , followed by layers. Each round consists of 10 iterations, providing robust mixing through additions, rotations, and XOR operations in a Miyaguchi-Preneel mode. was selected by NESSIE and later standardized in ISO/IEC 10118-3:2004 for dedicated hash functions up to 512 bits. BLAKE2, introduced in 2012 by Jean-Philippe Aumasson and Samuel Neves, builds on the stream cipher as an evolution of the finalist BLAKE, prioritizing software performance. The variant BLAKE2b targets 64-bit platforms and generates digests from 1 to 64 bytes, with a standard 512-bit output, while BLAKE2s suits 8- to 32-bit systems with up to 32-byte outputs. It supports parallelism through BLAKE2bp (using four BLAKE2b instances) and BLAKE2sp (eight BLAKE2s instances), enabling multi-core acceleration up to 8× faster via SIMD instructions. BLAKE2b achieves higher throughput than , with benchmarks showing approximately 3.32 cycles per byte on processors compared to 's 20.46 cycles per byte. Keyed modes are integrated natively, functioning as a (MAC) or pseudorandom function (PRF) without needing constructs like , and outperforming them in speed. BLAKE2 has gained adoption in cryptographic libraries such as libsodium since its 2013 release, as well as , due to its balance of security and efficiency, though it lacks formal NIST standardization. BLAKE3, announced in 2020 by Jack O'Connor, Jean-Philippe Aumasson, and Thomas Peyrin, extends BLAKE2 into a tree-based for extreme parallelism and versatility as an extendable-output (XOF). It processes inputs in 1-KiB chunks, building a where sibling nodes are combined using a commutative and associative , allowing non-sequential processing and incremental updates ideal for large-scale verification tasks like . The compression , derived from BLAKE2 but optimized, supports up to two input words per round for better hardware utilization, achieving speeds around 10 GB/s on modern CPUs for multi-gigabyte inputs through massive parallelism across cores and SIMD lanes. BLAKE3 serves as a for in many applications, emphasizing with its enabling efficient proofs of inclusion or derivation. By 2025, it has seen widespread integration in the Rust ecosystem via its official crate, powering tools for file integrity and components, alongside C implementations, without pursuing NIST approval but thriving in open-source environments. For example, the BLAKE3-256 hash of an is a591a6d40bf420404a011733cfb7b190d62c65bf0bcda32b57b277d9ad9f146e, computed by hashing a node in the tree mode with no input leaves.

Applications

Data Integrity Verification

Cryptographic hash functions enable data integrity verification by producing a fixed-length digest, or hash value, from input data such that even a single-bit alteration in the input results in a substantially different output. The process involves computing the hash of the original data once and storing or transmitting the digest alongside or separately from the data; upon receipt or later inspection, the hash is recomputed from the current data and compared to the original digest to detect any tampering or corruption. This mechanism relies on the collision resistance property of cryptographic hashes, which makes it computationally infeasible for an adversary to find two different inputs yielding the same hash, thereby ensuring reliable detection of unauthorized modifications. A common is verifying software downloads, where providers publish SHA-256 digests for ISO images; users recompute the hash of the downloaded file and match it against the provided value to confirm no alterations occurred during transfer. In systems like , commits have historically used hashes for integrity, but as of 2025, repositories are migrating to SHA-256 to enhance security against known SHA-1 weaknesses. For accidental errors such as bit flips during storage or transmission, cryptographic hashes detect changes with an extremely low probability of failure, approximately $2^{-n}, where n is the hash output in bits (e.g., $2^{-256} for SHA-256), far surpassing non-cryptographic methods like CRC-32, which are optimized for error detection but vulnerable to intentional . Tools like hashdeep facilitate recursive hashing of directory structures to generate and audit digests for multiple files, aiding in bulk integrity checks for forensic or archival purposes. Additionally, hash functions integrate into protocols like via Subresource Integrity (SRI), where browsers verify fetched resources against declared SHA-256 hashes to prevent loading tampered content. Despite their effectiveness, hash-based has limitations: it only confirms whether data has changed but does not reveal the nature or location of modifications, nor does it provide for the data itself. Furthermore, the digest must be transmitted or stored via a , as an attacker could substitute both the data and a matching forged digest if access to the channel is compromised. The concept traces back to early tools, with introduced in 1992 as a Unix-based system that used cryptographic hashes to baseline and detect changes in critical files, alerting administrators to potential intrusions.

Digital Signatures and Message Authentication

Cryptographic hash functions play a crucial role in digital signatures by enabling the "hash-then-sign" paradigm, where the message is first hashed to produce a fixed-size digest, which is then signed using an asymmetric algorithm such as . This approach, exemplified by the RSASSA-PKCS1-v1_5 scheme with SHA-256 in v2.2, significantly reduces computational overhead compared to signing the entire message, as the signature operation processes only the digest rather than the potentially large original data. For message authentication, the Hash-based Message Authentication Code (), specified in RFC 2104, provides keyed integrity by combining a secret key with the message through nested hashing. The construction is defined as: \text{HMAC}(K, m) = H\left( (K \oplus \text{opad}) \parallel H\left( (K \oplus \text{ipad}) \parallel m \right) \right) where H is the underlying , K is the key (padded if necessary), \parallel denotes , and \text{opad} and \text{ipad} are fixed-length outer and inner padding constants (typically 0x5c repeated and 0x36 repeated, respectively, for block length B). This nested structure ensures authenticity against forgery attempts by an adversary without the key. In practice, hash functions underpin digital signatures in (PKI) certificates, where standards employ SHA-256 for hashing the certificate body before signing with algorithms like or ECDSA to bind identities to public keys. Similarly, JSON Web Tokens (JWTs) in RFC 7519 use hash-then-sign mechanisms, such as RS256 ( with SHA-256), to secure claims in authentication and authorization contexts. The (TLS) protocol version 1.3 mandates HMAC-SHA256 as part of its () for authenticating handshake messages and protecting record integrity in secure communications. The security of these schemes relies on the of the ; for instance, existential unforgeability under chosen-message attacks (EUF-CMA) for hash-then-sign signatures holds if finding hash collisions is computationally infeasible, as proven in the model for constructions like RSA-FDH. HMAC's nested application of the prevents length extension attacks, where an adversary could otherwise append data to a known output without knowing the , a inherent to Merkle-Damgård hashes like when used directly for . Over time, adoption has shifted from deprecated combinations like HMAC-MD5—vulnerable due to MD5's collision weaknesses—to more robust options such as , with NIST's FIPS 202 standardizing in 2015 and its integration accelerating by 2025 amid transitions to quantum-resistant , including hash-based primitives. For example, transactions have employed ECDSA with SHA-256 hashing for signing transaction data since its inception in 2009, while the 2021 upgrade introduced Schnorr signatures tagged with SHA-256 to enhance privacy and efficiency in multi-signature scenarios.

Password Storage and Verification

Cryptographic hash functions are essential for secure password storage, as they transform user into fixed-length values that cannot be reversed, thereby protecting against unauthorized even if the storage is compromised. To store securely, systems typically compute a of the password combined with additional measures, ensuring that the original password remains unknown. This approach relies on the one-way nature of hash functions, making direct recovery infeasible while allowing efficient during . A key technique is salting, where a unique random value, known as a , is generated for each and prepended to the password before hashing. For example, a 16-byte salt is commonly used to ensure uniqueness, resulting in hash(password + ) that thwarts precomputed attacks like rainbow tables, which exploit identical hashes for common passwords across users. Salts are stored alongside the hash in the database, enabling recomputation without storing the plaintext password. This method significantly increases the computational effort required for offline attacks on stolen databases. To further resist brute-force and dictionary attacks, slow hashing or key derivation s (KDFs) are employed, which deliberately increase computation time through s or resource-intensive operations. One widely adopted standard is , defined in RFC 2898 (2000), which applies a pseudorandom (PRF) iteratively, typically 1000 or more times, to derive the final output. The process generates intermediate values U_i = \text{PRF}(P, S \| \text{int}(i)) for i = 1 to c (where c is the count, P is the , and S is the ), and the final is the XOR of all U_i, chained across blocks if needed for longer outputs. This design slows down attackers attempting exhaustive searches on offline data. More advanced functions address evolving hardware threats like GPUs and ASICs. Bcrypt, introduced in 1999, incorporates an adaptive cost factor that allows tunable work levels based on the Blowfish cipher, making it resistant to acceleration by specialized hardware. Scrypt, proposed in 2009, is memory-hard, requiring significant RAM during computation to deter parallelization on low-memory devices like GPUs. Argon2, the winner of the 2013–2015 Password Hashing Competition, enhances these by balancing resistance to time-cost trade-offs, memory usage, and parallel processing; its variants, such as Argon2id (hybrid data-independent and data-dependent), provide robust protection against side-channel attacks. Best practices emphasize using these modern KDFs with appropriate parameters to balance security and usability. NIST Special Publication 800-63B (updated 2025) requires approved password storage mechanisms resistant to offline attacks, recommending at least 112 bits of and ing, while industry standards like advocate id with a 32-byte and computation time of 10–15 milliseconds on target hardware to mitigate dictionary and offline brute-force attacks. In 2025, is recommended by standards such as and increasingly adopted in new systems and frameworks, with fully deprecated due to its vulnerability to collisions and rapid cracking. For verification, upon login, the system retrieves the stored , recomputes the of the provided concatenated with the salt using the same parameters, and compares it to the stored value; a match confirms authenticity without exposing the password. This process ensures efficient online checks while maintaining high security against offline threats.

Blockchain and Proof-of-Work

In proof-of-work (PoW) consensus mechanisms, cryptographic hash functions serve as the core computational puzzle for validating transactions and securing networks. Miners compete to find a value such that the hash of the header concatenated with the nonce produces an output below a predefined , ensuring that solving the puzzle requires significant computational effort while verification is inexpensive. For instance, employs a double application of SHA-256, computing H = \text{SHA-256}(\text{SHA-256}(\text{block header} \Vert \text{nonce})), where the resulting 256-bit hash must be less than the current target to form a valid . The difficulty of this puzzle adjusts dynamically to maintain consistent block production rates, typically targeting one block every 10 minutes in . The target value scales inversely with the network's total hash rate, recalculated every 2016 blocks based on the actual time taken to mine them; if blocks are mined faster, the target decreases (increasing difficulty), and vice versa. The expected number of hash trials required to solve a block is approximately $2^{n - d}, where n = 256 is the bit length of the SHA-256 output and d represents the effective bits of difficulty (leading zeros required in the hash). Bitcoin, launched in January 2009, pioneered the use of SHA-256 in PoW for transaction validation and network security. , prior to its transition to proof-of-stake in September 2022, utilized the Ethash algorithm—a memory-hard PoW function designed to resist specialized hardware by requiring significant during computation, thereby promoting . The global Bitcoin network hash rate reached approximately 1,100 EH/s (exahashes per second) by November 2025, reflecting the immense energy consumption of PoW, estimated to rival that of small countries; this has driven a broader industry shift toward proof-of-stake systems, which eliminate the need for continuous hashing to reduce environmental impact. Within each block, Merkle trees—binary hash trees—efficiently summarize transactions by pairwise hashing their identifiers until reaching a single root hash, which is included in the block header for PoW computation. This structure, \text{Merkle root} = H(H(T_1 \Vert T_2) \Vert H(T_3 \Vert T_4)) for four transactions T_i, allows light clients to verify transaction inclusion with logarithmic proofs without downloading the full block, leveraging the of the underlying . Variants of PoW seek to address centralization risks from application-specific integrated circuits (). Litecoin, introduced in 2011 as a fork, adopted the for its PoW, which incorporates a memory-intensive salt-mixing step (parameters N=1024, r=1, p=1) to initially favor CPU and GPU over ASICs, enhancing accessibility. By 2025, trends in layer-2 scaling solutions increasingly incorporate hybrid PoW elements, combining on-chain PoW validation with off-chain computation to balance security and efficiency in high-throughput networks.

Attacks and Vulnerabilities

Attacks on Hash Functions

Cryptographic hash functions are susceptible to various attacks that their , including preimage resistance, second-preimage resistance, and collision resistance. These attacks can be theoretical, requiring immense computational resources, or practical, enabling real-world exploitation. Historical breakthroughs have demonstrated vulnerabilities in widely used algorithms, leading to their deprecation, while modern functions like and remain unbroken as of 2025. Collision attacks aim to find two distinct inputs that produce the same hash output, violating . A seminal example is the differential on by Wang et al. in 2004, which required approximately 2^{39} computations to generate a collision, with a full practical demonstration published in 2005. Similarly, in 2017, researchers from and CWI developed a practical on full , requiring about 2^{63} operations and enabling the creation of colliding files. Preimage attacks seek an input that hashes to a given output, challenging preimage resistance; however, no practical attacks exist for full rounds of mature functions. For , the best known theoretical preimage attack has a complexity of 2^{123}, demonstrated for truncated variants in analyses around 2009, with no significant improvements rendering it practical by 2025. Second-preimage attacks, finding a different input for a given input's hash, follow similar complexities but remain infeasible for full algorithms. Birthday attacks exploit the birthday paradox to find collisions with effort roughly 2^{n/2} for an n-bit hash, a generic lower bound for any collision-resistant function. The SHAttered attack in 2017 applied this to , producing two different PDF files with identical hashes after 2^{63} operations, demonstrating practical forgery in formats like PDFs. Such attacks underscore the reduced security margin for collisions compared to preimages. Side-channel attacks target implementation vulnerabilities rather than the algorithm's mathematics, such as timing or on . These can leak information about intermediate states during hashing, enabling key recovery or partial breaks; for instance, has been applied to hash implementations like in smart cards. Countermeasures include constant-time coding to eliminate timing variations and masking to randomize power consumption. Quantum threats pose emerging risks, with Grover's algorithm reducing preimage search effort from 2^n to 2^{n/2} using quantum parallelism. For collisions, quantum algorithms like Brassard–Høyer–Tapp achieve a cubic speedup, reducing the complexity to O(2^{n/3}). As of 2025, NIST's post-quantum cryptography migration recommends using longer hash outputs, such as SHA-512, to maintain security against these threats, without altering hash designs themselves. NIST's 2025 post-quantum FAQs recommend doubling hash output lengths (e.g., using SHA-512) to counter Grover's algorithm effects on preimage resistance. The impact of these attacks has been profound: and are deprecated for security-critical uses, with NIST mandating phase-out of by 2030 due to practical collisions. No significant breaks have been found in or as of 2025, preserving their status as recommended standards.

Attacks on Hashed Data

Attacks on hashed data primarily target the storage or transmission of hash values themselves, exploiting weaknesses in how systems handle outputs from cryptographic hash functions, such as in password databases or contexts. These attacks succeed when hashes are leaked through breaches, often due to inadequate protection like the absence of salts or peppers, allowing adversaries to reverse-engineer original inputs offline without interacting with the target system. Unlike attacks on the hash function's properties, these focus on practical exploitation of stored values, emphasizing the importance of secure implementation practices. Rainbow tables represent a classic time-memory on unsalted hashed , precomputing chains of hash values to reduce needs while enabling rapid lookups. Developed as an improvement over earlier methods, they generate long chains by repeatedly hashing a starting , applying a reduction function to produce a new password candidate, and continuing the process; only the start and end of each chain are stored, allowing reconstruction during an if an endpoint matches the target hash. For unsalted , the same input always yields the same , making precomputation feasible across all possible inputs. This is defeated by , which requires unique tables per salt value. For example, covering 99.9% of 8-character lowercase (26^8 possibilities) can be achieved with rainbow tables requiring approximately 2^{32} bytes of through optimized , far less than a full table. Offline brute-force attacks accelerate cracking of leaked hashes using high-performance , systematically trying all possible inputs until a match is found. Modern GPUs enable massive parallelism, with a single 4090 capable of computing around 82 billion hashes per second, allowing exhaustive searches of weak password spaces in minutes. In contrast, slow functions like , designed for password storage, limit rates to about 3,200 hashes per second on the same due to their adjustable work factor, making brute-force economically prohibitive for strong inputs. These attacks thrive on breaches exposing large hash datasets, underscoring the need for computationally expensive hashing to deter offline . Reuse of salts or their complete absence leads to catastrophic compromises, as identical passwords produce identical hashes, enabling attackers to crack multiple accounts simultaneously with a single computation. Unique per-user salts prevent this by ensuring each hash requires individual effort, but poor implementation like global or reused salts amplifies damage. A prominent example is the 2012 breach, where approximately 117 million unsalted password hashes were stolen and later posted online, allowing rapid cracking of common passwords like "123456" for over 1 million users via rainbow tables and . This unsalted approach exposed the platform to widespread credential reuse attacks, with hundreds of thousands of hashes cracked shortly after the leak. Dictionary attacks target hashed passwords by focusing on probable inputs from lists of common words, phrases, or patterns derived from prior breaches, rather than exhaustive searches. Attackers hash entries and compare them to leaked values, often succeeding against users with predictable choices like "password123". To counter this, especially in offline scenarios, a —a secret value stored separately from the database, such as in a —can be concatenated with the salted password before hashing, rendering dictionary matches useless without the pepper. Even if the hash database is compromised, the attacker must also obtain the pepper to verify guesses, adding a layer of protection beyond salting alone. In the 2025 landscape, emerging threats to stored hashes include via , which offers a for brute-force searches, potentially halving the effective security of hash outputs against exhaustive attacks on symmetric primitives like password hashes. While current guidance holds that standard hash lengths like SHA-256 remain viable with doubled key sizes, quantum hardware limitations may delay practical impacts, though harvest-now-crack-later strategies pose risks for long-lived data. Complementing this, hybrid attacks integrate AI-driven guessing—using to generate context-aware password candidates from user data or breach patterns—with GPU-accelerated hash cracking, boosting efficiency against by up to 100 times in targeted scenarios. These AI-enhanced hybrids refine dictionary lists in real-time, prioritizing likely variations and reducing search spaces significantly. A illustrative case is the 2009 RockYou breach, where 32 million plaintext s were exposed due to poor storage practices, providing a foundational for analyzing weak prevalence; subsequent studies showed that over 90% of short or common s in similar unsalted contexts could be recovered via and hybrid methods, highlighting the perils of inadequate hashing. This incident fueled the creation of massive lists used in modern attacks, emphasizing how legacy unsalted implementations enable near-total compromise of predictable inputs.

Mitigation Strategies

To mitigate vulnerabilities in cryptographic hash functions, organizations should prioritize the selection of robust algorithms. The National Institute of Standards and Technology (NIST) recommends migrating to at least from the family or as the minimum standard for new applications, with fully deprecated and disallowed after December 31, 2030, due to its demonstrated collision weaknesses. This transition aligns with NIST's broader policy on hash functions, emphasizing algorithms that maintain preimage and against current computational threats. For authentication and message integrity, keying mechanisms and operational modes are essential to prevent attacks on raw hash outputs. The use of Hash-based Message Authentication Codes () is mandated for all MAC constructions involving hashes, as it incorporates a secret key to thwart length-extension and other exploits that affect unkeyed hashes. Raw hashes should never be used directly for authentication purposes, as they lack key-derived security and are susceptible to offline attacks; instead, protocols must enforce or similar keyed constructions. In password storage and verification, parameter tuning of memory-hard hash functions is critical to resist brute-force and parallelized attacks using specialized hardware. Argon2id, the recommended variant, should employ high memory costs—such as 19 MiB of memory with 2 iterations and parallelism of 1—as recommended by for secure storage. This configuration balances security with usability, scaling memory usage over time to counter advancing hardware capabilities. To address emerging quantum threats, hash functions must be adapted for post-quantum security. Extending hash output lengths to at least 512 bits doubles the effective security against , preserving 256-bit in a quantum environment. For signature schemes, hash-based constructions like the eXtended Merkle Signature Scheme (XMSS) provide quantum-resistant alternatives, standardized by NIST for stateful hash-based signatures. Ongoing monitoring and flexibility are vital for long-term resilience. Systems should undergo regular audits to detect flaws or deprecated usage, while incorporating hash agility—allowing seamless switching between supported hashes without redesign. For instance, TLS 1.3 enables of multiple hash algorithms, including SHA-256 and SHA-384, to facilitate upgrades. further bolsters defenses by minimizing side-channel exposures. Developers must employ constant-time algorithms that avoid data-dependent branches and memory accesses, preventing timing attacks that could leak hash states or keys. Hardware accelerations, such as Intel's SHA Extensions introduced in 2013, can optimize performance for and SHA-256 while maintaining these properties when used in validated libraries.

Advanced Uses

Building Other Cryptographic Primitives

Cryptographic hash functions provide foundational properties such as and one-wayness, enabling their use as components in constructing more sophisticated primitives like key derivation functions and random number generators. These constructions leverage the hash's ability to transform inputs into fixed-length, pseudorandom outputs while preserving security guarantees through provable reductions. Key derivation functions (KDFs) employ hash functions to produce cryptographic keys from weaker or lower-entropy sources, such as passwords or shared secrets. The HMAC-based key derivation function (HKDF), defined in RFC 5869, follows an extract-then-expand paradigm to ensure the output is uniformly random and suitable for cryptographic use; it first computes a pseudorandom key (PRK) via HKDF-Extract using a salt and input keying material (IKM) as \text{PRK} = \text{HKDF-Extract}(\text{salt}, \text{IKM}), then derives the output keying material (OKM) via HKDF-Expand with optional context information as \text{OKM} = \text{HKDF-Expand}(\text{PRK}, \text{info}, L). This design mitigates issues like insufficient entropy in the IKM by incorporating salt and iteration mechanisms. Another prominent KDF is PBKDF2, which applies a pseudorandom function (typically HMAC) iteratively with a salt to derive keys from passwords, enhancing resistance to brute-force attacks; it is specified in RFC 2898 and used in Wi-Fi Protected Access 2 (WPA2) to generate the pairwise master key from a passphrase and service set identifier through 4096 iterations of HMAC-SHA-1. Hash-based deterministic random bit generators (DRBGs) utilize iterative hashing to produce sequences of pseudorandom bits from an initial , providing a controlled source of for cryptographic applications. As standardized in , these DRBGs require periodic reseeding with fresh to maintain security, after which output is generated by chaining hash invocations on the internal , such as the with additional inputs and deriving bits via the function's output. This approach ensures forward and backward security against seed compromise, with hash-based instances relying on approved functions like SHA-256 for both generation and reseeding processes. Beyond KDFs and DRBGs, hash functions enable constructions of pseudorandom functions (PRFs) and schemes. , built by nesting a with a secret key, serves as a PRF whose is proven under the assumption that the underlying hash's behaves as a PRF, without requiring full ; this is established through idealized models where the acts as a . schemes can employ a to bind a value x by publishing h(x \parallel r) with a random r, providing computational binding via and statistical hiding in the model, as demonstrated in early provably secure constructions from collision-free hashes. reductions often link these properties: for instance, of the implies second-preimage resistance, which underpins PRF for certain keyed constructions in the . Practical examples illustrate these applications' impact. PBKDF2's integration in WPA2 has secured billions of connections since 2004 by slowing offline dictionary attacks. In , as of 2024, the NIST-standardized Module-Lattice-Based Key-Encapsulation Mechanism (ML-KEM) in FIPS 203 incorporates SHAKE-256, a hash-based extendable-output function, for deriving shared secrets and encapsulating keys within lattice-based operations, ensuring compatibility with quantum-resistant protocols. Despite their versatility, hash functions have limitations as building blocks; their non-invertibility and fixed output length make them unsuitable for directly generating long or symmetric keys without expansion mechanisms like those in , which prevent leakage of input structure and ensure output uniformity.

Hash-Based Signatures and Structures

Hash-based signatures and structures leverage cryptographic hash functions to construct secure authentication mechanisms, particularly valued for their resistance to threats. These schemes rely solely on the and one-way properties of hash functions, avoiding reliance on number-theoretic assumptions vulnerable to algorithms like Shor's. Key structures include hash trees for efficient and signature schemes that enable one-time or limited-use digital signing without exposing long-term secrets. Merkle trees, also known as hash trees, provide an efficient method for committing to a large set of blocks and verifying their integrity with compact proofs. In a Merkle tree, each leaf node contains the of a data block, while non-leaf nodes hold the hash of their children's values, culminating in a root hash that commits to the entire dataset. To verify a specific leaf, an O(log n) sized proof path from the leaf to the root suffices, allowing efficient authentication without revealing the full tree. This structure, introduced by in , is foundational for scalable verification in systems requiring frequent integrity checks. One-time signatures (OTS) form the building blocks for more advanced hash-based schemes, designed for signing a single message per key pair to prevent reuse attacks. The Lamport signature scheme, proposed by in 1979, exemplifies this approach. For an n-bit security level, the private key consists of 2n random n-bit strings, denoted as sk = (s_0, s_1, \dots, s_{2n-1}), where s_i are chosen uniformly at random. The public key is computed as pk = (h(s_0), h(s_1), \dots, h(s_{2n-1})), with h being a cryptographic hash function. To sign an n-bit message m = (m_1, \dots, m_n), where each m_i \in \{0,1\}, the signer reveals the private key strings corresponding to the bits: for each bit j, reveal s_{2j + m_j}. Verification checks that h(s_{2j + m_j}) = pk_{2j + m_j} for all j, and that the revealed bits match the . This achieves 128-bit with 256-bit hashes but produces signatures of size 2n bits plus the , making it simple yet inefficient for multiple uses. The Winternitz OTS (W-OTS), introduced by Robert Winternitz in 1982, improves efficiency by allowing each key component to encode multiple bits of the message using iterated hashing. Parameters w (the number of message bits per component, typically a power of 2 like 4 or 16) signature size and security: higher w reduces the number of components but increases hash iterations per component. The scheme hashes private seeds up to a maximum value based on w, enabling compact signatures while maintaining one-time security under . Variants like W-OTS+ further optimize parameters for practical use. To enable multiple signatures without key reuse, stateful schemes chain one-time signatures using tree structures, tracking used keys to maintain security. The eXtended Merkle Signature Scheme (XMSS), proposed in 2011 and standardized in RFC 8391 (2018), builds a of W-OTS public keys, with the root serving as the XMSS public key. Each signature uses a fresh OTS key, and the signer updates state by computing authentication paths in the tree; after 2^h signatures (h tree height), a new key pair is generated. XMSS supports up to 2^60 signatures per key pair and achieves EUF-CMA security. Similarly, the Leighton-Micali Signature (LMS) scheme, specified in RFC 8554 (2019), uses a similar tree-based chaining but with LMS-OTS primitives for potentially smaller signatures. Both are stateful, requiring careful to avoid . For stateless operation, avoiding state management overhead, SPHINCS+ (a refinement of the original SPHINCS) generates signatures by randomly selecting few-time signatures from a large set, using hypertree structures for without tracking usage. Submitted to NIST's standardization in 2019 and selected as a finalist in 2022, SPHINCS+ relies on a few-time OTS like W-OTS and randomized hashing to bound forgery probability, supporting up to 2^64 signatures. It achieves strong unforgeability in the random oracle model. As of 2025, NIST has standardized stateful hash-based signatures through Special Publication 800-208 (2020), which approves LMS and XMSS as supplements to FIPS 186-5 for applications requiring forward security. The stateless SPHINCS+, renamed SLH-DSA, is specified in FIPS 205 (2024). These schemes are deployed in systems like early implementations of IOTA's Tangle, which employed Winternitz-based signatures for quantum-resistant in its . Hash-based signatures offer provable quantum resistance, as provides only a quadratic speedup for preimage attacks on hash functions—reducing an n-bit level to effectively n/2 bits, mitigable by doubling hash output sizes. However, drawbacks include large sizes (typically 10-50 KB for XMSS or SPHINCS+ at 128-bit ) and, for stateful , the need for secure state tracking to prevent key reuse that could enable existential forgery.

References

  1. [1]
    Cryptographic hash function - Glossary | CSRC
    A function that maps a bit string of arbitrary length to a fixed-length bit string. Approved hash functions are expected to satisfy the following properties: 1.
  2. [2]
    Hash Functions | CSRC - NIST Computer Security Resource Center
    Jan 4, 2017 · A hash algorithm is used to map a message of arbitrary length to a fixed-length message digest. Approved hash algorithms for generating a condensed ...NIST Policy · News & Updates · Events · SHA-3 Standardization
  3. [3]
    [PDF] fips pub 202 - federal information processing standards publication
    A cryptographic hash function is a hash function that is designed to provide special properties, including collision resistance and preimage resistance ...
  4. [4]
    FIPS 180-4, Secure Hash Standard (SHS) | CSRC
    This standard specifies hash algorithms that can be used to generate digests of messages. The digests are used to detect whether messages have been changed.Fips-180-4
  5. [5]
    SHA-3 Standard: Permutation-Based Hash and Extendable-Output ...
    This Standard specifies the Secure Hash Algorithm-3 (SHA-3) family of functions on binary data. Each of the SHA-3 functions is based on an instance of the ...
  6. [6]
    [PDF] Recommendation for Applications Using Approved Hash Algorithms
    This document provides security guidelines for using approved hash functions in applications like digital signatures, HMACs, and Hash-based KDFs.
  7. [7]
    SHA-3 Project - Hash Functions | CSRC
    A cryptographic hash algorithm (alternatively, hash "function") is designed to provide a random mapping from a string of binary data to a fixed-size “message ...Missing: definition | Show results with:definition
  8. [8]
    NIST Releases SHA-3 Cryptographic Hash Standard
    Aug 5, 2015 · Nine years in the making, SHA-3 is the first cryptographic hash algorithm NIST has developed using a public competition and vetting process ...
  9. [9]
    RFC 1321 MD5 Message-Digest Algorithm - IETF
    This document describes the MD5 message-digest algorithm. The algorithm takes as input a message of arbitrary length and produces as output a 128-bit " ...
  10. [10]
    Hash Functions | CSRC - NIST Computer Security Resource Center
    After 2010, Federal agencies may use SHA-1 only for the following applications: hash-based message authentication codes (HMACs); key derivation functions (KDFs ...
  11. [11]
    [PDF] Cryptographic Hash-Function Basics: Definitions, Implications, and ...
    Feb 12, 2004 · We consider basic notions of security for cryptographic hash functions: collision resistance, preimage resistance, and second-preimage ...
  12. [12]
    Today: Cryptographic hash functions - People | MIT CSAIL
    Definition: A cryptographic hash function h maps bit-strings of arbitrary length to a fixed-length output in an efficient, deterministic, public, “random”, ...
  13. [13]
    The First 30 Years of Cryptographic Hash Functions and the NIST ...
    This paper presents a brief overview of the state of hash functions 30 years after their introduction; it also discusses the progress of the SHA-3 competition.
  14. [14]
    FIPS 180, Secure Hash Standard (SHS) | CSRC
    This standard specifies a Secure Hash Algorithm (SHA) which can be used to generate a condensed representation of a message called a message digest.
  15. [15]
    [PDF] fips pub 180-4 - federal information processing standards publication
    Aug 4, 2015 · This Standard specifies secure hash algorithms, SHA-1, SHA-224, SHA-256, SHA-384, SHA-. 512, SHA-512/224 and SHA-512/256. All of the algorithms ...
  16. [16]
    [PDF] Cryptographic Hash-Function Basics: Definitions, Implications, and ...
    Abstract. We consider basic notions of security for cryptographic hash functions: collision resistance, preimage resistance, and second-preimage resistance.Missing: sources | Show results with:sources
  17. [17]
    [PDF] Random Oracles are Practical: A Paradigm for Designing Efficient ...
    They use the random oracle model to define and prove exact, non-asymptotic security. In another paper [29] the same authors use hash functions viewed as random ...Missing: roger | Show results with:roger
  18. [18]
    [PDF] Hash functions: Theory, attacks, and applications - Microsoft
    Nov 14, 2005 · We survey theory and applications of cryptographic hash functions, such as MD5 and SHA-1, especially their resistance to collision-finding ...
  19. [19]
    The MD4 Message Digest Algorithm - SpringerLink
    The MD4 message digest algorithm takes an input message of arbitrary length and produces an output 128-bit “fingerprint” or “message digest”.
  20. [20]
    One Way Hash Functions and DES
    Generating a one-way hash function which is secure if DES is a “good” block cipher would therefore be useful. We show three such functions which are secure if ...
  21. [21]
    None
    ### Summary of Key Design Principle for Collision-Free Hash Functions
  22. [22]
    [PDF] Revisiting Dedicated and Block Cipher based Hash Functions
    Construction of a hash function consists of two components. First component is a compression function and the second component is a domain extender.
  23. [23]
    [PDF] A Framework for Iterative Hash Functions — HAIFA*
    hash functions. Hash functions are usually constructed by means of iterating a cryptographic compression function, while trying to maintain the following three.
  24. [24]
    [PDF] Cryptographic sponge functions - Keccak Team
    Jan 14, 2011 · Cryptographic sponge functions are a versatile cryptographic primitive, and are used as a reference of security claims.
  25. [25]
    The cryptographic sponge and duplex constructions. - Keccak Team
    The sponge construction is a simple iterated construction for building a function F with variable-length input and arbitrary output length.
  26. [26]
    The Sponge is Quantum Indifferentiable - Cryptology ePrint Archive
    Apr 23, 2025 · The sponge is a cryptographic construction that turns a public permutation into a hash function. When instantiated with the Keccak ...Missing: advantages hardware efficiency
  27. [27]
    [PDF] FIPS PUB 180-1 - NIST Technical Series Publications
    Apr 17, 1995 · This standard specifies a Secure Hash Algorithm (SHA-1) which can be used to generate a condensed representation of a message called a message ...
  28. [28]
    [PDF] FIPS PUB 186 - NIST Technical Series Publications
    May 19, 1994 · This standard specifies a Digital Signature Algorithm (DSA) which can be used to generate a digital signature. Digital signatures are used ...
  29. [29]
    [PDF] RIPEMD-160: A Strengthened Version of RIPEMD*
    Apr 18, 1996 · The main difference with RIPEMD-160 is that we keep a hash-result and chain- ing variable of 128 bits (four 32-bit words); only four rounds are ...Missing: specification | Show results with:specification
  30. [30]
    What is the fastest MD5 sum calculator? - ubuntu - Super User
    Apr 1, 2010 · What is the fastest MD5 sum calculator? · Pentium III 700 MHz: 52 MB/s · Atom 1.6 GHz, 32-bit: 119 MB/s · Core 2 (Yorkfield) 2.5GHz, 32-bit: 194 MB ...
  31. [31]
    NIST Comments on Cryptanalytic Attacks on SHA-1 | CSRC
    Apr 26, 2006 · In 2005 Prof. Xiaoyun Wang announced a differential attack on the SHA-1 hash function. NIST found that the attack was practical, ...
  32. [32]
    [PDF] Transitioning the Use of Cryptographic Algorithms and Key Lengths
    Oct 21, 2024 · This revision of SP 800-131A also deprecates SHA-1 and the 224-bit hash functions through December 31, 2030, and disallows them thereafter ...
  33. [33]
    NIST Transitioning Away from SHA-1 for All Applications
    Dec 15, 2022 · NIST will transition away from the use of SHA-1 for applying cryptographic protection to all applications by December 31, 2030.
  34. [34]
    [PDF] NIST IR 8547 initial public draft, Transition to Post-Quantum ...
    Nov 12, 2024 · This section identifies quantum-vulnerable algorithms in NIST's existing cryptographic standards as well as the post-quantum algorithm standards ...
  35. [35]
    [PDF] Status Report on the Fourth Round of the NIST Post-Quantum ...
    Mar 5, 2025 · In the fourth round, NIST selected four key establishment algorithms, but only HQC will be standardized.
  36. [36]
    [PDF] The Whirlpool Secure Hash Function - GW Engineering
    Whirlpool is a block-cipher-based secure hash function, producing a 512-bit hash code, developed by Vincent Rijmen and Paulo Barreto.
  37. [37]
    ISO/IEC 10118-3:2004 - Dedicated hash-functions
    the seventh hash-function (WHIRLPOOL) in Clause 13 provides hash-codes of lengths up to 512 bits. For each of these dedicated hash-functions, ISO/IEC 10118-3 ...Missing: cryptographic | Show results with:cryptographic
  38. [38]
    [PDF] BLAKE2: simpler, smaller, fast as MD5 - Cryptology ePrint Archive
    Abstract. We present the hash function BLAKE2, an improved version of the SHA-3 finalist BLAKE optimized for speed in software. Target.Missing: original | Show results with:original
  39. [39]
    BLAKE2
    BLAKE2 is a cryptographic hash function faster than MD5, SHA-1, SHA-2, and SHA-3, yet is at least as secure as the latest standard SHA-3.
  40. [40]
    the official Rust and C implementations of the BLAKE3 ... - GitHub
    Jan 7, 2020 · BLAKE3 is a cryptographic hash function that is: Much faster than MD5, SHA-1, SHA-2, SHA-3, and BLAKE2. Secure, unlike MD5 and SHA-1.Issues · Pull requests 68 · Actions · Security
  41. [41]
    Verify your ISO image - Linux Mint Installation Guide - Read the Docs
    To check the integrity of your local ISO file, generate its SHA256 sum and compare it with the sum present in sha256sum.txt . sha256sum -b yourfile.iso. If you ...
  42. [42]
    hash-function-transition Documentation - Git
    We introduce a new repository format extension. Repositories with this extension enabled use SHA-256 instead of SHA-1 to name their objects.
  43. [43]
    CRCs vs Hash Functions - eklitzke.org
    Jun 12, 2016 · Both CRCs and hash functions share the property that they take an input value and reduce it to a (usually shorter) output value.
  44. [44]
    hashdeep download | SourceForge.net
    Jul 24, 2015 · Computes, matches, and audits hashes recursively. Supports the MD5, SHA-1, SHA-256, Tiger, and Whirlpool algorithms.
  45. [45]
    Subresource Integrity - W3C
    Jul 10, 2025 · Subresource Integrity (SRI) is a mechanism to verify fetched resources using a cryptographic hash, ensuring they were not unexpectedly ...
  46. [46]
    The Design and Implementation of Tripwire: A File System Integrity ...
    Tripwire monitors files for changes using signature routines, notifying admins of altered files. It is highly configurable and no-cost software.
  47. [47]
    RFC 8017 - PKCS #1: RSA Cryptography Specifications Version 2.2
    This document provides recommendations for the implementation of public-key cryptography based on the RSA algorithm.
  48. [48]
    RFC 2104 - HMAC: Keyed-Hashing for Message Authentication
    This document describes HMAC, a mechanism for message authentication using cryptographic hash functions.
  49. [49]
    RFC 5280 - Internet X.509 Public Key Infrastructure Certificate and ...
    This memo profiles the X.509 v3 certificate and X.509 v2 certificate revocation list (CRL) for use in the Internet.
  50. [50]
    RFC 7519 - JSON Web Token (JWT) - IETF Datatracker
    1. Verify that the JWT contains at least one period ('. · 2. Let the Encoded JOSE Header be the portion of the JWT before the first period ('. · 3. Base64url ...
  51. [51]
    RFC 8446 - The Transport Layer Security (TLS) Protocol Version 1.3
    RFC 8446 specifies TLS 1.3, which allows secure client/server communication over the internet, preventing eavesdropping, tampering, and forgery.
  52. [52]
    Password Storage - OWASP Cheat Sheet Series
    Use Argon2id or scrypt for modern password storage. Hash passwords, not encrypt. Use salting, and consider peppering for added security.
  53. [53]
    RFC 2898: Password-Based Cryptography Specification, Version 2.0
    PBKDF2 is recommended for new applications; PBKDF1 is included only for compatibility with existing applications, and is not recommended for new applications. A ...
  54. [54]
    [PDF] A Future-Adaptable Password Scheme - USENIX
    Monterey, California, USA, June 6–11, 1999. A Future-Adaptable Password Scheme. Niels Provos and David Mazières. The OpenBSD Project. © 1999 by The USENIX ...
  55. [55]
    [PDF] stronger key derivation via sequential memory-hard functions colin ...
    This paper aims to reduce the advantage which attackers can gain by using custom-designed parallel circuits. 2. Memory-hard algorithms. A natural way to reduce ...
  56. [56]
    [PDF] Open Sesame - Cryptology ePrint Archive
    In this document we present an overview of the background to and goals of the Password Hashing Competition (PHC) as well as the design of its winner, Argon2, ...
  57. [57]
    [PDF] Digital Identity Guidelines: Authentication and Lifecycle Management
    Jul 24, 2025 · This document defines technical requirements for each of the three authenticator assurance levels. This publication supersedes corresponding ...<|separator|>
  58. [58]
    [PDF] Discontinuing MD5 Hashed Passwords - RIPE 89
    Oct 31, 2024 · Discontinuing MD5 Hashed Passwords. 2. Introduction. ○ We plan to discontinue MD5 hashed passwords in 2025. ○ Presented at RIPE 88. ○ Impact ...
  59. [59]
    [PDF] A Peer-to-Peer Electronic Cash System - Bitcoin.org
    The proof-of-work involves scanning for a value that when hashed, such as with SHA-256, the hash begins with a number of zero bits.
  60. [60]
    Block Chain - Bitcoin.org
    A SHA256(SHA256()) hash in internal byte order of the previous block's header. This ensures no previous block can be changed without also changing this block's ...
  61. [61]
    Bitcoin Difficulty Chart + Adjustment Estimator - BitRef
    This adjustment keeps block times stable. The formula for difficulty is: Difficulty = Difficulty Target / Current Target. Difficulty Target is the highest ...
  62. [62]
  63. [63]
    Bitcoin network hashrate - SHA-256 ⛏️ - minerstat
    Rating 3.0 (6) Bitcoin network hashrate ; 8th Nov 2025. 1091.6538 · -0.99%. 102,774.12 · +1.02% ; 7th Nov 2025. 1102.5556 · +0.11%. 101,737.29 · -0.86% ; 6th Nov 2025. 1101.2917 · - ...
  64. [64]
    Litecoin Foundation
    Litecoin uses Scrypt Proof of Work (PoW), with parameters `N=1024`, `r=1` and `p=1`. The `salt` is the same 80 bytes as the input. The output is 256bits (32 ...
  65. [65]
  66. [66]
    [PDF] The first collision for full SHA-1 - Cryptology ePrint Archive
    This family originally started with MD4 [34] in 1990, which was quickly replaced by MD5 [35] in 1992 due to serious attacks [8, 10]. Despite early known ...
  67. [67]
    [PDF] Side-Channel Attacks: Ten Years After Its Publication and the ...
    The data masking technique is the most widely used countermeasure against power analysis and timing attacks at a software level. Masking an algorithm means ...
  68. [68]
    [PDF] Quantum Search for Scaled Hash Function Preimages
    Sep 2, 2020 · Grover's search algorithm [Gro96] can be adapted to find the preimage of a hash with. 2n/2 evaluations of a quantum oracle plus a diffusion ...
  69. [69]
    Post-Quantum Cryptography | CSRC
    Post-quantum cryptography aims to develop systems secure against both quantum and classical computers, as current systems are vulnerable to quantum computers.Workshops and Timeline · Presentations · Email List (PQC Forum) · Post-QuantumMissing: SHA- | Show results with:SHA-
  70. [70]
    NIST Retires SHA-1 Cryptographic Algorithm
    Dec 15, 2022 · It is a slightly modified version of SHA, the first hash function the federal government standardized for widespread use in 1993.<|control11|><|separator|>
  71. [71]
    [PDF] password protection for modern operating systems - USENIX
    Jun 23, 2004 · The attack uses large tables of “rainbow chains.” They are computed as follows: 1. A random word is generated and stored in the first column ...Missing: explanation | Show results with:explanation
  72. [72]
    GPU Password Cracking Benchmarks 2025: RTX vs CPUs
    This article provides a comprehensive, research-backed analysis of password cracking benchmarks using the latest NVIDIA RTX GPUs and high-end CPUs.
  73. [73]
    [PDF] The Cryptographic Implications of the LinkedIn Data Breach - arXiv
    Mar 20, 2017 · The use of an unsalted hash function led to the exposure of over 100 million passwords. As we have shown, standard hash functions such as SHA-1 ...
  74. [74]
    Post-Quantum Cryptography | CSRC
    NIST has numerous reasons for specifying a categorical post-quantum security hierarchy in the Call for Proposals for post-quantum standards. The primary purpose ...
  75. [75]
    AI-powered consumer GPUs speed up password cracking with bcrypt
    Sep 17, 2025 · Advances in AI-driven consumer GPUs like Nvidia's RTX 5090 drastically reduce bcrypt password cracking times, urging stronger, ...
  76. [76]
    Imperva Releases Detailed Analysis of 32 Million Breached ...
    Jan 21, 2010 · Imperva, the leader in Data Security, announced today the release of study analyzing 32 million passwords recently exposed in the Rockyou.com breach.Missing: 2009 recovery
  77. [77]
    [PDF] Recommendation for Stateful Hash-Based Signature Schemes
    Appendix B, XMSS XDR Syntax Additions, describes additions that are required for the. XDR syntax for XMSS and XMSSMT in order to support the new parameter sets ...
  78. [78]
    Guidelines for Mitigating Timing Side Channels Against ... - Intel
    Jun 29, 2022 · Learn how cryptographic implementations use constant time principles to help protect secret data from traditional side channel attacks.
  79. [79]
    [PDF] Intel® SHA Extensions
    Intel SHA extensions are new instructions for performance acceleration of SHA, a cryptographic hashing algorithm, used for data integrity, authentication, and ...
  80. [80]
    [PDF] Properties of Cryptographic Hash Functions
    notions of preimage resistance, second-preimage resistance and collision resistance ...
  81. [81]
    RFC 5869 - HMAC-based Extract-and-Expand Key Derivation ...
    This document specifies a simple Hashed Message Authentication Code (HMAC)-based key derivation function (HKDF), which can be used as a building block in ...
  82. [82]
    RFC 2898 - PKCS #5: Password-Based Cryptography Specification ...
    This document provides recommendations for the implementation of password-based cryptography, covering key derivation functions, encryption schemes, message- ...
  83. [83]
  84. [84]
    [PDF] New Proofs for NMAC and HMAC: Security without Collision ...
    This paper proves HMAC is a PRF if the compression function is a PRF, and a secure MAC if the compression function is a privacy-preserving MAC and the hash ...
  85. [85]
    [PDF] Practical and Provably-Secure Commitment Schemes from Collision ...
    Before presenting our scheme, it is useful to point out why simpler constructions based on collision-free hashing do NOT work. For the purpose of the discussion.<|separator|>
  86. [86]
    [PDF] Module-Lattice-Based Key-Encapsulation Mechanism Standard
    Aug 13, 2024 · NIST will continue to follow developments in the analysis of the ML-KEM algorithm. As with its other cryptographic algorithm standards, NIST ...
  87. [87]
  88. [88]
    [PDF] Constructing Digital Signatures from a One Way Function
    Page 1. Op. 52. Constructing Digital Signatures from a One Way Function. Leslie Lamport. Computer Science Laboratory. SRI International. 18 ...Missing: paper | Show results with:paper
  89. [89]
    RFC 8391: XMSS: eXtended Merkle Signature Scheme
    This note specifies Winternitz One-Time Signature Plus (WOTS+), a one-time ... 1983. Huelsing, et al. Informational [Page 59]. RFC 8391 XMSS May 2018 ...
  90. [90]
    RFC 8554 - Leighton-Micali Hash-Based Signatures
    This note describes a digital-signature system based on cryptographic hash functions, following the seminal work in this area of Lamport, Diffie, Winternitz, ...
  91. [91]
    SPHINCS+
    SPHINCS + is a stateless hash-based signature scheme, which was submitted to the NIST post-quantum crypto project.Resources · Software · Credits
  92. [92]
    FIPS 205, Stateless Hash-Based Digital Signature Standard | CSRC
    Aug 13, 2024 · This standard specifies the stateless hash-based digital signature algorithm (SLH-DSA). Digital signatures are used to detect unauthorized modifications to ...
  93. [93]
    Signatures | IOTA Documentation
    Signature requirements​. The signature must commit to the hash of the intent message of the transaction data, which you can construct by appending the 3-byte ...
  94. [94]
    [PDF] Applying Grover's Algorithm to Hash Functions - arXiv
    Feb 22, 2022 · The objective of this research was to study the computational cost of conducting a preimage at- tack on MD5, SHA-1, SHA-2, and SHA-3 with.