bcrypt
Bcrypt is a password hashing function designed by Niels Provos and David Mazières, based on the Blowfish block cipher, and first presented at the 1999 USENIX Annual Technical Conference.[1] It employs an expensive key setup process derived from the exponentially keyed Blowfish variant (eksblowfish), making the computation deliberately slow to thwart brute-force and dictionary attacks on password databases.[1] The function includes a configurable cost parameter, or work factor, which determines the number of iterations—for example, 2^6 in the original implementation, and adjustable upward—to ensure adaptability as hardware performance advances, while automatically incorporating a 128-bit salt to prevent precomputed table attacks like rainbow tables.[1][2]
Developed under The OpenBSD Project to address limitations in earlier hashing methods like DES-based crypt, bcrypt formalizes desirable properties for secure password schemes, including resistance to hardware acceleration and key search optimizations.[1] Its output format encodes the algorithm identifier ("2a", "2b", or "2y" variants), cost factor, salt, and hash in a modular, human-readable string of 60 characters, facilitating easy storage and verification.[2] Bcrypt gained prominence through open-source implementations and has been integrated into numerous systems, including as the default hashing mechanism in OpenBSD's crypt library since version 2.1.[3]
Although newer memory-hard functions like scrypt and Argon2 have emerged to counter GPU-accelerated attacks, bcrypt remains a cornerstone of password security due to its balance of security, simplicity, and widespread availability in libraries across languages such as C, Java, and Node.js.[2] It is recommended by security standards for legacy applications where modern alternatives are unavailable, with ongoing use in production environments emphasizing its enduring role in defending against offline password cracking.[2]
History and Development
Origins and Creation
Bcrypt was developed by Niels Provos and David Mazières in the late 1990s as a secure password hashing mechanism for the OpenBSD operating system.[1] The algorithm was first implemented by the pair and imported into the OpenBSD codebase on February 13, 1997, with its initial release as part of OpenBSD 2.1 in June 1997.[4] This early integration marked bcrypt's debut in a production environment, serving as the default password hashing function for OpenBSD's authentication system.[5]
The creation of bcrypt stemmed from concerns over the growing vulnerability of traditional password hashing schemes to advances in computational hardware. Existing methods, such as the DES-based crypt function in UNIX systems, were computationally inexpensive and thus susceptible to offline brute-force attacks as processor speeds increased.[1] Provos and Mazières sought to design a "future-adaptable" scheme that could scale its computational cost to match evolving threats, emphasizing deliberate slowness to deter attackers while remaining feasible for legitimate authentication.[1] To achieve this, they adapted the Blowfish block cipher's key schedule, transforming it into an expensive setup phase that dominates the hashing process.[1]
Bcrypt's formal introduction occurred in the 1999 USENIX Annual Technical Conference paper titled "A Future-Adaptable Password Scheme," where Provos and Mazières detailed the algorithm's design and provided pseudocode for its core operations.[1] By this time, the scheme had undergone refinement within OpenBSD, solidifying its role as a robust defense against password cracking and influencing subsequent security practices.[5]
Initial Adoption and Implementations
Bcrypt was integrated into the OpenBSD operating system's login capabilities as the default password hashing mechanism starting with OpenBSD 2.1 in 1997, establishing it as a core security feature by 2000 and influencing its adoption across other Unix-like systems, including FreeBSD and NetBSD, where it became a supported option for password storage in /etc/shadow files.[3][6][7]
In 2009, PHP version 5.3 introduced native support for bcrypt through an extension to the crypt() function, specifically via the CRYPT_BLOWFISH algorithm, which facilitated its integration into web development and spurred adoption in popular frameworks such as Ruby on Rails—via the bcrypt-ruby gem, first released in 2007—and Python, through libraries like py-bcrypt.[8][9][10]
Key milestones in bcrypt's dissemination included its availability in Linux distributions around 2006, notably through packages like libcrypt-blowfish-perl in Debian, which provided Blowfish-based hashing compatible with bcrypt for system-level password management, and ongoing efforts to reference it in IETF drafts on password storage best practices, such as draft-ietf-kitten-password-storage, although it has not achieved full standardization.[11]
Early security evaluations, including the foundational 1999 analysis by creator Niels Provos, affirmed bcrypt's design resistance to hardware-accelerated brute-force attacks by emphasizing its adaptable cost factor to counter increasing computational power.
Overview and Design Principles
Purpose and Core Objectives
Bcrypt was designed specifically as a password hashing function to resist offline brute-force and dictionary attacks by incorporating a deliberately computationally expensive process, thereby making it impractical for attackers to crack large numbers of passwords even with significant resources.[1] This approach leverages the expensive key schedule of the Blowfish block cipher to create a hash that scales with hardware capabilities.[1]
The core objectives of bcrypt include providing an adaptive work factor, which allows system administrators to increase the computational cost over time to counteract hardware improvements following Moore's Law, ensuring long-term resistance to faster processors.[5] It also mandates the inclusion of a unique salt for each password to prevent rainbow table precomputation attacks, and its output format embeds the salt, work factor, and hash value in a modular, verifiable string for straightforward rehashing and validation during authentication.[1]
In contrast to general-purpose cryptographic hash functions like MD5 or SHA-1, which prioritize speed for tasks such as data integrity verification, bcrypt emphasizes deliberate slowness—typically tuned to require 100 milliseconds or more per hash on target hardware—to impose a significant burden on attackers while remaining feasible for legitimate login processes.[1]
Bcrypt emerged to replace insecure password storage practices common in early web applications, where fast, unsalted hashes like MD5 or SHA-1 enabled rapid offline cracking; a notable example is the 2012 LinkedIn breach, in which over 6.5 million unsalted SHA-1 password hashes were stolen and subsequently cracked en masse, exposing user accounts to compromise.[12][13]
Key Features and Parameters
Bcrypt incorporates a configurable cost factor that determines the computational expense of the hashing process, expressed as the base-2 logarithm of the number of iterations performed during the expensive key schedule. This parameter, typically ranging from 4 to 31, allows the work factor to be tuned such that a value of n results in 2n rounds, enabling administrators to balance security against performance as hardware advances.[1][5][2]
A core feature is the generation of a 128-bit salt using cryptographically secure pseudorandom number generators, such as those seeded by kernel entropy sources in implementations like OpenBSD's. This salt, which is unique per password, prevents precomputation attacks like rainbow tables by ensuring that identical passwords produce distinct hashes.[1][14]
The output of bcrypt is encoded in the Modular Crypt Format, a string beginning with $2a$ (indicating the algorithm version), followed by the two-digit cost factor, a $ delimiter, 22 characters representing the base64-encoded 128-bit salt, followed by the 31-character base64-encoded 192-bit hash, embedding all necessary data for verification without separate storage.[1][14]
Many bcrypt libraries support automatic rehashing, where upon successful password verification, if the stored hash uses an outdated cost factor, the password is rehashed with the current higher cost and updated in storage, facilitating gradual security improvements without user intervention.[2]
Bcrypt's design, rooted in the Blowfish block cipher's key derivation without reliance on Merkle-Damgård constructions, inherently resists length-extension attacks that plague certain hash functions like SHA-256.[1]
Version History
Original and 2a Release
The original bcrypt implementation was introduced in 1999 within the OpenBSD operating system, using the $2$ prefix in its output format. Developed by Niels Provos and David Mazières, it adapted the Blowfish block cipher—featuring standard 4 KB S-boxes—for password hashing through an expensive key schedule that incorporates 64 rounds of the cipher's core function to resist brute-force attacks.[1]
In 2011, the OpenWall project released the 2a variant to resolve a critical flaw in certain implementations, notably PHP's crypt() function, where embedded null bytes (or other 8-bit characters) within the password could cause premature termination in certain implementations like PHP's crypt(), leading to incorrect hashing. The 2a variant fixes this by properly handling 8-bit characters, including null bytes, treating the password as a binary string truncated to exactly 72 bytes without termination artifacts.[15][16]
The 2a specification retains the foundational Blowfish structure, including the 4 KB S-boxes and 64-round modified setup from the original, but adjusts the key expansion process to mitigate the 8-bit character vulnerability without altering the overall security model.[17]
This 2a release quickly established itself as the de facto standard for bcrypt deployments, with libraries such as bcrypt-ruby integrating support for it by late 2011 to ensure interoperability and enhanced security in production environments. The original $2$ prefix is now considered obsolete.
Subsequent Variants (2b, 2x, 2y)
In February 2014, the $2b variant was introduced in the [OpenBSD](/page/OpenBSD) bcrypt implementation to resolve a critical [bug](/page/Bug!) affecting [password](/page/Password) lengths. The original code stored the password length in an unsigned 8-bit [integer](/page/Integer) ([char](/page/Char)), which caused lengths exceeding 255 bytes to [overflow](/page/Overflow) and wrap around to a small value, resulting in only the first few bytes of the [password](/page/Password) being hashed instead of the intended first 72 bytes, thereby weakening security for very long [password](/page/Password)s. The fix expanded support for [password](/page/Password)s up to the full 72-byte effective limit, with the new $2b prefix signaling the corrected behavior while preserving compatibility for existing $2a$ hashes generated under the buggy regime.[18]
The $2x and $2y variants originated in June 2011 within the crypt_blowfish library, a PHP-specific bcrypt implementation maintained by Solar Designer. An earlier version of this library incorrectly processed passwords containing 8-bit characters (values 128–255, often non-ASCII like accented letters in international text), leading to hashing errors or unintended collisions in edge cases such as truncated or malformed outputs. To distinguish these problematic hashes without invalidating them, the $2x prefix was assigned to outputs from the buggy code, while $2y marked the patched version that properly handles 8-bit characters; this allowed verification libraries to detect and apply the appropriate legacy logic. Unlike the OpenBSD changes, these prefixes were not adopted in the canonical implementation but remain supported in many libraries to ensure interoperability with legacy PHP-generated hashes.[15]
All bcrypt variants maintain functional equivalence in their core algorithm, producing identical hash outputs for the same password, salt, and cost factor when using corrected implementations. Modern libraries, such as those in Python's Passlib or Java's BCrypt, recognize prefixes from $2a, $2b, $2x, and $2y, automatically applying any necessary fixes (e.g., simulating truncation for $2a [verification](/page/Verification) or 8-bit mishandling for $2x) to validate legacy hashes without recomputation issues. For new applications, the $2b or $2y prefix is recommended to guarantee robust handling of long passwords and diverse character sets, minimizing risks from historical bugs.[19]
Algorithm Details
High-Level Operation
Bcrypt operates as a password-hashing function that derives a fixed-size output from an input password, a random salt, and a configurable cost factor, leveraging the Blowfish block cipher to enforce computational expense. The algorithm begins by parsing the inputs: a plaintext password string, a 128-bit salt, and a cost parameter (typically ranging from 4 to 31), where the cost logarithmically controls the number of iterations performed, specifically 2^cost rounds. This setup ensures the hashing process is deliberately slow to resist brute-force attacks by increasing the time required per attempt.[1]
The process proceeds in several key steps. First, the password is preprocessed by truncating it to 72 bytes if longer or padding it with zero bytes if shorter to form a 72-byte array used as input to the key derivation.[2][20] Next, an expensive key schedule (EKS), based on the Blowfish cipher's key setup, modifies the cipher's subkeys and S-boxes using the preprocessed password and salt; this phase, known as eksblowfish-setup, performs 2^cost iterations of subkey mixing to amplify computational cost early in the process. Following setup, the Blowfish cipher—now keyed with the derived state—is used in electronic codebook (ECB) mode to repeatedly encrypt a fixed 24-byte magic string, "OrpheanBeholderScryDoubt", for exactly 64 additional times, though the primary work is front-loaded in the EKS. The resulting 192-bit ciphertext is then truncated to 184 bits for encoding, encoded in a modified base-64 alphabet (using characters from "./0-9A-Za-z" to fit traditional crypt output constraints), and formatted as a string prefixed with "$2a$", the decimal cost value, the 22-character salt encoding, and the 31-character hash.[1]
This high-level flow can be outlined in pseudocode as follows:
function bcrypt(password, salt, cost):
preprocessed_password = truncate_or_pad(password, 72 bytes) // Truncate to 72 bytes or pad with zeros
state = eksblowfish_setup(cost, preprocessed_password, salt) // Expensive key schedule with 2^cost iterations
ciphertext = "OrpheanBeholderScryDoubt" // 24-byte fixed plaintext
for i = 1 to 64:
ciphertext = blowfish_encrypt(state, ciphertext) // ECB mode encryption
hash = encode_base64(truncate(ciphertext, 184 bits)) // Modified base-64
output = "$2a$" + to_decimal(cost) + encode_base64(salt, 22 chars) + hash // 60-character result
return output
function bcrypt(password, salt, cost):
preprocessed_password = truncate_or_pad(password, 72 bytes) // Truncate to 72 bytes or pad with zeros
state = eksblowfish_setup(cost, preprocessed_password, salt) // Expensive key schedule with 2^cost iterations
ciphertext = "OrpheanBeholderScryDoubt" // 24-byte fixed plaintext
for i = 1 to 64:
ciphertext = blowfish_encrypt(state, ciphertext) // ECB mode encryption
hash = encode_base64(truncate(ciphertext, 184 bits)) // Modified base-64
output = "$2a$" + to_decimal(cost) + encode_base64(salt, 22 chars) + hash // 60-character result
return output
The Blowfish cipher serves here as an iterated block cipher, with its 64-bit block size and variable key length enabling the adaptive slowdown without relying on the full encryption for primary security. This design prioritizes the EKS phase for most of the computational burden, making bcrypt suitable for password storage where verification involves recomputing the hash with the stored salt and cost.[1]
Expensive Key Schedule (EKS)
The Expensive Key Schedule (EKS), also known as Eksblowfish, forms the foundational phase of bcrypt's operation by transforming the Blowfish cipher's key schedule into a deliberately time-consuming process that integrates the password and salt to derive a secure set of subkeys. This modification ensures that the entire cipher state, including the 4 KB of S-boxes, must be fully recomputed for each unique password-salt combination, preventing efficient precomputation or rainbow table attacks. By extending the standard Blowfish setup, EKS achieves its security through computational expense rather than secrecy, allowing the cost to scale with advancing hardware capabilities.[1]
The EKS process begins with the standard Blowfish key schedule initialization, where the 18 P-array entries (P1 to P18) and four 256-entry S-box arrays are loaded with fixed constants derived from the fractional digits of π. These 1042 32-bit words (totaling 4168 bytes) form the initial subkey array. The preprocessed 72-byte password and the 128-bit salt are then used to modify this array via the ExpandKey function. In the initial call, the P-array is XORed with cycling chunks of the password, followed by a series of 64-bit block encryptions where the input to each encryption (alternating between the two 64-bit halves of the salt XORed with the previous ciphertext) replaces pairs of subkeys, propagating through all 18 P-entries and then the 1024 S-box entries (replacing two 32-bit words per encryption, for a total of 521 encryptions per ExpandKey call). This ensures dependence on both inputs across all subkeys, including the S-boxes which remain fixed in unmodified Blowfish.[1][20]
To enforce the cost, after the initial ExpandKey call with the salt and password, the ExpandKey function is alternately called with the salt (and zero key) and then with the password (and zero salt) for exactly 2^cost additional iterations. In these repeated calls, the XOR step applies only when a non-zero key is provided (affecting the P-array), while the encryption chain always updates the entire state using the provided salt parameter (or zero). This chained process propagates the influences of the password and salt throughout the array across multiple iterations.[1]
The primary purpose of EKS is to inflate the key setup time, rendering it approximately $2^{\text{cost}} times slower than a conventional Blowfish key schedule while enforcing complete 4 KB S-box reinitialization. This tunable expense, achieved through the iterative nature of the ExpandKey updates, directly contributes to bcrypt's resistance against brute-force and parallelized attacks without compromising the underlying cipher's strength.[1]
Key Expansion and Cipher Setup
Following the expensive key schedule, the modified Blowfish cipher state is utilized to produce the final hash through a series of encryptions on a fixed input. The 192-bit constant "OrpheanBeholderScryDoubt"—equivalent to 24 bytes—is encrypted 64 times in Electronic Codebook (ECB) mode using the established cipher state. Each iteration takes the output ciphertext from the previous encryption as input, creating a chained effect that further mixes the derived keys without altering the subkey array itself.[1]
This process can be expressed in pseudocode as follows:
ctext ← "OrpheanBeholderScryDoubt" // 192-bit (24-byte) constant
for i ← 1 to 64:
ctext ← Encrypt<sub>ECB</sub>(state, ctext) // Encrypt entire 192 bits (3 × 64-bit blocks)
output ← ctext // 24 bytes of final ciphertext (later truncated to 23 bytes for encoding)
ctext ← "OrpheanBeholderScryDoubt" // 192-bit (24-byte) constant
for i ← 1 to 64:
ctext ← Encrypt<sub>ECB</sub>(state, ctext) // Encrypt entire 192 bits (3 × 64-bit blocks)
output ← ctext // 24 bytes of final ciphertext (later truncated to 23 bytes for encoding)
The resulting 24-byte value serves as the core of the bcrypt hash before encoding and concatenation with the salt and cost factor. Since Blowfish operates on 64-bit blocks, each full encryption of the 192-bit input requires three block encryptions, totaling 192 block operations across the 64 iterations.[1]
Password and Salt Handling
Bcrypt processes the input password as a sequence of bytes, typically encoded in UTF-8 to handle international characters, converting non-ASCII characters into their corresponding multi-byte representations before further processing.[19] The password is then treated as a null-terminated C-style string and truncated to a maximum of 72 bytes, including the null terminator; any additional bytes are discarded.[2][19]
The salt in bcrypt is a randomly generated 128-bit value, selected uniformly to ensure uniqueness for each password hashing operation and to thwart precomputation attacks like rainbow tables.[1] This salt is incorporated directly into the hashing process and, for output purposes, encoded into 22 characters using a modified base64 alphabet consisting of the characters ./0-9A-Za-z.[1][21]
Edge cases in password handling include empty inputs, which are processed as a single null byte (the terminator for a zero-length string), resulting in a consistent 1-byte input that produces a reproducible hash for a given salt but varies with different salts.[3] Non-ASCII and special characters are preserved through UTF-8 byte encoding without alteration beyond the truncation limit.
The resulting bcrypt hash is formatted as a fixed-length string: $2y$<cost>$<22-character salt><31-character hash>, where $2y$ denotes the algorithm version (with $2a$, $2b$, or $2x$ used in earlier variants), <cost> is a two-digit logarithmic cost factor (e.g., 10 for 2¹⁰ iterations), the salt occupies the next 22 base64 characters, and the final 31 characters represent the base64-encoded 192-bit hash output.[21][1] This structure allows efficient storage and verification while embedding all necessary parameters.
Cost Factor and Iteration Mechanism
The cost factor in bcrypt, also known as the work factor, is a tunable parameter that determines the computational expense of the hashing process by specifying the number of iterations as $2^{\text{cost}}. For instance, a cost of 10 results in $2^{10} = [1024](/page/1024) iterations of the key expansion setup.[1] This exponential scaling allows bcrypt to adapt to advancing hardware capabilities, ensuring that hashing remains deliberately slow to thwart brute-force attacks.[1]
The iteration mechanism applies specifically to the key expansion loop in the eksblowfish setup, where the process alternates between the password-derived key and the salt for each of the $2^{\text{[cost](/page/Cost)}} rounds, without altering the core expensive key schedule itself. This design choice enables future-proofing by permitting incremental increases in the cost factor over time—for example, defaults of 6 to 8 in the late 1990s and early 2000s have evolved to 12 or higher in the 2020s to maintain security against faster processors.[1][22] The sequential nature of these iterations inherently resists parallelization, making it difficult for attackers to leverage GPUs or ASICs for acceleration, unlike more parallel-friendly algorithms.[1]
In practice, modern implementations recommend a cost of 12, which typically yields a hashing time of around 250 milliseconds on contemporary hardware, balancing security with acceptable login delays.[22][23] During verification, the stored hash is recomputed using the provided password, the embedded salt, and the original cost factor; a mismatch results in authentication failure, and many systems respond by rehashing valid credentials with a higher cost for subsequent storage to enhance long-term protection.[1][24]
Security Analysis
Resistance to Brute-Force Attacks
Bcrypt's primary defense against brute-force attacks lies in its deliberate computational intensity, achieved through an adaptive cost factor that controls the number of iterations in the Blowfish-based key setup. A standard cost factor of 12, recommended for balancing security and usability, typically yields 2 to 5 hashes per second on modern CPUs, a stark contrast to general-purpose hashes like SHA-256, which can exceed 10^9 hashes per second on high-end GPUs. This slowness forces attackers to expend significant resources on each guess, with the cost factor tunable upward to counteract advances in hardware performance, such as those from Moore's Law.[25][26][2]
The inclusion of a unique 128-bit salt per password further bolsters resistance by thwarting rainbow table attacks, as precomputing tables for the full salt space (2^128 possibilities) is computationally infeasible. This per-user salting ensures that even identical passwords produce distinct hashes, eliminating efficiencies from offline dictionary or preimage assaults.[27]
Bcrypt's design also exhibits strong resistance to parallelization on GPUs and ASICs, owing to its irregular memory access patterns during the 4 KB key setup phase, which hinder efficient vectorization and cache exploitation on specialized hardware. Benchmarks from the 2015 Password Hashing Competition confirm no practical side-channel vulnerabilities, such as timing or power analysis attacks, in standard implementations, validating its robustness under scrutiny.[5]
In real-world deployments, bcrypt has demonstrated resilience; for instance, during the 2016 disclosure of the 2012 Dropbox breach affecting 68 million accounts, the subset of passwords hashed with bcrypt resisted cracking despite exposure, unlike weaker SHA-1 hashes in the same dump. The Open Web Application Security Project (OWASP) has endorsed bcrypt for password storage since its 2009 guidelines, citing its adaptive security as a cornerstone for protecting against offline brute-force attempts.[28]
Comparison with Other Hashing Algorithms
Bcrypt represents a significant advancement over earlier cryptographic hash functions like MD5 and SHA-1 for password storage, primarily due to its incorporation of a per-password salt to thwart rainbow table attacks and an adjustable cost factor that enforces computational slowness to resist brute-force attempts.[1] In contrast, MD5 and SHA-1 are fast, general-purpose hashes lacking built-in salting or iteration mechanisms, making them highly vulnerable to offline dictionary and brute-force attacks even with added salts, as attackers can parallelize computations efficiently on modern hardware.[1]
Compared to PBKDF2, defined in RFC 2898, bcrypt employs a modified Blowfish cipher for its key schedule to achieve slowness through a simpler, single-purpose design rather than PBKDF2's reliance on iterated HMAC computations with underlying hashes like SHA-256.[29][1] Both algorithms provide comparable resistance to brute-force attacks when tuned to similar computational costs, but bcrypt's integrated salting and fixed output format reduce the risk of implementation errors, such as improper salt handling or iteration counts, making it arguably easier to deploy securely in practice.[2]
Unlike scrypt, introduced by Colin Percival in 2009, bcrypt does not incorporate memory-hardness, relying instead on CPU-intensive operations that can be more readily accelerated on GPUs and ASICs, thereby exposing it to higher parallelization in attacks.[30] Scrypt addresses this by requiring substantial sequential memory access, enhancing resistance to hardware-optimized brute-force efforts and aiming for better ASIC deterrence through tunable memory parameters.[30]
Argon2, the winner of the 2015 Password Hashing Competition, builds on these concepts with fully tunable parameters for time, memory, and parallelism, offering superior protection against GPU and ASIC attacks compared to bcrypt's fixed 4 KB memory usage.[31] Modern benchmarks demonstrate Argon2's effectiveness, where it can impose roughly twice the computational burden on GPUs relative to equivalently tuned bcrypt instances, due to its data-dependent memory access patterns that hinder efficient parallelization. The NIST SP 800-63B guidelines (2017, revised 2020) endorse PBKDF2 as a baseline for password-based authenticators but accept bcrypt, scrypt, and Argon2 as viable alternatives when configured to require at least 0.5 seconds of processing on verifier hardware.[32]
Limitations and Criticisms
Password Length Restrictions
Bcrypt imposes a maximum password length of 72 bytes, including a null terminator, as defined in its original design. This limit stems from the algorithm's use of a modified Blowfish key schedule, where the password serves as the key material, and only the first 72 bytes are processed to initialize the subkeys efficiently. The restriction traces its roots to the Unix crypt function's 8-character limit, derived from the 64-bit DES key (56 data bits plus 8 parity bits), which bcrypt extends for greater capacity while preserving compatibility with legacy systems.[1][3]
This length cap has notable security implications, especially for long passphrases. Passwords exceeding 72 bytes are silently truncated, causing the loss of entropy from any trailing characters and resulting in identical hashes for inputs sharing the same prefix. Consequently, attackers can concentrate brute-force attacks on potential short prefixes, undermining the added security intended from extended passphrases, such as multi-word combinations that surpass the byte threshold. For instance, a passphrase like "correct horse battery staple" followed by additional words may offer no extra protection beyond its initial 72 bytes.[33] This truncation risk extends to non-password uses; for example, in October 2024, Okta disclosed a vulnerability where bcrypt's limit in generating authentication cache keys from long usernames enabled unauthorized access.[34]
Bcrypt provides no built-in mitigations for this truncation. To address it, applications should ideally restrict passwords to 72 bytes or fewer, or incorporate a server-side pepper to bolster protection against targeted attacks on truncated inputs. Another approach involves pre-hashing longer passwords with a fast, non-salted function like SHA-256 to fold the full length into a fixed-size input for bcrypt, though this requires careful implementation to avoid introducing vulnerabilities. Modern libraries, including PHP versions from 7.4 (with enhanced documentation around 2020), issue warnings about truncation during hashing to alert developers and promote awareness.[35]
Post-2020 analyses have increasingly emphasized the entropy reduction for multi-word passphrases, critiquing the limit as an outdated constraint that encourages migration to unbounded algorithms like Argon2 for handling diverse password lengths without compromise.
Truncation and Encoding Issues
Bcrypt derives its final hash value from the 192-bit ciphertext produced by encrypting a fixed magic string ("OrpheanBeholderScryDoubt") using the Blowfish block cipher, where the key is derived from the password and salt via the expensive key schedule (EKS). Although the Blowfish key schedule processes up to 448 bits, only the 192-bit ciphertext is retained for the hash output, effectively truncating the result relative to the full internal state generated during encryption. This design choice has been critiqued for potentially narrowing the security margin against certain attacks, such as theoretical collisions, though the probability remains negligible at 2^{-192} for random inputs, and Blowfish's strong diffusion properties ensure that the key schedule's computational cost dominates security.[1][36]
The hash, along with the salt and cost parameter, is then encoded into a string using a custom base-64 alphabet consisting of the 64 characters ./0-9A-Za-z. This variant replaces the standard base-64's + with . to avoid common encoding conflicts, but retains /, which can introduce issues when bcrypt hashes are embedded in URLs, file paths, or configuration strings without proper percent-encoding, as / may be interpreted as a delimiter by parsers. In variant $2a$, this encoding has been noted to enable potential injection vulnerabilities in legacy systems or misconfigured parsers that fail to escape the output adequately, such as when hashes are directly concatenated into SQL queries or HTTP paths. Later variants like $2b$ and $2y$ maintain the same alphabet but address unrelated implementation flaws, preserving compatibility while mitigating other risks.[19][37]
Recent analyses since 2023 have highlighted encoding challenges in web and JSON contexts, where unescaped bcrypt outputs can disrupt API responses or database serialization if integrated into JSON payloads without quoting, particularly in environments enforcing strict URL safety or JSON schema validation. These concerns, while not unique to bcrypt, amplify the importance of context-aware encoding in modern web applications.[38]