Snake case
Snake case is a naming convention in computer programming used to format identifiers, such as variable and function names, by writing words in lowercase letters separated by underscores (_), creating a visual effect reminiscent of a snake slithering along the text.[1][2] This style replaces spaces or other separators with underscores to ensure compatibility with programming languages that restrict identifier characters, while maintaining readability by clearly delineating word boundaries.[3]
Widely adopted in languages like Python, Ruby, and Rust, snake case is the preferred convention for non-class identifiers in these ecosystems to promote consistent and legible code.[1] In Python specifically, the official style guide (PEP 8) mandates lowercase names with underscores for functions, variables, and modules, emphasizing that this approach enhances code readability over alternatives like camel case.[4] For example, a function to compute a user's age might be named calculate_user_age, and a constant for pi could use the uppercase variant, known as screaming snake case, as PI_VALUE.[3][1]
Unlike camel case, which capitalizes the first letter of each subsequent word without separators (e.g., calculateUserAge), or Pascal case, which does the same but starts with a capital (e.g., CalculateUserAge), snake case prioritizes all-lowercase uniformity and explicit word separation, making it particularly suitable for scripting and data-oriented languages where underscore support is robust.[5] This convention also extends to configuration files and database column names in various frameworks, underscoring its versatility across software development practices.[2]
Fundamentals
Definition
Snake case is a human-readable naming convention used for compound names or phrases in programming and data contexts, where each word is written in lowercase letters and separated by a single underscore (_), without spaces or other punctuation.[1] This style ensures that multi-word identifiers remain clear and adhere to the syntactic rules of many programming languages, which often prohibit spaces in names.[4]
The primary purpose of snake case is to enhance readability by simulating natural language spacing through underscores, making code and data structures easier to parse at a glance while complying with identifier constraints in languages like Python and Ruby.[4] It promotes consistency in naming, reducing ambiguity in collaborative development environments.[1]
Core rules of snake case include using all lowercase letters for every word, separating words with exactly one underscore, and avoiding leading or trailing underscores unless explicitly required by a specific style guide.[4] This convention applies to various elements such as variables, functions, and filenames, where multi-word names benefit from explicit separation.[4] In contrast, alternatives like camelCase achieve similar readability goals by capitalizing subsequent words without separators.[1]
Characteristics
Snake case employs a strict formatting convention where identifiers consist exclusively of lowercase English letters and digits, with words separated by a single underscore character. No uppercase letters, hyphens, spaces, or other punctuation are permitted in the core structure, ensuring uniformity and readability across implementations. This approach aligns with established style guides in languages like Python, where variable and function names follow lowercase with underscores as needed.[4][6]
Digits are permitted within words but must adhere to word boundaries, meaning they integrate into the preceding or following word rather than serving as separators. For instance, user_id_2 is valid as the digit attaches to the word "id," whereas placements like user_2_id treat "2" as part of a numeric word if contextually appropriate, though separators remain single underscores only. This maintains the convention's clarity without introducing ambiguity in identifier parsing.[4][6]
Edge cases in snake case include the handling of acronyms, which are conventionally rendered in lowercase to preserve the all-lowercase rule, such as http_status rather than HTTP_status. Consecutive underscores are avoided to prevent parsing issues and maintain semantic separation, as double underscores can trigger special behaviors like name mangling in some languages. For readability, names are recommended to be concise yet descriptive.[4][6]
A minor variation allows an initial single underscore to denote private or internal variables in certain conventions, as in _private_var, while strict snake case prohibits leading or trailing underscores in public identifiers. This prefix is not part of the core separation mechanism but serves as a scoping indicator without violating the lowercase and single-underscore rules.[4][6]
History
Origins
The practice of using underscores to separate words in multi-word identifiers, forming what is now called snake case, emerged in the late 1960s as a means to enhance readability in early computer programming. This convention allowed programmers to create descriptive names without relying on spaces or other separators that were not permitted in most language syntaxes at the time.
Key early influences trace to the Unix shell scripting and C programming communities in the 1970s, where underscores were employed for long identifiers to improve code clarity. In Unix development, underscores appeared in variable names within system code to denote compound concepts, such as in routines handling user data or file operations. The C language, developed at Bell Labs during this period, further entrenched the approach; the influential 1978 book The C Programming Language by Brian Kernighan and Dennis Ritchie recommended lowercase letters for variables and the use of underscores to join words, as seen in examples like compound names for functions and data structures.[7]
Although the convention predates formal documentation, one of the earliest style guides to explicitly endorse it was Python's PEP 8, published in 2001, which prescribed lowercase words separated by single underscores for variables, functions, and methods—reflecting practices already common in the Python community and inherited from C-influenced traditions. Influences from earlier languages like Lisp also contributed, where hyphens were favored for similar purposes of readable compound identifiers but shared the goal of clear word separation.[4]
The term "snake case" itself arose informally in online programming discussions, drawing from the visual analogy of underscores linking lowercase words like a snake's undulating body; the earliest documented use was in a 2004 Usenet post in the Ruby community, though one claim traces its invention to 2002 at Intel. It gained traction in the early 2000s among Ruby developers to distinguish the style from uppercase variants like "screaming snake case" used for constants. This naming helped differentiate it from other conventions in growing internet forums and documentation.[2][1][8]
Adoption and Evolution
Snake case gained widespread adoption in programming communities through key milestones in the early 2000s and beyond. In Python, it was officially standardized in PEP 8, the style guide for Python code released on July 5, 2001, which mandates lowercase names separated by underscores for functions and variables to enhance readability.[4] Similarly, in Ruby, snake case emerged as a community standard by the mid-2000s, particularly with the rise of Ruby on Rails in 2004, where it became the preferred convention for methods, variables, and file names as documented in community style guides.[9] In Rust, snake case has been the preferred style for variables and functions since the language's 1.0 stable release in 2015, as outlined in the Rust API Guidelines and earlier RFC 430, promoting consistency in codebases.[10]
The evolution of snake case standards reflects a transition from ad-hoc usage in early C and Unix environments to formalized guidelines in modern languages and projects. In the Unix tradition, underscores were used informally for multi-word identifiers, but the Linux kernel's coding style document, originally posted by Linus Torvalds in 2002, recommends using underscores to separate words in global variables, functions, and struct members while frowning upon mixed case, influencing open-source practices.[11] This shifted toward explicit adoption in newer ecosystems, where style guides enforced snake case to improve code maintainability, as seen in the influence of Python and Ruby communities on broader open-source norms.
Over time, snake case rose in prominence during the early 2000s due to the needs of web development for readable, human-friendly APIs, particularly in dynamic languages like Python and Ruby that powered early web frameworks. By the 2010s, it integrated into data handling standards, becoming common for JSON object keys in APIs influenced by backend languages like Python and Ruby, though no universal JSON specification mandates it, and for SQL column names in relational databases via ORMs such as SQLAlchemy, which default to snake case for portability across case-insensitive systems.[12]
Its global spread extended to non-English contexts through internationalization efforts in software, where ASCII-based snake case facilitates word separation in multilingual codebases, even when comments or strings use non-Latin scripts. For instance, in databases supporting Unicode, underscores enable clear delineation of terms in Cyrillic or other scripts, as adopted in PostgreSQL configurations for internationalized applications, aiding developers from diverse linguistic backgrounds without relying on case sensitivity.
Usage
In Programming Languages
In Python, snake case is the mandated convention for naming variables, functions, and methods, as specified in PEP 8, the official style guide for Python code, which recommends lowercase letters separated by underscores to enhance readability.[4] This standard is enforced through linters such as Pylint, which flags deviations from snake case via the invalid-name check, ensuring compliance during development.[13]
Ruby adopts snake case as the standard for methods, variables, and symbols, aligning with the Ruby Style Guide to maintain consistency across codebases.[9] In the Ruby on Rails framework, these conventions extend to database-related elements, where table and column names use snake case to facilitate clear mapping between models and schemas, while class names follow PascalCase.
Rust prefers snake case for variables, functions, modules, and other identifiers, as outlined in the Rust API Guidelines, which emphasize its use for non-type-level constructs to promote readability in systems programming contexts.[10] The Cargo package manager reinforces this by converting kebab-case package names (used in repository contexts) to snake case for crate identifiers in code, ensuring seamless integration without naming conflicts.
In other languages, snake case is optional but commonly applied in specific ecosystems; for instance, it appears in JavaScript for Node.js module filenames and export names in some projects to match backend conventions, though camelCase remains prevalent overall.[14] Similarly, SQL databases frequently use snake case for table and column names to preserve word separation in case-insensitive environments, as recommended in SQL style guides.[15] In C++, snake case is standard for variables and functions in the C++ Standard Library and adopted in libraries like Google's C++ codebase for consistency with legacy C influences.[16]
Enforcement of snake case occurs through integrated development environment (IDE) features and build tools; for example, Visual Studio Code supports auto-formatting and linting extensions like Pylint for Python or rust-analyzer for Rust, which highlight or correct non-compliant names in real-time.[17] Build systems such as Cargo in Rust or ESLint in JavaScript can flag violations during compilation or testing, often converting or rejecting code that deviates from project conventions.
In Unix-like file systems, such as those used in Linux and other POSIX-compliant environments, snake_case is a prevalent convention for naming files and directories to separate multiple words without introducing spaces, which can complicate command-line operations and scripting.[18] Underscores are explicitly permitted as part of valid filename characters alongside lowercase letters, numbers, dots, and hyphens, promoting consistency and readability in multi-word names like my_script.py.[19] This approach contrasts with common Windows practices that favor camelCase or embedded spaces, but snake_case enhances cross-platform interoperability by minimizing case-sensitivity conflicts—since Unix systems treat uppercase and lowercase as distinct—while avoiding the need for URL encoding or quoting in shared environments.[18]
In data serialization formats, snake_case is widely adopted for keys and headers to maintain human-readable structures without relying on case variations. For JSON objects, although the RFC 8259 standard imposes no specific naming requirements beyond keys being unique strings, snake_case is a common choice in practice for attributes like {"user_name": "value"}, as it aligns with readability preferences in ecosystems like Python and Ruby.[20] Similarly, CSV files often employ snake_case for column headers, such as customer_order_id, to ensure compatibility with tools that parse delimited text and to prevent parsing errors from spaces or special characters.[12] In XML, while lowerCamelCase predominates for element names per community guidelines, snake_case appears in attribute values or custom schemas where underscore separation aids clarity in data exchange.[21]
Databases, particularly relational systems like PostgreSQL, endorse snake_case for table and column identifiers to leverage the system's default lowercase folding of unquoted names, thereby simplifying queries without mandatory quoting.[22] For instance, conventions recommend formats like customer_order_id for columns, which enhances query legibility and portability across SQL dialects that handle underscores uniformly.[23] Configuration files in formats such as YAML and TOML further utilize snake_case extensively for key-value pairs, as seen in examples like database_url: "..." in .env or YAML setups, where it supports nested structures without ambiguity.[24] The TOML specification illustrates this with dotted keys like temp_targets.cpu, reinforcing snake_case's role in declarative configs.[24]
Overall, snake_case's use in these domains promotes interoperability by standardizing on safe, URL-friendly characters that bypass encoding challenges in web transfers or cross-system migrations, unlike space-separated or dashed alternatives that may require additional processing.[25]
Comparisons
With PascalCase
Snake case employs all lowercase letters with underscores as separators between words, forming identifiers like user_name or process_data. In contrast, PascalCase—also known as UpperCamelCase—capitalizes the initial letter of each word while omitting any separators, resulting in forms such as UserName or ProcessData. This structural divergence arises from differing conventions in programming ecosystems, where snake case aligns with delimiter-based traditions and PascalCase with capitalization-only approaches.[4][26]
Readability between the two conventions involves trade-offs influenced by visual parsing and context. Empirical studies indicate that PascalCase enhances recognition accuracy in source code comprehension tasks, with participants achieving higher correctness rates when trained on it compared to underscore-separated styles, though it may require slightly more time for processing. Snake case, however, offers advantages in scenarios demanding quick word boundary detection, such as reviewing long identifiers under constrained viewing conditions, due to the explicit underscore delimiter mimicking natural spacing. PascalCase's compactness reduces visual clutter in dense codebases, particularly for object-oriented paradigms where it signals type hierarchies.[27][28]
Use cases for snake case and PascalCase diverge along language-specific guidelines and paradigms. Snake case predominates for variables and functions in scripting languages like Python and Ruby, promoting consistency and readability in procedural or functional code. Conversely, PascalCase is standard for classes, interfaces, and public methods in object-oriented languages such as Java and C#, where it visually distinguishes types from instances and emphasizes modularity in large-scale applications.[4][26]
Converting between snake case and PascalCase requires parsing tools to handle delimiters and capitalization. In shell environments, utilities like sed with regular expressions can transform snake case by uppercasing letters following underscores and removing the separators, as in sed -r 's/(^|_)(.)/\U\2/g'. Python's re module facilitates similar conversions through substitution patterns that capitalize post-underscore characters, enabling programmatic refactoring in polyglot projects. These tools underscore the non-trivial mapping, as splitting on capitals in PascalCase demands different logic than underscore division.[29][30]
With Kebab-case
Snake case and kebab-case are both lowercase naming conventions that separate words with punctuation, but they differ fundamentally in their choice of separator: snake case employs underscores (_), as in user_name, while kebab-case utilizes hyphens (-), as in user-name.[31][5]
This structural distinction leads to distinct contextual preferences in usage. Snake case is favored for code identifiers in programming languages, where hyphens are typically invalid characters or interpreted as operators like subtraction, making underscores a safer and more compatible choice for variable, function, and constant names.[31][32] In contrast, kebab-case predominates in web development contexts such as URLs, CSS class names, and HTML attributes, where hyphens align with standards for readability and functionality; for example, CSS classes like my-class-name and HTML like <div class="user-profile"> rely on hyphens to denote multi-word selectors without issues.[31][33] In frontend frameworks like React, kebab-case is standard for CSS class strings passed to the className attribute, enhancing consistency with stylesheet conventions.[34][35]
Regarding readability and compatibility, both conventions promote clear word separation, but their behaviors vary across environments. Underscores in snake case maintain integrity in URLs without requiring special encoding and are less prone to misinterpretation in command-line shells, where hyphens can be confused with options (e.g., ./my-script.sh works seamlessly, but ./my-script.sh may need quoting).[36][37] Hyphens in kebab-case, however, are more web-friendly, as search engines like Google treat them as word boundaries for better SEO and semantic parsing, whereas underscores are often viewed as connectors within a single term.[36][38]
Adoption patterns reflect these preferences: snake case is prevalent in backend languages such as Python and Ruby for internal code elements, while kebab-case is widely adopted in frontend and web protocols, including CSS specifications and URL structures, to optimize for browser and search engine compatibility.[5]
Converting between the two is straightforward via string replacement—substituting underscores with hyphens while ensuring all characters remain lowercase—but it can introduce semantic shifts, such as hyphens enabling better word boundary recognition in search engines during URL processing.[39][36]
Examples
Basic Illustrations
Snake case, a naming convention that joins words with underscores and uses lowercase letters, is illustrated through simple identifiers that separate multiple terms for readability.[1] For variables, common examples include user_name, total_count, and http_response_code, where each word is distinctly separated without capitalization.[3][40]
Function names in snake case follow a similar pattern, such as calculate_average, fetch_user_data, and validate_email_address, emphasizing descriptive, multi-word phrasing.[1] Constants, however, often employ a variant known as screaming snake case, rendered in all uppercase letters with underscores, like MAX_CONNECTIONS and PI_VALUE, to denote immutability.[4]
To highlight validity, snake case requires underscores for separation, making my_variable a correct form, whereas my-variable is invalid due to its use of hyphens, which aligns with a different convention.[1] For more complex multi-word identifiers, an example is inventory_management_system, demonstrating how longer phrases maintain clarity through consistent underscoring without excessive length.[5]
Practical Applications
In programming, snake case is widely applied in Python for defining functions and variables to enhance readability. For instance, a typical function might be structured as follows:
python
def get_user_profile(user_id: int) -> dict:
# Implementation to fetch and return user data
pass
def get_user_profile(user_id: int) -> dict:
# Implementation to fetch and return user data
pass
This convention aligns with the Python Enhancement Proposal 8 (PEP 8), which specifies lowercase names separated by underscores for functions and variables.[4]
In file systems, particularly Unix-like environments, snake case facilitates consistent and searchable naming for directories and files, avoiding issues with case sensitivity and special characters. Common examples include config/database_settings.py for configuration files and assets/images/product_photo.jpg for media resources. This practice is endorsed in style guides for lowercase filenames with underscores to simplify searches and maintain portability across systems.[41][42]
For data interchange formats like JSON, snake case is frequently used in field names to match backend conventions in languages such as Python and Ruby. An example payload might appear as:
json
{
"first_name": "John",
"last_name": "Doe",
"order_total": 99.99
}
{
"first_name": "John",
"last_name": "Doe",
"order_total": 99.99
}
This approach ensures compatibility with database schemas and serialization tools that default to underscore-separated keys.[43]
In RESTful APIs, snake case is common for query parameters to promote clarity and consistency, especially in services built with frameworks like Django or those following Ruby on Rails patterns. For example, an endpoint such as /api/v1/user-profile?user_id=123 uses user_id as the parameter, mirroring the snake case in underlying models. Stripe's API exemplifies this by employing snake case for parameters like email and starting_after in customer queries.[43][44]
At a project scale, snake case is often adopted for backend components in web applications, such as Django models with fields like user_profile, while frontend JavaScript may use camelCase for variables to align with language idioms. This hybrid usage requires serialization tools to convert between conventions, ensuring seamless data flow without altering core logic.[43]