Fact-checked by Grok 2 weeks ago

nroff

nroff is a text formatting program developed for the Unix operating system, designed to produce output suitable for typewriter-like devices such as terminals and line printers. It processes input files containing markup commands to generate formatted ASCII text, enabling the creation of structured documents like manuals and reports without graphical capabilities. As part of the broader typesetting system, nroff serves as the companion to , which handles output for graphical phototypesetters, allowing the same source files to be adapted for different display mediums. The program originated at in 1973, when Joseph F. Ossanna implemented it in PDP-11 as an update to the earlier roff formatter from the 1960s and CTSS systems. rewrote nroff in in 1975, enhancing its portability and integration into Unix. Named "new roff," it built upon the RUNOFF program by Jerome Saltzer, incorporating commands inspired by traditional practices to automate document preparation. Throughout its evolution, nroff has been extended with preprocessors like tbl for tables and eqn for equations, as well as macro packages such as the man macros specifically for formatting Unix manual pages. nroff remains a foundational tool in Unix-like systems, particularly for generating terminal-viewable documentation, and continues to influence modern implementations like GNU groff, which support additional output formats including and . Its lightweight design and resource efficiency make it ideal for resource-constrained environments, underscoring its enduring role in command-line text processing despite the rise of graphical alternatives like and software.

Overview

Purpose and Functionality

nroff, short for "new roff," is a text-formatting program that processes ASCII text files containing markup instructions to generate output suitable for fixed-width, typewriter-like devices such as terminals and line printers. Its core purpose is to produce readable, paginated documents by emulating the output of mechanical typewriters, focusing on simple formatting rather than advanced typesetting features. The primary functionality of nroff involves collecting words from input lines, filling and justifying output lines by adjusting inter-word spaces to align with the right margin, and supporting basic alignments such as left, right, or centered. It handles through automatic page breaks based on specified page lengths and vertical spacing, while incorporating essential typographic elements like line filling, hyphenation, and font switching between styles such as , bold, and italic, all optimized for non-proportional spacing. Designed in the for low-resolution output devices like teletypes and line printers that dominated early environments, nroff ensures formatted text remains legible on with limited graphical capabilities. As a complement to , its counterpart for phototypesetters, nroff enables the same input files to yield terminal-friendly ASCII output without requiring modifications.

Role in Unix Ecosystem

nroff serves as a foundational component in the Unix ecosystem, particularly in the formatting and display of manual pages ( pages) for terminal-based viewing. The command, a standard utility for accessing documentation, relies on nroff—or its modern implementation via groff—to process the roff markup language used in man pages into formatted suitable for typewriter-like devices such as terminals. This integration ensures that essential system documentation remains accessible in text-only environments without requiring graphical interfaces. Aligned with the of designing simple, modular tools that handle one task efficiently and integrate via pipes and redirection, nroff exemplifies the principle of , where input and output are streams that can be chained with other utilities. It is frequently invoked indirectly through scripts, command, or pipelines, enabling seamless incorporation into broader workflows for document preparation and text manipulation without unnecessary complexity. nroff maintains a persistent role in contemporary Unix-like systems, including Linux distributions and BSD variants, where it endures alongside graphical alternatives owing to its computational efficiency, minimal resource footprint, and adherence to POSIX standards for portability. Its generation of unadorned ASCII text output makes it particularly well-suited for non-graphical contexts, such as email transmission, system logs, remote terminals, and resource-constrained setups like embedded or legacy environments, sustaining its utility as of 2025. Preprocessors such as tbl can extend nroff's basic capabilities for table formatting within these workflows.

History and Development

Origins and Early Influences

The origins of nroff trace back to the early text-formatting programs developed in the for systems. The foundational influence was RUNOFF, created by J. H. Saltzer in 1964 for the (CTSS) on the 7094 at . Written in assembler, RUNOFF served as a simple macro package for formatting documents, enabling basic operations like right-justification, line length control, and page headers through control words prefixed by a period (e.g., .adjust for justified text). It was designed for producing manuscript-style output on line printers, marking the first computerized approach to structured text processing beyond basic echoing. This concept evolved through ports and adaptations at . In 1969, Douglas McIlroy reimplemented RUNOFF in for the GECOS system on the GE 645, introducing enhancements such as no-fill mode (.na) and improved hyphenation. By 1971, McIlroy's version was ported to —itself a descendant of CTSS—and renamed "roff" by Robert Morris to distinguish it from the original CTSS program; this roff was then transliterated into PDP-7 assembly for the nascent Unix operating system on the . The early roff for Unix, integrated into the First Edition in 1971, remained rudimentary, supporting only fixed-width output for devices like the 2741 or Teletype Model 37, with no capabilities for tables, equations, or proportional spacing. nroff emerged as an advancement of this lineage in the early 1970s at , primarily through the efforts of Joe Ossanna. As "new roff," nroff addressed some of roff's limitations by providing a more programmable formatter optimized for typewriter-like terminals and line printers, while retaining the core request-based markup. Ossanna's key contribution was formalizing the roff processing pipeline, which decoupled generic input markup (requests and macros) from device-specific output generation, allowing intermediate representations to be routed to diverse renderers—this architecture laid the groundwork for extensible text processing in Unix.

Key Milestones and Implementations

nroff was first released on June 12, 1972, as part of Version 2 Unix at , where it was implemented in PDP-11 by Ossanna to provide terminal-formatted text processing capabilities. In 1975, nroff underwent enhancements for , which introduced improved pagination features and expanded support for output devices, making it more versatile for the evolving Unix environment. In 1975, rewrote nroff in , enhancing its portability. During the late 1970s, after Joe Ossanna's death in 1979, refined and completed the C implementation to accommodate phototypesetter hardware, a development that eventually led to the creation of as a specialized variant. In the , nroff became a standard component of prominent Unix distributions, including (BSD) releases and AT&T's System V, facilitating widespread adoption in academic and commercial settings. Additionally, a simplified implementation of nroff in Ratfor appeared in the 1976 book Software Tools by and P. J. Plauger, serving as an educational example for building portable text processing utilities.

Technical Specifications

and Requests

nroff employs a line-based markup language known as roff, which consists of plain text interspersed with control directives to manage formatting. The core elements include requests, which are commands beginning with a dot (.) at the start of a line, and escape sequences, which start with a () embedded within text lines. Requests control structural and layout aspects, such as paragraphs and headings, while escape sequences handle inline adjustments like font changes. This syntax relies on sequential line without nested tags, allowing simple embedding of directives in document input. Common requests for text structure include .PP, which initiates a new with a line break and optional indentation, and .SH, which formats a section heading followed by a break. For text flow control, primitives manage filling, adjusting, and hyphenation: .fi enables filling, where words are collected and fitted into lines up to the right margin; .na disables adjusting, left-aligning text without widening inter-word spaces; and .hy toggles hyphenation to allow word breaks at syllables. These requests operate on a per-line basis, with text lines processed in fill mode by default unless altered. Hyphenation can be influenced by escape sequences like \% to inhibit breaks or \: to suggest them without printing. Escape sequences provide finer control, such as \fR to switch to roman font or \fB for bold, though nroff's capabilities are constrained compared to troff. In nroff, which targets terminal output, advanced formatting like true bold or italic fonts is simplified: bold is typically rendered via overstriking (printing characters multiple times in the same position), and emphasis such as italics uses underlining. For instance, \fBtext\fP in nroff produces bold-like output through overstrike on monospaced terminals, while \fItext\fP results in underlined text. This interpretation of the shared roff language ensures compatibility but adapts output to line-printer or terminal limitations, ignoring proportional spacing or multiple font families. Macro packages, such as -ms or -man, extend these basic requests and escapes by defining higher-level commands for common like paragraphs or lists, building upon the primitive syntax without altering the underlying parsing.

Input Processing and Output Generation

nroff processes its input by reading the document line by line, where each line is either a control line beginning with a period (.)—known as a request that modifies formatting parameters—or ordinary text potentially containing escape sequences prefixed by a () for inline adjustments. As it parses the input, nroff interprets these elements to construct an internal vertical list representing the document's , comprising text blocks, vertical spacing (via requests like .sp for space), and other vertical elements such as page breaks. This vertical assembly allows for logical organization of content before final rendering, ensuring proper pagination and layout decisions occur in a top-to-bottom . Once the vertical list is built, nroff performs horizontalization by filling lines with text according to the current line length and justification settings (e.g., via .ad for adjust), breaking words as needed and applying horizontal motions from escapes like \h for specific spacing. The resulting horizontal lines are then output as an ASCII stream tailored to the target device, such as a or , specified via the -T option (e.g., -Tascii for generic output). Pagination is managed through requests like , which forces an immediate page break by advancing to the next page while respecting the page length set by .pl. This output stream incorporates device-specific commands to emulate formatting, but nroff operates at a fixed of 10 characters per inch horizontally and 6 lines per inch vertically, limiting its precision compared to more advanced formatters. To adapt output for diverse devices, nroff relies on terminal tables—ASCII files in directories like /usr/lib/troff/term or /usr/share/lib/nterm—that define device characteristics, including resolution units (in 1/240 inch) and sequences for motions. These tables specify strings for upward (\u , moving up half an or one line in nroff), downward (\d ), rightward, and leftward movements, enabling features like superscripts and subscripts on s such as the , which use ANSI codes for cursor control (e.g., ESC [A for up). For example, a terminal table might set the "up" motion to the sequence for moving the cursor up one line, ensuring compatibility with the device's capabilities. nroff simulates bold using overstriking (reprinting characters) and italics using underlining (e.g., via the .ul request), adapting output to terminal limitations; troff supports these simulations but primarily uses actual font changes for typesetter output.

Relationship to troff

Shared Architecture

nroff and share a unified roff language as their foundational input format, employing identical syntax for requests (commands prefixed with a dot, such as .lp for line paragraphing) and escape sequences (backslash-prefixed directives like \f for font changes). This commonality ensures high portability, as a single .roff file containing these elements can be processed interchangeably by either formatter without modification, producing appropriately adapted output for terminal or typesetter devices. The architectural pipeline for both formatters follows a standardized sequence: input text is first routed through optional preprocessors (such as tbl for tables or eqn for equations) that expand specialized markup into standard roff requests, then fed into the core formatter (nroff or ) for and , and finally directed to postprocessors tailored to output devices (e.g., drivers for nroff or typesetter drivers for ). This modular pipeline leverages Unix , enabling flexible composition of tools while maintaining consistent handling of the roff input stream across both systems. Core mechanisms for output control, including diversions and traps, are identically implemented in both nroff and troff to support dynamic document assembly. The .di request initiates text diversion, capturing formatted output into a buffer for later reuse, such as in generating headers or footers, while trap requests like .wh (for vertical positioning) or .it (for input-line traps) automatically invoke macros at predefined page locations or intervals, facilitating automated pagination and conditional formatting. nroff and were developed concurrently by Joseph Ossanna at in the early 1970s, with troff building directly on nroff's simplified base to extend capabilities for ; this shared foundation persisted in early portable implementations, such as Brian Kernighan's 1979 ditroff, where substantial code for language interpretation and processing logic was common to both.

Key Differences in Capabilities

nroff and troff share a common input syntax based on markup requests, but diverge significantly in their output capabilities and supported features. While nroff targets low-resolution ASCII output suitable for terminals and line printers, troff is designed for high-resolution typesetting on devices such as the Graphic Systems phototypesetter. This fundamental distinction limits nroff to fixed-width, monospaced text rendering, whereas troff enables precise control over character placement and visual elements for professional printing. In terms of typographic features, nroff lacks support for advanced elements like ligatures, , and proportional fonts, which troff incorporates to achieve refined spacing and readability in variable-width typefaces. For mathematical equations and , nroff simplifies representations to basic ASCII approximations without full fidelity, as it does not accommodate the complex positioning and scaling that troff handles via preprocessors like eqn. Additionally, nroff ignores troff-specific requests such as .ps for point size adjustments and .vs for vertical spacing, defaulting to uniform character sizes and line heights to suit its output constraints. Performance-wise, nroff operates more efficiently and with lower demands, as it bypasses the computational overhead of high-resolution rendering, making it ideal for quick previews. Following Kernighan's 1979 rewrite of into a device-independent format (ditroff), continued to evolve for emerging hardware like Imagen laser printers introduced around 1982, incorporating support for laser-based output and enhanced graphics. In contrast, nroff maintained its focus on ASCII-centric processing, allowing it to emulate input by degrading advanced elements—such as fonts or motions—into equivalents, though this results in loss of visual sophistication.

Implementations and Variants

Original AT&T Versions

The original implementation of nroff appeared in Version 3 Unix, released in February 1973, as a PDP-11 program designed for basic text formatting on line printers and terminals, including support for simple macros and output limited to 132 columns. This version built on the earlier roff formatter from 1972, adapting it for typewriter-like output while maintaining compatibility with early Unix text processing needs. nroff was rewritten in in 1975 by , enhancing its portability. Following Ossanna's death in 1977, further modifications were made. In , released in January 1979, these included improved error handling through better diagnostic output and enhanced terminal motion controls for more efficient and cursor positioning. These changes enhanced portability across hardware and laid the groundwork for device-independent processing, allowing nroff to generate formatted text suitable for a wider range of output devices without major recompilation. During the , nroff became a standardized component of 's System III (introduced in 1981) and System V Unix distributions, where it was bundled with the man macro package—originally designed by McIlroy in Version 7 but formalized for consistent manual page formatting—and licensed commercially to third-party vendors for inclusion in Unix systems. As a tool under 's control, nroff remained exclusive to licensed Unix implementations until BSD derivatives began replacing AT&T code in the late ; its last major update occurred around 1989 in System V Release 4, prioritizing POSIX.1 compliance for improved portability and standards adherence over new feature additions.

Modern Free Software Implementations

The GNU roff (groff) project, initiated in 1989 by , represents a major open-source reimplementation of the original nroff and systems, with the first release (version 0.3.1) occurring in June 1990. Development continued under contributors including Bernd Warken, who focused on enhancements such as improved compatibility and documentation. Groff provides full with while extending functionality, including native support for encoding to handle international characters and postprocessors like gropdf for generating PDF output from intermediate formats such as . As the for roff processing on distributions in 2025, groff version 1.23.0—released in July 2023 and still current—includess X11 drivers for graphical output and optimizations for handling large documents efficiently, and it is commonly installed via package managers like apt (e.g., apt install groff). The Documentation Tools, developed in the mid-2000s, derive directly from the System V Release 4 and nroff source code released by in 2005 as part of the open-source initiative. This suite emphasizes legacy compatibility, particularly for formatting manual pages and other documents on systems such as and derivatives, while providing utilities like eqn, neqn, tbl, and for consistent output to terminals and printers. It maintains close adherence to historical behavior, avoiding many modern extensions found in groff, to ensure seamless operation with older software environments. Other notable free software implementations include the Plan 9 /nroff system, where nroff operates as a mode of the unified binary invoked via a with the -N flag, enabling hybrid text and typeset output tailored to Plan 9's model. Additionally, cawf, a lightweight C-based clone of nroff developed by Vic Abell in the 1990s at , serves as the primary formatter in the operating system, approximating nroff's man and macros for resource-constrained embedded and educational environments while rejecting unknown requests for stricter error handling. Another implementation is mandoc, initiated in 2008 by Kristaps Dzonsons for , providing a fast, standards-compliant parser for and mdoc macros with roff subset support for terminal output. It emphasizes and , serving as the default in several BSD distributions as of 2025.

Usage and Applications

The nroff command is invoked from the command line with the basic syntax nroff [options] [files], where it processes the specified input files in sequence or reads from standard input if no files are provided, producing formatted output to standard output by default. This design allows nroff to function as a in pipelines, handling multiple files by concatenating their processed results into a single continuous output stream. Key options control formatting and device-specific behavior. The -Tterm option specifies the target output device or terminal type, such as -Tascii for standard ASCII text suitable for or line printers, ensuring appropriate character mappings and spacing. The -ms option loads the ms macro package to apply predefined formatting rules for memos and letters. The -e option enables even spacing between words on adjusted lines, which is particularly useful for technical documents like chemical formulas where symbols must remain adjacent. Additionally, the -i option directs nroff to continue reading from standard input after exhausting the listed files, facilitating integration with other commands via pipes. For interactive or paginated control, the -s option causes nroff to pause every N pages (default N=1), halting output to allow actions like paper loading or before resuming upon receipt of a from the . In practice, nroff is frequently piped to utilities for display or printing, such as nroff file.n | less to view formatted text page-by-page or nroff file.n | lpr to send it directly to a . Options like -man may be used briefly to invoke the macro package for rendering manual pages.

Integration with Macro Packages

nroff integrates with predefined macro packages to extend its formatting capabilities for specific document types, providing libraries of higher-level requests that build upon core nroff commands to simplify document preparation and ensure consistent output. These macro packages, such as -man, -ms, and -me, are loaded at and handle common structural elements like headings, paragraphs, and emphasis, thereby reducing the need for repetitive low-level requests in input files. The -man macro package is specifically designed for formatting Unix manual pages, standardizing the structure of command and system documentation. It defines key requests such as for the title and header, which sets the document identifier, section number, date, and source information to configure page headers and footers; .SH for section headings, which produces bold titles like "NAME" or ; and .TP for tagged paragraphs, which creates indented blocks with unindented tags for options or descriptions. Invoked via the -man option, this package ensures uniform presentation across Unix systems, making it essential for technical reference materials. The -ms macro package targets articles, books, and general manuscripts, offering tools for professional document layout including paragraphs, , and emphasis. Notable requests include .PP for initiating a new with indentation and spacing, .IP for indented paragraphs often with a label like a or number for , and .B for rendering text in bold (underlined on terminals). This package streamlines the creation of multi-section documents by encapsulating formatting logic, allowing authors to focus on content rather than precise spacing adjustments. Similar to -ms but optimized for memos and technical papers, the -me macro package extends formatting for shorter, quoted, or displayed content. It incorporates requests like to start a display block for quoted text or , typically indented and spaced distinctly from body text, alongside and heading macros adapted for styles. Both -ms and -me are invoked using the -m option followed by the package name, such as nroff -ms file or nroff -me file, prepending the respective definitions to process the input efficiently. Macro packages like -man continue to be the most widely used in Unix documentation as of 2025, particularly for maintaining legacy and new manual pages in modern free software implementations, due to their role in preserving standardized, boilerplate-free formatting across diverse environments.

Preprocessors and Extensions

Supported Preprocessors

nroff supports a set of preprocessors that transform input files containing specialized markup into standard nroff/troff requests, enabling the creation of structured content like tables, equations, diagrams, and bibliographies before final formatting. These tools were developed at Bell Labs and are typically invoked in a pipeline, such as refer input | tbl | eqn | pic | nroff, allowing sequential processing to build complex documents. The tbl preprocessor, introduced in 1975 with Unix Version 6, formats tables by reading input delimited by .TS (table start) and .TE (table end) directives, followed by column specifications and data rows. For nroff output, tbl aligns table elements using spaces to produce readable ASCII representations suitable for display, handling features like horizontal and vertical spanning without requiring manual spacing. Equation support in nroff is provided through neqn, a variant of the eqn preprocessor also released in 1975, which processes mathematical expressions enclosed in .EQ and .EN blocks. It supports basic inline notation, such as x + y = z, and renders output as simple text approximations, supporting basic superscripts and subscripts via keywords like sup and sub, but limiting advanced layout and symbols to linear representations suitable for typewriter-style devices. The preprocessor, developed in the early 1980s, enables the description of simple diagrams using textual commands for elements like boxes, lines, and arrows, which it converts into requests for positioning. In nroff, pic simplifies these to text-based approximations, such as using ASCII characters for lines and labels, to approximate visual structures in output. Additionally, the refer preprocessor, introduced around 1978, processes bibliographic references marked with . [ and . ] delimiters, generating citations and bibliographies that integrate seamlessly into nroff pipelines for document referencing. Other supported preprocessors include soelim, which handles .so requests to include external files, eliminating them from the output to avoid processing issues in pipelines, and grn, which converts Gremlin graphics files into pic-compatible input for simple diagrams.

Limitations and Workarounds

nroff produces output limited to 7-bit ASCII characters suitable for typewriter-like devices, lacking support for colors and rendering special characters through approximations like <hy> for soft hyphens. This restriction stems from its design for terminal output, where advanced formatting such as font changes or color sequences is ignored or simplified. Older implementations offer no full Unicode support, confining documents to basic Latin characters and excluding accents or non-Latin scripts. To address these output constraints, modern variants like enable Unicode handling via the -Tutf8 option, which processes input in UTF-8 encoding and outputs accordingly for terminals supporting it. By default, groff emits terminal escape sequences for text attributes and colors when using devices like utf8; the -c flag disables them for legacy compatibility, though this depends on the display device's capabilities and remains incompatible with plain ASCII printers. The preconv preprocessor further aids by converting input encodings to before processing, ensuring broader character compatibility without altering the core nroff pipeline. Graphics and equations, handled via preprocessors like pic for diagrams and eqn for mathematical expressions, lose fidelity in nroff mode, degrading to crude text-based approximations such as inline or simplified symbols. This occurs because nroff prioritizes linear text output over spatial positioning, rendering complex layouts as sequential blocks. A common workaround employs conditional requests, such as .if t to execute -specific code only when not in nroff mode, allowing documents to include high-fidelity elements like full eqn output for phototypesetter rendering while falling back to text in nroff. Similarly, .ie and .el constructs can branch logic based on the t () or n (nroff) conditions, embedding alternative content to maintain portability. In terms of performance, nroff's single-pass can become inefficient for very large documents, as it processes the entire file sequentially without optimizations for or speed in traditional implementations. Workarounds include splitting input files into smaller sections using tools like so requests for inclusion or external scripting, which reduces load times during formatting. Variants such as the Documentation Tools incorporate output optimizations, like tab-based horizontal spacing to accelerate rendering and minimize character counts in the final stream. As of 2025, nroff remains a niche primarily for legacy Unix systems, embedded environments, and generating manual pages where simplicity and portability are paramount. For broader applications, alternatives like Markdown-based workflows have emerged, with tools such as converting documents to format via pandoc -t man input.md | nroff -man, emulating nroff output while leveraging modern syntax. This pipeline allows integration with contemporary editors and , bridging nroff's constraints without full replacement.