nroff

nroff is a text formatting program developed for the Unix operating system, designed to produce output suitable for typewriter-like devices such as terminals and line printers.^[1] It processes input files containing markup commands to generate formatted ASCII text, enabling the creation of structured documents like manuals and reports without graphical capabilities.^[2] As part of the broader roff typesetting system, nroff serves as the companion to troff, which handles output for graphical phototypesetters, allowing the same source files to be adapted for different display mediums.^[1] The program originated at Bell Labs in 1973, when Joseph F. Ossanna implemented it in PDP-11 assembly language as an update to the earlier roff formatter from the 1960s Multics and CTSS systems.^[3] Brian Kernighan rewrote nroff in the C programming language in 1975, enhancing its portability and integration into Unix.^[1] Named "new roff," it built upon the RUNOFF program by Jerome Saltzer, incorporating commands inspired by traditional typesetting practices to automate document preparation.^[3] Throughout its evolution, nroff has been extended with preprocessors like tbl for tables and eqn for equations, as well as macro packages such as the man macros specifically for formatting Unix manual pages.^[1] nroff remains a foundational tool in Unix-like systems, particularly for generating terminal-viewable documentation, and continues to influence modern implementations like GNU groff, which support additional output formats including UTF-8 and PostScript.^[2] Its lightweight design and resource efficiency make it ideal for resource-constrained environments, underscoring its enduring role in command-line text processing despite the rise of graphical alternatives like TeX and desktop publishing software.^[3]

Overview

Purpose and Functionality

nroff, short for "new roff," is a text-formatting program that processes ASCII text files containing markup instructions to generate output suitable for fixed-width, typewriter-like devices such as terminals and line printers.^[4]^[5] Its core purpose is to produce readable, paginated documents by emulating the output of mechanical typewriters, focusing on simple formatting rather than advanced typesetting features.^[6] The primary functionality of nroff involves collecting words from input lines, filling and justifying output lines by adjusting inter-word spaces to align with the right margin, and supporting basic alignments such as left, right, or centered.^[4] It handles pagination through automatic page breaks based on specified page lengths and vertical spacing, while incorporating essential typographic elements like line filling, hyphenation, and font switching between styles such as roman, bold, and italic, all optimized for non-proportional spacing.^[4] Designed in the 1970s for low-resolution output devices like teletypes and line printers that dominated early computing environments, nroff ensures formatted text remains legible on hardware with limited graphical capabilities.^[7]^[4] As a complement to troff, its typesetting counterpart for phototypesetters, nroff enables the same input files to yield terminal-friendly ASCII output without requiring modifications.^[4]

Role in Unix Ecosystem

nroff serves as a foundational component in the Unix ecosystem, particularly in the formatting and display of manual pages (man pages) for terminal-based viewing. The man command, a standard utility for accessing documentation, relies on nroff—or its modern implementation via groff—to process the roff markup language used in man pages into formatted plain text suitable for typewriter-like devices such as terminals. This integration ensures that essential system documentation remains accessible in text-only environments without requiring graphical interfaces.^[8]^[6] Aligned with the Unix philosophy of designing simple, modular tools that handle one task efficiently and integrate via pipes and redirection, nroff exemplifies the principle of text stream processing, where input and output are plain text streams that can be chained with other utilities. It is frequently invoked indirectly through scripts, the man command, or pipelines, enabling seamless incorporation into broader workflows for document preparation and text manipulation without unnecessary complexity.^[9]^[10] nroff maintains a persistent role in contemporary Unix-like systems, including Linux distributions and BSD variants, where it endures alongside graphical alternatives owing to its computational efficiency, minimal resource footprint, and adherence to POSIX standards for portability. Its generation of unadorned ASCII text output makes it particularly well-suited for non-graphical contexts, such as email transmission, system logs, remote terminals, and resource-constrained setups like embedded or legacy environments, sustaining its utility as of 2025.^[11]^[12]^[6] Preprocessors such as tbl can extend nroff's basic capabilities for table formatting within these workflows.^[13]

History and Development

Origins and Early Influences

The origins of nroff trace back to the early text-formatting programs developed in the 1960s for time-sharing systems. The foundational influence was RUNOFF, created by J. H. Saltzer in 1964 for the Compatible Time-Sharing System (CTSS) on the IBM 7094 at MIT. Written in MAD assembler, RUNOFF served as a simple macro package for formatting documents, enabling basic operations like right-justification, line length control, and page headers through control words prefixed by a period (e.g., .adjust for justified text). It was designed for producing manuscript-style output on line printers, marking the first computerized approach to structured text processing beyond basic echoing.^[14] This concept evolved through ports and adaptations at Bell Labs. In 1969, Douglas McIlroy reimplemented RUNOFF in BCPL for the GECOS system on the GE 645, introducing enhancements such as no-fill mode (.na) and improved hyphenation. By 1971, McIlroy's version was ported to Multics—itself a descendant of CTSS—and renamed "roff" by Robert Morris to distinguish it from the original CTSS program; this roff was then transliterated into PDP-7 assembly for the nascent Unix operating system on the PDP-7 minicomputer. The early roff for Unix, integrated into the First Edition in 1971, remained rudimentary, supporting only fixed-width output for devices like the IBM 2741 or Teletype Model 37, with no capabilities for tables, equations, or proportional spacing.^[15] nroff emerged as an advancement of this lineage in the early 1970s at Bell Labs, primarily through the efforts of Joe Ossanna. As "new roff," nroff addressed some of roff's limitations by providing a more programmable formatter optimized for typewriter-like terminals and line printers, while retaining the core request-based markup. Ossanna's key contribution was formalizing the roff processing pipeline, which decoupled generic input markup (requests and macros) from device-specific output generation, allowing intermediate representations to be routed to diverse renderers—this architecture laid the groundwork for extensible text processing in Unix.^[16]^[17]

Key Milestones and Implementations

nroff was first released on June 12, 1972, as part of Version 2 Unix at Bell Labs, where it was implemented in PDP-11 assembly language by Joe Ossanna to provide terminal-formatted text processing capabilities.^[18]^[19] In 1975, nroff underwent enhancements for Version 6 Unix, which introduced improved pagination features and expanded support for output devices, making it more versatile for the evolving Unix environment. In 1975, Brian Kernighan rewrote nroff in the C programming language, enhancing its portability. During the late 1970s, after Joe Ossanna's death in 1979, Brian Kernighan refined and completed the C implementation to accommodate phototypesetter hardware, a development that eventually led to the creation of troff as a specialized variant.^[19]^[17] In the 1980s, nroff became a standard component of prominent Unix distributions, including Berkeley Software Distribution (BSD) releases and AT&T's System V, facilitating widespread adoption in academic and commercial settings. Additionally, a simplified implementation of nroff in Ratfor appeared in the 1976 book Software Tools by Brian Kernighan and P. J. Plauger, serving as an educational example for building portable text processing utilities.

Technical Specifications

Markup Language and Requests

nroff employs a line-based markup language known as roff, which consists of plain text interspersed with control directives to manage formatting. The core elements include requests, which are commands beginning with a dot (.) at the start of a line, and escape sequences, which start with a backslash () embedded within text lines. Requests control structural and layout aspects, such as paragraphs and headings, while escape sequences handle inline adjustments like font changes. This syntax relies on sequential line parsing without nested tags, allowing simple embedding of directives in document input.^[15]^[20] Common requests for text structure include .PP, which initiates a new paragraph with a line break and optional indentation, and .SH, which formats a section heading followed by a break. For text flow control, primitives manage filling, adjusting, and hyphenation: .fi enables filling, where words are collected and fitted into lines up to the right margin; .na disables adjusting, left-aligning text without widening inter-word spaces; and .hy toggles hyphenation to allow word breaks at syllables. These requests operate on a per-line basis, with text lines processed in fill mode by default unless altered. Hyphenation can be influenced by escape sequences like \% to inhibit breaks or \: to suggest them without printing.^[15]^[21]^[20] Escape sequences provide finer control, such as \fR to switch to roman font or \fB for bold, though nroff's capabilities are constrained compared to troff. In nroff, which targets terminal output, advanced formatting like true bold or italic fonts is simplified: bold is typically rendered via overstriking (printing characters multiple times in the same position), and emphasis such as italics uses underlining. For instance, \fBtext\fP in nroff produces bold-like output through overstrike on monospaced terminals, while \fItext\fP results in underlined text. This interpretation of the shared roff language ensures compatibility but adapts output to line-printer or terminal limitations, ignoring proportional spacing or multiple font families.^[15]^[21]^[20] Macro packages, such as -ms or -man, extend these basic requests and escapes by defining higher-level commands for common structures like paragraphs or lists, building upon the primitive syntax without altering the underlying parsing.^[21]^[20]

Input Processing and Output Generation

nroff processes its input by reading the document line by line, where each line is either a control line beginning with a period (.)—known as a request that modifies formatting parameters—or ordinary text potentially containing escape sequences prefixed by a backslash () for inline adjustments. As it parses the input, nroff interprets these elements to construct an internal vertical list representing the document's structure, comprising text blocks, vertical spacing (via requests like .sp for space), and other vertical elements such as page breaks. This vertical assembly allows for logical organization of content before final rendering, ensuring proper pagination and layout decisions occur in a top-to-bottom sequence.^[22] Once the vertical list is built, nroff performs horizontalization by filling lines with text according to the current line length and justification settings (e.g., via .ad for adjust), breaking words as needed and applying horizontal motions from escapes like \h for specific spacing. The resulting horizontal lines are then output as an ASCII stream tailored to the target device, such as a terminal or line printer, specified via the -T option (e.g., -Tascii for generic typewriter output). Pagination is managed through requests like .bp, which forces an immediate page break by advancing to the next page while respecting the page length set by .pl. This output stream incorporates device-specific commands to emulate formatting, but nroff operates at a fixed resolution of 10 characters per inch horizontally and 6 lines per inch vertically, limiting its precision compared to more advanced formatters.^[22] To adapt output for diverse devices, nroff relies on terminal tables—ASCII files in directories like /usr/lib/troff/term or /usr/share/lib/nterm—that define device characteristics, including resolution units (in 1/240 inch) and escape sequences for motions. These tables specify strings for upward (\u escape, moving up half an em or one line in nroff), downward (\d escape), rightward, and leftward movements, enabling features like superscripts and subscripts on terminals such as the VT100, which use ANSI escape codes for cursor control (e.g., ESC [A for up). For example, a terminal table might set the "up" motion to the sequence for moving the cursor up one line, ensuring compatibility with the device's capabilities. nroff simulates bold using overstriking (reprinting characters) and italics using underlining (e.g., via the .ul request), adapting output to terminal limitations; troff supports these simulations but primarily uses actual font changes for typesetter output.^[22]

Relationship to troff

Shared Architecture

nroff and troff share a unified roff language as their foundational input format, employing identical syntax for requests (commands prefixed with a dot, such as .lp for line paragraphing) and escape sequences (backslash-prefixed directives like \f for font changes). This commonality ensures high portability, as a single .roff file containing these elements can be processed interchangeably by either formatter without modification, producing appropriately adapted output for terminal or typesetter devices.^[15]^[23] The architectural pipeline for both formatters follows a standardized sequence: input text is first routed through optional preprocessors (such as tbl for tables or eqn for equations) that expand specialized markup into standard roff requests, then fed into the core formatter (nroff or troff) for layout and pagination, and finally directed to postprocessors tailored to output devices (e.g., terminal drivers for nroff or typesetter drivers for troff). This modular pipeline leverages Unix interprocess communication, enabling flexible composition of tools while maintaining consistent handling of the roff input stream across both systems.^[24] Core mechanisms for output control, including diversions and traps, are identically implemented in both nroff and troff to support dynamic document assembly. The .di request initiates text diversion, capturing formatted output into a buffer for later reuse, such as in generating headers or footers, while trap requests like .wh (for vertical positioning) or .it (for input-line traps) automatically invoke macros at predefined page locations or intervals, facilitating automated pagination and conditional formatting.^[25]^[26] nroff and troff were developed concurrently by Joseph Ossanna at Bell Labs in the early 1970s, with troff building directly on nroff's simplified base to extend capabilities for phototypesetting; this shared foundation persisted in early portable C implementations, such as Brian Kernighan's 1979 ditroff, where substantial code for language interpretation and processing logic was common to both.^[27]

Key Differences in Capabilities

nroff and troff share a common input syntax based on markup requests, but diverge significantly in their output capabilities and supported features.^[27] While nroff targets low-resolution ASCII output suitable for terminals and line printers, troff is designed for high-resolution typesetting on devices such as the Graphic Systems CAT phototypesetter.^[28] This fundamental distinction limits nroff to fixed-width, monospaced text rendering, whereas troff enables precise control over character placement and visual elements for professional printing.^[21] In terms of typographic features, nroff lacks support for advanced elements like ligatures, kerning, and proportional fonts, which troff incorporates to achieve refined spacing and readability in variable-width typefaces.^[29] For mathematical equations and graphics, nroff simplifies representations to basic ASCII approximations without full fidelity, as it does not accommodate the complex positioning and scaling that troff handles via preprocessors like eqn.^[26] Additionally, nroff ignores troff-specific requests such as .ps for point size adjustments and .vs for vertical spacing, defaulting to uniform character sizes and line heights to suit its output constraints.^[23] Performance-wise, nroff operates more efficiently and with lower resource demands, as it bypasses the computational overhead of high-resolution rendering, making it ideal for quick terminal previews.^[30] Following Brian Kernighan's 1979 rewrite of troff into a device-independent format (ditroff), troff continued to evolve for emerging hardware like Imagen laser printers introduced around 1982, incorporating support for laser-based output and enhanced graphics.^[31]^[32] In contrast, nroff maintained its focus on ASCII-centric processing, allowing it to emulate troff input by degrading advanced elements—such as fonts or motions—into plain text equivalents, though this results in loss of visual sophistication.^[27]

Implementations and Variants

Original AT&T Versions

The original implementation of nroff appeared in Version 3 Unix, released in February 1973, as a PDP-11 assembly language program designed for basic text formatting on line printers and terminals, including support for simple macros and output limited to 132 columns.^[26] This version built on the earlier roff formatter from 1972, adapting it for typewriter-like output while maintaining compatibility with early Unix text processing needs.^[26] nroff was rewritten in the C programming language in 1975 by Brian Kernighan, enhancing its portability. Following Ossanna's death in 1977, further modifications were made. In Version 7 Unix, released in January 1979, these included improved error handling through better diagnostic output and enhanced terminal motion controls for more efficient pagination and cursor positioning.^[26] These changes enhanced portability across hardware and laid the groundwork for device-independent processing, allowing nroff to generate formatted text suitable for a wider range of output devices without major recompilation.^[26] During the 1980s, nroff became a standardized component of AT&T's System III (introduced in 1981) and System V Unix distributions, where it was bundled with the man macro package—originally designed by Doug McIlroy in Version 7 but formalized for consistent manual page formatting—and licensed commercially to third-party vendors for inclusion in proprietary Unix systems.^[15]^[26] As a proprietary tool under AT&T's control, nroff remained exclusive to licensed Unix implementations until BSD derivatives began replacing AT&T code in the late 1980s; its last major update occurred around 1989 in System V Release 4, prioritizing POSIX.1 compliance for improved portability and standards adherence over new feature additions.^[26]^[33]

Modern Free Software Implementations

The GNU roff (groff) project, initiated in 1989 by James Clark, represents a major open-source reimplementation of the original AT&T nroff and troff systems, with the first release (version 0.3.1) occurring in June 1990. Development continued under contributors including Bernd Warken, who focused on enhancements such as improved compatibility and documentation.^[34] Groff provides full backward compatibility with troff while extending functionality, including native support for UTF-8 encoding to handle international characters and postprocessors like gropdf for generating PDF output from intermediate formats such as PostScript.^[35] As the de facto standard for roff processing on Linux distributions in 2025, groff version 1.23.0—released in July 2023 and still current—includess X11 drivers for graphical output and optimizations for handling large documents efficiently, and it is commonly installed via package managers like apt (e.g., apt install groff).^[35]^[36] The Heirloom Documentation Tools, developed in the mid-2000s, derive directly from the AT&T System V Release 4 troff and nroff source code released by Sun Microsystems in 2005 as part of the Solaris open-source initiative.^[37] This suite emphasizes legacy compatibility, particularly for formatting manual pages and other documents on Unix-like systems such as Solaris and OpenSolaris derivatives, while providing utilities like eqn, neqn, tbl, and pic for consistent output to terminals and printers.^[38] It maintains close adherence to historical AT&T behavior, avoiding many modern extensions found in groff, to ensure seamless operation with older software environments.^[37] Other notable free software implementations include the Plan 9 troff/nroff system, where nroff operates as a mode of the unified troff binary invoked via a shell script with the -N flag, enabling hybrid text and typeset output tailored to Plan 9's distributed computing model.^[22] Additionally, cawf, a lightweight C-based clone of nroff developed by Vic Abell in the 1990s at Purdue University, serves as the primary formatter in the Minix operating system, approximating nroff's man and ms macros for resource-constrained embedded and educational environments while rejecting unknown requests for stricter error handling.^[39]^[40] Another implementation is mandoc, initiated in 2008 by Kristaps Dzonsons for OpenBSD, providing a fast, standards-compliant parser for man and mdoc macros with roff subset support for terminal output. It emphasizes security and efficiency, serving as the default in several BSD distributions as of 2025.^[41]

Usage and Applications

Command-Line Interface

The nroff command is invoked from the command line with the basic syntax nroff [options] [files], where it processes the specified input files in sequence or reads from standard input if no files are provided, producing formatted output to standard output by default.^[42]^[43] This design allows nroff to function as a filter in pipelines, handling multiple files by concatenating their processed results into a single continuous output stream.^[42]^[43] Key options control formatting and device-specific behavior. The -Tterm option specifies the target output device or terminal type, such as -Tascii for standard ASCII text suitable for terminals or line printers, ensuring appropriate character mappings and spacing.^[42]^[44] The -ms option loads the ms macro package to apply predefined formatting rules for memos and letters.^[43] The -e option enables even spacing between words on adjusted lines, which is particularly useful for technical documents like chemical formulas where symbols must remain adjacent.^[42]^[43] Additionally, the -i option directs nroff to continue reading from standard input after exhausting the listed files, facilitating integration with other commands via pipes.^[42]^[43] For interactive or paginated control, the -s option causes nroff to pause every N pages (default N=1), halting output to allow actions like paper loading or inspection before resuming upon receipt of a newline from the keyboard.^[42]^[44] In practice, nroff is frequently piped to utilities for display or printing, such as nroff file.n | less to view formatted text page-by-page or nroff file.n | lpr to send it directly to a line printer.^[23] Options like -man may be used briefly to invoke the man macro package for rendering manual pages.^[43]

Integration with Macro Packages

nroff integrates with predefined macro packages to extend its formatting capabilities for specific document types, providing libraries of higher-level requests that build upon core nroff commands to simplify document preparation and ensure consistent output. These macro packages, such as -man, -ms, and -me, are loaded at runtime and handle common structural elements like headings, paragraphs, and emphasis, thereby reducing the need for repetitive low-level requests in input files.^[45] The -man macro package is specifically designed for formatting Unix manual pages, standardizing the structure of command and system documentation. It defines key requests such as .TH for the title and header, which sets the document identifier, section number, date, and source information to configure page headers and footers; .SH for section headings, which produces bold titles like "NAME" or "SYNOPSIS"; and .TP for tagged paragraphs, which creates indented blocks with unindented tags for options or descriptions. Invoked via the -man option, this package ensures uniform presentation across Unix systems, making it essential for technical reference materials.^[46] The -ms macro package targets articles, books, and general manuscripts, offering tools for professional document layout including paragraphs, lists, and emphasis. Notable requests include .PP for initiating a new paragraph with indentation and spacing, .IP for indented paragraphs often with a label like a bullet or number for lists, and .B for rendering text in bold (underlined on terminals). This package streamlines the creation of multi-section documents by encapsulating formatting logic, allowing authors to focus on content rather than precise spacing adjustments.^[47] Similar to -ms but optimized for memos and technical papers, the -me macro package extends formatting for shorter, quoted, or displayed content. It incorporates requests like .DS to start a display block for quoted text or code, typically indented and spaced distinctly from body text, alongside paragraph and heading macros adapted for memo styles. Both -ms and -me are invoked using the -m option followed by the package name, such as nroff -ms file or nroff -me file, prepending the respective macro definitions to process the input efficiently.^[48]^[45] Macro packages like -man continue to be the most widely used in Unix documentation as of 2025, particularly for maintaining legacy and new manual pages in modern free software implementations, due to their role in preserving standardized, boilerplate-free formatting across diverse environments.^[8]

Preprocessors and Extensions

Supported Preprocessors

nroff supports a set of preprocessors that transform input files containing specialized markup into standard nroff/troff requests, enabling the creation of structured content like tables, equations, diagrams, and bibliographies before final formatting. These tools were developed at Bell Labs and are typically invoked in a pipeline, such as refer input | tbl | eqn | pic | nroff, allowing sequential processing to build complex documents.^[49]^[26] The tbl preprocessor, introduced in 1975 with Unix Version 6, formats tables by reading input delimited by .TS (table start) and .TE (table end) directives, followed by column specifications and data rows. For nroff output, tbl aligns table elements using spaces to produce readable ASCII representations suitable for terminal display, handling features like horizontal and vertical spanning without requiring manual spacing.^[50]^[51] Equation support in nroff is provided through neqn, a variant of the eqn preprocessor also released in 1975, which processes mathematical expressions enclosed in .EQ and .EN blocks. It supports basic inline notation, such as x + y = z, and renders output as simple text approximations, supporting basic superscripts and subscripts via keywords like sup and sub, but limiting advanced layout and symbols to linear representations suitable for typewriter-style devices.^[52]^[53] The pic preprocessor, developed in the early 1980s, enables the description of simple diagrams using textual commands for elements like boxes, lines, and arrows, which it converts into troff requests for positioning. In nroff, pic simplifies these to text-based approximations, such as using ASCII characters for lines and labels, to approximate visual structures in plain text output.^[54] Additionally, the refer preprocessor, introduced around 1978, processes bibliographic references marked with . [ and . ] delimiters, generating citations and bibliographies that integrate seamlessly into nroff pipelines for document referencing.^[26] Other supported preprocessors include soelim, which handles .so requests to include external files, eliminating them from the output to avoid processing issues in pipelines, and grn, which converts Gremlin graphics files into pic-compatible input for simple diagrams.^[21]

Limitations and Workarounds

nroff produces output limited to 7-bit ASCII characters suitable for typewriter-like devices, lacking support for colors and rendering special characters through approximations like <hy> for soft hyphens.^[21] This restriction stems from its design for terminal output, where advanced formatting such as font changes or color sequences is ignored or simplified.^[8] Older implementations offer no full Unicode support, confining documents to basic Latin characters and excluding accents or non-Latin scripts.^[21] To address these output constraints, modern variants like GNU groff enable Unicode handling via the -Tutf8 option, which processes input in UTF-8 encoding and outputs accordingly for terminals supporting it.^[21] By default, groff emits terminal escape sequences for text attributes and colors when using devices like utf8; the -c flag disables them for legacy compatibility, though this depends on the display device's capabilities and remains incompatible with plain ASCII printers.^[55] The preconv preprocessor further aids by converting input encodings to UTF-8 before processing, ensuring broader character compatibility without altering the core nroff pipeline.^[21] Graphics and equations, handled via preprocessors like pic for diagrams and eqn for mathematical expressions, lose fidelity in nroff mode, degrading to crude text-based approximations such as inline ASCII art or simplified symbols.^[21] This occurs because nroff prioritizes linear text output over spatial positioning, rendering complex layouts as sequential blocks.^[8] A common workaround employs conditional requests, such as .if t to execute troff-specific code only when not in nroff mode, allowing documents to include high-fidelity elements like full eqn output for phototypesetter rendering while falling back to text in nroff.^[30] Similarly, .ie and .el constructs can branch logic based on the t (troff) or n (nroff) conditions, embedding alternative content to maintain portability.^[56] In terms of performance, nroff's single-pass parsing can become inefficient for very large documents, as it processes the entire file sequentially without optimizations for memory or speed in traditional implementations.^[21] Workarounds include splitting input files into smaller sections using tools like so requests for inclusion or external scripting, which reduces load times during formatting.^[21] Variants such as the Heirloom Documentation Tools incorporate output optimizations, like tab-based horizontal spacing to accelerate rendering and minimize character counts in the final stream.^[37] As of 2025, nroff remains a niche tool primarily for legacy Unix systems, embedded environments, and generating manual pages where simplicity and portability are paramount.^[6] For broader applications, alternatives like Markdown-based workflows have emerged, with tools such as Pandoc converting documents to man page format via pandoc -t man input.md | nroff -man, emulating nroff output while leveraging modern syntax.^[57] This pipeline allows integration with contemporary editors and version control, bridging nroff's constraints without full replacement.^[57]