troff
troff is a document processing system and typesetting program originally developed at Bell Labs in the early 1970s for producing formatted output on phototypesetters such as the Graphic Systems CAT.[1] It interprets plain text input embedded with control sequences and requests to handle tasks like line filling, hyphenation, and precise character positioning, enabling the creation of typographically sophisticated documents including footnotes, tables, and equations.[2] As part of the broader roff family of formatters, troff serves as the graphical counterpart to nroff, which generates simpler text output for line printers and terminals, allowing a single source file to produce multiple output formats.[3]
The origins of troff trace back to the runoff program, written by J. E. Saltzer in the mid-1960s for MIT's CTSS operating system, which evolved through ports like Bob Morris's roff for the GE 635 and Doug McIlroy's BCPL rewrite in 1969.[4] Joseph F. Ossanna, often called the father of modern roff systems, created the initial version of troff around 1973 in PDP-11 assembly language to drive the CAT typesetter for AT&T's patent documentation needs.[3] Following Ossanna's death in 1977, Brian Kernighan rewrote it in C by 1979, introducing device independence via ditroff to support multiple typesetters and adding features like drawing functions.[1] This evolution made troff a cornerstone of Unix typesetting, powering tools for technical writing and documentation.[5]
Key features of troff include its fully programmable input language using escape sequences (e.g., \.) and requests (e.g., .ta for tabs), which allow arbitrary page layout, overlapping characters, and custom macros defined with the .de command for reusable formatting blocks.[2] It supports macro packages like -ms (by Mike Lesk) for multi-column layouts and tables of contents, and -mm (designed by John Mashey, Dale Smith, and Ted Dolotta) as an early style sheet system for large manuals.[1] Preprocessors extend its capabilities: tbl for tables, eqn for mathematical equations, pic for diagrams, and refer for bibliographies, all integrated into the Unix toolchain for generating PostScript, PDF, or HTML output.[5] Troff's single-pass processing and hyphenation algorithms ensure efficient handling of complex documents with ligatures, kerning, and adjustable inter-sentence spacing.[2]
In the modern era, troff's legacy endures through GNU groff, initiated by James Clark in 1989 as a free reimplementation compatible with AT&T troff, featuring enhancements like longer macro names, UTF-8 support, and additional preprocessors such as grn for graphics.[4] It remains essential for formatting Unix man pages, O'Reilly technical books, and other documentation, demonstrating troff's enduring influence on document preparation despite the rise of systems like TeX and LaTeX.[5] As of 2024, GNU groff is maintained by G. Branden Robinson, with emeritus status for contributors like Werner Lemberg; it continues to evolve while preserving troff's core principles of simplicity and power.[6][1]
History
Origins in RUNOFF and Early Roff
The origins of troff trace back to early text formatting tools developed in the 1960s for time-sharing systems, beginning with RUNOFF, a command-line program created by Jerome H. Saltzer for the Compatible Time-Sharing System (CTSS) at MIT.[7] Developed between 1963 and 1964, RUNOFF processed plain text input augmented with simple markup commands—prefixed by a period followed by control words, such as .ll for setting line length—to produce formatted output suitable for line printers.[8] This approach allowed users to prepare documents like theses or reports by embedding formatting directives directly in the text, marking a shift from manual typing to automated batch processing in academic computing environments.[7] Saltzer's implementation, written in MAD and FAP assembly languages, was documented in a 1965 manual (revised 1966) and proved influential, inspiring ports to other systems including Multics.[8]
RUNOFF's concepts evolved into roff at Bell Laboratories around 1969, when Doug McIlroy reimplemented it in the BCPL programming language for the GE 645 computer running the GECOS operating system.[9] McIlroy's version, initially ported from Multics where it had been adapted by Bob Morris, introduced key enhancements like pagination control, basic indentation (e.g., the .ti request for temporary indent), and rudimentary hyphenation developed by Molly Wagner, tailoring it for producing paginated documents on line printers.[9] This roff served as a foundational tool in early Unix development, simplifying RUNOFF's syntax while adding registers for storing values and conditional processing, though it retained the core model of inline directives for batch formatting.[1] Deployed at Bell Labs' Murray Hill site, roff facilitated technical documentation in a pre-Unix era, bridging academic prototypes to industrial computing needs.[9]
A significant step toward troff came with the introduction of nroff, developed by Joseph F. Ossanna as the "new roff" for Unix Version 3 in February 1973.[9] Written in PDP-11 assembly language, nroff was designed as a device-independent formatter optimized for terminals and fixed-width output devices like the Teletype Model 37 or IBM 2471 printer, enabling real-time previewing of documents in interactive environments.[9] It expanded roff's capabilities with features such as macro definitions (via the .de request), diversions for capturing output (.di), and traps for dynamic page elements (.wh), while maintaining compatibility with earlier roff input.[9] As the immediate precursor to troff, nroff emphasized portability across output media but was constrained by the era's hardware, producing only fixed-width text without support for proportional fonts or variable spacing.[1]
Early roff systems, including nroff, faced inherent technical limitations rooted in their target environments, such as reliance on monospaced character sets for line printers and terminals, which prevented advanced typographic features like kerning or varying font widths.[9] These tools assumed uniform character widths (typically 6 characters per inch horizontally and 10 lines per inch vertically), restricting output to simple, grid-based layouts unsuitable for high-fidelity printing.[1] Such constraints highlighted the need for evolution toward more sophisticated typesetting, a transition later pursued at Bell Labs.[9]
Development at Bell Labs and Key Contributors
The development of troff at AT&T Bell Laboratories began in earnest in the early 1970s, driven by the need for high-quality typesetting within the Unix environment. Joe Ossanna, a member of the technical staff at Bell Labs, implemented the initial version of troff between 1972 and 1973, written in PDP-11 assembly language specifically to drive the Graphic Systems Corporation CAT phototypesetter. This pioneering effort introduced advanced features such as automatic hyphenation, kerning for improved letter spacing, and support for proportional fonts, marking a significant advancement over earlier roff implementations by enabling professional-grade output for technical documents.[10][1]
Ossanna continued refining troff, rewriting it in the C programming language around 1975 to enhance maintainability and integration with Unix, with ongoing evolution until his untimely death on November 28, 1977. Following Ossanna's passing, Brian W. Kernighan took primary responsibility for troff's maintenance and enhancement at Bell Labs, collaborating with colleagues including Mike E. Lesk, who had contributed related tools like the ms macro package and preprocessors such as tbl for tabular formatting. In 1979, Kernighan led a major revision to make troff compatible with multiple output devices, preserving backward compatibility with existing input formats and macro packages while addressing the limitations of the original CAT-specific design. This work culminated in the inclusion of the revised troff in Unix Version 7, released in 1979, which solidified its role as a core Unix utility for document preparation.[10][1][11]
A pivotal advancement came in the early 1980s with Kernighan's introduction of ditroff, a device-independent variant of troff, first implemented in spring 1979 and documented in a 1982 technical report. Ditroff separated the formatting logic from device-specific output by generating an intermediate ASCII-based description file, which could then be processed by postprocessors tailored to various typesetters, such as the Mergenthaler Linotron 202 or APS-5. This architecture improved portability across hardware, facilitating troff's adoption beyond Bell Labs and enabling extensions for graphics and diverse fonts during the decade's Unix portability initiatives.[10]
Other notable contributions at Bell Labs included work by Eric Grosse on font handling, particularly in developing glyph libraries for non-Latin scripts like Cyrillic, which expanded troff's capabilities for international technical documentation in the 1980s. These efforts, building on the foundational work of Ossanna and Kernighan, ensured troff's evolution as a robust, extensible tool within the Unix ecosystem.[12][1]
Core Functionality
Troff employs a stream-oriented markup language where input consists of ordinary text lines interspersed with control structures to guide formatting. Requests, which are commands for primitive operations, appear on lines beginning with a period (.) followed by the request name and optional arguments; for instance, .sp N inserts N lines of vertical space (defaulting to 1 line if N is omitted), while .ce N centers the subsequent N input lines about the page.[13] Escape sequences, embedded within text and prefixed by a backslash (), enable fine-grained inline adjustments, such as \fR to switch to the Roman font or \s+2 to increase the type size by 2 points relative to the current setting.[13]
Core to troff's text handling are fill mode and adjustment mode, which are active by default to produce justified paragraphs. In fill mode, troff collects words from multiple input lines to fill each output line to the right margin; the .nf request disables this, entering nofill mode to output lines as they appear in the input, preserving breaks for elements like addresses or poetry.[13] Adjustment expands or contracts interword spaces to align text flush left, right, or both; the .ad l request sets left-justified adjustment, .ad r for right, and .ad b (the default) for both, with .na turning adjustment off entirely.[13]
Diversions provide a mechanism for capturing portions of the output stream into a named internal buffer for later reuse, such as in footers or tables of contents. The .di xx request begins diverting output to the buffer named xx, and a subsequent .di (with no arguments) ends the diversion and resumes normal output; the diverted content can then be reinserted via the escape sequence \!xx.[13] Traps automate macro invocation at specific vertical positions, like page tops or bottoms, to insert headers or process page ejections; .wh N xx installs a trap to call macro xx when the current vertical position reaches N (measured from the top of the page), and .wh -1 xx removes it.[13]
Fundamental typography commands include .ft F to select font position F (e.g., .ft 2 for the second mounted font) and .ps N to set the point size to N (typically in increments of 1 point, with 10 as the default).[13] Horizontal and vertical motions adjust positioning precisely: the escape \h'Nm' moves rightward by distance Nm (e.g., \h'1i' for 1 inch), while \v'Nm'u moves up or down (positive for down, negative for up, with u indicating absolute motion); these are essential for superscripts, rules, or custom spacing.[13]
For debugging and control, troff supports error handling via .ab "message", which immediately aborts processing and outputs the specified error message to diagnostics.[13] Input stacking facilitates modular documents by reading external files inline, as with .so filename to include the contents of filename at that point.[13] These primitives form the foundation, which can be extended through macro packages for higher-level document structuring.[13]
Example of Basic Requests and Escapes
The following input demonstrates simple spacing, centering, font changes, and motions:
.ft R
.ps 12
.sp 1
.ce 2
This line is centered.
\fIThis line is italicized\fR and moved right by half an inch: \h'0.5i'moved text.
.ps
.ft R
.ps 12
.sp 1
.ce 2
This line is centered.
\fIThis line is italicized\fR and moved right by half an inch: \h'0.5i'moved text.
.ps
This produces vertically spaced, centered text in Roman font at 12 points, followed by an italicized phrase offset horizontally.[13]
Document Processing Model
Troff operates through a device-independent processing pipeline that interprets input directives and text to produce formatted output adaptable to various rendering devices. The core model involves reading input files containing text and control commands, applying formatting rules, and generating an intermediate representation that postprocessors convert to device-specific instructions. This design, introduced in the original troff and enhanced in ditroff, ensures portability across output media like phototypesetters, terminals, and modern printers.[10]
The processing begins with a two-pass approach: the first pass focuses on formatting operations, including line breaking, justification, and hyphenation via an algorithmic method that combines a set of exception words with a pattern-matching trie to determine break points efficiently. During this phase, troff builds vertical and horizontal layouts, managing page breaks and spacing requirements. The second pass then emits the formatted content as output, ensuring adjustments like inter-word spacing are finalized only after line composition. This separation allows for precise control over document flow while handling complex layouts in a single input read.[14]
In ditroff, the output from this process is an intermediate format (IF), a stream of low-level ASCII commands that describe precise positioning and glyph rendering without tying to any specific device. Examples include h 10 to move right by 10 basic units (where units are defined by device resolution), c A to output the character 'A', v -5 for upward vertical motion by 5 units, and f 2 to select font number 2. These commands form a compact, resolution-aware description of the page, enabling optimizations like vertical character sorting in postprocessors. The IF supports drawing primitives prefixed with D, such as Dl w for a line of width w.[10]
Device adaptation is facilitated by DESC files, plain-text configuration files located in directories like /usr/lib/font/devname/DESC, which specify parameters such as resolution (e.g., res 300 for 300 units per inch), available fonts (e.g., fonts 4 R I B S), point sizes (e.g., sizes 10 12 0), and minimum motion quanta (e.g., hor 1 vert 1). Troff reads the appropriate DESC file based on the -Tdevname option at invocation, parameterizing the IF accordingly—for instance, scaling motions to match the device's unit system. This setup decouples input processing from hardware specifics, allowing the same input to target diverse devices like the CAT phototypesetter or PostScript printers.[10][14]
The postprocessor chain completes the pipeline: ditroff (or traditional troff) generates the IF, which a device driver then translates to final output. For example, the dpost driver converts IF to PostScript code for laser printers, handling glyph mapping and rasterization based on the DESC specifications. Other drivers, such as those for DVI or terminal output, perform similar conversions, often optimizing for the target's capabilities like color or high-resolution rendering. This modular chain supports extensions like graphics embedding without altering the core formatter.[10]
Internally, troff maintains state through environments, registers, and conditional logic to manage dynamic formatting. Up to three switchable environments store parameters like current font, point size, and line length, altered via requests such as .ev 1 to activate environment 1 for elements like footnotes. Number registers, defined with .nr reg value (e.g., .nr pg 1 for page numbering), hold integers for computations and influence behavior—predefined ones like .ne N request N lines of vertical space before a break to avoid widows. Conditional execution, via .if condition or .ie/.el pairs, evaluates numerical comparisons (e.g., .if \n(pg>1), string matches, or page parity to selectively process input, enabling adaptive layouts like alternating headers. These mechanisms integrate seamlessly into the processing model, allowing documents to respond to content metrics during formatting.[14]
Macro Packages
Standard Macro Sets
The standard macro sets for troff provide predefined collections of macros that simplify document formatting for specific purposes, such as technical documentation, memos, and academic papers, by building on troff's core requests to handle layout, headings, and structural elements. These packages are typically distributed with troff implementations and invoked via command-line options, allowing users to process input files without defining custom macros from scratch. They originated primarily at Bell Labs and BSD Unix environments during the 1970s and 1980s, reflecting the needs of early Unix users for consistent output on phototypesetters and terminals.[1]
The -man macro package is designed for formatting Unix manual pages, organizing content into sections like NAME, SYNOPSIS, and DESCRIPTION to ensure standardized documentation across systems. Key macros include .SH for section headings, which sets bold text and adjusts indentation, and .TP for tagged paragraphs, where the first line serves as a non-indented tag followed by indented body text. It assumes one-inch margins and supports basic paragraphing with .PP. This package was developed at Bell Labs as part of early Unix distributions.[15][1]
The -ms macros, originating from Bell Labs and designed by Mike Lesk, target papers, memos, and general correspondence, offering simple yet flexible structuring for professional documents. Notable macros are .TL for titles, which centers and enlarges the text, and .PP for paragraphs, initiating filled and justified text with appropriate spacing. Additional features include .NH for numbered headings and .IP for indented paragraphs with optional tags, making it suitable for reports without excessive complexity.[1][16]
As an enhanced variant of -ms, the -mm macros extend functionality for business and technical memoranda, reports, and books, incorporating advanced features like indexing and table of contents generation while maintaining compatibility with -ms structures. Developed at Bell Labs by Dale Smith and John R. Mashey with input from Ted Dolotta, it includes .IX for index entries, which outputs tab-separated keywords for post-processing, and .XS/.XE pairs to delimit sections for automatic table of contents creation. Other elements support cover pages via .COVER and letters with .LT, emphasizing hierarchical organization through numbered headings (.H).[17][1]
The -me macros cater to academic and technical writing, particularly research papers, with built-in support for complex layouts like displays and footnotes to facilitate scholarly composition. Created by Eric Allman for BSD Unix, they provide .sh for numbered sections, .uh for unnumbered ones, and paragraph starters such as .pp (indented) or .ip (with tags). Displays are handled by delimiters like (b for blocks, (c for centered blocks, and (q for quotations, ended by )b, )c, or )q respectively; footnotes use (f and )f, with the $f register tracking numbers. This package promotes readability in dense, referenced content.[18][1]
For BSD-style manual pages, the -mdoc package emphasizes semantic markup to describe content meaning rather than just presentation, enabling better portability and automated processing. Introduced in 4.3BSD-Reno by Cynthia Livingston, it uses domain-specific macros like .Nm for command names, .Fl for flags, and .Ar for arguments, grouped into categories for text styling, layout, and semantics. Unlike -man, it prioritizes structural intent, such as .Sh for sections and .Ss for subsections, to generate consistent output across tools. Sample files like -mdoc.samples illustrate usage.[19][20]
Standard macro sets are invoked by specifying the package name after the -m option in troff commands, such as troff -ms file or troff -man file, which loads the corresponding tmac file (e.g., tmac.s for -ms) before processing the input. This method ensures the macros are available from the start, overriding default behaviors. Across all sets, troff maintains shared registers like .ps for controlling point size, set via .nr ps 10 to adjust font scaling in scaled points, providing consistent typography control independent of the package. Preprocessors like tbl or eqn integrate seamlessly with these macros for tables or equations.[21][22]
Customization and Examples
Troff allows users to extend its functionality through user-defined macros, which are sequences of commands and text that can be invoked by name to automate repetitive formatting tasks. Macros are defined using the .de request, followed by the macro name and body, terminated by .. or a custom end macro. For instance, a simple macro to format an example block might be defined as .de EX .sp 0.5 .RS .nf .. and invoked with .EX to start the block, followed by .EE (a separate macro for ending, such as .de EE .fi .RE .sp 0.5 ..).[23] Macros can accept arguments, accessed via \$1, \$2, etc., enabling parameterized behavior; for example, .de SM \s-2\$1\s+2 .. reduces the font size for its argument and restores it afterward.[2] These definitions operate globally unless scoped within diversions or environments, allowing reuse across documents.[13]
Macro management includes renaming with .rn old new, which reassigns a macro or string name while discarding the old one if the new exists, and removal with .rm name to free storage and prevent conflicts.[13] Local assignments can be achieved by defining macros within diversions, limiting their scope to that context, while global ones persist across the entire input.[2] For instance, .rn PP PARA renames the standard paragraph macro to avoid overriding it in custom setups.[24]
Traps enable automation by invoking macros at specific vertical positions, such as headers and footers using .wh position [macro](/page/Macro). A common setup plants traps at the page top (.wh 0 [hd](/page/H.D.)) and bottom (.wh -1i [fo](/page/FO)), where hd and fo are macros outputting page content; positions use scalable units like inches (i).[13] The .ch macro position adjusts existing traps dynamically.[2]
Diversions facilitate advanced layouts by capturing formatted output into a macro for later reuse or manipulation.[2]
Best practices for portability emphasize using standard requests like .sp and .ft over device-specific escapes, testing output across implementations (e.g., AT&T troff and GNU groff) with options like -Tps, and prefixing custom macro names (e.g., MY. ) to avoid conflicts with standard sets like ms or mm.[2] Remove unused standard macros via .rm if overriding, and favor scalable units (e.g., p for points) for consistent rendering.[25]
Historically, at Bell Labs, customizations extended standard packages like mm for book publishing, incorporating features such as multi-level numbered headings (.H 1), floating displays for illustrations (.DF), and automatic table of contents (.TC) to handle complex technical volumes like user guides and proposals.[26] These adaptations supported sequential section-page numbering (e.g., via -rN3) and proprietary markings (.PM CR) for professional output.[27]
Preprocessors
Layout and Graphics Preprocessors
Troff's layout and graphics preprocessors enhance the system's text-based formatting by handling complex visual structures such as tables, line drawings, and diagrams, converting specialized input syntax into equivalent troff commands for seamless integration into the document processing pipeline.[28] These tools, developed primarily at Bell Labs in the late 1970s and 1980s, allow users to specify spatial arrangements without directly manipulating troff's low-level positioning primitives, focusing on declarative descriptions of layout elements like alignments, shapes, and connections.[29]
The tbl preprocessor, introduced by M. E. Lesk in 1977, simplifies the creation of formatted tables within troff documents by parsing a concise syntax that defines column structures, alignments, and content.[30] Tables are delimited by .TS (table start) and .TE (table end) directives, with an initial format line specifying options like center for horizontal centering or expand to fill available width, followed by key letters for each column such as l for left-aligned, c for centered, r for right-aligned, or n for decimal-point alignment where numbers like 13 and 4.2 align on their units digit.[30] Spanning is supported via s for horizontal column spans (e.g., a title across multiple columns) or ^ for vertical row spans, while text blocks use T{ and T} for multi-line entries, and options like delim(xy) allow embedding equations or other markup.[30] For instance, a basic table might begin with .TS H; l c n.; to create a header row with left, center, and numeric columns, followed by data lines and ending with .TE.[30] This approach made even complex tables, including those with ruled lines via allbox, straightforward to author compared to manual troff spacing.[30]
Developed by Brian W. Kernighan in 1979, the pic preprocessor enables the creation of line drawings and simple diagrams by describing geometric objects and their relationships in a high-level, English-like syntax that pic translates into troff drawing commands.[31] Pictures are enclosed between .PS and .PE macros, with elements like boxes (box "label" for a default 0.75-inch by 0.5-inch rectangle), circles (circle radius 0.1 for a 0.2-inch diameter circle), lines (line from A to B connecting predefined points), and arrows specified using attributes such as with .nw at 1i, 2i for positioning or dashed for style.[31] Connections can be relative (e.g., arrow from previous .se to C) or absolute, supporting arcs, ellipses, splines, and text placement, with variables like boxwid adjustable for scaling.[31] An example drawing might include A: box; B: [circle](/page/The_Circle) at 1i right of A; line from A to B, producing troff-compatible output for boxes, arrows, and ellipses that troff renders on the typesetter.[31] Pic's design emphasized simplicity for technical illustrations, outputting device-independent troff primitives for arbitrary positioning.[31]
The ideal preprocessor, created by C. J. Van Wyk in the early 1980s, facilitates the generation of structured diagrams such as flowcharts and entity-relationship models through a constraint-based language that automatically resolves node positions and connections.[32] Specified within .ID and .DE blocks, ideal uses node declarations (e.g., node A "Label") and links (e.g., A -> B), with constraints like A right of B or mathematical relations for layout, producing pic-like output that troff processes for final rendering.[33] This node-centric syntax allowed declarative descriptions of diagrams, where ideal's solver handled spacing and routing to avoid overlaps, making it suitable for complex hierarchical visuals without manual coordinate tuning.[33]
Grap, developed by Brian W. Kernighan and Jon L. Bentley in the mid-1980s, serves as a specialized pic preprocessor for typesetting graphs and charts by accepting data specifications and axis definitions, outputting pic commands for troff integration.[34] Graphs are bounded by .G1 and .G2 macros, with commands like label "Title"; data: 1 2 3 4; plot to define points, followed by options for axes (xaxis from 0 to 5 by 1), scaling, and labels, enabling bar charts, line plots, and histograms through simple declarative input.[35] For example, copy "file.dat" through line imports data for automated plotting with customizable ticks and grids, emphasizing ease for quantitative visualizations over pic's general drawing primitives.[35]
These preprocessors integrate into the troff workflow via Unix pipelines, where multiple tools chain sequentially—for instance, pic input | tbl | [troff](/page/troff) processes drawings first, then tables, before troff formats the combined output—allowing flexible invocation without altering core troff syntax.[14] In practice, users embed preprocessor directives directly in source files, with the pipeline handling translation to ensure graphical elements align precisely with surrounding text.[14]
Mathematical and Reference Preprocessors
The eqn preprocessor, developed in 1974 by Brian W. Kernighan and Lorinda L. Cherry at Bell Laboratories, enables the inclusion of complex mathematical expressions in troff documents by translating a simple descriptive language into troff formatting commands. It processes input delimited by .EQ and .EN lines, supporting constructs for fractions, superscripts, integrals, summations, matrices, and font changes such as roman or italic. For instance, the expression {x sup 2} over y produces a fraction with x^2 in the numerator and y in the denominator, while int from {a} to {b} f(x) dx renders an integral \int_a^b f(x) \, dx. Matrices can be formed using alignment modes like matrix { lcl: x above y above z }, arranging elements in left-center-left columns. eqn outputs troff-compatible code that handles variable spacing and scaling for high-quality typesetting on phototypesetters.[36]
A companion tool, neqn, serves as a variant of eqn optimized for nroff output on terminal devices, adapting the same syntax to produce readable ASCII approximations of equations for typewriter-like displays.[2] It maintains compatibility with eqn's input format but simplifies rendering, such as approximating fractions with slashed lines or using carets for superscripts, to suit fixed-width fonts without sacrificing essential structure.
The refer preprocessor, introduced in 1978 by Michael E. Lesk at Bell Laboratories, facilitates bibliographic management by scanning documents for citation keywords and inserting formatted references from external databases.[37] References are stored in plain text files with fields prefixed by percent signs, such as %A for author and %T for title, separated by blank lines; an inverted index is built using indxbib for efficient lookup. In the input file, citations are marked with .][ keyword1 keyword2 ], triggering refer to retrieve and insert matching entries as footnotes or endnotes, while commands like .Ah define author headers for sorting. Options such as -sA enable sorting by author, and -p limits output to a specified number of labels, supporting reverse indexing with tools like lookbib for interactive searches.[38]
These preprocessors integrate into a pipeline where refer processes citations first, followed by eqn for equations, before feeding the result to troff; a typical command is refer file | eqn | troff -[ms](/page/MS), ensuring inline math and references are resolved sequentially without conflicts.[37] This order allows refer's insertions to be treated as regular text by subsequent stages, enabling seamless embedding in macro packages like ms.
Despite their capabilities, eqn and refer have limitations, including eqn's lack of built-in automatic equation numbering, which requires custom troff macros (e.g., incrementing a register within .EQ blocks) for labeling and cross-referencing.[39] refer relies on exact keyword matches without support for fuzzy or boolean searches, and database changes necessitate index rebuilding, potentially complicating maintenance for large bibliographies.[37]
Implementations
Original AT&T Troff
The original AT&T troff implementation, developed at Bell Laboratories, is a monolithic program written in the C programming language, designed to format text for high-quality typesetting output. Initially created by Joe Ossanna in PDP-11 assembly language in 1973 to drive the Graphic Systems CAT phototypesetter, it was rewritten in C around 1975 to improve portability and maintainability, with further modifications by Brian Kernighan in 1979 to support device-independent intermediate code. This core formatter processes roff input language, handling layout, fonts, and pagination in a single executable that embeds logic for generating output tailored to specific devices via postprocessors, including those for the CAT phototypesetter, Autologic APS-5, and Teletype Model 37 terminal. The -T option selects the target device, such as -Tcat for the CAT, -Taps for the APS-5, or -T37 for the Teletype 37, enabling troff to produce paginated, printable documents optimized for phototypesetting hardware.[14][40]
Troff's architecture emphasized efficiency on contemporary hardware like the DEC VAX, where it was commonly deployed in Unix environments; on such systems, it could drive phototypesetters at rates supporting up to 3000 lines per minute for the APS-5, though typical throughput varied by device and content complexity, with the CAT limited to around 50 lines per minute. Font management relied on structured directories under /usr/lib/font, where subdirectories like devcat or devaps contained device-specific description files (DESC) and width/character metric files for each font, allowing dynamic access to multiple fonts without physical mounting. For terminal emulation in nroff mode (a troff variant for line printers and displays), termcap entries or equivalent TERM specifications in /usr/lib/term provided motion and character capabilities, such as reverse-line support for the Teletype 37. This setup ensured compatibility across output media while maintaining a compact, self-contained binary.[14][40]
Distribution of the original AT&T troff occurred primarily through licensed Unix releases, bundled in System V Release 3 (SVR3) during the 1980s as part of the Documenter's Workbench software package, which included enhanced macro sets and preprocessors for professional document production. It was also incorporated into Berkeley Software Distribution (BSD) Unix starting from the 7th Edition in 1979, derived from Bell Labs' research versions, allowing academic and research institutions to use it under AT&T's source licensing agreements. Source code availability remained restricted under AT&T's proprietary model, requiring expensive licenses for access and prohibiting redistribution, until the mid-1990s when successor companies like Caldera began releasing older Unix variants, including troff components, under more open terms as part of efforts to settle antitrust-related obligations.[41][42]
Early versions of troff, through the 7th Edition Unix (1979), underwent iterative fixes for stability and functionality, addressing issues in areas like footnote processing, diversion traps, and multi-column output identified during development. Contributors including Ossanna, Kernighan, Dennis Ritchie, Ken Thompson, and others resolved bugs such as improper handling of size changes in electronic fonts and inconsistencies in terminal motion for nroff, with changes documented in internal Bell Labs memos and reflected in the evolving C codebase up to that edition. These refinements ensured reliable operation for document preparation in research and technical writing at AT&T, though some legacy behaviors, like limited error reporting, persisted in later proprietary releases.[14]
Modern Reimplementations
GNU groff, initiated by James Clark in early 1989 with its first release (version 0.3.1) in June 1990, serves as a complete reimplementation of the original AT&T troff written in C++. It extends the baseline functionality with modern features, including drivers for X11 terminals, output generation in HTML format, and native support for UTF-8 encoding to handle international text. The project remains actively maintained, with contributors such as Bernd Warken involved in documentation and enhancements, ensuring compatibility across Unix-like systems while addressing portability issues from proprietary origins.[43][6][44]
The Heirloom Documentation Tools, developed by Gunnar Ritter starting in 2005 as an open-source package, combines elements from AT&T Unix and BSD variants to provide a hybrid troff implementation focused on preserving legacy behaviors. It emphasizes compatibility with historical macro packages and includes utilities like dpost for generating PostScript output convertible to PDF, facilitating use on contemporary printers without requiring extensive reconfiguration. This approach bridges gaps in older systems by supporting both traditional terminal output and modern document formats.[45][46]
Plan 9 troff, originating in the late 1980s as part of the Plan 9 operating system from Bell Labs and continuing through subsequent ports, offers a streamlined and highly portable reimplementation compared to the original AT&T version. It produces device-independent intermediate output processed by built-in drivers, such as the UTF driver for PostScript and Unicode-enabled displays, enabling seamless adaptation to modern screens and printers. Enhancements like Unicode support, added in 1992–1993, further improve its utility for global text processing.[13][47]
mandoc, initiated in 2008 by Kristaps Dzonsons primarily for OpenBSD and other BSD derivatives, functions as a lightweight alternative formatter with a strong emphasis on the mdoc macro language for manual pages. It provides robust output options including HTML5 and PDF via integrated preprocessing, while offering partial compatibility with troff's man macros through built-in support for soelim, tbl, and eqn. This design prioritizes speed and standards adherence over full troff emulation, making it suitable for documentation in resource-constrained environments.[48][49]
Recent developments since 2020 have centered on groff, with extensions providing Lua scripting integration via the .eval macro (introduced around 2022 and compatible with version 1.23.0, released July 2023) allowing dynamic extensions within documents. Color support has also seen refinements, including better handling of RGB and CMYK spaces up to 16 bits per channel for terminal and PostScript devices, enhancing visual fidelity in outputs. No significant new troff implementations have appeared in this period, as of November 2025, underscoring the maturity of these established projects.[2][50][43]
Applications and Legacy
Role in Unix Systems
Troff has served as the primary formatter for manual pages (man pages) in Unix systems since Version 7, released by AT&T in 1979, where it replaced earlier simple macro sets used in Version 6.[51] These man pages, essential for documenting system commands, libraries, and utilities, rely on the -man macro package to structure content into sections such as NAME, SYNOPSIS, DESCRIPTION, and OPTIONS, enabling consistent output for both terminal display via nroff and high-quality printing via troff.[51] This integration made troff a cornerstone of Unix documentation, allowing developers to produce formatted pages directly from source files processed by the man command.
In traditional Unix build processes, troff integrates seamlessly with tools like makefiles, where rules invoke troff or its variants to generate man pages during compilation. For instance, Perl documentation employs pod2man to convert POD (Plain Old Documentation) markup into troff-compatible input, which is then formatted for inclusion in system manuals, often as part of Makefile targets for automated distribution.[52] This workflow ensures that utility manuals, such as those for core commands like ls or grep, are generated on-the-fly, maintaining uniformity across Unix environments.
Historically, troff played a pivotal role in producing AT&T research papers and articles for the Bell System Technical Journal, where its capabilities for handling complex layouts, mathematics via eqn, and tables via tbl facilitated the typesetting of technical content at Bell Labs.[53] POSIX standards outline requirements for man page content and structure, including conventions for sections and wording, which are implemented using troff formatting to ensure portability and readability in Unix-like systems.[54] Examples abound in kernel documentation, such as the troff-formatted pages for system calls like open(2) or utilities like mount(8), which detail interfaces and usage within the Linux kernel and broader Unix ecosystem.[55]
Current Usage and Alternatives
In contemporary Unix-like operating systems, including major Linux distributions and BSD variants, groff—a free implementation of troff—persists as the standard tool for generating manual pages (man pages), which provide essential documentation for system commands and utilities.[15] This usage is embedded in core system packages, ensuring that troff-derived formatting remains integral to software development and administration workflows on platforms like Debian, Fedora, and FreeBSD.[56][57] Additionally, groff supports legacy printing tasks, such as producing PostScript or PDF output for archival or specialized typesetting needs.[58]
Beyond system documentation, troff via groff finds niche applications in technical book formatting, where its precise control over layout suits complex, text-heavy publications like programming guides.[59] For instance, groff's device drivers enable web conversion through commands like groff -T[html](/page/HTML), facilitating the generation of HTML from troff source for online technical resources.[60] These uses highlight troff's enduring role in environments prioritizing markup simplicity over graphical interfaces.
The adoption of troff has declined markedly since the rise of what-you-see-is-what-you-get (WYSIWYG) editors and web-based authoring tools, which offer more intuitive experiences for general document creation. This shift has confined troff primarily to legacy and specialized contexts within open-source ecosystems.
Prominent alternatives to troff include LaTeX, a declarative markup system favored for documents involving intricate mathematics and scientific notation due to its robust macro packages and automated layout features. For simpler technical writing, Markdown—often converted via Pandoc—provides lightweight syntax that prioritizes readability in plain text while supporting export to multiple formats like PDF or HTML. In structured documentation for software projects, DocBook serves as an XML-based standard, enabling modular content management and transformation into various outputs through tools like XSLT.
Ongoing maintenance of the GNU groff project sustains troff's viability for current and legacy applications, with regular updates ensuring compatibility with modern systems.[2]