uuencoding
Uuencoding is a binary-to-text encoding method that converts binary data into a format consisting of printable ASCII characters, enabling the safe transmission of non-text files over text-only channels such as early email systems and Usenet.[1] Developed as part of the Unix utilities uuencode and uudecode, it originated in the Berkeley Software Distribution (BSD) 4.0 in the late 1970s to facilitate the exchange of binary files like executables and images across networks that could not reliably handle 8-bit data.[2][3]
The encoding algorithm processes input in groups of three bytes (24 bits), dividing them into four 6-bit values ranging from 0 to 63, which are then mapped to printable characters by adding 32 (resulting in ASCII codes 32 to 95, from space to underscore) and adjusted for the local codeset.[1] Encoded output includes a header line specifying file permissions and a decode path, followed by data lines prefixed with a character indicating the number of input bytes encoded on that line (typically up to 45 input bytes per line, producing 60 encoded characters), and ends with an "end" marker.[1] This results in approximately a 35% increase in file size due to the expansion from 8-bit to 6-bit encoding.[1]
Uuencoding became a de facto standard in the early 1980s for Unix-to-Unix Copy Protocol (UUCP) communications and early Internet file transfers, predating more efficient standards like MIME and Base64, which later supplanted it for its inefficiency and lack of formal specification.[4][2] Although largely obsolete today, it remains supported in POSIX-compliant systems and some programming libraries for legacy compatibility.[1]
Introduction
History and Development
Uuencoding was invented in 1980 by Mary Ann Horton, a graduate student at the University of California, Berkeley, as a method to transmit binary files over the Unix-to-Unix Copy Protocol (UUCP) email systems, which were limited to 7-bit ASCII text channels.[5][6] This innovation addressed the need to share executable programs and other non-text data across early networked Unix machines without corruption during transmission.[7]
The encoding scheme saw its initial implementation in the 1980s within Berkeley Software Distribution (BSD) Unix, debuting in version 4.0BSD released that year, and quickly gained widespread adoption through the standard uuencode and uudecode command-line tools.[2] It became essential for distributing binaries over UUCP-connected bulletin board systems (BBS), Usenet newsgroups, and the nascent Internet, enabling users to exchange files in environments that stripped or altered 8-bit data.[7] Key milestones included its integration into BSD Unix distributions, which facilitated its proliferation in academic and research computing, and the emergence of forks adapted for specific systems, such as BinHex developed in the mid-1980s for Macintosh computers to preserve resource and data forks during transfers.[8][9]
Uuencoding's prominence began to decline in the 1990s with the standardization of the Multipurpose Internet Mail Extensions (MIME) in RFC 1341 (1992), which promoted more efficient encodings like Base64 for binary attachments in email and web protocols.[10] By the mid-1990s, MIME's adoption in major email clients and browsers rendered uuencoding largely obsolete for new applications, though it persisted in legacy contexts. As of 2025, uuencoding is considered a historical artifact but remains implemented in open-source tools such as GNU sharutils for compatibility with older systems.[11]
Purpose and Applications
Uuencoding was developed to convert binary data into a 7-bit ASCII-safe text format, enabling reliable transmission over network protocols that could corrupt or alter 8-bit binary data, such as early implementations of SMTP for email.[5][12] This encoding ensures that arbitrary binary files could be embedded in text-based messages without loss, addressing the limitations of systems that only supported 7-bit channels.[2]
Its primary applications emerged in the pre-MIME era of the 1980s and 1990s, including attaching binary files to email messages, posting to Usenet newsgroups, and transferring data via modems over text-only connections like UUCP networks.[5][2][13] By mapping data to printable ASCII characters in the range 32 (space) to 95 (underscore), uuencoding promoted portability across diverse systems with varying character encodings and ensured compatibility with text processors that might strip or modify non-printable bytes.[14][12]
As of 2025, uuencoding persists in rare niche contexts, such as legacy UNIX scripts, certain embedded systems, and tools for preserving historical archives, including its use in FreeBSD source code to embed binary data as text.[15][16] However, it sees no significant new applications, having been supplanted by more efficient standards like MIME Base64 for modern data transfer needs.[12][17]
Encoding Process
The output of uuencoding follows a standardized structure designed for reliable transmission of binary data over text-based channels, consisting of a header, body, and footer.[18] The header begins with a line in the format begin <mode> <filename>, where <mode> specifies the octal representation of the file's permissions (such as 644 for typical user-readable files) and <filename> indicates the original name of the encoded file.[19] This line provides essential metadata for decoding and reconstruction, ensuring the recipient can restore the file with appropriate attributes.[20]
The body comprises multiple lines, each starting with a single ASCII character that denotes the number of input bytes encoded in that line (ranging from 0 to 45). Per the POSIX specification, this length character is the count plus 32 (space for 0, up to capital M for 45).[18] However, common implementations use the same character mapping as the data (grave accent ` for 0 to avoid space stripping, up to M).[21] Following this count character, the line contains up to 60 encoded characters derived from the binary data using a fixed character mapping, resulting in full lines of exactly 61 characters plus a newline (totaling 62 characters including the terminator).[19] These lines collectively represent the entire binary content, with shorter lines possible at the end if the input length is not a multiple of 45 bytes.[20]
The structure concludes with a footer signaling the end of the encoded data: per the specification, a final line containing solely end.[18] For compatibility with many decoders, common implementations include a preceding zero-length line (using space per spec or grave accent in practice) followed by theend` line.[21][19] While the core format is rigid, some implementations may include optional comment lines or additional metadata outside the standard header-body-footer, though these are ignored by compliant decoders and not part of the original specification.[20]
Step-by-Step Mechanism
The uuencoding mechanism processes binary input data by grouping it into sets of three bytes, equivalent to 24 bits, to facilitate conversion into printable ASCII characters suitable for text-based transmission protocols. This grouping allows each set to be divided into four 6-bit segments, where each segment's value (ranging from 0 to 63) is offset by adding 32 (0x20 in hexadecimal) to map it to an ASCII character between space (ASCII 32) and underscore (ASCII 95).[1] In common implementations, value 0 is mapped to grave accent ` (96) instead of space to prevent alteration during transmission.[21]
To extract the 6-bit groups from three input bytes denoted as A (most significant), B, and C (least significant), the following bit manipulations are applied:
group1 = (A >> 2) & 0x3F
group2 = ((A << 4) | (B >> 4)) & 0x3F
group3 = ((B << 2) | (C >> 6)) & 0x3F
group4 = C & 0x3F
group1 = (A >> 2) & 0x3F
group2 = ((A << 4) | (B >> 4)) & 0x3F
group3 = ((B << 2) | (C >> 6)) & 0x3F
group4 = C & 0x3F
Each group value is then incremented by 0x20 (with the common substitution for 0 as noted) to yield the corresponding output character. Lines of encoded data are prefixed with a length octet, which represents the number of input bytes in that line (up to 45 for full lines) and is encoded using the same offset method (space or ` for 0).[1]
When the input length is not a multiple of three, the final group is padded with zero bits to complete the 24-bit block, ensuring four output characters are generated. The length octet for such partial groups indicates 1 or 2 bytes (or 3 for complete groups), allowing the decoder to discard extraneous padded bits during reconstruction. This padding and length handling provide basic integrity verification without additional checksums in the standard format.[1]
Decoding reverses this process: the length character is first offset by subtracting 32 to determine the number of valid output bytes for the line (up to 45), though for the final (partial) group within the line, only the first N bytes (1 to 3) are retained based on the remainder. Each subsequent data character is subtracted by 32 (or adjusted for the ` substitution if value was 0) to recover the 6-bit group value. These groups, denoted g1 to g4, are recombined into bytes using inverse bit shifts:
byte1 = (g1 << 2) | (g2 >> 4)
byte2 = (g2 << 4) | (g3 >> 2)
byte3 = (g3 << 6) | g4
byte1 = (g1 << 2) | (g2 >> 4)
byte2 = (g2 << 4) | (g3 >> 2)
byte3 = (g3 << 6) | g4
Only the first N bytes are retained, ignoring any padded portions. Some variants of uuencoding append a 6-bit checksum (encoded similarly) after the data characters on each line for enhanced error detection, though standard implementations rely solely on the length octet for validation.[1][21]
Character Mapping Table
Uuencoding maps each 6-bit value from 0 to 63 to one of 64 printable ASCII characters, ensuring compatibility with 7-bit text channels without requiring escape sequences.[12] The core mapping adds 32 (the ASCII value of space) to the 6-bit value, producing characters from space (0x20) to underscore (0x5F); however, to mitigate risks of space characters being stripped or altered during transmission over email or news systems, the value 0 is commonly mapped to the grave accent (0x60) instead.[12][22]
Early implementations of uuencoding, such as those in original Unix systems, used space for the value 0 without substitution.[22] While some modern tools permit customization of the alphabet, the conventional mapping adheres to printable ASCII characters in the range 0x20 to 0x7E, excluding the delete character (DEL, 0x7F), to maintain portability across 7-bit clean environments.[12] The table below presents the common mapping (with ` for 0):
| Value | Character |
|---|
| 0 | ` |
| 1 | ! |
| 2 | " |
| 3 | # |
| 4 | $ |
| 5 | % |
| 6 | & |
| 7 | ' |
| 8 | ( |
| 9 | ) |
| 10 | * |
| 11 | + |
| 12 | , |
| 13 | - |
| 14 | . |
| 15 | / |
| 16 | 0 |
| 17 | 1 |
| 18 | 2 |
| 19 | 3 |
| 20 | 4 |
| 21 | 5 |
| 22 | 6 |
| 23 | 7 |
| 24 | 8 |
| 25 | 9 |
| 26 | : |
| 27 | ; |
| 28 | < |
| 29 | = |
| 30 | > |
| 31 | ? |
| 32 | @ |
| 33 | A |
| 34 | B |
| 35 | C |
| 36 | D |
| 37 | E |
| 38 | F |
| 39 | G |
| 40 | H |
| 41 | I |
| 42 | J |
| 43 | K |
| 44 | L |
| 45 | M |
| 46 | N |
| 47 | O |
| 48 | P |
| 49 | Q |
| 50 | R |
| 51 | S |
| 52 | T |
| 53 | U |
| 54 | V |
| 55 | W |
| 56 | X |
| 57 | Y |
| 58 | Z |
| 59 | [ |
| 60 | \ |
| 61 | ] |
| 62 | ^ |
| 63 | _ |
Examples
Simple Encoding Illustration
To illustrate the basic uuencoding mechanism, consider a simple 3-byte input consisting of the ASCII characters "Man", with byte values 77 (M), 97 (a), and 110 (n).[1]
The encoding process begins by treating these 3 bytes as a 24-bit stream: 01001101 01100001 01101110 (in binary, MSB first). This stream is divided into four 6-bit groups: 010011 (decimal 19), 010110 (22), 000101 (5), and 101110 (46). Each 6-bit value is then mapped to a printable ASCII character by adding 32 (0x20), yielding 51 ('3'), 54 ('6'), 37 ('%'), and 78 ('N'). The output line is prefixed with an encoded byte count indicating the number of input bytes in this group (3 + 32 = 35, or '#'), resulting in the encoded line "#36%N".[1]
This mapping uses the formula for three input bytes A, B, C: first character = 32 + ((A >> 2) & 63), second = 32 + ((((A << 4) | ((B >> 4) & 15)) & 63)), third = 32 + ((((B << 2) | ((C >> 6) & 3)) & 63)), fourth = 32 + (C & 63). The resulting characters are from the printable ASCII range 32 (space) to 95 (underscore).[1]
To verify, decoding reverses the process: subtract 32 from each of the four characters to recover the 6-bit values (19, 22, 5, 46), then recombine the bits into a 24-bit stream—shifting and masking to form the original bytes 77, 97, and 110—yielding "Man" again. This demonstrates the lossless nature of the encoding for complete 3-byte groups.[1]
Complete File Encoding
To illustrate complete file encoding with uuencoding, consider a simple text file named "hello.txt" containing the 11-byte string "Hello World", saved with Unix file permissions mode 644 (readable and writable by owner, readable by group and others).
The full uuencoded output, generated using standard uuencoding conventions, appears as follows (note: trailing spaces in lines are significant):
begin 644 hello.txt
+2&5L;&\\@5V]R;&0
end
```[](https://pubs.opengroup.org/onlinepubs/7908799/xcu/uuencode.html)
This includes the header line specifying the mode and [filename](/page/Filename), the encoded [body](/page/Body) (with the first body line starting with '+' indicating 11 bytes in that line, followed by the 6-bit encoded data ending with a space for [padding](/page/Padding)), a zero-length line containing a single space to signal the end of data, and the closing "end" marker.[](https://pubs.opengroup.org/onlinepubs/7908799/xcu/uuencode.html)
The decoding process begins by identifying the "begin" and "end" markers to extract the [body](/page/Body) content, discarding the header and footer. For each body line, subtract [32](/page/32) from the first character's ASCII value to obtain the byte length for that line (e.g., '+' yields 11). Then, take the subsequent characters (up to 60 per line, excluding the length indicator), interpret them as 6-bit values by subtracting [32](/page/32) from each ASCII value, and reassemble into 8-bit bytes by combining four 6-bit groups into three 8-bit bytes (discarding [padding](/page/Padding) zeros from incomplete groups). Sum the lengths across lines to verify the total output size (11 bytes here), and write the resulting bytes to a [file](/page/File) with the specified [mode](/page/Mode). The [character](/page/Character) mapping [table](/page/Table) can be referenced for verification of individual 6-bit to ASCII conversions during this step.
On [Unix-like](/page/Unix-like) systems, command-line tools facilitate end-to-end encoding and decoding. To encode, use `uuencode hello.txt output.uue`, which reads the input file, applies the uuencoding algorithm, and writes the complete formatted output to "output.uue" (overwriting if it exists). To decode, run `uudecode output.uue`, which parses the file, reconstructs the original "hello.txt" with mode 644, and places it in the current directory (overwriting if it exists). These tools originated in early Unix systems and remain available in modern implementations like [GNU](/page/GNU) sharutils.
## Variants
### Handling File Forks
In file systems supporting multi-part structures, such as the Macintosh Hierarchical File System (HFS) and [Microsoft](/page/Microsoft) Windows [NTFS](/page/NTFS), files can include a primary data fork alongside additional components like resource forks or alternate data streams that store [metadata](/page/Metadata), application-specific resources, or extended attributes. Uuencoding, originating from Unix environments, processes only the primary data fork by default, ignoring these secondary elements unless explicitly handled otherwise.[](https://tidbits.com/1998/08/31/macintosh-internet-file-format-primer/)[](https://www.oreilly.com/library/view/mac-os-x/0596003706/re397.html)
For instance, in HFS, the [resource fork](/page/Resource_fork)—containing elements like icons, menus, and fonts—is not encoded, potentially rendering decoded files incomplete on Macintosh systems, while [NTFS](/page/NTFS) alternate data streams, used for similar [metadata](/page/Metadata) purposes, face analogous exclusion during encoding.[](https://tidbits.com/1998/08/31/macintosh-internet-file-format-primer/)[](https://www.oreilly.com/library/view/mac-os-x/0596003706/re397.html) Secondary forks thus require manual separation and individual encoding, complicating cross-platform transfers.
A common [workaround](/page/Workaround) involves tools like MacBinary, which first combines the data fork, [resource fork](/page/Resource_fork), and file [metadata](/page/Metadata) (such as type and creator codes) into a single [binary file](/page/Binary_file), adding a 128-byte header for integrity and compatibility; this wrapped file can then be uuencoded for safe transmission over ASCII-limited channels like early [email](/page/Email) systems.[](https://www.savagetaylor.com/TIL/KB007328.html)
This oversight in uuencoding contributed to frequent [data loss](/page/Data_loss) in cross-platform file exchanges during the [1980s](/page/1980s) and [1990s](/page/1990s), especially for Macintosh applications shared via Unix-based networks, where [resource fork](/page/Resource_fork) information was discarded en route.[](https://tidbits.com/1998/08/31/macintosh-internet-file-format-primer/) Such problems were mitigated in later protocols, including [MIME](/page/MIME) multipart formats, which support sending multiple file components as distinct parts within a single message, preserving both primary data and [metadata](/page/Metadata) streams.
### Handling Resource Forks
In the Macintosh [Hierarchical File System](/page/Hierarchical_File_System) (HFS), the [resource fork](/page/Resource_fork) of a file serves as a structured repository for non-data elements, including icons, menu definitions, window layouts, executable code segments, and other application-specific resources.[](https://www.computerlanguage.com/results.php?definition=resource%2Bfork) Standard uuencoding processes only the data fork of Macintosh files, completely omitting the [resource fork](/page/Resource_fork) and thereby failing to preserve these essential components during transmission.[](http://macintoshsisters.freeservers.com/faqs/faq8.html)
To address this limitation, Yves Lempereur developed BinHex 4.0, a specialized encoding tool that integrates both the data and [resource](/page/Resource) forks into a single [binary](/page/Binary) [stream](/page/Stream), prepends a header with metadata such as file type, creator, and fork lengths, applies [run-length encoding](/page/Run-length_encoding) for compression, and finally applies uuencoding to produce an ASCII-safe output.[](http://macintoshsisters.freeservers.com/faqs/faq8.html)[](https://www.rfc-editor.org/rfc/rfc1741.html) This wrapper format ensures full fidelity of Macintosh file structures when transferred over text-only protocols like [email](/page/Email).[](https://www.rfc-editor.org/rfc/rfc1741.html)
A typical [use case](/page/Use_case) involved creating a UUEncoded BinHex file from a complete Macintosh application or [document](/page/Document), allowing recipients to decode and reconstruct both forks intact on another [Mac](/page/Mac) system.[](https://www.savagetaylor.com/TIL/KB018758.html) By 2025, however, BinHex and resource fork handling have become obsolete, as modern macOS has phased out native resource forks in favor of [extended file attributes](/page/Extended_file_attributes) since the transition to Mac OS X in 2001, rendering such legacy solutions unnecessary for contemporary file transfers.
## Comparisons
### To xxencode
xxencoding represents a refinement of uuencoding, specifically tailored for safer transmission over [Usenet](/page/Usenet) and [email](/page/Email) systems in the 1990s. Developed by Nelson H. F. Beebe at the [University of Utah](/page/University_of_Utah) in 1990, xxencoding addressed limitations in uuencoding's character set, which predates it by about a decade and originated in the early 1980s as part of [Berkeley Software Distribution](/page/Berkeley_Software_Distribution) (BSD) utilities for Unix systems.[](https://www.math.utah.edu/~beebe/support/myman2html/testdata/okay/xxencode.html)[](https://www.techtarget.com/searchnetworking/definition/Uuencode)
A primary distinction lies in the character sets employed. xxencoding utilizes a 64-character set limited to alphanumeric characters and two symbols: A–Z, a–z, 0–9, +, and -. This selection deliberately avoids non-alphanumeric symbols, enhancing text safety by reducing the risk of alteration or filtering by mail gateways or [Usenet](/page/Usenet) moderators, which often stripped or modified special characters like backticks, quotes, or other [punctuation](/page/Punctuation) present in uuencoding's broader 64-character printable ASCII range (from [space](/page/Space) to [underscore](/page/Underscore)).[](https://www.math.utah.edu/~beebe/support/myman2html/testdata/okay/xxencode.html)
In terms of efficiency, both formats adhere to a 3:4 [byte-to-character encoding ratio](/page/Ratio), processing three [binary](/page/Binary) bytes into four encoded characters, and feature comparable line structures with [data](/page/Data) lines typically comprising around 60 characters. However, xxencoding incurs approximately 1% less overhead than uuencoding, attributable to its use of a fixed [prefix](/page/Prefix) (such as 'h' for full lines) rather than a variable length byte at the start of each line, which uuencoding requires to indicate the number of encoded bytes per line. This minor optimization results in slightly reduced expansion, with both schemes generally increasing [file size](/page/File_size) by about 33–35% overall.[](https://www.math.[utah](/page/Utah).edu/~beebe/support/myman2html/testdata/okay/xxencode.html)
### To Base64
Base64 is a [binary-to-text encoding](/page/Binary-to-text_encoding) scheme that represents arbitrary sequences of octets using a 64-character [alphabet](/page/Alphabet) consisting of the uppercase letters A–Z, lowercase letters a–z, decimal digits 0–9, and the symbols + and /, with padding provided by the = character to ensure proper decoding of incomplete groups.[](https://www.ietf.org/rfc/rfc2045.txt) Defined in RFC 2045 as part of the [Multipurpose Internet Mail Extensions (MIME)](/page/MIME) standard in 1996, [Base64](/page/Base64) was specifically designed to encode [binary data](/page/Binary_data) for safe transmission over text-based [email](/page/Email) protocols without corruption.[](https://www.ietf.org/rfc/rfc2045.txt)
While uuencoding also employs a 64-character set drawn from printable ASCII characters, Base64's [alphabet](/page/Alphabet) is more universally standardized and avoids certain [punctuation](/page/Punctuation) that could interfere with modern protocols. Both schemes incur approximately 33% overhead by encoding 3 bytes of [binary data](/page/Binary_data) into 4 characters, reflecting the conversion from 8-bit to 6-bit groups.[](https://pubs.opengroup.org/onlinepubs/9699969699/utilities/uuencode.html)[](https://www.ietf.org/rfc/rfc2045.txt) However, Base64 omits the per-line [length](/page/Length) prefixes and [checksum](/page/Checksum) suffixes required in uuencoding, yielding a simpler, more compact stream without embedded metadata or line delimiters that add extra bytes in uuencoded output.[](https://www.ietf.org/rfc/rfc2045.txt)
The structured line format of uuencoding, with its integrated checksums, supports partial [data recovery](/page/Data_recovery) by allowing decoders to identify and skip corrupted lines during transmission errors, a feature less emphasized in Base64's continuous stream. In contrast, Base64's streamable nature suits seamless integration in protocols, contributing to its dominance in [2025](/page/2025) for web and email applications, such as embedding binary resources via data URIs in [HTML](/page/HTML) and CSS.[](https://developer.mozilla.org/en-US/docs/Web/URI/Reference/Schemes/data)[](https://shiftasia.com/column/base64-encoding-a-comprehensive-overview-for-modern-data-transmission/)
### To Ascii85
Ascii85, also known as Base85, is a [binary-to-text encoding](/page/Binary-to-text_encoding) scheme developed by [Adobe](/page/Adobe) Systems that represents four bytes of binary data using five printable ASCII characters in the range from '!' (ASCII 33) to 'u' (ASCII 117).[](https://opensource.adobe.com/dc-acrobat-sdk-docs/pdfstandards/pdfreference1.0.pdf) This approach yields an overhead of approximately 25%, derived from the ratio of five output characters to four input bytes, in contrast to uuencoding's expansion of about 33% from its core 3-to-4 byte-to-character mapping, plus an additional ~2% from line headers and checksums, resulting in a total of roughly 35%.[](https://pubs.opengroup.org/onlinepubs/7908799/xcu/uuencode.html)[](https://opensource.adobe.com/dc-acrobat-sdk-docs/pdfstandards/pdfreference1.0.pdf)
Ascii85 finds primary application in Adobe's [PostScript](/page/PostScript), PDF, and related formats for embedding arbitrary binary data, such as compressed images, fonts, and streams, directly within documents without introducing fixed line breaks or group delimiters beyond optional PDF-wide 255-character line constraints.[](https://opensource.adobe.com/dc-acrobat-sdk-docs/pdfstandards/pdfreference1.0.pdf) In these contexts, it enables efficient inclusion of non-textual elements while maintaining 7-bit ASCII compatibility, often paired with [compression](/page/Compression) filters like LZWDecode for further optimization.[](https://opensource.adobe.com/dc-acrobat-sdk-docs/pdfstandards/pdfreference1.0.pdf)
Uuencoding, by contrast, processes input in rigid 45-byte groups per line, producing 60 encoded characters plus a length prefix and [checksum](/page/Checksum), which imposes throughput limitations and requires explicit line termination for transport compatibility.[](https://pubs.opengroup.org/onlinepubs/7908799/xcu/uuencode.html) This structure suits early Unix [email](/page/Email) and [file transfer](/page/File_transfer) needs but adds parsing overhead for large files, where Ascii85's continuous streaming proves more efficient, though its denser, punctuation-heavy output lacks uuencoding's semi-structured readability from prefixed lines.[](https://pubs.opengroup.org/onlinepubs/7908799/xcu/uuencode.html)[](https://opensource.adobe.com/dc-acrobat-sdk-docs/pdfstandards/pdfreference1.0.pdf)
## Limitations
### Inefficiencies and Overhead
Uuencoding introduces a significant size expansion when converting [binary data](/page/Binary_data) to text, primarily because it encodes every three input bytes into four output characters using a 64-character printable ASCII set.[](https://pubs.opengroup.org/onlinepubs/7908799/xcu/uuencode.html) This basic transformation results in approximately a 33% overhead relative to the original data size, as the 6-bit encoding per byte expands to 8-bit characters.[](https://compile7.org/compare-binary-encoding/what-is-difference-between-uuencoding-vs-z85-zeromq-spec32z85/) Additional control information, including per-line length indicators and headers, increases the overall expansion to about 35% for typical files.[](https://pubs.opengroup.org/onlinepubs/7908799/xcu/uuencode.html)
The format uses lines of up to 61 characters (excluding the trailing [newline](/page/Newline)), each beginning with a length indicator (the number of input bytes for that line plus [32](/page/32), translated to the local codeset) followed by up to 60 encoded characters that represent a maximum of 45 input bytes.[](https://www.dcode.fr/uu-encoding) This structure causes data fragmentation if the input [length](/page/Length) is not a multiple of 45 bytes per line, requiring padding with zero bytes in incomplete groups, which further contributes to inefficiency—typically adding 1-2% overhead from [length](/page/Length) bytes, [newlines](/page/Newline), and begin/end headers across the [file](/page/File).[](https://superuser.com/questions/568506/how-much-larger-does-uuencode-make-binary-files) Unlike modern formats, uuencoding lacks built-in [compression](/page/Compression), so the output remains uncompressed and vulnerable to redundancy in the source data.[](https://pubs.opengroup.org/onlinepubs/7908799/xcu/uuencode.html)
Processing uuencoded data demands buffering input into fixed 3-byte groups for encoding or decoding, as [the algorithm](/page/The_Algorithm) splits octets into 6-bit segments before [mapping](/page/Mapping) to characters.[](https://pubs.opengroup.org/onlinepubs/7908799/xcu/uuencode.html) This batch-oriented approach incurs computational overhead compared to more streamable encodings like [Base64](/page/Base64), particularly on contemporary hardware where sequential processing is optimized but line-based formatting adds parsing steps.[](https://news.ycombinator.com/item?id=38343748) To mitigate these inefficiencies, files are often pre-compressed using tools like [gzip](/page/Gzip) before uuencoding, which can reduce the net size increase but introduces extra encoding and decoding steps in the workflow.[](https://labex.io/tutorials/linux-linux-uuencode-command-with-practical-examples-422991)
### Compatibility Issues
Uuencoding was designed to produce output using only 7-bit printable ASCII characters (from space to underscore, ASCII 32–95), making it inherently safe for transmission over 7-bit clean channels such as early [email](/page/Email) systems.[](https://pubs.opengroup.org/onlinepubs/9699919799.2018edition/utilities/uuencode.html) However, despite this 7-bit safety, the format remains vulnerable to corruption by certain [email](/page/Email) gateways and transport agents that perform unintended modifications, such as line wrapping at arbitrary lengths (e.g., 80 characters in some [UUCP](/page/UUCP) gateways) or stripping trailing spaces from lines—a common behavior in [1980s](/page/1980s) [Internet](/page/Internet) protocols that disrupts the precise character sequence required for accurate decoding.[](https://retrocomputing.stackexchange.com/questions/3019/why-did-base64-win-against-uuencode) Additionally, while uuencoded [data](/page/Data) avoids non-ASCII characters, surrounding [email](/page/Email) [content](/page/Content) with non-ASCII elements or passage through gateways enforcing strict ASCII can lead to mangling if the entire message is altered.[](https://stackoverflow.com/questions/20862213/sending-email-attachment-using-uuencode-and-mailx) Unlike modern encodings, uuencoding includes no built-in error detection or correction mechanisms, such as checksums, leaving any transmission errors (e.g., bit flips or lost parts) undetectable and resulting in silent [data corruption](/page/Data_corruption) upon decoding.[](https://www.literateprograms.org/uuencode__c_.html)
In handling files with multiple components, such as Macintosh resource forks, uuencoding only processes the primary data fork, leading to complete loss of resource forks, [metadata](/page/Metadata), creator codes, and type information essential for Mac compatibility.[](https://www.oreilly.com/library/view/mac-os-x/0596003706/re397.html) This limitation becomes particularly problematic in multi-part transmissions, where incomplete delivery of split uuencoded sections—common in early [email](/page/Email) splits for large files—causes irreversible corruption, as there is no mechanism to verify part integrity or reconstruct missing segments.[](https://68kmla.org/bb/index.php?threads/do-stuffit-5-5-archives-encode-the-resource-fork-safely-for-transfer-to-a-non-mac.42405/) Furthermore, uuencoding's reliance on ASCII environments renders it outdated for modern [Unicode](/page/Unicode)/[UTF-8](/page/UTF-8) workflows; while it can technically encode UTF-8 binary data, it fails to preserve Unicode-aware filenames, paths, or text [metadata](/page/Metadata), often resulting in garbled or inaccessible files on systems expecting native UTF-8 support.[](https://www.savagetaylor.com/TIL/KB018758.html)
Security concerns persist in legacy uuencoding decoders, many of which contain [buffer overflow](/page/Buffer_overflow) vulnerabilities that can be exploited for [arbitrary code execution](/page/Arbitrary_code_execution) or denial-of-service attacks when processing maliciously crafted input. For instance, the uuencoded decoder in [Mutt](/page/Mutt) versions before 2.2.3 suffers from a buffer overread (CVE-2022-1328), while GMime prior to 2.4.15 has a buffer overflow in the uuencode length calculation macro (CVE-2010-0409).[](https://nvd.nist.gov/vuln/detail/CVE-2022-1328)[](https://nvd.nist.gov/vuln/detail/CVE-2010-0409) These risks remain relevant in legacy tools like [GNU](/page/GNU) sharutils (which includes uuencode/uudecode), where unpatched installations in [enterprise](/page/Enterprise) environments expose systems to [exploitation](/page/Exploitation), especially since updates for such obsolete utilities are infrequent. As recently as 2024, uuencoded files have been used in [phishing](/page/Phishing) campaigns to deliver remote access trojans like Remcos [RAT](/page/RAT).[](https://thecyberexpress.com/remcos-rat-malicious-uuencoding-uue-shipping/)
Due to these interoperability challenges and security exposures, uuencoding is widely deprecated for new applications, with standards bodies and tools recommending [MIME](/page/MIME) with [Base64](/page/Base64) encoding instead to ensure robust compatibility across diverse systems and avoid breakage in modern email infrastructures.[](https://www.gnu.org/software/sharutils/manual/sharutils.html)[](https://mimekit.net/docs/html/T_MimeKit_Encodings_UUEncoder.htm)
## Implementations
### Python Support
Python's [standard library](/page/Standard_library) includes built-in support for uuencoding via the `codecs` [module](/page/Module), which registers the `uu` [codec](/page/Codec) as a binary transform for encoding and decoding arbitrary [binary data](/page/Binary_data) into the uuencode format.[](https://docs.python.org/3/library/codecs.html) This functionality converts bytes to ASCII-safe text, complete with the standard "begin" header (including mode and filename placeholders) and "end" footer, facilitating legacy data transfer over text-only channels.[](https://docs.python.org/3/library/codecs.html#binary-transforms) The `uu` [codec](/page/Codec) has been part of [Python](/page/Python) 3 since its inception, while support in [Python](/page/Python) 2 was deprecated alongside the language's end-of-life on January 1, 2020.
To encode a bytes object, import the `codecs` module and apply the `encode` function with the `'uu'` specifier, which handles the transformation automatically without requiring additional libraries.[](https://docs.python.org/3/library/codecs.html)
```python
from codecs import encode, decode
# Encoding example
data = b'Hello'
encoded = encode(data, 'uu')
print(encoded) # Outputs the uuencoded bytes with header, body, length-0 line, and footer
# Decoding example
decoded = decode(encoded, 'uu')
print(decoded) # Outputs: b'Hello'
begin 644 hello.txt
+2&5L;&\\@5V]R;&0
end
```[](https://pubs.opengroup.org/onlinepubs/7908799/xcu/uuencode.html)
This includes the header line specifying the mode and [filename](/page/Filename), the encoded [body](/page/Body) (with the first body line starting with '+' indicating 11 bytes in that line, followed by the 6-bit encoded data ending with a space for [padding](/page/Padding)), a zero-length line containing a single space to signal the end of data, and the closing "end" marker.[](https://pubs.opengroup.org/onlinepubs/7908799/xcu/uuencode.html)
The decoding process begins by identifying the "begin" and "end" markers to extract the [body](/page/Body) content, discarding the header and footer. For each body line, subtract [32](/page/32) from the first character's ASCII value to obtain the byte length for that line (e.g., '+' yields 11). Then, take the subsequent characters (up to 60 per line, excluding the length indicator), interpret them as 6-bit values by subtracting [32](/page/32) from each ASCII value, and reassemble into 8-bit bytes by combining four 6-bit groups into three 8-bit bytes (discarding [padding](/page/Padding) zeros from incomplete groups). Sum the lengths across lines to verify the total output size (11 bytes here), and write the resulting bytes to a [file](/page/File) with the specified [mode](/page/Mode). The [character](/page/Character) mapping [table](/page/Table) can be referenced for verification of individual 6-bit to ASCII conversions during this step.
On [Unix-like](/page/Unix-like) systems, command-line tools facilitate end-to-end encoding and decoding. To encode, use `uuencode hello.txt output.uue`, which reads the input file, applies the uuencoding algorithm, and writes the complete formatted output to "output.uue" (overwriting if it exists). To decode, run `uudecode output.uue`, which parses the file, reconstructs the original "hello.txt" with mode 644, and places it in the current directory (overwriting if it exists). These tools originated in early Unix systems and remain available in modern implementations like [GNU](/page/GNU) sharutils.
## Variants
### Handling File Forks
In file systems supporting multi-part structures, such as the Macintosh Hierarchical File System (HFS) and [Microsoft](/page/Microsoft) Windows [NTFS](/page/NTFS), files can include a primary data fork alongside additional components like resource forks or alternate data streams that store [metadata](/page/Metadata), application-specific resources, or extended attributes. Uuencoding, originating from Unix environments, processes only the primary data fork by default, ignoring these secondary elements unless explicitly handled otherwise.[](https://tidbits.com/1998/08/31/macintosh-internet-file-format-primer/)[](https://www.oreilly.com/library/view/mac-os-x/0596003706/re397.html)
For instance, in HFS, the [resource fork](/page/Resource_fork)—containing elements like icons, menus, and fonts—is not encoded, potentially rendering decoded files incomplete on Macintosh systems, while [NTFS](/page/NTFS) alternate data streams, used for similar [metadata](/page/Metadata) purposes, face analogous exclusion during encoding.[](https://tidbits.com/1998/08/31/macintosh-internet-file-format-primer/)[](https://www.oreilly.com/library/view/mac-os-x/0596003706/re397.html) Secondary forks thus require manual separation and individual encoding, complicating cross-platform transfers.
A common [workaround](/page/Workaround) involves tools like MacBinary, which first combines the data fork, [resource fork](/page/Resource_fork), and file [metadata](/page/Metadata) (such as type and creator codes) into a single [binary file](/page/Binary_file), adding a 128-byte header for integrity and compatibility; this wrapped file can then be uuencoded for safe transmission over ASCII-limited channels like early [email](/page/Email) systems.[](https://www.savagetaylor.com/TIL/KB007328.html)
This oversight in uuencoding contributed to frequent [data loss](/page/Data_loss) in cross-platform file exchanges during the [1980s](/page/1980s) and [1990s](/page/1990s), especially for Macintosh applications shared via Unix-based networks, where [resource fork](/page/Resource_fork) information was discarded en route.[](https://tidbits.com/1998/08/31/macintosh-internet-file-format-primer/) Such problems were mitigated in later protocols, including [MIME](/page/MIME) multipart formats, which support sending multiple file components as distinct parts within a single message, preserving both primary data and [metadata](/page/Metadata) streams.
### Handling Resource Forks
In the Macintosh [Hierarchical File System](/page/Hierarchical_File_System) (HFS), the [resource fork](/page/Resource_fork) of a file serves as a structured repository for non-data elements, including icons, menu definitions, window layouts, executable code segments, and other application-specific resources.[](https://www.computerlanguage.com/results.php?definition=resource%2Bfork) Standard uuencoding processes only the data fork of Macintosh files, completely omitting the [resource fork](/page/Resource_fork) and thereby failing to preserve these essential components during transmission.[](http://macintoshsisters.freeservers.com/faqs/faq8.html)
To address this limitation, Yves Lempereur developed BinHex 4.0, a specialized encoding tool that integrates both the data and [resource](/page/Resource) forks into a single [binary](/page/Binary) [stream](/page/Stream), prepends a header with metadata such as file type, creator, and fork lengths, applies [run-length encoding](/page/Run-length_encoding) for compression, and finally applies uuencoding to produce an ASCII-safe output.[](http://macintoshsisters.freeservers.com/faqs/faq8.html)[](https://www.rfc-editor.org/rfc/rfc1741.html) This wrapper format ensures full fidelity of Macintosh file structures when transferred over text-only protocols like [email](/page/Email).[](https://www.rfc-editor.org/rfc/rfc1741.html)
A typical [use case](/page/Use_case) involved creating a UUEncoded BinHex file from a complete Macintosh application or [document](/page/Document), allowing recipients to decode and reconstruct both forks intact on another [Mac](/page/Mac) system.[](https://www.savagetaylor.com/TIL/KB018758.html) By 2025, however, BinHex and resource fork handling have become obsolete, as modern macOS has phased out native resource forks in favor of [extended file attributes](/page/Extended_file_attributes) since the transition to Mac OS X in 2001, rendering such legacy solutions unnecessary for contemporary file transfers.
## Comparisons
### To xxencode
xxencoding represents a refinement of uuencoding, specifically tailored for safer transmission over [Usenet](/page/Usenet) and [email](/page/Email) systems in the 1990s. Developed by Nelson H. F. Beebe at the [University of Utah](/page/University_of_Utah) in 1990, xxencoding addressed limitations in uuencoding's character set, which predates it by about a decade and originated in the early 1980s as part of [Berkeley Software Distribution](/page/Berkeley_Software_Distribution) (BSD) utilities for Unix systems.[](https://www.math.utah.edu/~beebe/support/myman2html/testdata/okay/xxencode.html)[](https://www.techtarget.com/searchnetworking/definition/Uuencode)
A primary distinction lies in the character sets employed. xxencoding utilizes a 64-character set limited to alphanumeric characters and two symbols: A–Z, a–z, 0–9, +, and -. This selection deliberately avoids non-alphanumeric symbols, enhancing text safety by reducing the risk of alteration or filtering by mail gateways or [Usenet](/page/Usenet) moderators, which often stripped or modified special characters like backticks, quotes, or other [punctuation](/page/Punctuation) present in uuencoding's broader 64-character printable ASCII range (from [space](/page/Space) to [underscore](/page/Underscore)).[](https://www.math.utah.edu/~beebe/support/myman2html/testdata/okay/xxencode.html)
In terms of efficiency, both formats adhere to a 3:4 [byte-to-character encoding ratio](/page/Ratio), processing three [binary](/page/Binary) bytes into four encoded characters, and feature comparable line structures with [data](/page/Data) lines typically comprising around 60 characters. However, xxencoding incurs approximately 1% less overhead than uuencoding, attributable to its use of a fixed [prefix](/page/Prefix) (such as 'h' for full lines) rather than a variable length byte at the start of each line, which uuencoding requires to indicate the number of encoded bytes per line. This minor optimization results in slightly reduced expansion, with both schemes generally increasing [file size](/page/File_size) by about 33–35% overall.[](https://www.math.[utah](/page/Utah).edu/~beebe/support/myman2html/testdata/okay/xxencode.html)
### To Base64
Base64 is a [binary-to-text encoding](/page/Binary-to-text_encoding) scheme that represents arbitrary sequences of octets using a 64-character [alphabet](/page/Alphabet) consisting of the uppercase letters A–Z, lowercase letters a–z, decimal digits 0–9, and the symbols + and /, with padding provided by the = character to ensure proper decoding of incomplete groups.[](https://www.ietf.org/rfc/rfc2045.txt) Defined in RFC 2045 as part of the [Multipurpose Internet Mail Extensions (MIME)](/page/MIME) standard in 1996, [Base64](/page/Base64) was specifically designed to encode [binary data](/page/Binary_data) for safe transmission over text-based [email](/page/Email) protocols without corruption.[](https://www.ietf.org/rfc/rfc2045.txt)
While uuencoding also employs a 64-character set drawn from printable ASCII characters, Base64's [alphabet](/page/Alphabet) is more universally standardized and avoids certain [punctuation](/page/Punctuation) that could interfere with modern protocols. Both schemes incur approximately 33% overhead by encoding 3 bytes of [binary data](/page/Binary_data) into 4 characters, reflecting the conversion from 8-bit to 6-bit groups.[](https://pubs.opengroup.org/onlinepubs/9699969699/utilities/uuencode.html)[](https://www.ietf.org/rfc/rfc2045.txt) However, Base64 omits the per-line [length](/page/Length) prefixes and [checksum](/page/Checksum) suffixes required in uuencoding, yielding a simpler, more compact stream without embedded metadata or line delimiters that add extra bytes in uuencoded output.[](https://www.ietf.org/rfc/rfc2045.txt)
The structured line format of uuencoding, with its integrated checksums, supports partial [data recovery](/page/Data_recovery) by allowing decoders to identify and skip corrupted lines during transmission errors, a feature less emphasized in Base64's continuous stream. In contrast, Base64's streamable nature suits seamless integration in protocols, contributing to its dominance in [2025](/page/2025) for web and email applications, such as embedding binary resources via data URIs in [HTML](/page/HTML) and CSS.[](https://developer.mozilla.org/en-US/docs/Web/URI/Reference/Schemes/data)[](https://shiftasia.com/column/base64-encoding-a-comprehensive-overview-for-modern-data-transmission/)
### To Ascii85
Ascii85, also known as Base85, is a [binary-to-text encoding](/page/Binary-to-text_encoding) scheme developed by [Adobe](/page/Adobe) Systems that represents four bytes of binary data using five printable ASCII characters in the range from '!' (ASCII 33) to 'u' (ASCII 117).[](https://opensource.adobe.com/dc-acrobat-sdk-docs/pdfstandards/pdfreference1.0.pdf) This approach yields an overhead of approximately 25%, derived from the ratio of five output characters to four input bytes, in contrast to uuencoding's expansion of about 33% from its core 3-to-4 byte-to-character mapping, plus an additional ~2% from line headers and checksums, resulting in a total of roughly 35%.[](https://pubs.opengroup.org/onlinepubs/7908799/xcu/uuencode.html)[](https://opensource.adobe.com/dc-acrobat-sdk-docs/pdfstandards/pdfreference1.0.pdf)
Ascii85 finds primary application in Adobe's [PostScript](/page/PostScript), PDF, and related formats for embedding arbitrary binary data, such as compressed images, fonts, and streams, directly within documents without introducing fixed line breaks or group delimiters beyond optional PDF-wide 255-character line constraints.[](https://opensource.adobe.com/dc-acrobat-sdk-docs/pdfstandards/pdfreference1.0.pdf) In these contexts, it enables efficient inclusion of non-textual elements while maintaining 7-bit ASCII compatibility, often paired with [compression](/page/Compression) filters like LZWDecode for further optimization.[](https://opensource.adobe.com/dc-acrobat-sdk-docs/pdfstandards/pdfreference1.0.pdf)
Uuencoding, by contrast, processes input in rigid 45-byte groups per line, producing 60 encoded characters plus a length prefix and [checksum](/page/Checksum), which imposes throughput limitations and requires explicit line termination for transport compatibility.[](https://pubs.opengroup.org/onlinepubs/7908799/xcu/uuencode.html) This structure suits early Unix [email](/page/Email) and [file transfer](/page/File_transfer) needs but adds parsing overhead for large files, where Ascii85's continuous streaming proves more efficient, though its denser, punctuation-heavy output lacks uuencoding's semi-structured readability from prefixed lines.[](https://pubs.opengroup.org/onlinepubs/7908799/xcu/uuencode.html)[](https://opensource.adobe.com/dc-acrobat-sdk-docs/pdfstandards/pdfreference1.0.pdf)
## Limitations
### Inefficiencies and Overhead
Uuencoding introduces a significant size expansion when converting [binary data](/page/Binary_data) to text, primarily because it encodes every three input bytes into four output characters using a 64-character printable ASCII set.[](https://pubs.opengroup.org/onlinepubs/7908799/xcu/uuencode.html) This basic transformation results in approximately a 33% overhead relative to the original data size, as the 6-bit encoding per byte expands to 8-bit characters.[](https://compile7.org/compare-binary-encoding/what-is-difference-between-uuencoding-vs-z85-zeromq-spec32z85/) Additional control information, including per-line length indicators and headers, increases the overall expansion to about 35% for typical files.[](https://pubs.opengroup.org/onlinepubs/7908799/xcu/uuencode.html)
The format uses lines of up to 61 characters (excluding the trailing [newline](/page/Newline)), each beginning with a length indicator (the number of input bytes for that line plus [32](/page/32), translated to the local codeset) followed by up to 60 encoded characters that represent a maximum of 45 input bytes.[](https://www.dcode.fr/uu-encoding) This structure causes data fragmentation if the input [length](/page/Length) is not a multiple of 45 bytes per line, requiring padding with zero bytes in incomplete groups, which further contributes to inefficiency—typically adding 1-2% overhead from [length](/page/Length) bytes, [newlines](/page/Newline), and begin/end headers across the [file](/page/File).[](https://superuser.com/questions/568506/how-much-larger-does-uuencode-make-binary-files) Unlike modern formats, uuencoding lacks built-in [compression](/page/Compression), so the output remains uncompressed and vulnerable to redundancy in the source data.[](https://pubs.opengroup.org/onlinepubs/7908799/xcu/uuencode.html)
Processing uuencoded data demands buffering input into fixed 3-byte groups for encoding or decoding, as [the algorithm](/page/The_Algorithm) splits octets into 6-bit segments before [mapping](/page/Mapping) to characters.[](https://pubs.opengroup.org/onlinepubs/7908799/xcu/uuencode.html) This batch-oriented approach incurs computational overhead compared to more streamable encodings like [Base64](/page/Base64), particularly on contemporary hardware where sequential processing is optimized but line-based formatting adds parsing steps.[](https://news.ycombinator.com/item?id=38343748) To mitigate these inefficiencies, files are often pre-compressed using tools like [gzip](/page/Gzip) before uuencoding, which can reduce the net size increase but introduces extra encoding and decoding steps in the workflow.[](https://labex.io/tutorials/linux-linux-uuencode-command-with-practical-examples-422991)
### Compatibility Issues
Uuencoding was designed to produce output using only 7-bit printable ASCII characters (from space to underscore, ASCII 32–95), making it inherently safe for transmission over 7-bit clean channels such as early [email](/page/Email) systems.[](https://pubs.opengroup.org/onlinepubs/9699919799.2018edition/utilities/uuencode.html) However, despite this 7-bit safety, the format remains vulnerable to corruption by certain [email](/page/Email) gateways and transport agents that perform unintended modifications, such as line wrapping at arbitrary lengths (e.g., 80 characters in some [UUCP](/page/UUCP) gateways) or stripping trailing spaces from lines—a common behavior in [1980s](/page/1980s) [Internet](/page/Internet) protocols that disrupts the precise character sequence required for accurate decoding.[](https://retrocomputing.stackexchange.com/questions/3019/why-did-base64-win-against-uuencode) Additionally, while uuencoded [data](/page/Data) avoids non-ASCII characters, surrounding [email](/page/Email) [content](/page/Content) with non-ASCII elements or passage through gateways enforcing strict ASCII can lead to mangling if the entire message is altered.[](https://stackoverflow.com/questions/20862213/sending-email-attachment-using-uuencode-and-mailx) Unlike modern encodings, uuencoding includes no built-in error detection or correction mechanisms, such as checksums, leaving any transmission errors (e.g., bit flips or lost parts) undetectable and resulting in silent [data corruption](/page/Data_corruption) upon decoding.[](https://www.literateprograms.org/uuencode__c_.html)
In handling files with multiple components, such as Macintosh resource forks, uuencoding only processes the primary data fork, leading to complete loss of resource forks, [metadata](/page/Metadata), creator codes, and type information essential for Mac compatibility.[](https://www.oreilly.com/library/view/mac-os-x/0596003706/re397.html) This limitation becomes particularly problematic in multi-part transmissions, where incomplete delivery of split uuencoded sections—common in early [email](/page/Email) splits for large files—causes irreversible corruption, as there is no mechanism to verify part integrity or reconstruct missing segments.[](https://68kmla.org/bb/index.php?threads/do-stuffit-5-5-archives-encode-the-resource-fork-safely-for-transfer-to-a-non-mac.42405/) Furthermore, uuencoding's reliance on ASCII environments renders it outdated for modern [Unicode](/page/Unicode)/[UTF-8](/page/UTF-8) workflows; while it can technically encode UTF-8 binary data, it fails to preserve Unicode-aware filenames, paths, or text [metadata](/page/Metadata), often resulting in garbled or inaccessible files on systems expecting native UTF-8 support.[](https://www.savagetaylor.com/TIL/KB018758.html)
Security concerns persist in legacy uuencoding decoders, many of which contain [buffer overflow](/page/Buffer_overflow) vulnerabilities that can be exploited for [arbitrary code execution](/page/Arbitrary_code_execution) or denial-of-service attacks when processing maliciously crafted input. For instance, the uuencoded decoder in [Mutt](/page/Mutt) versions before 2.2.3 suffers from a buffer overread (CVE-2022-1328), while GMime prior to 2.4.15 has a buffer overflow in the uuencode length calculation macro (CVE-2010-0409).[](https://nvd.nist.gov/vuln/detail/CVE-2022-1328)[](https://nvd.nist.gov/vuln/detail/CVE-2010-0409) These risks remain relevant in legacy tools like [GNU](/page/GNU) sharutils (which includes uuencode/uudecode), where unpatched installations in [enterprise](/page/Enterprise) environments expose systems to [exploitation](/page/Exploitation), especially since updates for such obsolete utilities are infrequent. As recently as 2024, uuencoded files have been used in [phishing](/page/Phishing) campaigns to deliver remote access trojans like Remcos [RAT](/page/RAT).[](https://thecyberexpress.com/remcos-rat-malicious-uuencoding-uue-shipping/)
Due to these interoperability challenges and security exposures, uuencoding is widely deprecated for new applications, with standards bodies and tools recommending [MIME](/page/MIME) with [Base64](/page/Base64) encoding instead to ensure robust compatibility across diverse systems and avoid breakage in modern email infrastructures.[](https://www.gnu.org/software/sharutils/manual/sharutils.html)[](https://mimekit.net/docs/html/T_MimeKit_Encodings_UUEncoder.htm)
## Implementations
### Python Support
Python's [standard library](/page/Standard_library) includes built-in support for uuencoding via the `codecs` [module](/page/Module), which registers the `uu` [codec](/page/Codec) as a binary transform for encoding and decoding arbitrary [binary data](/page/Binary_data) into the uuencode format.[](https://docs.python.org/3/library/codecs.html) This functionality converts bytes to ASCII-safe text, complete with the standard "begin" header (including mode and filename placeholders) and "end" footer, facilitating legacy data transfer over text-only channels.[](https://docs.python.org/3/library/codecs.html#binary-transforms) The `uu` [codec](/page/Codec) has been part of [Python](/page/Python) 3 since its inception, while support in [Python](/page/Python) 2 was deprecated alongside the language's end-of-life on January 1, 2020.
To encode a bytes object, import the `codecs` module and apply the `encode` function with the `'uu'` specifier, which handles the transformation automatically without requiring additional libraries.[](https://docs.python.org/3/library/codecs.html)
```python
from codecs import encode, decode
# Encoding example
data = b'Hello'
encoded = encode(data, 'uu')
print(encoded) # Outputs the uuencoded bytes with header, body, length-0 line, and footer
# Decoding example
decoded = decode(encoded, 'uu')
print(decoded) # Outputs: b'Hello'
This approach ensures the output is a valid uuencoded string, ready for transmission.[23]
For file-based operations, read the input file in binary mode, encode the contents using codecs.encode, and write the result to a text file; decoding follows the reverse process.[23]
python
from codecs import [encode](/page/ENCODE), [decode](/page/ENCODE)
# Encoding a [file](/page/File)
with open('input.bin', 'rb') as infile:
data = infile.read()
encoded_data = [encode](/page/ENCODE)(data, 'uu')
with open('output.uu', 'wb') as outfile:
outfile.write(encoded_data)
# Decoding a file
with open('output.uu', 'rb') as infile:
encoded_data = infile.read()
decoded_data = decode(encoded_data, 'uu')
with open('decoded.bin', 'wb') as outfile:
outfile.write(decoded_data)
from codecs import [encode](/page/ENCODE), [decode](/page/ENCODE)
# Encoding a [file](/page/File)
with open('input.bin', 'rb') as infile:
data = infile.read()
encoded_data = [encode](/page/ENCODE)(data, 'uu')
with open('output.uu', 'wb') as outfile:
outfile.write(encoded_data)
# Decoding a file
with open('output.uu', 'rb') as infile:
encoded_data = infile.read()
decoded_data = decode(encoded_data, 'uu')
with open('decoded.bin', 'wb') as outfile:
outfile.write(decoded_data)
The uu codec supports error handling modes like 'strict' (default, raising errors on invalid input) via the optional errors parameter, ensuring robust processing.[24]
As of 2025, with Python 3.13 and later versions, uuencoding remains integrated in the standard library through the codecs module, despite the removal of the dedicated uu module in Python 3.13 for maintenance reasons; it continues to see use in legacy system migration scripts and compatibility layers for older protocols.[25]
Perl Support
Perl provides native support for uuencoding through its built-in pack and unpack functions, using the "u" template specifier. The pack("u", $data) function encodes binary data into a uuencoded string, automatically grouping the output into lines of up to 45 bytes each, with each line prefixed by a character indicating its length (such as 'M' for a full 45-byte line). Conversely, unpack("u", $encoded) decodes a uuencoded string back to its original binary form, handling the line groupings and length prefixes transparently.[26][27]
This core functionality enables concise one-liners for encoding files directly from the command line. For instance, the following command reads a file, adds standard uuencode headers and footers, and outputs the encoded result:
[perl](/page/Perl) -ple 'BEGIN{use File::Basename; $/=undef; $sn=basename($ARGV[0]);} $_="begin 600 $sn\n".(pack "u", $_)." \nend" if $_' /path/to/file
[perl](/page/Perl) -ple 'BEGIN{use File::Basename; $/=undef; $sn=basename($ARGV[0]);} $_="begin 600 $sn\n".(pack "u", $_)." \nend" if $_' /path/to/file
This approach requires no external modules and produces output compatible with traditional uudecode tools.[28]
For more structured encoding, a full Perl script can read a file, apply the encoding, and include manual headers. The following example script demonstrates this process:
perl
#!/usr/bin/perl
use strict;
use warnings;
my $filename = $ARGV[0] || die "Usage: $0 <file>\n";
open my $fh, '<', $filename or die "Cannot open $filename: $!\n";
binmode $fh;
my $data;
{
local $/;
$data = <$fh>;
}
close $fh;
my $mode = 644; # Default file mode
my $encoded = pack("u", $data);
$encoded = "begin $mode $filename\n" . $encoded . " \nend\n";
print $encoded;
#!/usr/bin/perl
use strict;
use warnings;
my $filename = $ARGV[0] || die "Usage: $0 <file>\n";
open my $fh, '<', $filename or die "Cannot open $filename: $!\n";
binmode $fh;
my $data;
{
local $/;
$data = <$fh>;
}
close $fh;
my $mode = 644; # Default file mode
my $encoded = pack("u", $data);
$encoded = "begin $mode $filename\n" . $encoded . " \nend\n";
print $encoded;
This script uses pack("u", $data) to generate the encoded body without built-in headers, then prepends a "begin" line with file permissions and name, and appends the standard "end" marker (preceded by a length-0 line) for full uuencode compatibility.[26]
As of 2025, uuencoding support via the "u" template remains a core feature of Perl 5, introduced with the language's initial stable release in 1994, and continues to be employed in system administration scripts for interoperability with legacy Unix environments lacking modern encoding tools.[26][28]