Fact-checked by Grok 2 weeks ago

Complex text layout

Complex text layout (CTL), also referred to as complex script rendering, is the specialized process of typesetting and rendering text in writing systems where the visual form, positioning, or sequence of characters (graphemes) varies based on contextual relationships with neighboring characters, rather than following a simple left-to-right linear progression.^[1]^[2] This includes handling bidirectional text directions, glyph shaping, ligature formation, and diacritic placement to ensure accurate and aesthetically appropriate display.^[1]^[3] CTL is essential for supporting a wide array of scripts, including right-to-left languages like Arabic and Hebrew, which mix with left-to-right elements such as numbers, as well as Southeast Asian scripts like Thai that form character clusters with implicit vowels and tone marks.^[2]^[1] Indic scripts, such as Devanagari and Bengali, require complex reordering and matra (vowel sign) positioning around base consonants to form syllables.^[3] Unlike simple scripts (e.g., Latin or Cyrillic), which map characters directly to glyphs in storage order, CTL languages store text in logical order but demand transformation for visual presentation, involving steps like script analysis, character reordering, and font-specific glyph substitution.^[2]^[1] In computing, CTL is implemented through technologies like Microsoft's Uniscribe API, which performs script-specific processing including bidirectional resolution via the Unicode Bidirectional Algorithm and OpenType font features for shaping.^[1] Open-source libraries such as HarfBuzz provide similar capabilities, while web standards in CSS and SVG leverage these for international typography, ensuring support for diverse languages in browsers and applications.^[3] Early efforts, such as The Open Group's CTL project in the 1990s, standardized integration of these features into desktop environments for languages like Arabic and Thai.^[4] The complexity arises from rules for justification, line breaking, and font fallback, which prevent invalid combinations and maintain readability across mixed-script documents.^[1]^[2]

Introduction

Definition and Scope

Complex text layout (CTL) refers to the typesetting and rendering of writing systems in which the shape, position, or order of a grapheme depends on its context, such as adjacent characters or the surrounding text direction. This process involves transformations between the logical storage of text in Unicode and its visual display, distinguishing it from simple linear rendering where characters are presented without modification.^[1]^[2] The scope of CTL includes bidirectional (BiDi) text that mixes right-to-left and left-to-right directions, cursive joining behaviors, ligature formation for combined glyphs, and vertical or multidirectional layouts, but generally excludes straightforward left-to-right scripts like basic Latin unless they require contextual features such as combining marks. These elements ensure that text is legible and culturally appropriate across diverse scripts, with brief handling of BiDi reordering to maintain logical flow in mixed-language documents.^[1]^[2] For example, the simple Latin string "abc" displays as isolated characters in fixed positions, while the Arabic phrase "العربية" demands contextual shaping: letters connect cursively and alter forms (initial, medial, final, or isolated) based on their neighbors, resulting in a fluid, joined appearance. CTL's importance lies in its role for software internationalization (i18n), allowing applications to support global languages accurately and reducing localization costs for vendors entering international markets.^[5]

Historical Development

In the 1980s and early 1990s, digital typesetting technologies like Adobe's PostScript, introduced in 1982, were optimized for Latin-based scripts, creating substantial hurdles for non-Latin writing systems that demanded bidirectional rendering, variable glyph widths, or contextual shaping.^[6] These systems often relied on fixed-width encodings or ad hoc extensions, complicating the handling of scripts such as Arabic, Hebrew, or CJK ideographs, where mixed byte lengths in standards like Shift-JIS further exacerbated access and unification issues.^[7] Early Unicode releases, including version 1.0 in 1991 and 1.1 in 1993, provided a universal encoding foundation but omitted full bidirectional support, restricting effective digital representation of right-to-left and mixed-direction texts.^[7] Key advancements in the mid-1990s addressed these deficiencies through standardized algorithms and font formats. Unicode 2.0, released in 1996, incorporated the Bidirectional Algorithm, enabling logical-to-visual text reordering for scripts with opposing directionalities.^[8] Complementing this, OpenType 1.0, jointly developed by Microsoft and Adobe and published in April 1997, introduced glyph substitution and positioning tables via GSUB and GPOS, facilitating complex shaping for cursive and conjunct-dependent scripts.^[9] As proprietary solutions proved insufficient for diverse linguistic needs, open-source initiatives gained traction: SIL International launched Graphite in 2004 as a programmable system for TrueType fonts targeting lesser-known languages, while HarfBuzz emerged in 2006 from collaborations between Pango and Qt developers to provide a robust, unified OpenType shaping engine.^[10] The post-2000 era marked a transition to open, web-centric standards, driven by the internet's expansion into non-Western markets and the demand for global content accessibility. This evolution culminated in specifications like the CSS Writing Modes Module Level 3, issued as a W3C Working Draft in February 2011, which defined properties for horizontal, vertical, and bidirectional layouts to support international scripts in browsers.^[11] Despite these strides, pre-2020 implementations revealed persistent gaps in minority script support, where many endangered or low-resource writing systems lacked encoding, shaping rules, or font resources for complex layouts. Unicode expansions, including version 3.0 in 2001 and subsequent releases up to 13.0 in 2020, systematically incorporated new characters, bidirectional properties, and script-specific behaviors to bridge these deficiencies and preserve linguistic diversity, continuing in later versions up to 16.0 in September 2024.^[12]^[13]

Writing Systems Requiring CTL

Bidirectional Scripts

Bidirectional scripts are writing systems that incorporate text flowing primarily from right to left (RTL), often intermixed with left-to-right (LTR) elements such as numbers, punctuation, or embedded phrases in other languages, necessitating algorithmic reordering to achieve correct visual presentation.^[14] These scripts arise in languages where the base direction is RTL, but neutral or weak directional characters require resolution based on surrounding context to prevent visual distortion.^[15] Primary examples include Arabic, Hebrew, and Syriac, which are Semitic languages using abjads where letters connect and change form contextually, but whose layout demands bidirectional handling for coherent display.^[16] Numbers, typically classified as European numbers (EN) or Arabic numbers (AN), and punctuation marks like parentheses or quotes are treated as neutral (ON) or weak elements, adopting the direction of adjacent strong directional text or the paragraph's embedding level.^[17] For instance, in an Arabic sentence containing a European numeral, the number flows LTR within the RTL context, ensuring readability without manual adjustment.^[18] The Unicode Bidirectional Algorithm, specified in Unicode Standard Annex #9 (UAX #9), governs this reordering through a multi-pass process that assigns directional levels to characters.^[14] Embedding levels allow nesting of opposite-direction text using control characters like left-to-right embedding (LRE, U+202A) or right-to-left embedding (RLE, U+202B), with levels ranging from even (LTR) to odd (RTL) up to a maximum depth of 125 to avoid overflow.^[19] Overrides, via left-to-right override (LRO, U+202D) or right-to-left override (RLO, U+202E), force uniform direction but are discouraged due to accessibility and security concerns.^[20] Resolution occurs in phases: first, splitting into paragraphs (P1) and applying explicit embeddings (X1–X9); then resolving weak types like numbers (W1–W7); followed by neutral resolution (N1–N2), where neutrals inherit direction from neighbors; and finally implicit levels (I1–I2) for unresolved cases, culminating in visual reordering by level parity (L1–L4).^[15] These scripts affect hundreds of millions of users worldwide, with Arabic alone spoken by over 450 million people across 25 countries, underscoring the global scale of bidirectional layout needs.^[21] Historical precedents trace to ancient systems like the Phoenician script, an RTL abjad from the 11th century BCE that influenced modern Semitic writing directions.^[22] Challenges emerge prominently in mixed-content scenarios, such as RTL documents embedding LTR quotes, URLs, or code snippets, where unhandled neutrals can lead to reversed or mirrored appearances— for example, a URL in Arabic text might display with slashes and dots in inverted order, confusing readers.^[23] Modern solutions recommend directional isolates (LRI, RLI, PDI; U+2066–U+2069) to encapsulate segments without affecting surroundings, mitigating these issues in digital interfaces.^[24]

Complex Shaping Scripts

Complex shaping scripts involve writing systems where individual characters or glyphs change form, combine into ligatures, or reposition relative to one another within a word or syllable to achieve proper rendering. These scripts require sophisticated layout engines to handle intra-word transformations, such as vowel signs attaching to consonants or letters adopting contextual shapes based on their position. Unlike simple scripts, shaping here ensures legibility and aesthetic harmony by applying rules for clustering and substitution.^[25] The Indic or Brahmic family of scripts, including Devanagari, Bengali, and Tamil, exemplifies complex shaping through abugida structures where consonants carry an inherent vowel that can be modified or suppressed. In Devanagari, dependent vowel signs known as matras attach above, below, to the left, or right of a base consonant; for instance, the matra U+093F ◌ि repositions to the left of the consonant क (U+0915) to form the syllable कि (ki). Bengali follows similar rules, allowing up to three left-side vowel signs per syllable, while Tamil uses the puḷḷi (U+0BCA) to suppress inherent vowels and positions vowel signs accordingly. These scripts rely on glyph substitution (GSUB) and positioning (GPOS) tables in OpenType fonts to handle reordering and attachment of matras and consonant conjuncts.^[25]^[26] Southeast Asian scripts like Thai, Khmer, and Lao also demand intricate shaping due to their stacked diacritics and lack of inter-word spacing. In Thai, tone marks (e.g., U+0E48 ◌่ mai ek) and vowel signs (e.g., U+0E31 ◌ู) appear above or below the base consonant, with left-side vowels rendered in logical order but visually preceding the base. Khmer employs a coeng (U+17D2 ◌្) for subjoined consonants and vowel signs that trap around the base, such as composites like U+17B6 U+17C6 for certain vowels, while avoiding spaces between words. Lao mirrors Thai in tone mark and vowel placement, using diacritics that stack outward from the consonant. These features necessitate precise vertical positioning to prevent overlaps in rendering.^[27] Cursive scripts such as Arabic and Mongolian further complicate shaping by requiring glyphs to adopt position-dependent forms for fluid connection. Arabic letters typically have up to four contextual forms: isolated (standalone), initial (word-start), medial (mid-word, joining both sides), and final (word-end), applied to dual-joining characters like م (U+0645); right-joining letters like ر (U+0631) use only isolated and final forms. This cursive joining is managed through OpenType features like init, medi, and fina. Mongolian, written vertically, exhibits similar cursive behavior where letters join on both sides within words, with context-sensitive forms ensuring continuous flow from top to bottom.^[28]^[29]^[30]^[31]

Vertical and Multidirectional Layouts

Vertical text layout involves arranging characters in lines that flow from top to bottom, often with columns progressing from right to left, a convention prevalent in certain writing systems to accommodate their visual and cultural traditions.^[32] This approach contrasts with the predominant horizontal left-to-right flow in many scripts and requires specific handling for character orientation, such as keeping ideographs upright while rotating punctuation or Latin letters.^[33] In East Asian languages, vertical presentation has historical roots in scroll-based writing, where text advances downward along the spine, enhancing readability for dense ideographic content.^[34] East Asian scripts exemplify vertical layout through their handling of Hanzi (Chinese characters), Kanji (Japanese characters borrowed from Chinese), Hiragana and Katakana (Japanese syllabaries), and Hangul (Korean syllables). Hanzi and Kanji remain upright in vertical text, with lines flowing top to bottom and succeeding columns from right to left, preserving the square aspect of each glyph for optimal legibility.^[32] Hiragana and Katakana characters also stay upright, integrating seamlessly with ideographs in mixed-script documents common in Japanese publications.^[33] For Korean, Hangul syllables are composed of stacked jamo (consonants and vowels) that appear upright in vertical flow, though the overall syllable block does not rotate; this allows natural progression down the line without disrupting phonetic clustering.^[32] The Mongolian script represents a distinct vertical system where text is written in columns from top to bottom, with columns advancing from right to left across the page. Individual letters rotate 90 degrees counterclockwise to align with the vertical baseline and connect fluidly within each column, forming a cursive-like chain that reflects the script's traditional calligraphic style.^[35] This rotation and connection ensure that vowels and consonants interlock properly, maintaining the script's aesthetic continuity in vertical presentation.^[36] Multidirectional layouts extend vertical flow by incorporating non-linear progressions, as seen in Tibetan script, which primarily runs horizontally left to right but can adopt vertical arrangements top to bottom with successive columns progressing from left to right in certain manuscript traditions.^[37] This leftward column advance, combined with the script's inherent stacking of subjoined consonants below main glyphs, creates a dynamic flow suited to religious texts or artistic layouts.^[38] Ancient scripts like Linear B, used for Mycenaean Greek around 1450–1200 BCE, occasionally employed boustrophedon writing—alternating direction per line (left to right, then right to left)—on clay tablets.^[39] Unicode Technical Annex #50 (UAX #50) addresses these needs by defining the Vertical_Orientation property, which specifies default behaviors such as upright positioning or 90-degree rotation for over 100 characters across scripts, enabling consistent rendering in vertical contexts without relying solely on font-specific adjustments.^[32] This property supports bidirectional interactions briefly noted in text directionality handling, ensuring mixed vertical-horizontal flows remain coherent.^[32]

Key Characteristics

Text Directionality

Text directionality in complex text layout (CTL) refers to the foundational rules governing how text flows, either from left to right (LTR) or right to left (RTL), particularly in mixed-direction content. For languages like English, the default base direction is LTR, while scripts such as Hebrew and Arabic use RTL as the base direction.^[40] The base direction of a paragraph is typically determined by the first strong directional character encountered, which could be L (left-to-right, e.g., Latin letters), R (right-to-left, e.g., Hebrew letters), or AL (Arabic letters with right-to-left direction).^[40] If no strong character is present, higher-level protocols may set the direction explicitly.^[40] The Unicode Bidirectional Algorithm (UBA), defined in Unicode Standard Annex #9 (UAX #9), provides a standardized method to resolve directionality through an 18-rule process divided into phases: separating paragraphs, resolving embedding levels, handling weak and neutral characters, and final reordering.^[40] For instance, Rule P2 identifies paragraph separators, and Rule P3 sets the paragraph level to 0 (LTR) or 1 (RTL) based on the first strong character.^[40] Explicit directional overrides are managed by formatting codes, such as Rule X2 for RLE (Right-to-Left Embedding), which raises the embedding level to the next odd number to force RTL direction within a segment, later terminated by PDF (Pop Directional Format).^[40] Rule L1 then resets the levels of paragraph separators, trailing whitespace, and isolate terminators to match the paragraph's base level.^[40] Directionality operates at both paragraph and inline levels within CTL. Paragraphs are processed independently, split by B-type (paragraph separator) characters, with each establishing its own base direction before line-by-line reordering.^[40] Inline elements, such as embedded text or objects, inherit or adapt to the surrounding context, treating inline objects as the neutral U+FFFC character for direction resolution.^[40] Weak directional characters, including numbers, are resolved in the algorithm's third phase using Rules W1 through W7; for example, European numbers (EN) adapt by changing to Arabic numbers (AN) if preceded by right-to-left characters like AL (Rule W2), or to left-to-right if preceded by L (Rule W7), ensuring numbers align appropriately in RTL contexts without disrupting the overall flow.^[40] In web and document technologies, directionality can be overridden using standards like CSS. The CSS direction property specifies the base inline direction as ltr or rtl for an element, influencing the UBA's paragraph level and affecting text ordering, table layouts, and overflow behavior.^[41] Complementing this, the unicode-bidi property controls bidirectional embedding and isolation, with values like embed (inserting LRE or RLE codes), isolate (using directional isolates for scoped direction), or bidi-override (forcing direction regardless of character types), allowing precise control over mixed-direction rendering while integrating with the UBA.^[41] These properties enable authors to handle CTL in bidirectional scripts, such as embedding LTR quotes in RTL text.^[41]

Glyph Shaping and Ligatures

Glyph shaping transforms sequences of Unicode code points into positioned glyphs for accurate rendering in complex scripts, primarily through substitutions defined in the OpenType GSUB (Glyph Substitution) table. The process begins with mapping Unicode characters to initial glyph indices via the font's cmap (character-to-glyph mapping) table, followed by application of script- and language-specific OpenType features by a shaping engine, such as HarfBuzz or Microsoft's Uniscribe. These features apply contextual substitutions, resulting in an output glyph string that accounts for script requirements like cursive joining or syllabic clustering. For instance, the 'rlig' feature enforces required ligatures, while others handle positional variants.^[42]^[43] Ligatures represent a key substitution mechanism, replacing multiple input glyphs with a single composite glyph to enhance readability or aesthetics. Discretionary ligatures, activated via the 'dlig' feature, are optional and common in Latin scripts, such as the "fi" combination where the dot of 'i' overlaps the crossbar of 'f' to avoid collision. In contrast, contextual ligatures are mandatory in cursive scripts like Arabic, where the 'rlig' feature substitutes specific sequences; a prominent example is the Lam-Alef ligature (لام + الف → لا), which joins the lam and alef consonants into a unified form essential for orthographic correctness across initial, medial, final, and isolated positions. These substitutions ensure fluid cursive connections without gaps or overlaps.^[44]^[30]^[45] In abugida scripts like those of Indic languages, position-specific forms further refine glyph substitution to reflect syllabic structure. The 'rphf' feature substitutes a special reph form for the 'ra' consonant (र) when followed by a virama (halant) in a conjunct, repositioning it visually after the subsequent base consonant, often in an above-base position. Similarly, the 'vatu' feature applies above-base substitutions for vattu forms, such as elevating certain consonant clusters above the primary base glyph in scripts like Telugu or Kannada. Khmer script employs analogous mechanisms, where the 'pres' (pre-base substitutions) and 'abvs' (above-base) features split certain vowel signs; for example, the OE vowel (អើ) decomposes into a pre-base part and an above-base component, ensuring proper attachment around the consonant without overlap.^[46]^[47]^[48] The GSUB table organizes these substitutions into lookups, which can number in the thousands for complex scripts due to the combinatorial possibilities of contextual rules. In Arabic fonts, such as those supporting Naskh styles, GSUB lookups handle joining behaviors and ligatures across hundreds of glyph variants, demonstrating the table's capacity for intricate rule sets. This substitution framework, integral to OpenType font technologies, enables consistent rendering across diverse writing systems.^[42]^[30]

Reordering and Positioning

Reordering in complex text layout involves transforming the logical sequence of characters— as entered or stored—into a visual order suitable for display, particularly in bidirectional and complex scripts. In bidirectional scripts like Hebrew and Arabic, the Unicode Bidirectional Algorithm (UBA) performs this logical-to-visual reordering by assigning embedding levels to characters based on their directional properties. For example, Hebrew text is input in logical order from left to right, but the UBA reverses it for right-to-left visual presentation; thus, the logical sequence "AB" (where A and B are Hebrew characters) appears as "BA" visually.^[40] This process resolves mixed directional runs, ensuring that left-to-right (LTR) segments, such as embedded numbers or Latin text, are correctly nested within right-to-left (RTL) contexts. The UBA supports up to 61 explicit embedding levels, with even levels indicating LTR direction and odd levels RTL, allowing for deeply nested bidirectional structures without exceeding practical limits.^[49] In Indic scripts, reordering also repositions dependent vowels known as matras relative to their base consonants to achieve proper syllabic structure. Pre-base matras, which appear after the base consonant in logical order, are repositioned to precede the consonant glyph during rendering. For instance, in Devanagari, the short 'i' matra (ि) follows the base 'ka' (क) logically as कि, but is rendered with the matra before the 'ka'. In clusters, such as 'ka' + 'i-matra' + virama + 'ta' for "क्ति", the matra is reordered before the 'ka' after forming the conjunct.^[26] This reordering occurs after initial glyph decomposition and before applying features like half-forms, relying on script-specific rules to maintain phonetic and aesthetic integrity. Positioning adjustments fine-tune glyph metrics post-reordering to handle spacing and attachments. The OpenType GPOS (Glyph Positioning) table enables precise control, including kerning to adjust inter-glyph spacing—such as reducing the advance width between a lowercase "f" and "i" by a specified value—and mark attachment for anchoring diacritics to base glyphs. In mark-to-base positioning, a diacritic like a kasra (below a base letter in Arabic) is aligned using anchor points, offsetting its x and y coordinates relative to the base glyph's attachment point for accurate vertical and horizontal placement. Mark-to-mark attachments further position stacked diacritics, such as a tone mark above a vowel mark, ensuring layered readability.^[50] Line breaking in complex scripts requires tailored rules to identify permissible breaks, often beyond simple spaces. Unicode Annex #14 (UAX #14) defines these via character classes and rules, but for scripts like Thai—which lack spaces between words—breaks are restricted to syllable or word boundaries determined by dictionary-based analysis. Thai characters fall into the SA (South East Asian) class, where a morphological dictionary reclassifies runs (e.g., assigning BB for word beginnings and AL for continuations) to enable breaks only at valid points, preventing disruptions in tonal or conjunct forms.^[51] For vertical layouts common in East Asian writing systems, metrics ensure proper ideograph positioning and line progression. Under UAX #50, Han ideographs and similar characters remain upright (Vertical_Orientation property "U") in vertical text, with baselines aligned centrally within the em-box for consistent column flow. Vertical metrics, such as those in OpenType's VORG or VDMX tables, define line gaps and advance heights tailored to ideographs, accommodating mixed orientations where Latin insertions rotate sideways while ideographs stay upright to preserve readability in traditional formats.^[32]

Standards and Specifications

The Unicode Standard, maintained by the Unicode Consortium, serves as the primary character encoding framework for complex text layout (CTL) by assigning unique code points to characters from diverse writing systems and defining properties that enable algorithms for directionality, shaping, and positioning. First released as version 1.0 in 1991, the standard has evolved through annual updates, reaching version 17.0 in September 2025, with each iteration expanding support for CTL through refined character properties such as Bidi_Class (which categorizes characters for bidirectional resolution) and Script (which identifies the writing system for appropriate rendering rules).^[52]^[53] These properties are documented in the Unicode Character Database (UCD), part of Unicode Standard Annex #44, and form the basis for CTL processing in software implementations.^[53] Key supporting specifications include Unicode Standard Annex #9, which outlines the Bidirectional Algorithm for handling mixed directional text in scripts like Arabic and Hebrew.^[14] Annex #14 details the Line Breaking Algorithm, specifying rules for identifying break opportunities in complex scripts to prevent improper word or syllable division.^[54] For shaping in scripts requiring glyph reordering and contextual forms, such as Indic and Southeast Asian languages, the standard relies on properties like Indic_Syllabic_Category and Joining_Type defined in the UCD.^[53] Annex #50 addresses vertical text layout, providing orientation properties for scripts like Mongolian and traditional Chinese that flow top-to-bottom.^[32] Additionally, Unicode Technical Report #17 describes the character encoding model that accommodates complex representations, such as composite sequences for scripts with inherent variability. The Unicode Standard maintains synchronization with ISO/IEC 10646, the International Standard for the Universal Coded Character Set (UCS), ensuring identical character repertoires and encoding forms like UTF-8, UTF-16, and UTF-32 for global interoperability. Recent versions have incorporated new complex scripts to preserve endangered writing systems; for instance, Unicode 10.0 (2017) added the Masaram Gondi block (U+11D00–U+11D5F), a Brahmi-derived script for the Gondi language requiring vowel signs and reph positioning. Unicode 12.0 (2019) introduced the Nandinagari block (U+119A0–U+119FF), a historical Devanagari variant used for Sanskrit with matra attachments and conjunct forms. Unicode 17.0, released on September 9, 2025, adds 4,803 characters to reach a total of 159,801, including the new Tolong Siki block (U+11DB0–U+11DBF) for the Kurukh language, a Dravidian script requiring vowel sign positioning around consonants to form syllables.^[55]^[56]

OpenType and Font Technologies

OpenType, developed jointly by Microsoft and Adobe, serves as the predominant font format for enabling complex text layout through its layout tables, which allow for script-specific glyph substitutions, positioning, and classifications. The core of OpenType's CTL capabilities lies in three key tables: the Glyph Substitution table (GSUB), which handles glyph replacements such as ligatures and contextual forms; the Glyph Positioning table (GPOS), which manages precise adjustments for kerning, mark placement, and cursive connections; and the Glyph Definition table (GDEF), which defines glyph classes like base glyphs, ligatures, and marks to facilitate efficient processing by GSUB and GPOS. These tables, introduced in OpenType 1.0 and refined in subsequent versions, enable fonts to implement Unicode-based script requirements without altering the underlying text encoding.^[57]^[42]^[50] Prior to widespread OpenType adoption, alternative technologies existed for CTL. Apple's Advanced Typography (AAT), part of the Apple Type Services framework, provided similar functionality through tables like 'mort' for substitutions and 'morf' or 'trak' for positioning, but it was largely proprietary and tied to Apple's ecosystem. AAT was deprecated starting with Mac OS X 10.5 Leopard in 2007, with Apple shifting focus to OpenType for cross-platform compatibility and broader script support. Another alternative is Graphite, developed by SIL International as an open-source, rule-based system embedded in TrueType or OpenType-compatible fonts using custom tables like 'Silf' for layout rules and 'Sill' for state tables. Graphite excels in flexibility for non-Latin scripts not fully covered by OpenType standards, allowing programmers to define complex behaviors directly in the font without relying on external engines.^[58]^[59]^[60] OpenType version 1.8, released in 2016, introduced variable fonts, which extend CTL efficiency by packaging multiple stylistic variations—such as weight, width, or optical size—into a single font file using axes defined in the 'fvar' table and interpolated via 'gvar' for glyphs. This reduces file sizes and loading times for CTL scenarios involving diverse typographic needs across scripts, as a single variable font can adapt to localization or emphasis requirements without multiple static files. OpenType's feature system further supports over 30 scripts, including Arabic, Devanagari, and Thai, through tags like 'locl' for localized glyph forms and 'mark' for attaching diacritics and combining marks to base characters, ensuring proper rendering in bidirectional or shaping contexts as per Unicode properties.^[61]^[62]^[63]

Implementations

Software Libraries

HarfBuzz is an open-source text shaping library initiated in 2006 by Behdad Esfahbod as part of the FreeType project and currently maintained by Google and SIL International.^[64]^[65] It provides comprehensive support for OpenType features, Apple Advanced Typography (AAT), and Graphite shaping models, enabling accurate glyph selection, positioning, and ligature formation for complex scripts across various writing systems.^[64] Widely adopted in web browsers such as Google Chrome and Mozilla Firefox, HarfBuzz ensures consistent rendering of bidirectional and cursive text in these environments.^[64] As of November 2025, version 12.2.0 includes optimizations for font subsetting and integration with modern graphics APIs, while adding full support for Unicode 17.0 characters released in September 2025.^[66] On modern hardware, HarfBuzz achieves high throughput, with recent releases like 11.3 delivering up to 45% faster glyph advance calculations.^[67] The International Components for Unicode (ICU) library, developed by IBM and now maintained under the Unicode Consortium, incorporates a LayoutEngine module for handling complex text layout in cross-platform applications.^[68] This engine integrates bidirectional algorithm processing with glyph shaping, supporting OpenType features for scripts like Arabic, Devanagari, and Indic languages through its C, C++, and Java APIs.^[68] Designed for embedding in software such as web engines and document processors, ICU's LayoutEngine processes runs of text in a single font and direction, facilitating reordering and positioning without relying on platform-specific rendering.^[68] Other notable libraries include Microsoft's Uniscribe, a legacy Windows API introduced in the early 2000s for Unicode text rendering and complex script support, which handles paragraph-level layout using OpenType tables but is increasingly supplemented by newer DirectWrite APIs.^[69] Apple's Core Text framework provides low-level text shaping and layout capabilities optimized for macOS and iOS, leveraging AAT and OpenType for high-performance glyph positioning in applications like Safari.^[70] For Rust ecosystems, wrappers such as harfbuzz-rs offer safe bindings to HarfBuzz, enabling text shaping in systems programming without direct C interop, while rustybuzz provides a pure-Rust implementation of the core shaping algorithm for memory-safe environments.^[71]^[72]

Operating System and Application Support

On Microsoft Windows, DirectWrite, introduced in 2009 with Windows 7, serves as the primary API for high-quality text rendering, incorporating full support for complex scripts through its integration with the Uniscribe engine.^[73] Uniscribe, a longstanding component of the Windows text processing stack, handles bidirectional text, glyph shaping, and reordering for a wide array of scripts, enabling applications to support numerous languages including Arabic, Hebrew, Indic, and Southeast Asian writing systems.^[74] The DWriteCore library extends this functionality to non-Windows environments while maintaining compatibility with Windows' native complex text layout capabilities.^[73] Apple's macOS and iOS platforms rely on Core Text as the core framework for text layout and rendering, providing robust support for complex scripts through features like glyph positioning, bidirectional algorithms, and font fallback mechanisms.^[75] Core Text processes Unicode text streams to generate positioned glyph runs, accommodating right-to-left and vertical writing modes essential for languages such as Arabic, Hebrew, and East Asian scripts. Since macOS Ventura in 2022, enhanced integration with open-source libraries like HarfBuzz has allowed developers to leverage advanced shaping for even more precise control over complex text rendering in custom applications.^[76] Linux operating systems typically employ HarfBuzz for text shaping in conjunction with FreeType for glyph rasterization, forming a lightweight yet powerful stack for complex text layout in desktop environments like GNOME and KDE.^[77] This combination ensures accurate handling of script-specific features, such as ligature formation in Arabic or reordering in Indic scripts, across graphical toolkits and applications. On Android, HarfBuzz similarly powers the system's text engine, integrated into the Android framework to deliver consistent complex script support for diverse languages in user interfaces and apps.^[64] Web browsers achieve complex text layout through adherence to the CSS Writing Modes Level 3 specification, which defines properties for controlling text direction, inline progression, and glyph orientation to support bidirectional and vertical flows. Modern engines in Chrome and Firefox utilize underlying shapers like HarfBuzz, while Safari relies on Core Text, enabling web content to render scripts such as Mongolian vertical text or Arabic cursive joining without platform-specific dependencies. Major applications have incorporated dedicated complex text layout engines to meet professional typesetting needs. Adobe InDesign has provided comprehensive CTL support since the CS3 release in 2007, with the World-Ready paragraph composer enabling advanced features like contextual glyph substitution and bidirectional paragraph composition for scripts including Arabic, Hebrew, and Indic languages. Microsoft Office applications, particularly from versions post-2010, feature enhanced rendering for Arabic and Hebrew through improved Uniscribe integration, offering better visual kerning, ligature application, and right-to-left text alignment in tools like Word and PowerPoint.^[78] Recent mobile OS updates have further refined CTL for specific scripts.

Challenges and Advances

Persistent Issues

One persistent challenge in complex text layout (CTL) is the incomplete coverage of fonts for minority and low-resource scripts. Although Unicode encodes over 150 scripts, many minority ones lack comprehensive OpenType features necessary for proper glyph shaping, ligature formation, and positioning. For instance, analysis of Unicode versions 6.0 to 9.0 (2010–2016) revealed that over 40% of newly added scripts had no available fonts supporting their layout requirements at the time of encoding.^[79] Projects like Google's Noto font family have addressed some gaps by providing open-licensed coverage for most Unicode scripts, yet full OpenType support remains absent for numerous endangered and minority writing systems, limiting accurate digital representation.^[80] Performance bottlenecks continue to affect CTL, particularly in handling scripts with high glyph counts or intricate shaping rules. Rendering complex pages, such as Arabic PDFs with 1000+ glyphs, demands significant CPU resources due to the computational intensity of bidirectional analysis, contextual substitution, and positioning algorithms.^[81] Text shaping engines like HarfBuzz, while optimized, incur overhead from frequent glyph lookups and feature applications, leading to delays in resource-constrained environments like mobile devices or legacy systems.^[82] Interoperability variations between shaping libraries pose another ongoing issue, resulting in inconsistent text rendering across applications and platforms. For example, HarfBuzz and ICU (International Components for Unicode) differ in their handling of Thai text stacking, where diacritic positioning and vowel marks may vary due to distinct implementations of OpenType tables and script-specific rules.^[83] These discrepancies can lead to visual artifacts, such as misaligned clusters or incorrect ligatures, complicating cross-platform development and document exchange.^[68] Accessibility challenges are particularly acute for users relying on screen readers with reordered or bidirectional text. Screen readers often struggle to convey logical reading order in complex scripts like Arabic or Hebrew, presenting content in visual rather than semantic sequence, which confuses navigation and comprehension.^[84] Pre-CSS Writing Modes Level 3 implementations (prior to widespread CSS4 adoption) exacerbated web rendering inconsistencies, with browsers varying in support for inline progression and baseline alignment in mixed-script layouts. A 2023 survey highlighted these issues, reporting a 15% error rate in accurate text rendering for low-resource languages like Shan across common assistive technologies.^[85]

Recent Developments

In recent years, the Unicode Consortium has continued to expand support for complex text layout through major version releases. Unicode 16.0, released on September 10, 2024, introduced seven new scripts, including Tulu-Tigalari, which requires complex glyph shaping and positioning for proper rendering.^[86] Unicode 17.0, released on September 9, 2025, added four additional scripts, with Tai Yo featuring intricate layout requirements involving reordering and ligature formation.^[52] The open-source HarfBuzz shaping library has seen significant enhancements for complex text processing. Version 10.3.0, released on February 11, 2025, delivered substantial performance improvements to Apple Advanced Typography (AAT) shaping, benefiting scripts with complex contextual rules. More recently, version 12.0.0 on September 27, 2025, enabled support for the VARC (Variable Composites) table by default, optimizing variable font handling in complex layouts by allowing dynamic glyph composition. Version 12.2.0, released on November 5, 2025, aligned HarfBuzz's syllable-based ChainContext rules with Windows implementations, enhancing consistency for Indic and other complex scripts. Web standards have advanced to better accommodate bidirectional and ruby annotations in complex text. The CSS Text Module Level 4 was published as a Working Draft on May 29, 2024, introducing refined controls for text wrapping, justification, and white space processing that interact with bidirectional algorithms.^[87] It builds on the unicode-bidi property to provide finer isolation for mixed-directionality content, reducing embedding errors in layouts with right-to-left and left-to-right scripts.^[88] For ruby text, often used in East Asian complex layouts, CSS Ruby Module Level 1 integrations with Text Level 4 enable advanced positioning without disrupting baseline alignment.^[89] Microsoft's Universal Shaping Engine (USE) has been updated to support emerging Unicode scripts. As of 2024, it accommodates complex scripts from Unicode 16.0, including those requiring multi-stage glyph reordering, extending prior coverage of Unicode 15.0 scripts like ADLaM.^[46] Open-source efforts for underrepresented scripts have progressed notably; for instance, full shaping support for the ADLaM script—used for Fulani languages in West Africa—was integrated into HarfBuzz and related font tools in 2019, with W3C layout requirements documented in May 2024 to guide browser and e-book implementations.^[90] Browser vendors have implemented these advancements, leading to more efficient complex text rendering. In 2025, Chromium-based browsers, including Microsoft Edge, rolled out enhanced text rendering on Windows, improving subpixel antialiasing and contrast, which has reduced visual artifacts in various contexts.^[91]

References

[1]
About Complex Scripts - Win32 apps | Microsoft Learn
Jan 7, 2021 · Processing a complex script must account for the difference between the logical (keystroke) order and the visual order of the glyphs.
[2]
Chapter 1 Complex Text Layout Languages
A Complex Text Layout (CTL) language is any language which stores text differently from how it is displayed.
[3]
Text — SVG 2
complex text layout where: there is not always a one-to-one correspondence between characters and glyphs, characters may change shape depending on location (e. ...
[4]
Desktop Technologies -- CTL - The Open Group
The Open Group's Complex Text Layout (CTL) pre-structured technology (PST) project integrates the display and editing of complex text languages.
[5]
TOG Press Release - CTL 1.0 - The Open Group
Complex Text Layout enables open system and software vendors to penetrate new, international markets while reducing the costs associated with localizing ...Missing: i18n | Show results with:i18n
[6]
[PDF] The Non-Latin scripts & typography Kamal Mansour 1 Introduction
It is quite common in typographic terminology to divide the world's scripts into Latin and non-Latins. At first glance that might seem to be a reasonable.
[7]
Early Years of Unicode
Mar 26, 2015 · Unicode's groundwork began in late 1987 with discussions by Joe Becker, Lee Collins, and Mark Davis. The term "Unicode" was coined in December ...
[8]
ReadMe-2.0.14.txt - Unicode
These are the categories required by the Bidirectional Behavior Algorithm in the Unicode Standard. These categories are summarized in Chapter 4 of the Unicode ...
[9]
Archive of OpenType versions - Typography - Microsoft Learn
released April 1997. For a detailed change history, see the change log for the ...Missing: text shaping
[10]
State of Text Rendering - Behdad Esfahbod
Jul 5, 2009 · Around 2006 Pango and Qt developers cooperated to reunify the layout engine again, and HarfBuzz was born as a freedesktop.org project. Initially ...
[11]
CSS Writing Modes Module Level 3
### Summary of CSS Writing Modes Module Level 3 (W3C Working Draft, 01 February 2011)
[12]
About Versions
### Summary of Unicode Expansions Pre-2020 Addressing Gaps in Minority Scripts for Complex Text Layout
[13]
Progress Updates - Script Encoding Initiative
This page compiles past reports and annual summaries of achievement. Detailed updates on SEI's work are published quarterly in the Unicode Document Registry.Missing: pre- | Show results with:pre-
[14]
UAX #9: Unicode Bidirectional Algorithm
Summary of each segment:
[15]
https://www.unicode.org/reports/tr9/#Basic_Display_Algorithm
[16]
https://www.unicode.org/reports/tr9/#Introduction
[17]
https://www.unicode.org/reports/tr9/#Resolving_Weak_Types
[18]
https://www.unicode.org/reports/tr9/#Examples
[19]
https://www.unicode.org/reports/tr9/#Explicit_Levels
[20]
https://www.unicode.org/reports/tr9/#Explicit_Directional_Overrides
[21]
World Arabic Language Day - the United Nations
Dec 18, 2024 · Arabic, spoken by over 450 million people and holding official status in nearly 25 countries, is a global language with immense cultural ...Missing: total | Show results with:total
[22]
https://www.britannica.com/topic/alphabet-writing/Greek-alphabet
[23]
https://www.unicode.org/reports/tr9/#URLs
[24]
https://www.unicode.org/reports/tr9/#Explicit_Directional_Isolates
[25]
Chapter 12 – Unicode 17.0.0
The Unicode Standard encodes Devanagari characters in the same relative positions as those coded in positions A0–F416 in the ISCII-1988 standard. The same ...
[26]
Developing OpenType Fonts for Devanagari Script - Typography
The new Indic shaping engine allows for variations in typographic conventions, giving a font developer control over shaping by the choice of designation of ...
[27]
Southeast Asia-I - Unicode
Some of the vowel signs and all of the tone marks are rendered in the script as diacritics attached above or below the base consonant. These combining signs and ...Missing: traps | Show results with:traps
[28]
https://www.unicode.org/versions/latest/core-spec/chapter-9/
[29]
Arabic & Persian Layout Requirements - W3C
Oct 2, 2025 · In addition to the four joining forms (isolated, initial, medial, and final), each Arabic letter can come with different shapes while preserving ...
[30]
Developing OpenType Fonts for Arabic Script - Microsoft Learn
Jun 9, 2022 · Glyph - A glyph represents a form of one or more characters. For example, the final, initial and medial 'lam' glyphs (U+FEDE, U+FEDF & U+ ...
[31]
Mongolian Script Resources - W3C
Jan 16, 2025 · Modern Mongolian can be written using a subset of the letters available in the Mongolian Unicode block. ... The script is cursive, ie.
[32]
UAX #50: Unicode Vertical Text Layout
This report describes a Unicode character property which can serve as a stable default orientation of characters for reliable document interchange.Overview and Scope · The Vertical_Orientation... · Scope of the Property
[33]
UTN #22: Robust Vertical Text Layout - Unicode
Apr 25, 2005 · Vertical text is the traditional mode of text layout for many East Asian writing systems. It is also used for effects such as vertical headers ...
[34]
UTR #50: Unicode Vertical Text Layout
Some languages, however, have publishing traditions that provide for long-format vertical text presentation, notably East Asian languages such as Japanese. In ...Unicode Vertical Text Layout · 1 Overview And Scope · 5 Glyphs Changes For...
[35]
Mongolian Layout Requirements - W3C
Jul 10, 2025 · This document describes the basic requirements for Mongolian script layout and text support on the Web and in eBooks.
[36]
[PDF] Mongolian Script Rendering Issues - Unicode
Abstract. This paper discusses the rendering issues of complex text layouts, particularly traditional Mongolian script. Solving the rendering issues of.
[37]
Requirements for Tibetan Text Layout and Typography - W3C
Apr 2, 2024 · Text direction Tibetan is normally written horizontally and read from left to right. Occasionally, Tibetan text may occur in vertically-set ...
[38]
https://r12a.github.io/scripts/tibt/bo.html
[39]
Linear B Script - World History Encyclopedia
Aug 4, 2023 · Linear B script was the writing system of the Mycenaean civilization of the Bronze Age Mediterranean. The syllabic script was used to write Mycenaean Greek ...Missing: multidirectional | Show results with:multidirectional
[40]
UAX #9: Unicode Bidirectional Algorithm
This annex describes the algorithm used to determine the directionality for bidirectional Unicode text.
[41]
CSS Writing Modes Level 3 - W3C
Dec 10, 2019 · CSS Writing Modes Level 3 defines CSS support for various writing modes and their combinations, including left-to-right and right-to-left text ordering.Missing: globalization | Show results with:globalization
[42]
GSUB — Glyph Substitution Table (OpenType 1.9.1) - Microsoft Learn
May 29, 2024 · The Glyph Substitution (GSUB) table provides data for substitution of glyphs for appropriate rendering of scripts, such as cursively-connecting forms in Arabic ...Missing: Devanagari | Show results with:Devanagari
[43]
Windows glyph processing for OpenType fonts, part 1 - Microsoft Learn
Nov 17, 2020 · When applications have needed to provide more complicated text processing for complex scripts [1] or sophisticated typography, they have ...Opentype Fonts · Opentype Layout Services · Uniscribe
[44]
Registered features, a-e (OpenType 1.9.1) - Typography
Jul 6, 2024 · Recommended implementation: This feature is used to map sequences that form Akhands to the corresponding ligature glyph (GSUB lookup type 4).
[45]
Chapter 9 – Unicode 17.0.0
Summary of each segment:
[46]
Creating and supporting OpenType fonts for the Universal Shaping ...
This document presents information that will help font developers in creating OpenType fonts for complex scripts included in the Unicode Standard 16.0
[47]
Registered features, p-t (OpenType 1.9.1) - Typography
May 31, 2024 · For Indic scripts, the following features should be applied in order: 'nukt', 'akhn', 'rphf', 'rkrf', 'pref', 'blwf', 'half', 'pstf', 'cjct'.
[48]
Developing OpenType Fonts for Khmer Script - Microsoft Learn
Jun 24, 2022 · This document helps font developers create OpenType fonts for Khmer, covering encoding, character sets, and using tools to produce Khmer fonts.Missing: Indic reph vattu
[49]
UAX #9: The Bidirectional Algorithm - Unicode
Summary. This document describes specifications for the positioning of characters flowing from right to left, such as Arabic or Hebrew.
[50]
GPOS — Glyph Positioning Table (OpenType 1.9.1) - Typography
May 29, 2024 · A mark-to-mark attachment positions one mark relative to another, as when positioning tone marks with respect to vowel diacritical marks in ...
[51]
UAX #14: Unicode Line Breaking Algorithm
The third style is used for scripts such as Thai, which allow line breaks only at word boundaries, but do not mark word boundaries in any way, so that the ...Definitions · Line Breaking Properties · Line Breaking Algorithm · Customization
[52]
Unicode 17.0.0
Sep 9, 2025 · Unicode 17.0 adds 4803 characters, for a total of 159,801 characters. The new additions include 4 new scripts: Sidetic; Tolong Siki; Beria ...Missing: Rohingya Garay
[53]
UAX #44: Unicode Character Database
Aug 27, 2025 · This annex provides the core documentation for the Unicode Character Database (UCD). It describes the layout and organization of the Unicode Character Database.Missing: pre- | Show results with:pre-
[54]
https://www.unicode.org/reports/tr14/
[55]
https://blog.unicode.org/2025/09/unicode-170-release-announcement.html
[56]
[PDF] Hanifi Rohingya - The Unicode Standard, Version 17.0
These charts are provided as the online reference to the character contents of the Unicode Standard, Version 17.0 but do not provide all the information needed ...
[57]
OpenType layout common table formats (OpenType 1.9.1)
Jul 6, 2024 · OpenType Layout makes use of five tables: the Glyph Substitution table (GSUB), the Glyph Positioning table (GPOS), the Baseline table (BASE), ...Missing: Adobe | Show results with:Adobe
[58]
About Apple Advanced Typography Fonts
This document explains the font tables you include in the 'sfnt' resource in order for your font to offer special features and effects.Missing: deprecated 10.5
[59]
Postscript is gone, long live TrueType and OpenType - AppleInsider
Nov 8, 2023 · Apple deprecated all support for ATSUI and WorldScript in Mac OS X 10.5 Leopard - and support for Mac OS X's ATS.framework (Apple Type Services) ...
[60]
Graphite technical overview
The Graphite engine uses the font, particularly the Graphite-specific tables, to perform text layout. The dotted arrow between the engine and the output device ...Missing: non- | Show results with:non-
[61]
OpenType Font Variations Overview - Microsoft Learn
May 30, 2024 · OpenType Font Variations allow a font designer to incorporate multiple font faces within a font family into a single font resource.
[62]
Introducing OpenType Variable Fonts | by John Hudson - Medium
Sep 14, 2016 · An OpenType variable font is one in which the equivalent of multiple individual fonts can be compactly packaged within a single font file.
[63]
Registered features, k-o (OpenType 1.9.1) - Typography
May 31, 2024 · Application interface: In recommended usage, this feature triggers positioning of mark glyphs required for correct layout. It should always be ...
[64]
HarfBuzz text shaping engine - GitHub
HarfBuzz is a text shaping engine. It primarily supports OpenType, but also Apple Advanced Typography. HarfBuzz is used in Android, Chrome, ChromeOS, Firefox, ...Harfbuzz Wiki · HarfBuzz · Issues 84 · Pull requests 7Missing: 2006 | Show results with:2006
[65]
Why is it called HarfBuzz?
This project is maintained by Behdad Esfahbod, who named it HarfBuzz. Originally, it was a shaping engine for OpenType fonts—"HarfBuzz" is the Persian for "open ...Missing: history | Show results with:history
[66]
Releases · harfbuzz/harfbuzz - GitHub
The Fontra font editor already supports this technology. Note that this new format involves just the HarfBuzz draw API and does not affect shaping.Missing: text history
[67]
HarfBuzz 11.3 Delivers Significant Performance Improvements
Jul 21, 2025 · Drawing can be up to 40% faster, calculating glyph extents up to 15% faster, and getting horizontal glyph advances up to 45% faster. HarfBuzz ...Missing: second | Show results with:second
[68]
Layout Engine | ICU Documentation
The ICU LayoutEngine is designed to process a run of text which is in a single font. It is written in a single direction (left-to-right or right-to-left), and ...
[69]
Displaying Text with Uniscribe - Win32 apps | Microsoft Learn
Jan 7, 2021 · An application that uses complex scripts has the following problems with a simple approach to layout and display. The width of a complex script ...
[70]
Core Text | Apple Developer Documentation
Core Text provides a low-level programming interface for laying out text and handling fonts. The Core Text layout engine is designed for high performance.Core Text Programming Guide · Core Text Enumerations · Core Text Structures
[71]
harfbuzz/harfbuzz_rs: A fully safe Rust wrapper for the ... - GitHub
harfbuzz_rs is a high-level interface to HarfBuzz, exposing its most important functionality in a safe manner using Rust.
[72]
harfbuzz/rustybuzz - GitHub
rustybuzz passes nearly all of harfbuzz shaping tests (2221 out of 2252 to be more precise). So it's mostly identical, but there are still some tiny ...Missing: November | Show results with:November
[73]
DirectWrite (DWrite) - Win32 apps - Microsoft Learn
Oct 4, 2021 · Today's applications must support high-quality text rendering, resolution-independent outline fonts, and full Unicode text and layout support.
[74]
Script and font support in Windows - Globalization - Microsoft Learn
Mar 11, 2025 · The following complex scripts in Unicode 7.0 are supported in the Universal Shaping Engine. Balinese, Batak, Brahmi, Buginese, Buhid, Chakma, ...
[75]
Core Text | Apple Developer Documentation
Core Text provides a low-level programming interface for laying out text and handling fonts. The Core Text layout engine is designed for high performance, ease ...Missing: HarfBuzz 2022
[76]
Core Text integration: HarfBuzz Manual
HarfBuzz offers an additional API that can help integrate with Apple's Core Text engine and the underlying Core Graphics framework.
[77]
FreeType integration: HarfBuzz Manual
HarfBuzz provides integration points with FreeType at the face-object and font-object level and for the font-functions virtual-method structure of a font object ...
[78]
Using right-to-left languages in Office - Microsoft Support
Open an Microsoft 365 program file, such as a Word document. On the File tab, choose Options > Language. In the Set the Office Language Preferences dialog box, ...
[79]
Bridging the Divide: Supporting Minority and Historic Scripts in Fonts
Especially true for scripts in Unicode versions 6.0 to 9.0 (2010 – 2016), where over 40% of the scripts have no fonts. (Unicode version 10.0 was released in ...Missing: percentage | Show results with:percentage
[80]
Technical Affordances of Multilingual Publication from Manuscripts ...
Sep 20, 2024 · In this article, we move between these definitions as needed to illustrate different challenges across the history of text technologies. The ...
[81]
Layout and Complex Text Processing - Simon Cozens
The task of the bidi algorithm is to swap around the characters in a text to convert it from its logical order into its visual order - the visual order being ...
[82]
Text rendering and fonts | Qt for MCUs 2.11.1
This default behavior comes with a performance overhead caused by the frequent calls to drawing engine. On platforms where this behavior leads to slower ...Text Rendering And Fonts · Overview · Monotype
[83]
difference between icu4c opentype harfbuzz - Stack Overflow
Mar 13, 2013 · HarfBuzz is a text shaping library, in short it takes a font, a string of text and some properties (script, language, optional OpenType ...ICU layout engine - Stack OverflowICU Layout sample renders text differently than Microsoft Notepad ...More results from stackoverflow.com<|separator|>
[84]
Best Practices for Authoring HTML: Handling Right-to-left Scripts
Jul 14, 2009 · This document provides advice on practical techniques related to the creation of content in languages that use right-to-left scripts, such as ...W3c Working Draft 14 July... · Table Of Contents · 1 Introduction
[85]
[PDF] A Concise Survey of OCR for Low-Resource Languages
Jun 21, 2024 · Post-OCR processing aims to rectify mistakes made by OCR systems in text extraction, and can be extremely valuable for low-resource languages.Missing: rendering | Show results with:rendering
[86]
Unicode 16.0.0
Sep 10, 2024 · Script-related Changes. There are seven new scripts encoded in Unicode 16.0. Some of these scripts, such as Tulu-Tigalari, have complex layout.
[87]
CSS Text Module Level 4 - W3C
May 29, 2024 · The CSS Text Module Level 4 defines properties for text manipulation, including line breaking, justification, alignment, white space, and text ...
[88]
unicode-bidi - CSS - MDN Web Docs
Oct 30, 2025 · The element does not offer an additional level of embedding with respect to the bidirectional algorithm. For inline elements, implicit ...
[89]
CSS Text Module Level 4
Oct 29, 2025 · This CSS module defines properties for text manipulation and specifies their processing model. It covers line breaking, justification and alignment, white ...
[90]
Adlam Script Resources - W3C
Nov 14, 2024 · This document points to resources for the layout and presentation of text in languages that use the Adlam script.Missing: source | Show results with:source
[91]
Better text contrast for all Chromium-based browsers on Windows
Jan 30, 2025 · We're happy to announce that our enhanced text rendering is now available for users across all Chromium-based browsers on Windows.Missing: complex Indic latency