Fact-checked by Grok 2 weeks ago

Zero-width space

The zero-width space (ZWSP), designated as Unicode character U+200B, is a non-printing format character that occupies no visual width and serves to indicate invisible word boundaries and line break opportunities in text processing. Introduced in 1.0 in 1991 as part of the General Punctuation block, the ZWSP was initially classified under the space category (Zs) but was later reclassified as a format character (Cf) to emphasize its role in layout control rather than spacing. This character enables proper text rendering in languages and scripts that lack explicit visible spaces between words, such as Thai, , , , and certain uses in . In the Unicode Line Breaking Algorithm (UAX #14), the ZWSP functions as a break opportunity: line breaks are prohibited before it (along with regular spaces), but explicitly allowed immediately after it, allowing it to delimit words without altering visual appearance. During text justification, it permits the addition of inter-letter spacing, distinguishing it from fixed-width spaces like the en space (U+2002). The ZWSP differs from related zero-width characters, such as the (U+200C) and (U+200D), which control joining and ligature formation in scripts rather than providing break points. While primarily a tool for and , its invisible nature requires careful handling in editing software to avoid unintended insertions or rendering issues.

Overview

Definition

The zero-width space (ZWSP) is a non-printing character designated as U+200B, which occupies no horizontal space in rendered text but serves to indicate a potential word boundary or opportunity. This invisible character is detectable by text processing systems, allowing them to apply breaking rules without altering the visual layout. Introduced in 1.0 in October 1991 as part of the General block (U+2000 to U+206F), the ZWSP was developed to support international text handling in digital environments. It addresses the needs of scripts that lack explicit visible spaces between words, such as Thai, , , , and , by providing an invisible separator for word breaks in and layout algorithms. In rendering, the ZWSP remains completely invisible to users, with no or advance width, yet it influences in applications like word processors and web browsers. This property makes it essential for maintaining readability in complex scripts while preserving the intended structure of the text. The zero-width space (U+200B), commonly abbreviated as ZWSP, is primarily intended for invisible word separation and control, allowing a line break opportunity without visible width, though it may expand slightly in justified text. In contrast, the (U+200C, ZWNJ) serves to separate characters that would otherwise form ligatures or join in scripts, such as in or Indic languages, but it does not create a line break opportunity or affect word boundaries. Thus, while both are invisible, the ZWSP facilitates potential breaks for formatting, whereas the ZWNJ prioritizes preventing unwanted joining without altering flow. The zero-width joiner (U+200D, ZWJ) functions oppositely to the ZWNJ by forcing the joining of adjacent characters that would not normally connect, such as combining base emojis into sequences (e.g., family or flag emojis) or linking elements in scripts like Devanagari. Unlike the ZWSP, which permits separation and breaks, the ZWJ enforces visual or semantic unity without influencing line breaking, making it unsuitable for word boundary marking. This distinction is critical in complex text rendering, where misuse could disrupt intended glyph formation or emoji display. Another related character is the zero-width no-break space (U+FEFF, ZWNBSP), which prohibits line breaks at its position to maintain text integrity, such as preventing unwanted separation in phrases, and is widely used as a (BOM) to indicate encoding in files like or UTF-16. In opposition to the ZWSP's breakable nature, the ZWNBSP ensures non-breaking behavior, and explicitly deprecates its use for invisible separation in favor of other characters like U+2060 for similar non-breaking needs. Fundamentally, these differences highlight the ZWSP's role as a , breakable whitespace for general text processing, while the ZWNJ, ZWJ, and ZWNBSP focus on controlling joining or prohibiting breaks in contexts involving complex scripts or encoding signatures.

Encoding and Standards

Unicode Specification

The zero-width space is encoded in the Unicode Standard at U+200B, named ZERO WIDTH SPACE, and resides in the General Punctuation block spanning U+2000 through U+206F. It was first assigned in Version 1.0.0, released in October 1991, and has remained stable in its core encoding since that initial version, with no subsequent reallocation or deprecation. In the Unicode Character Database, it is classified with the General_Category property value of (Other, Format), reflecting its role as a formatting rather than a visible ; this category was updated from the original (Separator, ) value in Unicode Version 4.0.1 to better align with its invisible, non-spacing behavior. Additional key properties include Bidi_Class= (Boundary Neutral), which ensures it does not affect embedding levels, and Line_Break=, designating it specifically for enabling invisible opportunities without contributing to text width. The character is treated as invisible in rendering, with no default glyph or advance width, and it is designated as a Default_Ignorable_Code_Point, meaning it should be ignored in rendering unless explicitly supported for line breaking and other formatting effects. No major changes to its encoding or primary properties have occurred since Unicode 1.0.0 beyond the General_Category adjustment, underscoring its foundational status in the standard. It is referenced in Unicode Standard Annex #9 (Unicode Bidirectional Algorithm) as a boundary neutral format character that preserves text directionality without visual impact, and in Unicode Standard Annex #14 (Unicode Line Breaking Properties) as a dedicated class for zero-width break opportunities, distinct from other spaces or joiners.

Representations in Markup and Protocols

In , the zero-width space (U+200B) can be represented using the numeric entities ​ () or ​ (). These entities allow insertion of the without direct support in older parsers. In XML and related markup languages such as those used in and feeds, the zero-width space is typically inserted directly as (bytes E2 80 8B) or UTF-16 encoding, or via numeric references like ​ or ​ to ensure across parsers. This approach is common for invisible separators in structured feeds, where the maintains document integrity without visual impact. Programming languages provide standard methods to generate the zero-width space in string literals, often for testing or text manipulation. In , it is created using String.fromCharCode(8203). In , the chr(0x200B) or chr(8203) function returns the character. In C#, the escape sequence \u200B embeds it directly in strings. In network protocols, the zero-width space requires specific encoding for transmission. In URLs, it is percent-encoded in UTF-8 as %E2%80%8B to handle the multi-byte sequence safely. For email via (RFC 2045), it appears in quoted-printable or base64-encoded bodies, or as \u200B in structured parts, preserving invisibility across transports. In (RFC 8259), non-ASCII characters like U+200B are escaped as \u200b to ensure valid parsing.

Core Purposes

Word Boundary Marking

The zero-width space (ZWSP, U+200B) functions as an invisible to mark word boundaries in languages without visible inter-word spacing, such as Thai, , and . By inserting ZWSP between words, text processors can accurately segment continuous scripts for tasks like dictionary lookups and , preserving the original visual appearance while enabling precise linguistic analysis. In , ZWSP aids parsers in identifying or word boundaries without disrupting layout, particularly useful in non-spaced scripts or for annotating compounds. For instance, in Thai text like "สวัสดี" (hello), placing a ZWSP after "สวัส" distinguishes it as a compound for processing, such as in segmentation algorithms. This approach enhances applications like tools that rely on explicit boundaries for tokenization. The use of ZWSP improves text search accuracy by providing reliable word-level granularity, especially when combined with text segmentation rules, and supports hyphenation in systems like by allowing breaks at designated points in compounds without visible gaps. For example, in LaTeX, inserting ZWSP after a slash in terms like "input/output" enables proper hyphenation while maintaining compound integrity.

Line Break Facilitation

The zero-width space (ZWSP, U+200B) serves as an invisible in text layout systems, permitting line wrapping at designated points without altering the visual appearance of the content. According to the Line Breaking Algorithm outlined in UAX #14, ZWSP is assigned the line breaking property class ZW, which enforces specific rules for break opportunities: breaks are prohibited before ZWSP (rule LB7: × ZW), but allowed after it (rule LB8: ZW ÷). This mechanism positions ZWSP as a non-hyphenating alternative to the (U+00AD), enabling controlled fragmentation of otherwise unbreakable sequences while avoiding the insertion of a mark. In practical scenarios, ZWSP facilitates line breaks in extended constructs such as URLs, where inserting it— for instance, within "://example.com/very/long/path"—prevents horizontal overflow in constrained viewports without compromising readability. Similarly, it supports wrapping in chemical formulas, like long molecular notations (e.g., C₆₀H₁₂₂), and ideographic scripts such as , , or (CJK), where traditional spaces are absent and natural break points are scarce; UAX #14 explicitly notes its utility for indicating potential breaks in non-Latin scripts. These applications ensure text flows adaptively across devices and formats, maintaining semantic integrity. Within web styling contexts, ZWSP integrates with CSS line-breaking behaviors to allow discretionary wraps where standard spaces would introduce undesired width; for example, browsers treat it equivalently to the for suggesting breaks in inline elements. However, its effectiveness is not absolute, as break realization remains optional and context-dependent—UAX #14 specifies that ZWSP may be suppressed after punctuation or in tightly justified layouts, prioritizing overall typographic balance over individual insertions. This contextual sensitivity underscores ZWSP's role as a suggestive rather than mandatory cue in line wrapping algorithms.

Practical Applications

In Multilingual Text Processing

The zero-width space (U+200B) plays a key role in East Asian typography, particularly for , , and (CJK) languages, where it provides subtle control over spacing and line breaks without introducing visible gaps. In CJK text processing, algorithms may automatically insert proportional spacing between characters for justification, but inserting a zero-width space can override this behavior to prevent unwanted auto-spacing, ensuring precise layout in horizontal or vertical arrangements. For instance, in typesetting, the zero-width space facilitates break opportunities, as in CSS features for phrase breaking, maintaining aesthetic balance across lines while adhering to monospaced character grids. This application is especially useful in digital software, where CJK justification relies on distributed spacing rather than word gaps, and the zero-width space acts as an invisible delimiter to fine-tune character distribution. In bidirectional text processing, the zero-width space supports layouts involving right-to-left scripts like Hebrew and by serving as a neutral formatting character that does not alter the overall directional flow. According to the Unicode Bidirectional Algorithm (UAX #9), neutral characters such as whitespace (WS class) and boundary neutrals like the zero-width space (U+200B, BN class) are treated as neutral, allowing them to embed boundaries between directional runs—such as isolating left-to-right insertions like numbers or English terms—without forcing reordering or embedding levels that could disrupt the primary right-to-left progression. This neutrality ensures that the zero-width space can mark logical separations in mixed-direction text, such as in Arabic sentences containing Hebrew quotes, while preserving the visual integrity of the right-to-left rendering as defined in the algorithm's resolution phases. For search and indexing in natural language processing (NLP), the zero-width space improves tokenization accuracy in script-mixed multilingual text by explicitly indicating word boundaries where visible spaces are absent or ambiguous. Unicode Standard Annex #29 specifies that U+200B functions as a deliberate word separator, enabling tools to distinguish tokens in languages without inter-word spacing, such as when English words are embedded in Arabic or Thai sentences; for example, inserting it between "hello" and an adjacent Arabic term prevents erroneous merging during indexing. Libraries like the International Components for Unicode (ICU) incorporate these rules in their BreakIterator implementation, supporting precise segmentation for search engines and NLP pipelines handling diverse scripts, thus enhancing retrieval relevance in global corpora. In multilingual input methods, the zero-width space facilitates the entry of invisible across language locales, particularly in environments where keyboard configurations allow insertion via modifier keys for non-printing characters essential to script-specific formatting. For example, certain layouts enable users to produce U+200B through compose sequences or AltGr combinations, aiding typists in adding subtle boundaries during real-time composition of mixed-script documents without visible artifacts.

In Web Development and HTML

In , the zero-width space (U+200B) is inserted into HTML using the numeric entity ​, enabling line breaks within inline elements without introducing visible spacing. This technique is particularly useful for maintaining integrity in scenarios where standard spaces would disrupt , such as in navigation menus where items need to wrap responsively on smaller screens without awkward gaps. For instance, placing ​ between menu text allows the browser to break the line at that point if the narrows, preserving readability without adding width. Similarly, in snippets displayed inline, ​ facilitates natural line wrapping for long identifiers or URLs, ensuring they do not overflow containers while mimicking the original formatting. When integrated with CSS, the zero-width space enhances text wrapping behaviors, especially in responsive designs. It pairs effectively with the white-space: pre property, which preserves whitespace and line breaks, allowing developers to embed ZWSP strategically to control where breaks occur without altering the visual flow. Combined with word-break: break-word, ZWSP provides subtle opportunities for hyphenless breaks in long, space-less strings like URLs or compound words, preventing overflow in fluid layouts across devices. This approach is common in mobile-first designs, where precise control over text reflow is essential to avoid horizontal scrolling. In , handling zero-width spaces is crucial for input to mitigate risks like injection attacks or hidden payloads in user-submitted data. Developers often detect and remove ZWSP using expressions, such as string.replace(/\u200B/g, ''), which targets the Unicode and strips all instances globally. This method ensures clean form data processing, particularly in web applications where malicious actors might embed invisible characters to evade validation filters. The gained prominence in following the adoption of around 2010, as mobile browsers improved support and responsive techniques became standard, enabling better handling of invisible formatting in cross-device layouts.

In Typography and Document Formatting

In applications like and , the zero-width space (ZWSP, U+200B) serves as an invisible delimiter to facilitate adjustments in non-Latin scripts, such as Thai or , where visible spaces are absent between words. By marking word boundaries without adding width, it enables software to apply appropriate inter-character spacing and optical metrics tailored to the font's design, preventing awkward gaps or overlaps in complex layouts. This is particularly useful for maintaining readability in documents mixing scripts, as the ZWSP informs the engine of logical breaks for justification without altering visual appearance. As an alternative to soft hyphens, the ZWSP allows line breaks in justified text without introducing visible hyphenation marks, promoting cleaner in professional outputs. For instance, in legal documents requiring precise and unobtrusive formatting, inserting a ZWSP at potential break points ensures even line endings across paragraphs while avoiding the aesthetic disruption of hyphens, which can imply fragmentation in formal prose. This approach supports full justification by permitting controlled letter-spacing expansion instead of erratic word gaps, aligning with typographic best practices for high-legibility print media. During PDF and generation, the ZWSP aids consistent rendering across devices by explicitly signaling allowable line breaks within embedded fonts, which may vary in glyph metrics or justification . This prevents overflow or reflow issues in publications, especially for long compounds or non-spaced scripts, ensuring the document's layout integrity regardless of the reader's font substitution or screen size. In environments, packages such as polyglossia incorporate the ZWSP to manage script-specific spacing in multilingual PDFs, inserting it dynamically at language transitions or word boundaries to enforce proper hyphenation and rules per script. This enhances output quality for documents blending Latin and non-Latin content, like academic texts, by leveraging XeLaTeX's fontspec integration for precise, invisible adjustments.

Restrictions and Challenges

Prohibitions in Identifiers

The zero-width space (U+200B) is prohibited in internationalized domain names (IDNs) under policies to prevent attacks and invisible spoofing that could enable or visual deception. The briefing on IDN permissible code points discusses U+200B as a non-displayed in the context of potential user confusion. Similarly, RFC 5892, which outlines the code points eligible for IDNA labels, classifies U+200B as disallowed, excluding it from the protocol-valid (PVALID) category and thereby barring its use in registered domain labels. In programming languages, the zero-width space is handled specially in identifiers for ; the ignores U+200B to avoid hidden code and security risks, as it does not qualify under the Java Language Specification's rules for valid identifier parts based on categories. In contrast, the C++ ISO standard permits U+200B in identifiers under its Unicode support rules, but this allowance is criticized for enabling invisible variations in code that can lead to subtle bugs or malicious insertions. Security implications arise from the zero-width space's potential for malicious use in , where it is inserted into URLs to create homographic domains like "examp​le.com" that evade detection while appearing identical to legitimate ones. This technique, dubbed Z-WASP (zero-width space ), has been employed to bypass protections in systems such as Microsoft Office 365 by obfuscating malicious links without altering their functionality. As of January 2025, variants like "shy z-wasp" continue to exploit zero-width characters in campaigns. Policy evolution regarding the zero-width space in identifiers reflects growing awareness of Unicode security risks, with the Unicode Technical Report #36 (updated in 2010) discouraging the use of invisible characters like U+200B in user-facing identifiers to mitigate confusability and spoofing threats. This guidance influences standards bodies and implementers to prioritize visible, unambiguous characters in contexts like and code naming.

Compatibility and Rendering Issues

The zero-width space (U+200B) presents several compatibility challenges in web browsers, particularly in older versions. For instance, early implementations in , such as version 6, did not fully support the character in certain fonts, resulting in it being ignored or rendered incorrectly, which disrupted intended opportunities. In , inserting zero-width spaces into for URL wrapping could trigger crashes during PDF generation in versions from around 2006. Modern browsers like have also been observed to inadvertently insert U+200B into copied snippets from developer tools, complicating debugging. Rendering variations occur across fonts and operating systems. Fonts such as Arial Unicode MS provide support for U+200B, but without appropriate fallback mechanisms, WebKit-based browsers may display it as a square or a tiny visible gap if the primary font lacks the , leading to inconsistent visual output. Input methods on macOS and Windows can accidentally insert the character; for example, selecting text (e.g., via Cmd+A) in web applications like Outlook Web App on macOS has been reported to add extraneous U+200B instances. In legacy ASCII-only environments, the Unicode character cannot be represented and is typically substituted with a replacement like "?" or stripped entirely, nullifying its formatting role. To mitigate these problems, developers commonly employ regular expressions such as /\u200B/g in to detect and remove zero-width spaces, a practice increasingly routine in code audits since the early to eliminate artifacts from web copy-pasting.

References

  1. [1]
    General Punctuation - Unicode
    Zero Width Space. •, commonly abbreviated ZWSP. •, this character is intended for invisible word separation and for line break control; it has no width, but its ...
  2. [2]
    UAX #44: Unicode Character Database
    For example, U+200B ZERO WIDTH SPACE was originally classified as a space character (General_Category=Zs), but it was reclassified as a Format character ( ...
  3. [3]
    [PDF] The Symbols area of the Unicode standard includes the encoding of ...
    Having a zero width makes the zero-width space similar in some respects to the zero-width layout characters; however, since it is used to delimit word breaks, ...
  4. [4]
    Special Areas and Format Characters - Unicode
    Zero-width space characters are intended to be used in languages that have no visible word spacing to represent word break or line break opportunities, such as ...
  5. [5]
    None
    Nothing is retrieved...<|control11|><|separator|>
  6. [6]
    History of Unicode Release and Publication Dates
    The Unicode Standard, Version 1.0, Volume 2, 1992, June, 0-201-60845-6. The Unicode Standard, Version 1.0, Volume 1, 1991, October, 0-201-56788-1. Note that for ...Unicode Release Dates · Publication Dates for Unicode...
  7. [7]
    [PDF] L2/08-344 - Unicode
    Sep 6, 2008 · Zero-width space characters are intended to be used in languages that have no visible word spacing to represent line break opportunities, such ...
  8. [8]
    UnicodeData.txt (at unicode.org)
    ... U;Lu;0;L;;;;;N;;;;0075; 0056;LATIN CAPITAL LETTER V;Lu;0;L;;;;;N;;;;0076 ... 200B;ZERO WIDTH SPACE;Cf;0;BN;;;;;N;;;;; 200C;ZERO WIDTH NON-JOINER;Cf;0 ...
  9. [9]
    Unicode 1.0
    Jul 15, 2015 · It was published prior to the publication of ISO/IEC 10646-1:1993. Volume 1 corresponds to Unicode Version 1.0.0, published in October, 1991.
  10. [10]
    Resolved Public Review Issues - Unicode
    The proposal is to change the general category of U+200B from Zs to Cf. Resolution: Closed. The general category of U+200B will be changed from Zs to Cf in ...
  11. [11]
    https://www.unicode.org/Public/UNIDATA/LineBreak.txt
    # LineBreak-17.0.0.txt # Date: 2025-07-29, 13:52:18 GMT # © 2025 Unicode®, Inc. # Unicode and the Unicode Logo are registered trademarks of Unicode, Inc. in ...
  12. [12]
    UAX #44: Unicode Character Database
    Aug 27, 2025 · For example, U+200B ZERO WIDTH SPACE was originally classified as a space character (General_Category=Zs), but it was reclassified as a ...
  13. [13]
    Unicode Character 'ZERO WIDTH SPACE' (U+200B) - FileFormat.Info
    0 (June, 1993). Encodings. HTML Entity (decimal), &#8203;. HTML Entity (hex), &#x200b;. HTML Entity (named), &ZeroWidthSpace;: ​ &NegativeVeryThinSpace ...
  14. [14]
    String.fromCharCode() - JavaScript - MDN Web Docs
    Jul 10, 2025 · The String.fromCharCode() static method returns a string created from the specified sequence of UTF-16 code units.
  15. [15]
    UAX #29: Unicode Text Segmentation
    Jul 25, 2025 · This annex describes guidelines for determining default segmentation boundaries between certain significant text elements.
  16. [16]
    UAX #14: Line Breaking Properties - Unicode
    LB 4 Don't break before spaces or zero-width space. × SP. × ZW. LB 5 Break after zero-width space. ZW ÷. Combining Marks: At any possible break opportunity ...
  17. [17]
    <wbr>: The Line Break Opportunity element - HTML - MDN Web Docs
    Aug 13, 2025 · On UTF-8 encoded pages, <wbr> behaves like the U+200B ZERO-WIDTH SPACE code point. In particular, it behaves like a Unicode bidi BN code point, ...
  18. [18]
    UTR #59: East Asian Spacing - Unicode
    Dec 16, 2024 · Likewise, inserting U+200B ZERO WIDTH SPACE to where the algorithm inserts the auto-spacing should prevent the auto-spacing from being inserted ...
  19. [19]
    Introducing four new international features in CSS | Blog
    Dec 4, 2023 · Some East Asian languages such as Chinese or Japanese don't use spaces ... A <wbr> tag or Zero Width Space ( &ZeroWidthSpace; ) enforces a ...
  20. [20]
  21. [21]
    Xorg/Keyboard configuration - ArchWiki
    This article describes the basics of Xorg keyboard configuration. For advanced topics such as keyboard layout modification or additional key mappings, see X ...
  22. [22]
    as named character reference for zero width space (U+200B) - W3C
    Jul 28, 2014 · REQUEST: &zwsp; as a named character reference for the Zero Width Space Character (U+200B) should be obligatory for all HTML5 parsers to ...
  23. [23]
    Zero-Width Space | CSS-Tricks
    Jul 2, 2021 · The zero-width space on the other hand is technically a space, but rendered with no width. You won't be able to see it. If you place it between ...
  24. [24]
    Responsive Word Wrapping – College of Fine Arts Web Instruction
    Word Break Fixes. Soft Hyphen; Word Break Opportunity or Zero Width Space Entity. Lines Break Fixes. Non Breaking Space, &nbsp;. Learn More. Example. A frequent ...
  25. [25]
    How to Remove Zero-Width Space Characters from a JavaScript ...
    Oct 16, 2024 · To remove zero-width space characters from a JavaScript string, we can use the JavaScript string replace method that matches all zero-width characters and ...<|separator|>
  26. [26]
    7 Best Practices for Sanitizing Input in Node.js | by Arunangshu Das
    May 25, 2025 · Whitespace attacks: Hidden characters like \u200B (zero-width space) can sneak past regexes. Unicode exploits: Punycode and emoji domains can ...
  27. [27]
    U+200B ZERO WIDTH SPACE - Unicode Explorer
    The zero-width space, abbreviated ZWSP, is a non-printing character used in computerized typesetting to indicate word boundaries to text processing systems ...
  28. [28]
    How Comprehensive is InDesign's Support for the Unicode Standard?
    Oct 25, 2024 · Languages that require the zero-width space also seem to not work well with Optical Kerning.Chinese font punctuation incorrect widths - Adobe Product CommunityNon breaking space between two words is ˝too wide˝...More results from community.adobe.com
  29. [29]
    None
    Nothing is retrieved...<|separator|>
  30. [30]
    The Look That Says Book - A List Apart
    Similar to the soft hyphen, the zero space character communicates allowable line breaks within strings of text. ... hyphens in words like “zero-space” and “soft ...
  31. [31]
    Justified text | Butterick's Practical Typography
    If you're using justified text, you must also turn on hyphenation to prevent gruesomely large spaces between words, as shown in the example below. ...
  32. [32]
    Zero width space and other similar characters don't work · Issue #483
    Mar 20, 2016 · I wanted to use the zero-width space as a hint for line breaks, but I found that pdfkit renders the zero-width space with a non-zero width!Missing: ePub | Show results with:ePub
  33. [33]
    Avoiding Bad Punctuation Breaks in eBooks - EPUBSecrets
    Feb 10, 2014 · The trick is to use a zero width non-breaking space (ZWNBSP). You're probably familiar with the non-breaking space, used in your xhtml file as ...
  34. [34]
    XeLaTeX, nohyphen, space and babel/polyglossia - TeX
    Mar 24, 2020 · It looks as if one can get around the problem for \nohyphens by inserting a zero width space (but I didn't test with many fonts):Unicode char {U+200B} - TeX - LaTeX Stack ExchangeXeLaTeX+polyglossia+french+biblatex complain about missing ...More results from tex.stackexchange.com
  35. [35]
    Multilingual typesetting on Overleaf using polyglossia and fontspec
    This article provides an overview of typesetting multilingual documents on Overleaf using the XeLaTeX (or LuaLaTeX) compiler in conjunction with the fontspec ...
  36. [36]
    Briefing Paper on IDN Permissible Code Point Problems
    Feb 27, 2002 · At present, the DNS host name specifications limit permissible code points in domain name labels to a restricted subset of 7-bit ASCII: the ...
  37. [37]
    RFC 5892 - The Unicode Code Points and Internationalized Domain ...
    ... ZERO WIDTH JOINER and U+200C ZERO WIDTH NON-JOINER. Both of them have the derived property value CONTEXTJ. A character with the derived property value ...Missing: prohibition | Show results with:prohibition
  38. [38]
    ignore invisible characters which are valid part of Java identifiers - bug
    May 26, 2019 · > If you check the javadoc of the method Character#isJavaIdentifierPart(..), > you will see that \u200B can be used inside a java identifier. So ...
  39. [39]
    C++ Identifier Syntax using Unicode Standard Annex 31
    Apr 12, 2021 · Some words in some scripts, such as Persian, Malayalam, and Sinhala, require the use of zero width joiners and non-joiners to render properly.
  40. [40]
    Hackers Using Zero-Width Spaces to Bypass MS Office 365 Protection
    Jan 10, 2019 · According to the researchers, attackers are simply inserting multiple zero-width spaces within the malicious URL mentioned in their phishing ...Missing: implications | Show results with:implications
  41. [41]
    Z-WASP Vulnerability Used to Phish Office 365 and ATP - Avanan
    Jan 10, 2019 · The vulnerability was discovered when we noticed a large number of hackers using zero-width spaces (ZWSPs) to obfuscate links in phishing emails to Office 365.Missing: implications | Show results with:implications
  42. [42]
    UTR #36: Unicode Security Considerations
    1 and l may appear alike, depending on font. Certain Unicode characters are invisible, although they may affect the rendering of the characters around them. An ...