Zero-width space
The zero-width space (rendered: ; HTML entity: ​ or ​), abbreviated ZWSP, is a non-printing character used in computerized typesetting to indicate where the word boundaries are, without actually displaying a visible space in the rendered text. This enables text-processing systems for scripts that do not use explicit spacing to recognize where word boundaries are for the purpose of handling line breaks appropriately.
The zero-width space is Unicode character U+200B, and is located in the Unicode General Punctuation block. In HTML, it can be represented by the character entity reference ​.
Purpose
[edit | edit source]The zero-width space marks a potential line break without hyphenation. Its semantics and HTML implementation are similar to the soft hyphen, but soft hyphens display a hyphen character at the point where the line is broken.
The zero-width space can be used to mark word breaks in languages without visible space between words, such as Thai, Myanmar, Khmer, and Japanese.[1]
In justified text, the rendering engine may add inter-character spacing, also known as letter spacing, between letters separated by a zero-width space, unlike around fixed-width spaces.[1]
Example
[edit | edit source]To show the effect of the zero-width space in text, the following words have been separated with zero-width spaces:
LoremIpsumDolorSitAmetConsecteturAdipiscingElitSedDoEiusmodTemporIncididuntUtLaboreEtDoloreMagnaAliquaUtEnimAdMinimVeniamQuisNostrudExercitationUllamcoLaborisNisiUtAliquipExEaCommodoConsequatDuisAuteIrureDolorInReprehenderitInVoluptateVelitEsseCillumDoloreEuFugiatNullaPariaturExcepteurSintOccaecatCupidatatNonProidentSuntInCulpaQuiOfficiaDeseruntMollitAnimIdEstLaborum
By contrast, the following words have not been separated:
LoremIpsumDolorSitAmetConsecteturAdipiscingElitSedDoEiusmodTemporIncididuntUtLaboreEtDoloreMagnaAliquaUtEnimAdMinimVeniamQuisNostrudExercitationUllamcoLaborisNisiUtAliquipExEaCommodoConsequatDuisAuteIrureDolorInReprehenderitInVoluptateVelitEsseCillumDoloreEuFugiatNullaPariaturExcepteurSintOccaecatCupidatatNonProidentSuntInCulpaQuiOfficiaDeseruntMollitAnimIdEstLaborum
The first text is broken into lines but only at word boundaries, and resizing the browser window will re-break the text accordingly, while the second text is not broken at all.
Usage
[edit | edit source]HTML
[edit | edit source]In HTML pages, the HTML element <wbr> functions as a zero-width space. In Internet Explorer 6, the zero-width space was not supported in some fonts.[2]
Prohibition in domain names
[edit | edit source]ICANN rules prohibit domain names from containing non-displayed characters, including the zero-width space, and most browsers prohibit their use within domain names because they can be used to create a homograph attack, where a malicious URL is visually indistinguishable from a legitimate one.[3][4]
Encoding
[edit | edit source]| Preview | | |
|---|---|---|
| Unicode name | ZERO WIDTH SPACE | |
| Encodings | decimal | hex |
| Unicode | 8203 | U+200B |
| UTF-8 | 226 128 139 | E2 80 8B |
| Numeric character reference | ​ |
​ |
| Named character reference | ​, ​, ​, ​, ​ | |
The zero-width space character is encoded in Unicode as U+200B <reserved-200B>.[5]
In HTML, it can be referenced as ​, ​ or ​. Additionally, the character entities ​, ​, ​, and ​ all also refer to the zero-width space, contrary to what their names suggest.[6]
The TeX representation is \hskip0pt; the LaTeX representation is \hspace{0pt};[7] and the groff representation is \:.[8]
See also
[edit | edit source]- Hair space
- Whitespace character – including a table comparing various space-like characters
- Word divider
- Word wrapping
- Word joiner (U+2060: )
- U+FEFF (U+FEFF: ), which is named "Zero Width No-Break Space" in Unicode
- Zero-width joiner (U+200D: )
- Zero-width non-joiner (U+200C: )
References
[edit | edit source]Citations
[edit | edit source]- ^ a b Lua error in Module:Citation/CS1/Configuration at line 2172: attempt to index field '?' (a nil value).
- ^ Lua error in Module:Citation/CS1/Configuration at line 2172: attempt to index field '?' (a nil value).
- ^ Lua error in Module:Citation/CS1/Configuration at line 2172: attempt to index field '?' (a nil value).
- ^ Lua error in Module:Citation/CS1/Configuration at line 2172: attempt to index field '?' (a nil value).
- ^ Lua error in Module:Citation/CS1/Configuration at line 2172: attempt to index field '?' (a nil value).
- ^ Entities/ZeroWidthSpace in MathML Version 2.0
- ^ Lua error in Module:Citation/CS1/Configuration at line 2172: attempt to index field '?' (a nil value).
- ^ Lua error in Module:Citation/CS1/Configuration at line 2172: attempt to index field '?' (a nil value).
Sources
[edit | edit source]- Lua error in Module:Citation/CS1/Configuration at line 2172: attempt to index field '?' (a nil value).