Script (Unicode)
This article needs additional citations for verification. (June 2024) |
| File:Manichaean letter beth.svg | ᓃ | File:1bc5c.png | ⠿ | |
| File:Palmyrene letter aleph.svg | File:U+A6AA.svg | አ | 文 | あ |
| ꦏ | File:Tai Viet letter High Kho.svg | File:Grantha Aa.png | File:Soyombo sa.svg | ழ் |
| File:Manichaean letter beth.svg | ع | ש | Д | A |
In Unicode, a script is a collection of letters and other written signs used to represent textual information in one or more writing systems.[1] Some scripts support only one writing system and language, for example, Armenian. Other scripts support many different writing systems; for example, the Latin script supports English, French, German, Italian, Vietnamese, Latin itself, and several other languages. Some languages make use of multiple alternate writing systems and thus also use several scripts; for example, in Turkish, the Arabic script was used before the 20th century but transitioned to Latin in the early part of the 20th century. More or less complementary to scripts are symbols and Unicode control characters.
The unified diacritical characters and unified punctuation characters frequently have the "common" or "inherited" script property. However, the individual scripts often have their own punctuation and diacritics, so that many scripts include not only letters but also diacritic and other marks, punctuation, numerals and even their own idiosyncratic symbols and space characters.
Unicode 17.0 defines 172 separate scripts, including 102 modern scripts and 70 ancient or historic scripts.[2][3] More scripts are in the process for encoding or have been tentatively allocated for encoding in roadmaps.[4]
Definition and classification
[edit | edit source]When multiple languages make use of the same script, there are frequently some differences, particularly in diacritics and other marks. For example, Swedish and English both use the Latin script. However, Swedish includes the character å (sometimes called a Swedish O), while English has no such character. Nor does English make use of the diacritic combining ring above for any character. In general, the languages sharing the same scripts share many of the same characters. Despite these peripheral differences in the Swedish and English writing systems, they are said to use the same Latin script. Thus, the Unicode abstraction of scripts is a basic organizing technique. The differences among different alphabets or writing systems remain and are supported through Unicode’s flexible scripts, combining marks and collation algorithms.
Script versus writing system
[edit | edit source]Writing system is sometimes treated as a synonym for "script". However, it also can be used as the specific concrete writing system supported by a script. For example, the Vietnamese writing system is supported by the Latin script. A writing system may also cover more than one script; for example, the Japanese writing system makes use of the kanji, hiragana and katakana scripts.
Most writing systems can be broadly divided into several categories: logographic, syllabic, alphabetic (or segmental), abugida, abjad and featural; however, all features of any of these may be found in any given writing system in varying proportions, often making it difficult to purely categorize a system. The term complex system is sometimes used to describe those where the admixture makes classification problematic.
Unicode supports all of these types of writing systems through its numerous scripts. Unicode also adds further properties to characters to help differentiate the various characters and the ways they behave within Unicode text-processing algorithms.
Special script property values
[edit | edit source]In addition to explicit or specific script properties, Unicode uses three special values:[5]
- Common
- Unicode can assign a character in the UCS to a single script only. However, many characters—those that are not part of a formal natural-language writing system or are unified across many writing systems—may be used in more than one script (for example, currency signs, symbols, numerals and punctuation marks). In these cases Unicode defines them as belonging to the "common" script (ISO 15924 code "Zyyy").
- Inherited
- Many diacritics and non-spacing combining characters may be applied to characters from more than one script. In these cases Unicode assigns them to the "inherited" script (ISO 15924 code Zinh), which means that they have the same script class as the base character with which they combine, and so in different contexts they may be treated as belonging to different scripts. For example, U+0308 ◌̈ <reserved-0308> may combine either with U+0065 e <reserved-0065> to create a Latin ë or with U+0435 е <reserved-0435> for the Cyrillic ё. In the former case, it inherits the Latin script of the base character, whereas in the latter case, it inherits the Cyrillic script of the base character.
- Unknown
- The value of "unknown" script (ISO 15924 code Zzzz) is given to unassigned, private-use, noncharacter, and surrogate code points.
Character categories within scripts
[edit | edit source]Unicode provides a general category property for each character. So in addition to belonging to a script every character also has a general category. Typically scripts include letter characters including: uppercase letters, lowercase letters and modifier letters. Some characters are considered titlecase letters for a few precomposed ligatures such as Dz (U+01F2). Such titlecase ligatures are all in the Latin and Greek scripts and are all compatibility characters, and therefore Unicode discourages their use by authors. It is unlikely that new titlecase letters will be added in the future.
Most writing systems do not differentiate between uppercase and lowercase letters. For those scripts all letters are categorized as "other letter" or "modifier letter". Ideographs such as Unihan ideographs are also categorized as "other letters". A few scripts do differentiate between uppercase and lowercase however: Latin, Cyrillic, Greek, Armenian, Georgian, and Deseret. Even for these scripts there are some letters that are neither uppercase nor lowercase.
Scripts can also contain any other general category character such as marks (diacritic and otherwise), numbers (numerals), punctuation, separators (word separators such as spaces), symbols and non-graphical format characters. These are included in a particular script when they are unique to that script. Other such characters are generally unified and included in the punctuation or diacritic blocks. However, the bulk of characters in any script (other than the common and inherited scripts) are letters.
List of encoded scripts
[edit | edit source]As of version 17.0[update], Unicode defines 175 script aliases ("property value aliases") for codes in the ISO 15924 list. Unicode assigns the alias "Common" for ISO 15924's Zyyy code for undetermined scripts, "Inherited" for ISO 15924's Zinh code for inherited scripts, and "Unknown" for ISO 15924's Zzzz code for uncoded scripts. There are script codes defined by ISO 15924 that are not used in Unicode, including Zsym (Symbols) and Zmth (Mathematical notation).
| ISO 15924 | Script in Unicode[e] | |||||||
|---|---|---|---|---|---|---|---|---|
| Code | ISO number | ISO formal name | Directionality | Unicode Alias[f] | Version | Characters | Notes | Description |
| Adlm | 166 | Adlam | Adlam | 9.0 | 88 | Ch 19.9 | ||
| Afak | 439 | Afaka | — Not in Unicode, proposal is explored[i] | |||||
| Aghb | 239 | Caucasian Albanian | Caucasian Albanian | 7.0 | 53 | Ancient/historic | Ch 8.11 | |
| Ahom | 338 | Ahom, Tai Ahom | Ahom | 8.0 | 65 | Ancient/historic | Ch 15.16 | |
| Arab | 160 | Arabic | Arabic | 1.0 | 1,413 | Ch 9.2 | ||
| Aran | 161 | Arabic (Nastaliq variant) | — Typographic variant of Arabic (see § Arab) | |||||
| Armi | 124 | Imperial Aramaic | Imperial Aramaic | 5.2 | 31 | Ancient/historic | Ch 10.4 | |
| Armn | 230 | Armenian | Armenian | 1.0 | 96 | Ch 7.6 | ||
| Avst | 134 | Avestan | Avestan | 5.2 | 61 | Ancient/historic | Ch 10.7 | |
| Bali | 360 | Balinese | Balinese | 5.0 | 127 | Ch 17.3 | ||
| Bamu | 435 | Bamum | Bamum | 5.2 | 657 | Ch 19.6 | ||
| Bass | 259 | Bassa Vah | Bassa Vah | 7.0 | 36 | Ancient/historic | Ch 19.7 | |
| Batk | 365 | Batak | Batak | 6.0 | 56 | Ch 17.6 | ||
| Beng | 325 | Bengali (Bangla) | Bengali | 1.0 | 96 | Ch 12.2 | ||
| Berf | 258 | Beria Erfe | Beria Erfe | 17.0 | 50 | |||
| Bhks | 334 | Bhaiksuki | Bhaiksuki | 9.0 | 97 | Ancient/historic | Ch 14.3 | |
| Blis | 550 | Blissymbols | varies | — Not in Unicode, proposal is explored[i] | ||||
| Bopo | 285 | Bopomofo | Bopomofo | 1.0 | 77 | Ch 18.3 | ||
| Brah | 300 | Brahmi | Brahmi | 6.0 | 115 | Ancient/historic | Ch 14.1 | |
| Brai | 570 | Braille | Braille | 3.0 | 256 | Ch 21.1 | ||
| Bugi | 367 | Buginese | Buginese | 4.1 | 30 | Ch 17.2 | ||
| Buhd | 372 | Buhid | Buhid | 3.2 | 20 | Ch 17.1 | ||
| Cakm | 349 | Chakma | Chakma | 6.1 | 71 | Ch 13.11 | ||
| Cans | 440 | Unified Canadian Aboriginal Syllabics | Canadian Aboriginal | 3.0 | 726 | Ch 20.2 | ||
| Cari | 201 | Carian | Carian | 5.1 | 49 | Ancient/historic | Ch 8.5 | |
| Cham | 358 | Cham | Cham | 5.1 | 83 | Ch 16.10 | ||
| Cher | 445 | Cherokee | Cherokee | 3.0 | 172 | Ch 20.1 | ||
| Chis | 298 | Chisoi | — Not in Unicode, proposal is mature[ii] | |||||
| Chrs | 109 | Chorasmian | Chorasmian | 13.0 | 28 | Ancient/historic | Ch 10.8 | |
| Cirt | 291 | Cirth | varies | — Not in Unicode | ||||
| Copt | 204 | Coptic | Coptic | 1.0 | 137 | Ancient/historic, disunified from Greek in 4.1 | Ch 7.3 | |
| Cpmn | 402 | Cypro-Minoan | Cypro Minoan | 14.0 | 99 | Ancient/historic | Ch 8.4 | |
| Cprt | 403 | Cypriot syllabary | Cypriot | 4.0 | 55 | Ancient/historic | Ch 8.3 | |
| Cyrl | 220 | Cyrillic | Cyrillic | 1.0 | 508 | Includes typographic variant Old Church Slavonic (see § Cyrs) | Ch 7.4 | |
| Cyrs | 221 | Cyrillic (Old Church Slavonic variant) | — Typographic variant of Cyrillic (see § Cyrl); Ancient/historic | |||||
| Deva | 315 | Devanagari (Nagari) | Devanagari | 1.0 | 164 | Ch 12.1 | ||
| Diak | 342 | Dives Akuru | Dives Akuru | 13.0 | 72 | Ancient/historic | Ch 15.15 | |
| Dogr | 328 | Dogra | Dogra | 11.0 | 60 | Ancient/historic | Ch 15.18 | |
| Dsrt | 250 | Deseret (Mormon) | Deseret | 3.1 | 80 | Ch 20.4 | ||
| Dupl | 755 | Duployan shorthand, Duployan stenography | Duployan | 7.0 | 143 | Ch 21.6 | ||
| Egyd | 070 | Egyptian demotic | mixed | — Not in Unicode | ||||
| Egyh | 060 | Egyptian hieratic | mixed | — Not in Unicode | ||||
| Egyp | 050 | Egyptian hieroglyphs | Egyptian Hieroglyphs | 5.2 | 5,105 | Ancient/historic | Ch 11.4 | |
| Elba | 226 | Elbasan | Elbasan | 7.0 | 40 | Ancient/historic | Ch 8.10 | |
| Elym | 128 | Elymaic | Elymaic | 12.0 | 23 | Ancient/historic | Ch 10.9 | |
| Ethi | 430 | Ethiopic (Geʻez) | Ethiopic | 3.0 | 523 | Ch 19.1 | ||
| Gara | 164 | Garay | Garay | 16.0 | 69 | |||
| Geok | 241 | Khutsuri (Asomtavruli and Nuskhuri) | Georgian | Unicode groups Khutsori, Asomtavruli and Nuskhuri into 'Georgian' (see § Geok). Similarly, Mkhedruli and Mtavruli are 'Georgian' (see § Geor) | Ch 7.7 | |||
| Geor | 240 | Georgian (Mkhedruli and Mtavruli) | Georgian | 1.0 | 173 | In Unicode this also includes Nuskhuri (Geok) | Ch 7.7 | |
| Glag | 225 | Glagolitic | Glagolitic | 4.1 | 134 | Ancient/historic | Ch 7.5 | |
| Gong | 312 | Gunjala Gondi | Gunjala Gondi | 11.0 | 63 | Ch 13.15 | ||
| Gonm | 313 | Masaram Gondi | Masaram Gondi | 10.0 | 75 | Ch 13.14 | ||
| Goth | 206 | Gothic | Gothic | 3.1 | 27 | Ancient/historic | Ch 8.9 | |
| Gran | 343 | Grantha | Grantha | 7.0 | 85 | Ancient/historic | Ch 15.14 | |
| Grek | 200 | Greek | Greek | 1.0 | 518 | Directionality sometimes as boustrophedon | Ch 7.2 | |
| Gujr | 320 | Gujarati | Gujarati | 1.0 | 91 | Ch 12.4 | ||
| Gukh | 397 | Gurung Khema | Gurung Khema | 16.0 | 58 | |||
| Guru | 310 | Gurmukhi | Gurmukhi | 1.0 | 80 | Ch 12.3 | ||
| Hanb | 503 | Han with Bopomofo (alias for Han + Bopomofo) | mixed | — See § Hani, § Bopo | ||||
| Hang | 286 | Hangul (Hangŭl, Hangeul) | Hangul | 1.0 | 11,739 | Hangul syllables relocated in 2.0 | Ch 18.6 | |
| Hani | 500 | Han (Hanzi, Kanji, Hanja) | top-to-bottom, columns right-to-left (historically) | Han | 1.0 | 103,351 | Ch 18.1 | |
| Hano | 371 | Hanunoo (Hanunóo) | Hanunoo | 3.2 | 21 | Ch 17.1 | ||
| Hans | 501 | Han (Simplified variant) | varies | — Subset of Han (Hanzi, Kanji, Hanja) (see § Hani) | ||||
| Hant | 502 | Han (Traditional variant) | varies | — Subset of § Hani | ||||
| Hatr | 127 | Hatran | Hatran | 8.0 | 26 | Ancient/historic | Ch 10.12 | |
| Hebr | 125 | Hebrew | Hebrew | 1.0 | 134 | Ch 9.1 | ||
| Hira | 410 | Hiragana | Hiragana | 1.0 | 381 | Ch 18.4 | ||
| Hluw | 080 | Anatolian Hieroglyphs (Luwian Hieroglyphs, Hittite Hieroglyphs) | Anatolian Hieroglyphs | 8.0 | 583 | Ancient/historic | Ch 11.6 | |
| Hmng | 450 | Pahawh Hmong | Pahawh Hmong | 7.0 | 127 | Ch 16.11 | ||
| Hmnp | 451 | Nyiakeng Puachue Hmong | Nyiakeng Puachue Hmong | 12.0 | 71 | Ch 16.12 | ||
| Hntl | 504 | Han (Traditional variant) with Latin (alias for Hant + Latn) | — See § Hant and § Latn | |||||
| Hrkt | 412 | Japanese syllabaries (alias for Hiragana + Katakana) | Katakana or Hiragana | See § Hira, § Kana | Ch 18.4 | |||
| Hung | 176 | Old Hungarian (Hungarian Runic) | Old Hungarian | 8.0 | 108 | Ancient/historic | Ch 8.8 | |
| Inds | 610 | Indus (Harappan) | — Not in Unicode, proposal is explored[i] | |||||
| Ital | 210 | Old Italic (Etruscan, Oscan, etc.) | Old Italic | 3.1 | 39 | Ancient/historic | Ch 8.6 | |
| Jamo | 284 | Jamo (alias for Jamo subset of Hangul) | varies | — Subset of § Hang | ||||
| Java | 361 | Javanese | Javanese | 5.2 | 90 | Ch 17.4 | ||
| Jpan | 413 | Japanese (alias for Han + Hiragana + Katakana) | varies | — See § Hani, § Hira and § Kana | ||||
| Jurc | 510 | Jurchen | — Not in Unicode | |||||
| Kali | 357 | Kayah Li | Kayah Li | 5.1 | 47 | Ch 16.9 | ||
| Kana | 411 | Katakana | Katakana | 1.0 | 321 | Ch 18.4 | ||
| Kawi | 368 | Kawi | Kawi | 15.0 | 87 | Ancient/historic | Ch 17.9 | |
| Khar | 305 | Kharoshthi | Kharoshthi | 4.1 | 68 | Ancient/historic | Ch 14.2 | |
| Khmr | 355 | Khmer | Khmer | 3.0 | 146 | Ch 16.4 | ||
| Khoj | 322 | Khojki | Khojki | 7.0 | 65 | Ancient/historic | Ch 15.7 | |
| Kitl | 505 | Khitan large script | — Not in Unicode | |||||
| Kits | 288 | Khitan small script | Khitan Small Script | 13.0 | 472 | Ancient/historic | Ch 18.12 | |
| Knda | 345 | Kannada | Kannada | 1.0 | 92 | Ch 12.8 | ||
| Kore | 287 | Korean (alias for Hangul + Han) | left-to-right | — See § Hani, § Hang | ||||
| Kpel | 436 | Kpelle | — Not in Unicode, proposal is explored[i] | |||||
| Krai | 396 | Kirat Rai | Kirat Rai | 16.0 | 58 | |||
| Kthi | 317 | Kaithi | Kaithi | 5.2 | 68 | Ancient/historic | Ch 15.2 | |
| Lana | 351 | Tai Tham (Lanna) | Tai Tham | 5.2 | 127 | Ch 16.7 | ||
| Laoo | 356 | Lao | Lao | 1.0 | 83 | Ch 16.2 | ||
| Latf | 217 | Latin (Fraktur variant) | — Typographic variant of Latin (see § Latn) | |||||
| Latg | 216 | Latin (Gaelic variant) | — Typographic variant of Latin (see § Latn) | |||||
| Latn | 215 | Latin | Latin | 1.0 | 1,492 | See also: Latin script in Unicode | Ch 7.1 | |
| Leke | 364 | Leke | — Not in Unicode | |||||
| Lepc | 335 | Lepcha (Róng) | Lepcha | 5.1 | 74 | Ch 13.12 | ||
| Limb | 336 | Limbu | Limbu | 4.0 | 68 | Ch 13.6 | ||
| Lina | 400 | Linear A | Linear A | 7.0 | 341 | Ancient/historic | Ch 8.1 | |
| Linb | 401 | Linear B | Linear B | 4.0 | 211 | Ancient/historic | Ch 8.2 | |
| Lisu | 399 | Lisu (Fraser) | Lisu | 5.2 | 49 | Ch 18.9 | ||
| Loma | 437 | Loma | — Not in Unicode, proposal is explored[i] | |||||
| Lyci | 202 | Lycian | Lycian | 5.1 | 29 | Ancient/historic | Ch 8.5 | |
| Lydi | 116 | Lydian | Lydian | 5.1 | 27 | Ancient/historic | Ch 8.5 | |
| Mahj | 314 | Mahajani | Mahajani | 7.0 | 39 | Ancient/historic | Ch 15.6 | |
| Maka | 366 | Makasar | Makasar | 11.0 | 25 | Ancient/historic | Ch 17.8 | |
| Mand | 140 | Mandaic, Mandaean | Mandaic | 6.0 | 29 | Ch 9.5 | ||
| Mani | 139 | Manichaean | Manichaean | 7.0 | 51 | Ancient/historic | Ch 10.5 | |
| Marc | 332 | Marchen | Marchen | 9.0 | 68 | Ancient/historic | Ch 14.5 | |
| Maya | 090 | Mayan hieroglyphs | mixed | — Not in Unicode | ||||
| Medf | 265 | Medefaidrin (Oberi Okaime, Oberi Ɔkaimɛ) | Medefaidrin | 11.0 | 91 | Ch 19.10 | ||
| Mend | 438 | Mende Kikakui | Mende Kikakui | 7.0 | 213 | Ch 19.8 | ||
| Merc | 101 | Meroitic Cursive | Meroitic Cursive | 6.1 | 90 | Ancient/historic | Ch 11.5 | |
| Mero | 100 | Meroitic Hieroglyphs | Meroitic Hieroglyphs | 6.1 | 32 | Ancient/historic | Ch 11.5 | |
| Mlym | 347 | Malayalam | Malayalam | 1.0 | 118 | Ch 12.9 | ||
| Modi | 324 | Modi, Moḍī | Modi | 7.0 | 79 | Ancient/historic | Ch 15.12 | |
| Mong | 145 | Mongolian | Mongolian | 3.0 | 168 | Mong includes Clear and Manchu scripts | Ch 13.5 | |
| Moon | 218 | Moon (Moon code, Moon script, Moon type) | mixed | — Not in Unicode, proposal is explored[i] | ||||
| Mroo | 264 | Mro, Mru | Mro | 7.0 | 43 | Ch 13.8 | ||
| Mtei | 337 | Meitei Mayek (Meithei, Meetei) | Meetei Mayek | 5.2 | 79 | Ch 13.7 | ||
| Mult | 323 | Multani | Multani | 8.0 | 38 | Ancient/historic | Ch 15.10 | |
| Mymr | 350 | Myanmar (Burmese) | Myanmar | 3.0 | 243 | Ch 16.3 | ||
| Nagm | 295 | Nag Mundari | Nag Mundari | 15.0 | 42 | |||
| Nand | 311 | Nandinagari | Nandinagari | 12.0 | 65 | Ancient/historic | Ch 15.13 | |
| Narb | 106 | Old North Arabian (Ancient North Arabian) | Old North Arabian | 7.0 | 32 | Ancient/historic | Ch 10.1 | |
| Nbat | 159 | Nabataean | Nabataean | 7.0 | 40 | Ancient/historic | Ch 10.10 | |
| Newa | 333 | Newa, Newar, Newari, Nepāla lipi | Newa | 9.0 | 97 | Ch 13.3 | ||
| Nkdb | 085 | Naxi Dongba (na²¹ɕi³³ to³³ba²¹, Nakhi Tomba) | — Not in Unicode | |||||
| Nkgb | 420 | Naxi Geba (na²¹ɕi³³ gʌ²¹ba²¹, 'Na-'Khi ²Ggŏ-¹baw, Nakhi Geba) | — Not in Unicode, proposal is explored[i] | |||||
| Nkoo | 165 | N’Ko | NKo | 5.0 | 62 | Ch 19.4 | ||
| Nshu | 499 | Nüshu | Nushu | 10.0 | 397 | Ch 18.8 | ||
| Ogam | 212 | Ogham | Ogham | 3.0 | 29 | Ancient/historic | Ch 8.14 | |
| Olck | 261 | Ol Chiki (Ol Cemet’, Ol, Santali) | Ol Chiki | 5.1 | 48 | Ch 13.10 | ||
| Onao | 296 | Ol Onal | Ol Onal | 16.0 | 44 | |||
| Orkh | 175 | Old Turkic, Orkhon Runic | Old Turkic | 5.2 | 73 | Ancient/historic | Ch 14.8 | |
| Orya | 327 | Oriya (Odia) | Oriya | 1.0 | 91 | Ch 12.5 | ||
| Osge | 219 | Osage | Osage | 9.0 | 72 | Ch 20.3 | ||
| Osma | 260 | Osmanya | Osmanya | 4.0 | 40 | Ch 19.2 | ||
| Ougr | 143 | Old Uyghur | mixed | Old Uyghur | 14.0 | 26 | Ancient/historic | Ch 14.11 |
| Palm | 126 | Palmyrene | Palmyrene | 7.0 | 32 | Ancient/historic | Ch 10.11 | |
| Pauc | 263 | Pau Cin Hau | Pau Cin Hau | 7.0 | 57 | Ch 16.13 | ||
| Pcun | 015 | Proto-Cuneiform | — Not in Unicode | |||||
| Pelm | 016 | Proto-Elamite | — Not in Unicode | |||||
| Perm | 227 | Old Permic | Old Permic | 7.0 | 43 | Ancient/historic | Ch 8.13 | |
| Phag | 331 | Phags-pa | Phags-pa | 5.0 | 56 | Ancient/historic | Ch 14.4 | |
| Phli | 131 | Inscriptional Pahlavi | Inscriptional Pahlavi | 5.2 | 27 | Ancient/historic | Ch 10.6 | |
| Phlp | 132 | Psalter Pahlavi | Psalter Pahlavi | 7.0 | 29 | Ancient/historic | Ch 10.6 | |
| Phlv | 133 | Book Pahlavi | mixed | — Not in Unicode | ||||
| Phnx | 115 | Phoenician | Phoenician | 5.0 | 29 | Ancient/historic[g] | Ch 10.3 | |
| Piqd | 293 | Klingon (KLI pIqaD) | — Rejected for inclusion in Unicode[iii][iv] | |||||
| Plrd | 282 | Miao (Pollard) | Miao | 6.1 | 149 | Ch 18.10 | ||
| Prti | 130 | Inscriptional Parthian | Inscriptional Parthian | 5.2 | 30 | Ancient/historic | Ch 10.6 | |
| Psin | 103 | Proto-Sinaitic | mixed | — Not in Unicode | ||||
| Qaaa-Qabx | 900-949 | Reserved for private use (range) | — Not in Unicode | |||||
| Ranj | 303 | Ranjana | — Not in Unicode | |||||
| Rjng | 363 | Rejang (Redjang, Kaganga) | Rejang | 5.1 | 37 | Ch 17.5 | ||
| Rohg | 167 | Hanifi Rohingya | Hanifi Rohingya | 11.0 | 50 | Ch 16.14 | ||
| Roro | 620 | Rongorongo | mixed | — Not in Unicode, proposal is explored[i] | ||||
| Runr | 211 | Runic | Runic | 3.0 | 86 | Ancient/historic | Ch 8.7 | |
| Samr | 123 | Samaritan | Samaritan | 5.2 | 61 | Ch 9.4 | ||
| Sara | 292 | Sarati | mixed | — Not in Unicode | ||||
| Sarb | 105 | Old South Arabian | Old South Arabian | 5.2 | 32 | Ancient/historic | Ch 10.2 | |
| Saur | 344 | Saurashtra | Saurashtra | 5.1 | 82 | Ch 13.13 | ||
| Seal | 590 | (Small) Seal | varies | — Not in Unicode, proposal is explored[i] | ||||
| Sgnw | 095 | SignWriting | SignWriting | 8.0 | 672 | Ch 21.7 | ||
| Shaw | 281 | Shavian (Shaw) | Shavian | 4.0 | 48 | Ch 8.15 | ||
| Shrd | 319 | Sharada, Śāradā | Sharada | 6.1 | 104 | Ch 15.3 | ||
| Shui | 530 | Shuishu | left-to-right | — Not in Unicode | ||||
| Sidd | 302 | Siddham, Siddhaṃ, Siddhamātṛkā | Siddham | 7.0 | 92 | Ancient/historic | Ch 15.5 | |
| Sidt | 180 | Sidetic | Sidetic | 17.0 | 26 | Ancient/historic | ||
| Sind | 318 | Khudawadi, Sindhi | Khudawadi | 7.0 | 69 | Ch 15.9 | ||
| Sinh | 348 | Sinhala | Sinhala | 3.0 | 111 | Ch 13.2 | ||
| Sogd | 141 | Sogdian | Sogdian | 11.0 | 42 | Ancient/historic | Ch 14.10 | |
| Sogo | 142 | Old Sogdian | Old Sogdian | 11.0 | 40 | Ancient/historic | Ch 14.9 | |
| Sora | 398 | Sora Sompeng | Sora Sompeng | 6.1 | 35 | Ch 15.17 | ||
| Soyo | 329 | Soyombo | Soyombo | 10.0 | 83 | Ancient/historic | Ch 14.7 | |
| Sund | 362 | Sundanese | Sundanese | 5.1 | 72 | Ch 17.7 | ||
| Sunu | 274 | Sunuwar | Sunuwar | 16.0 | 44 | |||
| Sylo | 316 | Syloti Nagri | Syloti Nagri | 4.1 | 45 | Ancient/historic | Ch 15.1 | |
| Syrc | 135 | Syriac | Syriac | 3.0 | 88 | Includes typographic variants Estrangelo (see § Syre), Western (§ Syrj), and Eastern (§ Syrn) | Ch 9.3 | |
| Syre | 138 | Syriac (Estrangelo variant) | — Typographic variant of Syriac (see § Syrc) | |||||
| Syrj | 137 | Syriac (Western variant) | — Typographic variant of Syriac (see § Syrc) | |||||
| Syrn | 136 | Syriac (Eastern variant) | — Typographic variant of Syriac (see § Syrc) | |||||
| Tagb | 373 | Tagbanwa | Tagbanwa | 3.2 | 18 | Ch 17.1 | ||
| Takr | 321 | Takri, Ṭākrī, Ṭāṅkrī | Takri | 6.1 | 68 | Ch 15.4 | ||
| Tale | 353 | Tai Le | Tai Le | 4.0 | 35 | Ch 16.5 | ||
| Talu | 354 | New Tai Lue | New Tai Lue | 4.1 | 83 | Ch 16.6 | ||
| Taml | 346 | Tamil | Tamil | 1.0 | 123 | Ch 12.6 | ||
| Tang | 520 | Tangut | Tangut | 9.0 | 7,059 | Ancient/historic | Ch 18.11 | |
| Tavt | 359 | Tai Viet | Tai Viet | 5.2 | 72 | Ch 16.8 | ||
| Tayo | 380 | Tai Yo | Tai Yo | 17.0 | 55 | |||
| Telu | 340 | Telugu | Telugu | 1.0 | 101 | Ch 12.7 | ||
| Teng | 290 | Tengwar | — Not in Unicode | |||||
| Tfng | 120 | Tifinagh (Berber) | Tifinagh | 4.1 | 59 | Ch 19.3 | ||
| Tglg | 370 | Tagalog (Baybayin, Alibata) | Tagalog | 3.2 | 23 | Ch 17.1 | ||
| Thaa | 170 | Thaana | Thaana | 3.0 | 50 | Ch 13.1 | ||
| Thai | 352 | Thai | Thai | 1.0 | 86 | Ch 16.1 | ||
| Tibt | 330 | Tibetan | Tibetan | 2.0 | 207 | Added in 1.0, removed in 1.1 and reintroduced in 2.0 | Ch 13.4 | |
| Tirh | 326 | Tirhuta | Tirhuta | 7.0 | 82 | Ch 15.11 | ||
| Tnsa | 275 | Tangsa | Tangsa | 14.0 | 89 | Ch 13.18 | ||
| Todr | 229 | Todhri | Todhri | 16.0 | 52 | Ancient/historic | ||
| Tols | 299 | Tolong Siki | Tolong Siki | 17.0 | 54 | |||
| Toto | 294 | Toto | Toto | 14.0 | 31 | Ch 13.17 | ||
| Tutg | 341 | Tulu-Tigalari | Tulu Tigalari | 16.0 | 80 | Ancient/historic | ||
| Ugar | 040 | Ugaritic | Ugaritic | 4.0 | 31 | Ancient/historic | Ch 11.2 | |
| Vaii | 470 | Vai | Vai | 5.1 | 300 | Ch 19.5 | ||
| Visp | 280 | Visible Speech | — Not in Unicode | |||||
| Vith | 228 | Vithkuqi | Vithkuqi | 14.0 | 70 | Ancient/historic | Ch 8.12 | |
| Wara | 262 | Warang Citi (Varang Kshiti) | Warang Citi | 7.0 | 84 | Ch 13.9 | ||
| Wcho | 283 | Wancho | Wancho | 12.0 | 59 | Ch 13.16 | ||
| Wole | 480 | Woleai | — Not in Unicode, proposal is explored[i] | |||||
| Xpeo | 030 | Old Persian | Old Persian | 4.1 | 50 | Ancient/historic | Ch 11.3 | |
| Xsux | 020 | Cuneiform, Sumero-Akkadian | Cuneiform | 5.0 | 1,234 | Ancient/historic | Ch 11.1 | |
| Yezi | 192 | Yezidi | Yezidi | 13.0 | 47 | Ancient/historic | Ch 9.6 | |
| Yiii | 460 | Yi | Yi | 3.0 | 1,220 | Ch 18.7 | ||
| Zanb | 339 | Zanabazar Square (Zanabazarin Dörböljin Useg, Xewtee Dörböljin Bicig, Horizontal Square Script) | Zanabazar Square | 10.0 | 72 | Ancient/historic | Ch 14.6 | |
| Zinh | 994 | Code for inherited script | Inherited | 684 | ||||
| Zmth | 995 | Mathematical notation | — Not a 'script' in Unicode | |||||
| Zsye | 993 | Symbols (emoji variant) | — Not a 'script' in Unicode | |||||
| Zsym | 996 | Symbols | — Not a 'script' in Unicode | |||||
| Zxxx | 997 | Code for unwritten documents | — Not a 'script' in Unicode | |||||
| Zyyy | 998 | Code for undetermined script | Common | 9,123 | ||||
| Zzzz | 999 | Code for uncoded script | Unknown | 954,246 | In Unicode: All other code points | |||
Notes
| ||||||||
References
| ||||||||
Missing scripts in Unicode
[edit | edit source]The project Missing Scripts—with contributors from the Mainz University of Applied Sciences, the L’Atelier national de recherche typographique (ANRT) in Nancy, and the University of California, Berkeley—has compiled a list of 131 scripts that have not yet been encoded in The Unicode Standard, out of a total of 294 recognized scripts according to the current state of research.[6]
See also
[edit | edit source]References
[edit | edit source]- ^ Lua error in Module:Citation/CS1/Configuration at line 2172: attempt to index field '?' (a nil value).
- ^ Lua error in Module:Citation/CS1/Configuration at line 2172: attempt to index field '?' (a nil value).
- ^ Lua error in Module:Citation/CS1/Configuration at line 2172: attempt to index field '?' (a nil value).
- ^ https://www.unicode.org/roadmaps/ Roadmaps to Unicode
- ^ Lua error in Module:Citation/CS1/Configuration at line 2172: attempt to index field '?' (a nil value).
- ^ Lua error in Module:Citation/CS1/Configuration at line 2172: attempt to index field '?' (a nil value).
External links
[edit | edit source]- Script Encoding Initiative, A project at UC Berkeley, USA, working to get more scripts included in the Unicode standard.
- The World’s Writing Systems, An overview of all 294 known writing systems, each with a typographic reference glyph and their Unicode status.