Tags (Unicode block)

From Wikipedia, the free encyclopedia
Jump to navigation Jump to search
Tags
RangeU+E0000..U+E007F
(128 code points)
PlaneSSP
ScriptsCommon
Assigned97 code points
Unused31 reserved code points
1 deprecated
Unicode version history
3.1 (2001)97 (+97)
Unicode documentation
Code chart ∣ Web page
Note: [1][2]

Tags is a Unicode block containing formatting tag characters. The block is designed to mirror ASCII. It was originally intended for language tags, but has now been repurposed as emoji modifiers, specifically for region flags.

Legacy use

[edit | edit source]

U+E0001, U+E0020–U+E007F were originally intended for invisibly tagging texts by language[3] but that use is no longer recommended.[4] All of those characters were deprecated in Unicode 5.1.

With the release of Unicode 8.0, U+E0020–U+E007E are no longer deprecated characters. The change was made "to clear the way for the potential future use of tag characters for a purpose other than to represent language tags".[5] Unicode states that "the use of tag characters to represent language tags in a plain text stream is still a deprecated mechanism for conveying language information about text".[5]

Current use

[edit | edit source]

With the release of Unicode 9.0, U+E007F is no longer a deprecated character. (U+E0001 LANGUAGE TAG remains deprecated.) The release of Emoji 5.0 in May 2017[6] considers these characters to be emoji for use as modifiers in special sequences.

The only usage specified is for representing the flags of regions, alongside the use of Regional Indicator Symbols for national flags.[7] These sequences consist of U+1F3F4 🏴 <reserved-1F3F4> followed by a sequence of tags corresponding to the region as coded in the CLDR, then U+E007F <reserved-E007F>. For example, using the tags for "gbeng" (🏴󠁧󠁢󠁥󠁮󠁧󠁿) will cause some systems to display the flag of England, those for "gbsct" (🏴󠁧󠁢󠁳󠁣󠁴󠁿) the flag of Scotland, and those for "gbwls" (🏴󠁧󠁢󠁷󠁬󠁳󠁿) the flag of Wales.[7]

The tag sequences are derived from ISO 3166-2, but sequences representing other subnational flags (for example US states) are also possible using this mechanism. However, as of Unicode version 12.0 only the three flag sequences listed above are "Recommended for General Interchange" by the Unicode Consortium, meaning they are "most likely to be widely supported across multiple platforms".[8]

Tags have been used to create invisible prompt injections on LLMs.[9]

Unicode block

[edit | edit source]
Tags[1][2][3]
Official Unicode Consortium code chart (PDF)
  0 1 2 3 4 5 6 7 8 9 A B C D E F
U+E000x BEGIN
U+E001x
U+E002x  SP    !     "     #     $     %     &     '     (     )     *     +     ,     -     .     /  
U+E003x   0     1     2     3     4     5     6     7     8     9     :     ;     <     =     >     ?  
U+E004x  @    A     B     C     D     E     F     G     H     I     J     K     L     M     N     O  
U+E005x   P     Q     R     S     T     U     V     W     X     Y     Z     [     \     ]     ^     _  
U+E006x   `     a     b     c     d     e     f     g     h     i     j     k     l     m     n     o  
U+E007x   p     q     r     s     t     u     v     w     x     y     z     {     |     }     ~   END
1.^ As of Unicode version 17.0
2.^ Grey areas indicate non-assigned code points
3.^ Unicode code points U+E0001 and U+E0020 through U+E007F were deprecated with Unicode version 5.1 however as of Unicode version 9.0 only U+E0001 remains deprecated

History

[edit | edit source]

The following Unicode-related documents record the purpose and process of defining specific characters in the Tags block:

References

[edit | edit source]
  1. ^ Lua error in Module:Citation/CS1/Configuration at line 2172: attempt to index field '?' (a nil value).
  2. ^ Lua error in Module:Citation/CS1/Configuration at line 2172: attempt to index field '?' (a nil value).
  3. ^ Lua error in Module:Citation/CS1/Configuration at line 2172: attempt to index field '?' (a nil value).
  4. ^ Lua error in Module:Citation/CS1/Configuration at line 2172: attempt to index field '?' (a nil value).
  5. ^ a b Lua error in Module:Citation/CS1/Configuration at line 2172: attempt to index field '?' (a nil value).
  6. ^ Lua error in Module:Citation/CS1/Configuration at line 2172: attempt to index field '?' (a nil value).
  7. ^ a b Lua error in Module:Citation/CS1/Configuration at line 2172: attempt to index field '?' (a nil value).
  8. ^ Lua error in Module:Citation/CS1/Configuration at line 2172: attempt to index field '?' (a nil value).
  9. ^ Lua error in Module:Citation/CS1/Configuration at line 2172: attempt to index field '?' (a nil value).