Unicode collation algorithm

From Wikipedia, the free encyclopedia
(Redirected from Unicode Collation Algorithm)
Jump to navigation Jump to search

The Unicode collation algorithm (UCA) is an algorithm defined in Unicode Technical Report #10, which is a customizable method to produce binary keys from strings representing text in any writing system and language that can be represented with Unicode. These keys can then be efficiently compared byte by byte in order to collate or sort them according to the rules of the language, with options for ignoring case, accents, etc.[1]

Unicode Technical Report #10 also specifies the Default Unicode Collation Element Table (DUCET). This data file specifies a default collation ordering. The DUCET is customizable for different languages,[1][2] and some such customizations can be found in the Unicode Common Locale Data Repository (CLDR).[3]

An open source implementation of UCA is included with the International Components for Unicode, ICU.[4][5] ICU supports tailoring, and the collation tailorings from CLDR are included in ICU.[6][2]

See also

[edit | edit source]

References

[edit | edit source]
  1. ^ a b Lua error in Module:Citation/CS1/Configuration at line 2172: attempt to index field '?' (a nil value).
  2. ^ a b Lua error in Module:Citation/CS1/Configuration at line 2172: attempt to index field '?' (a nil value).
  3. ^ Lua error in Module:Citation/CS1/Configuration at line 2172: attempt to index field '?' (a nil value).
  4. ^ Lua error in Module:Citation/CS1/Configuration at line 2172: attempt to index field '?' (a nil value).
  5. ^ Lua error in Module:Citation/CS1/Configuration at line 2172: attempt to index field '?' (a nil value).
  6. ^ Lua error in Module:Citation/CS1/Configuration at line 2172: attempt to index field '?' (a nil value).
[edit | edit source]

Tools

[edit | edit source]