|
This article is cited in 1 scientific paper (total in 1 paper)
Adjustable variable-length character encoding scheme — ACE
I. M. Adamovich, D. V. Zemskov Institute of Informatics Problems, Russian Academy of Sciences, 44-2 Vavilov Str., Moscow 119333, Russian Federation
Abstract:
The article describes ACE (Adjustable Character Encoding) — a variable-length character encoding scheme, which is capable of encoding the full range of UCS (Universal Coded Character Set, ISO/IEC 10646) code points as sequences of one to four octets (8-bit code units). The main reason of creating this encoding was to increase, in comparison with UTF-8 (Unicode Transformation Format, 8-bit), the number of code points encoded as one-octet code unit sequence, thus allowing more compact representation of texts containing characters of a chosen national alphabet, and also to increase the capability to preserve binary representation of encoded characters of such alphabet to match their binary values in a single-byte code table. This encoding retains such properties of the UTF-8 encoding as statelessness (the representation of an encoded character does not depend on the values of previous characters), self-synchronization (none of the valid code sequences can occur inside the other one, nor inside any adjacent sequences across their boundaries), and the possibility to locate the beginning or the end of a code sequence at any place of encoded text.
Keywords:
character encoding scheme; UCS; program localization; UTF-8.
Received: 05.05.2014
Citation:
I. M. Adamovich, D. V. Zemskov, “Adjustable variable-length character encoding scheme — ACE”, Sistemy i Sredstva Inform., 24:4 (2014), 124–134
Linking options:
https://www.mathnet.ru/eng/ssi379 https://www.mathnet.ru/eng/ssi/v24/i4/p124
|
Statistics & downloads: |
Abstract page: | 322 | Full-text PDF : | 77 | References: | 46 |
|