About: UTF-16 is a research topic. Over the lifetime, 14 publications have been published within this topic receiving 130 citations. The topic is also known as: 16-bit Unicode Transformation Format & Unicode Transformation Format – 16-bit.
TL;DR: The UTF-16 encoding of Unicode/ISO-10646 is described, the issues of serializingUTF-16 as an octet stream for transmission over the Internet are addressed, and MIME charset naming is discussed as described in [CHARSET-REG].
Abstract: This document describes the UTF-16 encoding of Unicode/ISO-10646,
addresses the issues of serializing UTF-16 as an octet stream for
transmission over the Internet, discusses MIME charset naming as
described in [CHARSET-REG], and contains the registration for three
MIME charset parameter values: UTF-16BE (big-endian), UTF-16LE
(little- endian), and UTF-16. This memo provides information for the
Internet community.
TL;DR: A method for converting to Unicode, in a Java Input Method Editor (IME), is described in this article. But it does not specify the encoding formats of a character code unit.
Abstract: A method for converting to Unicode, in a Java Input Method Editor (“IME”), the encoding formats of a character code unit, including selecting an encoding format, receiving, through a computer user interface, in an IME, at least one character code unit having the encoding format and an encoding base, and displaying the character code unit through the computer user interface. Embodiments also include converting the encoding format of the character code unit to Unicode, thereby creating a Unicode code point, displaying, through the computer user interface, a glyph corresponding to the Unicode code point, and transferring the Unicode code point to an application.
TL;DR: The Unicode Standard, version 2.0, and ISO/IEC 10646-1:1993(E) (as amended) jointly define a character set (hereafter referred to as Unicode) which encompasses most of the world's writing systems, but Internet mail currently supports only 7- bit US ASCII as a characterSet.
Abstract: The Unicode Standard, version 2.0, and ISO/IEC 10646-1:1993(E) (as amended) jointly define a character set (hereafter referred to as Unicode) which encompasses most of the world's writing systems. However, Internet mail (STD 11, RFC 822) currently supports only 7- bit US ASCII as a character set. MIME (RFC 2045 through 2049) extends Internet mail to support different media types and character sets, and thus could support Unicode in mail messages. MIME neither defines Unicode as a permitted character set nor specifies how it would be encoded, although it does provide for the registration of additional character sets over time.
TL;DR: The background of the development of this standard among vendors and by the International Organization for Standardization (ISO) is presented and the character encoding’s design goals and principles are described.
Abstract: A universal character encoding—the Unicode standard—has been developed to produce international software and to process and render data in most of the world’s languages. In this paper, we present the background of the development of this standard among vendors and by the International Organization for Standardization (ISO). We describe the character encoding’s design goals and principles. We also discuss the issues an application handles when processing Unicode text. We conclude with a description of some approaches that can be taken to support Unicode and a discussion of Microsoft’s implementation. Microsoft’s decision to use Unicode as the native text encoding in its Windows NT (New Technology) operating system is of particular significance for the success of Unicode.
TL;DR: The Unicode Standard, version 1.1, and ISO/IEC 10646-1:1993(E) jointly define a 16 bit character set (hereafter referred to as Unicode) which encompasses most of the world's writing systems, but Internet mail currently supports only 7- bit US ASCII as a character set.
Abstract: The Unicode Standard, version 1.1, and ISO/IEC 10646-1:1993(E) jointly define a 16 bit character set (hereafter referred to as Unicode) which encompasses most of the world's writing systems. However, Internet mail (STD 11, RFC 822) currently supports only 7- bit US ASCII as a character set. MIME (RFC 1521 and RFC 1522) extends Internet mail to support different media types and character sets, and thus could support Unicode in mail messages. MIME neither defines Unicode as a permitted character set nor specifies how it would be encoded, although it does provide for the registration of additional character sets over time.