jiskanji(5) — Macro Packages and Conventions
NAME
jiskanji − A character encoding system (codeset) for Japanese
DESCRIPTION
JIS Kanji is a codeset that uses the JIS X0202 symbol extension method for encoding the JIS X0208 and JIS X0201 character sets. There are two types of JIS Kanji encoding: 7-bit JIS Kanji code and 8-bit JIS Kanji code.
7-bit JIS Kanji Code
In 7-bit JIS Kanji encoding, all character values are 7-bit bytes. Characters are interpreted according to preceding in and out sequences as follows:
•Kanji in sequence (ESC $ B)
The code values following the Kanji in sequence (ESC $ B) are treated as characters in the JIS X0208 Kanji character set.
•Kanji out sequence (ESC ( B)
The code values following the Kanji out sequence (ESC ( B) are treated as ASCII characters.
•Supplementary Kanji in sequence (ESC $ ( D)
The code values following the supplementary Kanji in sequence (ESC $ ( D) are treated as characters in the JIS X0212 supplementary Kanji character set.
•User-Defined Character (UDC) in sequence (ESC $ ( 0)
The code values following the UDC in sequence (ESC $ ( 0) are treated as characters in the vendor-defined or user-defined character set.
•Kana in (SO) and Kana out (SI) sequences
The code values following SO(0x0e) and preceding SI(0x0f) are treated as characters in the JIS X0201 Katakana character set.
•Katakana in sequence (ESC ( I)
Code values following the Katakana in sequence (ESC ( I) are treated as characters in the JIS X0201 Katakana character set. In this case, the Kanji out sequence is used to switch back to ASCII code.
The Katakana in and Kanji out sequences are an alternative to using the Kana in and out sequences (SO/SI).
8-bit JIS Kanji Code
In 8-bit JIS Kanji encoding, the JIS X0201 Katakana characters are represented as 8-bit bytes. Using this form of encoding, in and out sequences have the following effect:
•Kanji in sequence (ESC $ B)
Code values following the Kanji in sequence (ESC $ B) are treated as characters in the JIS X0208 Kanji character set.
•Supplementary Kanji in sequence (ESC $ ( D)
Code values following the supplementary Kanji in sequence (ESC $ ( D) are treated as characters in the JIS X0212 supplementary Kanji character set.
•User-Defined Character (UDC) in sequence (ESC $ ( 0)
Code values following the UDC in sequence (ESC $ ( 0) are treated as vendor-defined or user-defined characters.
•Kanji out sequence (ESC ( B) Code values following the Kanji out sequence (ESC ( B) are treated as ASCII characters.
•Kana in and out sequences (SI/SO)
These sequences are ignored.
RESTRICTIONS
The JIS Kanji codeset is not supported directly by a locale but through code conversion (through the iconv utility, Japanese terminal (tty) code conversion, and so forth).
In the codeset naming conventions used by the iconv utility, the string JIS7 indicates 7-bit JIS Kanji code that follows a Katakana in sequence and the string jiskanji7 indicates 7-bit JIS Kanji code entered between Kana in and out sequences. The following sequences are valid for input to the iconv utility but are not generated when code is converted to JIS Kanji:
•Kanji in (ESC $ @)
•Kanji in (ESC & @ ESC $ B)
•Kanji in (ESC $ ( B)
•Kanji in (ESC $ ( @)
•Supplementary Kanji in (ESC $ D)
•Kana in (ESC ( J)
•Kana in (ESC ( H)
In the code naming conventions of the Japanese terminal, the string jis7 indicates 7-bit JIS Kanji code and the string jis8 indicates 8-bit JIS Kanji code. When the terminal code is set to jis7, the Kana in and out sequences (SI/SO) are used for JIS X0201 Katakana character representation.
SEE ALSO
Commands: locale(1)
Others: ascii(5), deckanji(5), eucJP(5), i18n_intro(5), i18n_printing(5), iconv_intro(5), iso2022jp(5), Japanese(5), l10n_intro(5), sdeckanji(5), shiftjis(5)