Since: May/31st/2004
Rs[^?A?I? 0 1 ?B??A?O 10 i?A0 1 ?iV[PXj?\@Kv??B?A@ (Character Encodings) ?B?Ap?{I?L???B??? 7 rbg ASCII ????A??@?B
@??A?W??Zbg???eX????AR[h|WV蓁E@?B
?L??A ASCII (American Standard Code for Information Interchange) ??m 7 rbgR[h?BnI?? 1969 N RFC20 ??m??B8 rbg??AG[`FbNp?peB[rbg?g??A0 ??????B
p???L? 94 ?????\? (0x21
`0x7e
) ?A33 ????A (0x20
) 128 ??ZbgA0x00
0x7f
?? 7 rbg?R[hXy[X?}bsO?B??A??A0x7F
= 0111 1111
R[h|WV?? (del) ???B
R[hGA | ? |
---|---|
0x00 `0x1f | ? |
0x20 | |
0x21 `0x7e | }`ip?Lj |
0x7f | ?iDELj |
? ASCII RFC1345 ?`??B{e??A?? ASCII ?]??A ASCII ???A7-bit ASCII (RFC20) w???B
IANA ??o^???????F
Name: ANSI_X3.4-1968 [RFC1345,KXS2] MIBenum: 3 Source: ECMA registry Alias: iso-ir-6 Alias: ANSI_X3.4-1986 Alias: ISO_646.irv:1991 Alias: ASCII Alias: ISO646-US Alias: US-ASCII (preferred MIME name) Alias: us Alias: IBM367 Alias: cp367 Alias: csASCII
ASCII ?????A??v?? ISO-2022 ?A?dl??{?? ISO-2022-JP ?B JIS R[h?????BISO-2022-JP ?@??A?l??GXP[vV[PXg?A ASCII R[h?{?????B
1b+28+42
= "[ESC] ( B
" ASCII1b+28+4a
= "[ESC] ( J
" JIS[} 1b+24+40
= "[ESC] $ @
" JIS (78JIS)1b+24+42
= "[ESC] $ B
" VJIS (83JIS)GXP[vV[PX??A??R[h|Cg???Zbg??|WV???BpJi????AR[h|WV????????B
GXP[vV[PX | ASCII \ | 1oCg | 2oCg | ? |
---|---|---|---|---|
00-1f,7f | R[h | |||
1b+28+42 | [ESC] ( B | 20-7e | ASCII | |
1b+28+4a | [ESC] ( J | 20-7e | JIS[} | |
1b+24+40 | [ESC] $ @ | 21-7e | 21-7e | JIS (78JIS) |
1b+24+42 | [ESC] $ B | 21-7e | 21-7e | VJIS (83JIS) |
1b+28+49 | [ESC] ( I | 21-5f | pJi | |
1b+24+44 | [ESC] $ D | 21-7e | 21-7e | JIS? |
[ESC]
??GXP[v?A16 i? "0x1b
" ???BGXP[vV[PX? 2 ?AiGXP[vV[PXjAZbg???????B
??A0x1b+(+B
??A?? ACSII ???A0x1b+$+B
??A?? 1983 N??? JIS R[h????B
ISO-2022-JP ??A{? 2 oCg?\?AASCII 8 rbg??? 7 rbg???A 0x3021
?A"0x1B + $ + B
" ????uv???A"
0x1B + ( + B
" ??Au0!
v ???B??Auv???AJIS ?
0x2422
???AR[h ASCII ? u$"
v ???B
??A 0x24
?n?oCg?? ACSII R[h? u$+x
v??AJ^Ji 0x25
?n?oCg?? ASCII R[h??u%+x
v???BpJi JIS8 ??? 8-bit \? a1-df
i63 ?j??A?rbg 0 ? 7-bit \ JIS7 ? 21-5f
???B
Gc?f?AISO-2022 ?AGXP[vV[PX??AVOoCg?_uoCg?A?? 2 oCg???R[h?B
IANA ??o^???????F
Name: ISO-2022-JP (preferred MIME name) [RFC1468,Murai] MIBenum: 39 Source: RFC-1468 (see also RFC-2237) Alias: csISO2022JP
ISO-2022 ?AGXP[[vV[PX??A?R[h|Cg?X?BGXP[vV[PXE?AR[hApwAw???B
p{pJ^Ji{??AVOoCg?R[h|Cg?A_uoCg?R[h|Cg??AVtg JIS ?B
0x5c
????1oCg | 2oCg | ? |
---|---|---|
00-1f , 7f | R[h | |
21-7e | ASCII | |
81-9F , e0-ef | 40-7e , 80-fc | VJIS (83JIS) |
a1-df | pJi |
GXP[vV[PX?p?A1 oCg??l?????AR[h?]T?A??蓁E?B
8 rbg???g??A7 rbg????? SMTP (Simple Mail Transfer Protocol) ??????B?AEUC-JP UTF-8 ???l?B
Gc?f?AVtg JIS ?AVOoCg?_uoCg?R[h|Cgd?????AoCg???f?A?? 2 oCg???R[h?B
IANA ??o^???????F
ame: Shift_JIS (preferred MIME name) MIBenum: 17 Source: This charset is an extension of csHalfWidthKatakana by adding graphic characters in JIS X 0208. The CCS's are JIS X0201:1997 and JIS X0208:1997. The complete definition is shown in Appendix 1 of JIS X0208:1997. This charset can be used for the top-level media type "text". Alias: MS_Kanji Alias: csShiftJIS
7 rbg? ASCII R[h?dR[h|Cg??AGXP[vV[PX?A?R[h|Cg??? JIS ?BShit_JIS ?AGXP[vV[PXg??AVOoCg?_uoCgA??oCg??R[hGA????BEUC ?A? 1 rbg? 0 1 ???? ASCII ???B
1oCg | 2oCg | 3oCg | ? |
---|---|---|---|
00-1f , 7f | R[h | ||
21-7e | ASCII | ||
a1-fe | a1-fe | VJIS (83JIS) | |
8f | a1-fe | a1-fe | ? |
8e | a1-df | pJi |
Gc?f?AJIS ?Vtg JIS ????A?| 3 oCg??R[h?B?T|[g? EUC-JP ?A?ApJiT|[g?????AC^[lbg?pJig?????????B
IANA ??o^???????F
Name: Extended_UNIX_Code_Packed_Format_for_Japanese MIBenum: 18 Source: Standardized by OSF, UNIX International, and UNIX Systems Laboratories Pacific. Uses ISO 2022 rules to select code set 0: US-ASCII (a single 7-bit byte set) code set 1: JIS X0208-1990 (a double 8-bit byte set) restricted to A0-FF in both bytes code set 2: Half Width Katakana (a single 7-bit byte set) requiring SS2 as the character prefix code set 3: JIS X0212-1990 (a double 7-bit byte set) restricted to A0-FF in both bytes requiring SS3 as the character prefix Alias: csEUCPkdFmtJapanese Alias: EUC-JP (preferred MIME name)
Ct[?g?A}`oCg?R[h?BEBCDIC (Extended BCD Interchange Code) 8 rbgR[h?A?` BCD (Binary Coded Decimal) g??B
_uoCg?g?AVOoCg?? SBCS (Single Byte Character Set) ??сA_uoCg?? DBCS (Double Byte Character Set) ??сEBVOoCg?_uoCg?A??VtgAEg (SO, 0x0e
) ?VtgC (SI, 0x0f
) ????B??A_uoCg?ASO SI ?邱????B
IANA ??o^???????B
Name: IBM037 [RFC1345,KXS2] MIBenum: 2028 Source: IBM NLS RM Vol2 SE09-8002-01, March 1990 Alias: cp037 Alias: ebcdic-cp-us Alias: ebcdic-cp-ca Alias: ebcdic-cp-wt Alias: ebcdic-cp-nl Alias: csIBM037
?????AIBM037, IBM038, IBM274, IBM275, IBM277, IBM278, IBM280, IBM281, IBM274, IBM285, IBM290, IBM297, IBM420, IBM423, IBM424, IBM500, IBM870, IBM871, IBM880, IBM905, IBM918, IBM00924, IBM01140, IBM01141, IBM01142, IBM01143, IBM01144, IBM01145, IBM01146, IBM01147, IBM01148, IBM01149, IBM1047 ????AEBCDIC-AT-DE, EBCDIC-AT-DE-A, EBCDIC-CA-FR, EBCDIC-DK-NO, EBCDIC-DK-NO-A ????A?R?? IANA ?o^??B
Unicode ?AUnicode Consortium ??A 2 oCg???@?BS??A??@?舵???B?AWindows Java ?lCeBu?R[h??ft@NgX^_[h?n?z??B?W?? ISO/IEC 10646 l?RZvg?dl??A1992 N??A????A???????B
Unicode ??V? Unicode V4.01 ?A ISO/IEC 10646-1:2000 (Universal Multiple-Octet Coded Character Set (UCS)--Part 1: Architecture and Basic Multilingual Plane) ????B
ISO/IEC 10646 `?@ UCS (Universal Multiple-Octet Coded Character Set) ?A32 rbg` UCS-4 (Universal Character Set coded in 4 octets) ?A16 rbg` UCS-2 (Universal Character Set coded in 2 octets) ??B
UCS-4 ?A 231 ??R[h|CgAR[hXy[X?AQ (group)A (plane)A (row)A_ (cell)??BQ 7-bit i128 ?j?A?AA_ 8-bit i256 ?j?B
UCS-2 ?AUCS-4 ?Q0000\??A?? BMP (Basic Multilingual Plane, {) ????BUnicode V3.0 ?? ISO/IEC 10646-1 ?ABMP ?O???`??BUnicode V3.1 ? BMP ?O?R[h|WV?蓁E??AISO/IEC 10646-1 ?????AUCS-2 BMP ?????????B
IANA ??o^???????F
Name: ISO-10646-UCS-4 MIBenum: 1001 Source: the full code space. (same comment about byte order, these are 31-bit numbers. Alias: csUCS4 Name: ISO-10646-UCS-2 MIBenum: 1000 Source: the 2-octet Basic Multilingual Plane, aka Unicode this needs to specify network byte order: the standard does not specify (it is a 16-bit integer space) Alias: csUnicode
UCS ????@?A]?p邱?z??ATCYk????R[hA?i??`??B?AUTF (UCS Transformation Format) ?B
UTF ??A8 rbg` UTF-8 ?A16 rbg` UTF-16 \I?B
UTF-8 ?AISO/IEC 10646 ?S??\?\?B1 INebg?\?\??? 0x00
`0x7f
????A ASCII ?S???B???Aen?g 2 INebg??A?? 3 INebg??g?BUnicode ? 0x10ffff
??w????AUTF-8 ??? 4 INebg???AUCS-4 S???AUTF-8 ?? 6 INebg???B
Char. number range | UTF-8 octet sequence (hexadecimal) | (binary) ---------------------+--------------------------------------------- 0x00000000 - 0x0000007F | 0xxxxxxx (7-bit ASCII) 0x00000080 - 0x000007FF | 110xxxxx 10xxxxxx 0x00000800 - 0x0000FFFF | 1110xxxx 10xxxxxx 10xxxxxx (BMP) 0x00010000 - 0x001FFFFF | 11110xxx 10xxxxxx 10xxxxxx 10xxxxxx (Unicode) 0x00200000 - 0x03FFFFFF | 111110xx 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx 0x04000000 - 0x7FFFFFFF | 1111110x 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx
ASCII ?R[h?AMSB i??rbgj 1 ?????A7 rbg??]??? SMTP (Simple Mail Transfer Protocol) ??A??????B?_???AUTF-7 ? 7 rbg`?@??B
UTF-16 ?AISO/IEC 10646 ?`AO[v 00 ??? 16 v[??\?\?Bv[E[ BMP ?R[h|Cg UCS-2 ???gA BMP ??ATQ[gyA?\?B
UTF-16 ?hn?AUTF-16BE (Big Endian) UTF-16LE (Little Endian) ???B??AUCS-2 U+0041
CUTF-16BE ? 00 41
?\AUTF-16LE ? 41 00
?\?B
UTF-16 BE LE ??????A????A??A?? 2 oCg BOM (Byte Order Mark, U+FEFF
) ???R[h?????B? FE FF
? BE ??AFF FE
? LE ???BWindows ?lCeBuR[h? Unicode ?AUTF-16LE ?ABOM ???B
IANA ??o^???????F
Name: UTF-8 [RFC3629] MIBenum: 106 Source: RFC 3629 Alias: None Name: UTF-16BE [RFC2781] MIBenum: 1013 Source: RFC 2781 Alias: None Name: UTF-16LE [RFC2781] MIBenum: 1014 Source: RFC 2781 Alias: None Name: UTF-16 [RFC2781] MIBenum: 1015 Source: RFC 2781 Alias: None