À¯´ÏÄÚµå¿Í UTF-8, ±×¸®°í ÀÚ¹Ù...


[ Follow Ups ] [ Post Followup ] [ ÀÚ¹Ù ¹¯°í ´äÇϱâ ]

Posted by ±è´öÅ on July 02, 1997 at 10:01:40:

Gil wrote:
>
> >: "F900-FFDF ¿¡ CJK Unified Ideographs ¶ó°íÇؼ­ À¯´ÏÄڵ忡 Àִµ¥ À̹Ì
> >: 4E00~9FA5 ±îÁö (¸ðµÎ 20902°³ , °è»êÀÌ ¸Â´ÂÁö..) ÀÌ¹Ì ÇÑÀÚ ¿µ¿ªÀÌ ÀÖ½À´Ï´Ù.
> >: ±×·±µ¥ ¿Ö Àú °÷¿¡ ¶Ç ¹«½¼ "ȣȯ¼º¹®ÀÚ" ¶ó°í Çؼ­ ³Ö¾ú´ÂÁö ¸ð¸£°Ú½À´Ï´Ù."
> >
> >: ¶ó°í Áú¹®Çß¾ú´Âµ¥ ¾ÆÁ÷µµ ¸ð¸£°Ú½À´Ï´Ù, ´Ù½ÃÇѹø ¿©±â¿¡ :)
> >
> >Compatibility´Â ¸» ±×´ë·Î ȣȯ¼ºÀÌÁÒ.
> >ÀÌ´Â ±âÁ¸ÀÇ encodingÀ» unicode¿¡ ´ëºÎºÐ ¼ö¿ëÇϸ鼭 ÀÖ´Â ºÎºÐÀÔ´Ï´Ù.
> >
> >ÀÏ·Ê·Î ÇÑ±Û ¿Ï¼ºÇüÀÇ °æ¿ì ÇÑÀÚ°¡ À½À» ±âÁØÀ¸·Î ³ª¿­ÀÌ µÇ¾î ÀÖ½À´Ï´Ù.
> > ±×·¡¼­ Áߺ¹µÇ´Â ÇÑÀÚ°¡ ÀÖ½À´Ï´Ù. À̸¦ ¼ö¿ëÇϱâ À§Çؼ­ ÀÌ·¯ÇÑ ¿µ¿ªÀÌ ÀÖ½À´Ï´Ù.
>
> ±×·¸³×¿ä, 並 ¿¹·Îµé¸é,
>
> ¾ÇÇÒ ¾Ç(äÂ) : ¿Ï¼ºÇü¿¡¼­ E4C2 : À¯´ÏÄڵ忡¼­ 60E1
> ¹Ì¿öÇÒ ¿À(ç÷) : ¿Ï¼ºÇü¿¡¼­ E7F7 : À¯´ÏÄڵ忡¼­ F9B9
>
> ¾ÇÇÒ ¾Çí®°¡ ¸ÕÀú³ª¿Í¼­ ±×·±Áö ±× ´ÙÀ½¿¡ ³ª¿À´Â°ÍÀÌ È£ÇѼº¹®ÀÚ¿µ¿ªÀ¸·Î
> µé¾î°¡´Â±º¿ä.
>
> Àо´Ï 268°³ÀÇ ±×·± Áߺ¹µÈ °æ¿ì°¡ ÀÖ°í Big5ÀÇ °æ¿ì¿¡´Â 2°³ ÀÖ°í ³ª¸ÓÁö´Â
> Àß ¸ð¸£°Ú½À´Ï´Ù.
> GIL
> http://soback.kornet.nm.kr/~chlang



Áú¹®ÇϽŠ¹®Á¦¿¡ ´ëÇÏ¿© ±× ÀÇ¹Ì ¹× ¹®Á¦Á¡À» ÆľÇÇϱâ À§Çؼ­´Â
À¯´ÏÄÚµåÀÇ ±âº» ¸ñÀûÀ» ¸ÕÀú È®½ÇÈ÷ ÀÌÇØÇÏ´Â °ÍÀÌ Áß¿äÇÕ´Ï´Ù.


==== À¯´ÏÄÚµåÀÇ ¼Ò°³ ==========
´ÙÀ½¿¡¼­ ±â¼úÇÏ´Â À¯´ÏÄÚµåÀÇ ¸ñÀûÀº ÷ºÎÇÏ´Â Âü°í¹®ÇåÀÇ ³»¿ë°ú
»óÀÌÇϸç, Á¦°¡ ³ª¸§´ë·Î »ý°¢Çϱ⿡ Á¤¸» Áß¿äÇÏ´Ù°í »ý°¢µÇ´Â ¸ñÀû¸¸À»
³ª¸§´ë·Î Á¤¸®ÇÑ °ÍÀ̹ǷΠÀÌÁ¡ Âø¿À¾øÀ¸½Ã±â ¹Ù¶ø´Ï´Ù.

ù°, ÅëÇÕ ¹®ÀÚ ¼¼Æ®.
´Ù¾çÇÑ ³ª¶ó°¡ ¼­·Î µ¿ÀÏÇÑ È¤Àº ºñ½ÁÇÑ ÀǹÌÀÇ ¹®ÀÚ¸¦ Àú¸¶´Ù ´Ù¸¥ ÀÎÄÚµù
¹æ½ÄÀ» »ç¿ëÇÔÀ¸·Î½á, ÀÚ·á ¹× ÇÁ·Î±×·¥ÀÇ È£È¯¼º ¹× È®À强¿¡ ¹®Á¦¸¦
ÀÏÀ¸Å°´Â °ü°è·Î À̸¦ ÇϳªÀÇ ¹®ÀÚ ¼¼Æ®ÀÎ À¯´ÏÄÚµå·Î ÅëÇÕ½ÃÄÑ
Ç¥ÇöÇÔÀ¸·Î½á ÇØ°áÇÑ´Ù´Â °ÍÀÔ´Ï´Ù.


µÑ°, ¹®ÀÚ ¼¼Æ® º¯È¯ÀÇ Á᫐ ¿ªÇÒ.
±âÁ¸ ¹®ÀÚ ¼¼Æ® Ç¥ÁØ (KS C 5601 ȤÀº ÀϺ»ÀÇ JIS X 0208µîµî)ÀÌ N°³°¡
Á¸ÀçÇÑ´Ù°í ÇÒ °æ¿ì, ´Ù¾çÇÑ ¹®ÀÚ ¼¼Æ® Ç¥ÁØ»çÀÌÀÇ º¯È¯À» À§Çؼ­´Â N *
(N-1)°³ÀÇ º¯È¯ Å×À̺íÀÌ ÇÊ¿äÇϳª ÀÌ´Â ±¸Çö»ó »ó´çÇÑ ºÎ´ãÀ¸·Î
ÀÛ¿ëÇÕ´Ï´Ù. µû¶ó¼­, ±âÁ¸ ¹®ÀÚ ¼¼Æ® Ç¥ÁØ»çÀÌÀÇ º¯È¯À» ÇÒ ¶§, ÀÏ´Ü
À¯´ÏÄÚµå·Î º¯È¯ÇÑ ÈÄ ´Ù¸¥ ¹®ÀÚ ¼¼Æ® Ç¥ÁØÀ¸·Î º¯È¯ÇÏ°Ô µÇ¸é, º¯È¯
Å×À̺íÀÌ 2 * N°³¸¸ ÇÊ¿äÇÏ°Ô µÇ¹Ç·Î, ¹®ÀÚ ¼¼Æ® º¯È¯À» ´Ù·ç´Â
ÇÁ·Î±×·¡¹ÖÀÌ »ó´çÈ÷ °£ÆíÇØÁø´Ù´Â °ÍÀÔ´Ï´Ù. À¯´ÏÄڵ尡 ÀÌ¿Í°°ÀÌ ¹®ÀÚ
¼¼Æ® º¯È¯ÀÇ Áß½ÉÀ» Â÷ÁöÇϱâÀ§Çؼ­´Â, ±âÁ¸ ¹®ÀÚ ¼¼Æ®ÀÇ ¹®ÀÚ¸¦
À¯´ÏÄÚµå·Î º¯È¯ÇÑ ÈÄ ´Ù½Ã ¿ø·¡ÀÇ ¹®ÀÚ ¼¼Æ®·Î º¯È¯ÇÏ¿©µµ ¿ø·¡ÀÇ ÀÚ·á°¡
º¯°æµÇÁö ¾Ê°í ±×´ë·Î º¹±¸µÇ¾î¾ß ÇÑ´Ù´Â °ÍÀÔ´Ï´Ù. À̸¦ round-trip
conversion compatibility ȤÀº source separation ruleÀ̶ó°í ÇÕ´Ï´Ù.



======= ±âÁ¸ Ç¥ÁØ ¹®ÀÚ ¼¼Æ®¿ÍÀÇ È£È¯¼º ==========
Âü°í¹®ÇåÀÇ ÆäÀÌÁö 2-8°ú 2-9ÀÇ ³»¿ëÀ» ÀοëÇϸé,
``The Unicode Standard avoids duplicate encoding of characters by
unifying them within scripts across languages; characters that are
equivalent in form are given a single code.
... (Áß·«) ...
The Unicode Standard avoids duplication of characters due to specific
usage in different languages, duplicating characters to support
compatiblity with base standards.
... (Áß·«) ...
In determining whether or not to unifiy variant ideograph forms across
standards, the Unicode Standard follows the principles described in
Section 6.4, CJK Ideographs Area. Where these principles determine
that two forms constitue a trivial (wazukana) difference, the Unicode
Standard assigns a single code. Otherwise, separate codes are assigned.
... (Áß·«) ...
Identifying a character A as a compatibility variant of another
character B implies that generally A can be remapped to B without loss
of information other than formatting. Such remapping cannot always
take place because many of the compatibility characters are in place
just to allow systems to maintain one-to-one mappings to existing code
sets. In such cases, a remapping would lose information that is felt
to be important in the original set. Compatiblility remappings are
called out in Section 7.1, Character Names List. Because replacing a
character by its compatibly equivalent character or character sequence
may change the information in the text, implementation has to proceed
with due caution. A good use of these mappings may not be in
transcoding, but in providing the correct equivalence for searching
and sorting.''


Áú¹®ÇϽŠ¹®Á¦¿Í À§ ÀοëµÈ ¹®±¸¿Í ¿¬°ü½ÃÄÑ Çؼ®Çغ¸¸é ´ÙÀ½°ú °°½À´Ï´Ù.


KS C 5601¿¡´Â ÇÑÀÚ°¡ ¿ì¸®³ª¶ó ÇÑÀÚ ¹ßÀ½¼øÀ¸·Î ¹è¿­µÇ¾î ÀÖ½À´Ï´Ù.
µû¶ó¼­, `¼ö·¹ Â÷'¸¦ ÀǹÌÇÏ´Â ÇÑÀÚÀÇ °æ¿ì, `¼ö·¹ °Å'·Î ÀÐÀ» ¼ö ÀÖÀ¸¹Ç·Î
½ÇÁ¦·Î´Â µ¿ÀÏÇÑ ÇÑÀÚ¿¡ ´ëÇÏ¿©, KS C 5601¿¡¼­´Â À̵éÀ» ÀÎÀ§ÀûÀ¸·Î
`¼ö·¹ °Å' ¿Í `¼ö·¹ Â÷'·Î ³ª´©¾î °¢°¢ 0xCBE7, 0xF3B3À¸·Î ÀÎÄÚµùµÇ¾î
ÀÖ½À´Ï´Ù.


ÀÌ¿Í °°ÀÌ ÀÎÀ§ÀûÀ¸·Î ³ª´« ÀÌÀ¯´Â, ÇÑÀÚÀÇ Á¤·Ä (sorting) ¹×
ÇØ´ç ÇÑÀÚ ¹ßÀ½ÀÇ Çѱ۰úÀÇ º¯È¯¿¡ ¸ðÈ£¼ºÀÌ ¾ø¾îÁö±â ¶§¹®À¸·Î º¸ÀÔ´Ï´Ù.


±×·¯³ª, ÇÑÁßÀÏ ÅëÇÕ ÇÑÀÚ ¿µ¿ª¿¡´Â `¼ö·¹ Â÷'¿Í `¼ö·¹ °Å'°¡ µ¿½Ã¿¡
µé¾î°¥ ¼ö´Â ¾ø½À´Ï´Ù. ÀÌ·¯ÇÑ ±¸ºÐÀº ¿ì¸®³ª¶ó¿¡¼­¸¸ ÀÇ¹Ì ÀÖ´Â °ÍÀ̹ǷÎ
`¼ö·¹ Â÷'ÀÇ ÀÇ¹Ì¿Í `¼ö·¹ °Å'ÀÇ Àǹ̸¦ ±¸ºÐÇÏÁö ¾Ê´Â ´ëÇ¥ÀûÀÎ ÇÑÀÚ
ÇÑ°³¸¸ÀÌ µé¾î°¡¾ßÇÕ´Ï´Ù.


KS C 5601°ú À¯´ÏÄÚµåÀÇ ´ëÀÀ Å×À̺íÀ» ã¾Æº¸¸é, KS C 5601ÀÇ `¼ö·¹ Â÷'´Â
À¯´ÏÄÚµåÀÇ ÇÑÁßÀÏ ÅëÇÕ ÇÑÀÚ ¿µ¿ª¿¡ ÀÖ´Â 0x8ECA¿¡ ´ëÀÀµÇ°í, `¼ö·¹ °Å'´Â
KS C 5601 Ç¥ÁØ°úÀÇ È£È¯ ÇÑÀÚ ¿µ¿ª³»¿¡ ÀÖ´Â 0xF90E·Î ´ëÀÀµÇ¾î ÀÖ½À´Ï´Ù.
Áï, `¼ö·¹ Â÷'´Â ÇØ´ç ÇÑÀÚ¸¦ ´ëÇ¥ÇÏ´Â °ÍÀ¸·Î °£Áֵǰí, (¾Æ¸¶, `¼ö·¹
Â÷'°¡ ´õ ÀϹÝÀûÀÎ ¹ßÀ½À̹ǷΠ±×·¸°Ô ¼±ÅÃÇÑ °ÍÀ¸·Î »ý°¢µË´Ï´Ù.) `¼ö·¹
°Å'´Â À¯´ÏÄÚµå·ÎºÎÅÍ KS C 5601·Î ´Ù½Ã ¿ªº¯È¯Çصµ round-trip conversion
compatiblity¿¡ ÀÇÇؼ­ ±×´ë·Î º¹±¸µÇ±â À§ÇØ Á¸ÀçÇÏ´Â °ÍÀÔ´Ï´Ù.


ÇÏÁö¸¸, ¶§·Î´Â `¼ö·¹ °Å,Â÷' ±¸ºÐ¾øÀÌ ÇϳªÀÇ À¯´ÏÄÚµå ¹®ÀÚ (0x8ECA)·Î
´ëÀÀ (compatibility remapping)½Ãų ÇÊ¿ä°¡ ÀÖ½À´Ï´Ù. Áï, ±¸ºÐ¾øÀÌ ÇØ´ç
ÇÑÀÚ¸¦ Ž»ö (search)Çϰųª, ¾î¶² ¼ø¼­ (°¡·É ºÎ¼ö¼ø)·Î Á¤·Ä (sorting),
ȤÀº ´Ù¸¥ ¹®ÀÚ ¼¼Æ® Ç¥ÁØ°úÀÇ º¸´Ù ÀûÀýÇÑ º¯È¯ÀÌ ÇÊ¿äÇÑ °æ¿ìÀÔ´Ï´Ù.
°¡·É, JIS X 0208°ú À¯´ÏÄÚµå¿ÍÀÇ º¯È¯ Å×À̺íÀ» ã¾Æº¸¸é `¼ö·¹ Â÷'¿¡
´ëÀÀÇÏ´Â À¯´ÏÄÚµå °ªÀÌ JIS X 0208ÀÇ ÇØ´ç ÇÑÀÚ¿¡ ´ëÀÀµÇ¾î ÀÖÀ¸³ª,
`¼ö·¹ °Å'¿¡ ´ëÀÀÇÏ´Â À¯´ÏÄÚµå °ªÀº ÇØ´ç ÇÑÀÚ¿¡ ´ëÀÀµÇ¾î ÀÖÁö ¾Ê½À´Ï´Ù.
(ÀÌÀ¯´Â, 2°³ÀÇ À¯´ÏÄÚµå °ªÀÌ ÇϳªÀÇ JIS X 0208·Î ´ëÀÀµÉ °æ¿ì,
1:1 ´ëÀÀÀÌ ±úÁ®¼­ ¾à°£ ´Ù¸¥ ÀǹÌÀÇ round-trip conversion
compatibility¸¦ À¯ÁöÇÏÁö ¸øÇϱ⠶§¹®À¸·Î º¸ÀÔ´Ï´Ù.)
Áï, KSC 5601°ú JIS X 0208»çÀÌÀÇ ÇÑÀÚ º¯È¯Àº compatibility remappingÀ»
ÇÏ¿©¾ß¸¸ `¼ö·¹ °Å'µµ º¯È¯ÀÌ µÈ´Ù´Â °ÍÀÔ´Ï´Ù.
ÀÌ·¯ÇÑ remappingµÈ °ªÀº Âü°í¹®ÇåÀÇ ÆäÀÌÁö 7-470ÀÇ Å×À̺íÀ» º¸½Ã¸é,
`¼ö·¹ °Å'¿¡ ´ëÀÀÇÏ´Â À¯´ÏÄÚµå °ªÀÌ 0xF903ÀÌÁö¸¸ ¹Ù·Î À§¿¡ compatiblity
remappingµÈ À¯´ÏÄÚµå °ªÀÎ 0x8ECA (KS C 5601ÀÇ `¼ö·¹ Â÷'¿¡ ´ëÀÀÇÏ´Â
À¯´ÏÄÚµå °ª)°¡ Àû½ÃµÇ¾î ÀÖ½À´Ï´Ù.
(½Ã½ºÅÛ¸¶´Ù ´Ù¸£°ÚÁö¸¸, ÀÌ·± remappingÀ» ÇØÁÖ´Â ÇÔ¼ö°¡ Á¦°øµÉ °¡´É¼ºÀÌ
ÀÖ½À´Ï´Ù.)


±×·¯³ª, compatibility remappingÀ» ÇÏ°Ô µÇ¸é `¼ö·¹ °Å'¿Í `¼ö·¹ Â÷'¸¦
±¸ºÐÇÑ º»·¡ÀÇ Àǹ̸¦ »ó½ÇÇÏ°Ô µÇ¾î¼­ ¹«Á¶°Ç ÀÌ¿Í °°ÀÌ remappingÇÒ ¼ö´Â
¾ø´Â °ÍÀ̹ǷÎ, ƯÁ¤ applicationÀÌ »óȲ¿¡ ¸Â´Â °æ¿ì¿¡¸¸ remappingÀ»
ÇÏ´Â °ÍÀÌ ÀûÀýÇÏ¿© º¸ÅëÀÇ º¯È¯ ÇÔ¼ö¸¦ »ç¿ëÇÏ°Ô µÇ¸é remappingÀ» ÇÏÁö
¾ÊÀ» °ÍÀÔ´Ï´Ù.


±×·±µ¥, ¿©±â¼­ ÇÑ°¡Áö ÁÖÀÇ ÇؾßÇÒ Á¡ÀÌ Àִµ¥,
À¯´ÏÄÚµå °ª 0x8ECA¸¸ÀÌ ÁÖ¾îÁ³À» ¶§,
`¼ö·¹ °Å'¿Í ±¸ºÐ µÇ´Â `¼ö·¹ Â÷'ÀÇ Àǹ̸¦ °®´Â °ÍÀ¸·Î º¼ ¼öµµ ÀÖ°í,
`¼ö·¹ °Å' ȤÀº `¼ö·¹ Â÷' 2°³¸¦ ´ëÇ¥ÇÏ¿© ÀÌ µÑ »çÀ̸¦ ±¸ºÐÇÏÁö
¾Ê´Â Àǹ̸¦ °®´Â °ÍÀ¸·Î º¼ ¼öµµ ÀÖ´Â ¾à°£ »ó¹ÝµÈ 2°¡Áö Àǹ̷ÐÀ» µ¿½Ã¿¡
°®´Â´Ù°í Çؼ®µË´Ï´Ù.
Á¦°¡ ÀÌÇØÇϱ⠰ï¶õÇÑ ºÎºÐÀÌ ¹Ù·Î ¿©±â¿¡ ÀÖ½À´Ï´Ù.
À¯´ÏÄڵ尡 ÀÌ¿Í °°ÀÌ ¸ðÈ£ÇÑ Àǹ̷ÐÀ» °®°í ÀÖ´Â °ÍÀÎÁö,
¾Æ´Ï¸é, Á¦°¡ Âü°í¹®ÇåÀ» Á¦´ë·Î ÀÌÇØÇÏÁö ¸øÇߴٰųª ´õ ¾ö¹ÐÇÑ ¹®¼­°¡
Ȥ½Ã ÀÖ´Â Áö, Àú·Î¼­´Â ¸ð¸£°Ú½À´Ï´Ù.


¶Ç, Á¤ºÎ¿¡¼­ Á¦Á¤ÇÑ KS C 5700ÀÌ À¯´ÏÄÚµå »ç¾ç°ú ¿ÏÀüÈ÷ µ¿ÀÏÇÑ °ÍÀÎÁö,
¾Æ´Ï¸é º¸´Ù ±¸Ã¼ÀûÀÎ »ç¾çÀÌ Ãß°¡µÇ¾îÀÖ´Â Áö, ±×·¡¼­, À¯´ÏÄÚµåÀÇ
KS C 5601 ȣȯ ÇÑÀÚ ¿µ¿ªÀº °¡´ÉÇÏ¸é »ç¿ëÇÏÁö ¸»°í mapping½ÃÅ°µµ ¸»
°ÍÀ» ±ÇÀåÇÏ´Â Áöħ°°Àº °ÍÀÌ ÀÖ´Â Áöµµ ¸ð¸£°Ú½À´Ï´Ù.
¾Æ½Ã´Â ºÐÀÌ °è½Ã¸é posting ¹Ù¶ø´Ï´Ù.


ÇÏÁö¸¸, Á¦ °³ÀÎÀûÀÎ ¼Ò°ßÀ¸·Î´Â, À¯´ÏÄڵ尡 ´Ù·ç°í ÀÖ´Â ÇÑÀÚÀÇ ¼ö
(2¸¸¿©ÀÚ)°¡ KS C 5601 (4888ÀÚ)º¸´Ù ÈÙ¾À ¸¹°í, KS C 5601¿¡ Æ÷ÇÔµÇÁö
¾ÊÀº ÇÑÀÚ Áß¿¡¼­, 2°³ÀÌ»óÀÇ ¿ì¸®³ª¶ó ¹ßÀ½À» °®´Â ÇÑÀÚ°¡ ¸¹ÀÌ ÀÖÀ»
°ÍÀ̹ǷÎ, À̵鿡 ´ëÇÏ¿© ¸ðµÎ ȣȯ ÇÑÀÚ À¯´ÏÄÚµå °ªÀ» »õ·ÎÀÌ ÇÒ´çÇÒ
°èȹÀÌ ¾ø´Ù¸é (¾Æ¸¶, ¾ø°ÚÁÒ?), À¯´ÏÄÚµåÀÇ »ç¿ëÀÌ È®´ëµÇ´Â ½ÃÁ¡¿¡
À̸£·¶À» ¶§, ȣȯ ÇÑÀÚ ¿µ¿ªÀÇ »ç¿ëÀ» ÀÚÁ¦Çϵµ·Ï (Áï, compatibility
remappingÀÌ default º¯È¯ÀÌ µÇµµ·Ï) ÇÏ°í, µû¶ó¼­, À¯´ÏÄÚµå °ª 0x8ECAÀÇ
Àǹ̷ÐÀ» `¼ö·¹ Â÷', `¼ö·¹ °Å'¸¦ ´ëÇ¥ÇÏ´Â °ÍÀ¸·Î Àǹ̷ÐÀ» °íÁ¤½ÃÅ°´Â
°ÍÀÌ ¹Ù¶÷Á÷ÇÏ´Ù°í º¾´Ï´Ù.



============== À¯´ÏÄÚµå¿Í UTF-8 ÀÎÄÚµù ====================
ÇÑ°¡Áö ´õ Á¦ ÀÇ°ßÀ» °³ÁøÇÏ°í ½ÍÀº °ÍÀº, À¯´ÏÄÚµå´Â ¿©·¯°¡Áö·Î
µûÁ®º¸¾ÒÀ» ¶§ ÇÁ·Î±×·¥ ³»ºÎ¿¡¼­ »ç¿ëÇϰųª ƯÁ¤ ÇÁ·Î±×·¥¸¸ÀÌ ÀÌÇØÇÏ´Â
È­ÀÏ È¤Àº ±³È¯µÇ´Â ÀÚ·áÀÇ ÀÎÄÚµùÀ¸·Î Àû´çÇÒ »Ó, ÀÏ¹Ý ÅؽºÆ® ¹®¼­ÀÇ
ÀÎÄÚµù, ȤÀº ´Ù¾çÇÑ ÇÁ·Î±×·¥ »çÀÌ¿¡ Àü´ÞµÇ´Â ÅؽºÆ® ÀÚ·áÀÇ
ÀÎÄÚµùÀ¸·Î¼­´Â ºÎÀû´çÇÏ´Ù´Â °ÍÀÔ´Ï´Ù. µû¶ó¼­, ÇöÀç¿Í °°ÀÌ
¿ì¸®³ª¶ó¿¡¼­ ÀϹÝÈ­µÈ ÀÎÄÚµù Ç¥ÁØÀÎ KS C 5601 (Á»´õ Á¤È®È÷´Â EUC-KR)À»
´ëüÇϱâ´Â °ï¶õÇϸç, À¯´ÏÄÚµå¿Í 1:1 ´ëÀÀÇÒ »Ó¸¸ ¾Æ´Ï¶ó, ¿©·¯°¡Áö
¿ì¼öÇÑ ¼ºÁúÀ» °®´Â UTF-8 ÀÎÄÚµùÀÌ KS C 5601ÀÇ ´ëü ÀÎÄÚµù ¹æ¹ýÀ̶ó´Â
°ÍÀÔ´Ï´Ù. µû¶ó¼­, ¿ì¸®³ª¶ó°¡ °¡´ÉÇÏ¸é »¡¸® UTF-8·Î ÀÌÇàÇÏ´Â °ÍÀÌ
¹Ù¶÷Á÷ÇÏ´Ù°í »ý°¢ÇÕ´Ï´Ù.



============== Âü°í¹®Çå ========================


[1] ``The Unicode Standard, Version 2.0,'' The Unicode
Consortium, Addison Wesley, 1996
(Æò°¡) À¯´ÏÄڵ忡 ´ëÇÏ¿© ÀÚ¼¼ÇÏ°í Àü¹ÝÀûÀÎ ¼³¸í°ú ÇÔ²², ¸ðµç Á¤ÀǵÈ
À¯´Ï ÄÚµå ¹®ÀÚÀÇ ÀϹÝÀûÀΠƯ¡ ¹× Ãâ·ÂµÈ ¸ð¾çÀÌ ¼Ò°³µÇ¾î
ÀÖ´Ù.



p.s. À¯´ÏÄڵ忡 ´ëÇÑ ¿ÀÇØÀÇ ¼ÒÁö¸¦ ÃÖ¼ÒÈ­ÇÏ°í ³íÀǸ¦ È°¼ºÈ­Çϱâ
À§ÇÏ¿© Unicode, Inc. ¹× Addison WesleyÀÇ Çã¶ô ¾øÀÌ ÇԺηΠÀοëÇÑ
Á¡¿¡ ´ëÇÏ¿© Unicode, Inc. ¹× Addison Wesley¿¡ »ç°úÀÇ ¶æÀ»
Ç¥ÇÕ´Ï´Ù.



================== ÷ºÎ ============================================
÷ºÎÇÏ´Â ÇÁ·Î±×·¥Àº À¯´ÏÄÚµå¿ÍÀÇ º¯È¯À» Á¦°øÇÏ´Â ÀÚ¹Ù ÇÁ·Î±×·¥ÀÔ´Ï´Ù.
(JDK1.1 ¹öÀü)
char ÀÚ·áÇü ¹× String ÀÚ·áÇüÀÌ ¸ðµÎ À¯´ÏÄÚµåÀ̹ǷÎ, À¯´ÏÄÚµåÀÇ Ã¹Â° ¹×
µÑ° ¸ñÀû¿¡ ºÎÇÕÇÏ´Â ÇÁ·Î±×·¡¹ÖÀÌ ÀÚ¿¬½º·´°í ¼Õ½±½À´Ï´Ù.


-------- HanjaTest.java ----------
// KSC5601³»ÀÇ ÇÑÀÚ ¹üÀ§: 0xCAA1 ~ 0xFDFE (4888 ÀÚ = 52 * 94)
public class HanjaTest
{   public static void main(String args[])
        throws java.io.UnsupportedEncodingException
    {
        for( int high =  0xCA; high <= 0xFD; ++high )
            for( int low =  0xA1; low <= 0xFE; ++low )
            {
                String unicode = new String(
                    new byte[] {(byte) high, (byte) low}, "KSC5601");
                byte[] eucjis = unicode.getBytes("EUCJIS");

System.out.print( Integer.toHexString(high) + Integer.toHexString(low) + " (" + unicode + ") ==> " + Integer.toHexString((int) unicode.charAt(0)));

if ( eucjis.length == 1 && (eucjis[0] & 0xFF) == '?' ) System.out.println( ", ¾øÀ½" ); else System.out.println( ", " + Integer.toHexString(eucjis[0] & 0xFF) + Integer.toHexString(eucjis[1] & 0xFF) ) ; } } } -------------- end of HanjaTest.java ----------------

C:> java HanjaTest | more (KSC5601, À¯´ÏÄÚµå, EUCJIS¼øÀ¸·Î ÇØ´ç ÄÚµå °ªÀ» Ãâ·Â)

... Àü·« ... cbe7 (Ëç) ==> f902, ¾øÀ½ // ¼ö·¹ °Å ... Áß·« ... f3b3 (ó³) ==> 8eca, bcd6 // ¼ö·¹ Â÷ ... ÈÄ·« ...

--
Deogtae Kim (±è´öÅÂ)
CA Lab. CS Dept. KAIST
E-Mail : dtkim@camars.kaist.ac.kr
Phone : +82-42-869-3569
Fax : +82-42-869-3510



Follow Ups:



À̾ ±Û¿Ã¸®±â(´äÇϱâ)

À̸§:
E-Mail:
Á¦¸ñ:
³»¿ë:
°ü·Ã URL(¼±ÅÃ):
URL Á¦¸ñ(¼±ÅÃ):
°ü·Ã À̹ÌÁö URL:


[ Follow Ups ] [ Post Followup ] [ ÀÚ¹Ù ¹¯°í ´äÇϱâ ]