hu9920 发表于 2010-5-6 14:34:38

[求助]UTF-8和UTF-16的区别

知道的说下,谢谢~

wgx083 发表于 2010-5-11 15:45:32

UTF-8 is the byte-oriented encoding form of Unicode. For details of its definition, see Section 2.5 “Encoding Forms” and Section 3.9 “ Unicode Encoding Forms ” in the Unicode Standard. See, in particular, Table 3-6 UTF-8 Bit Distribution and Table 3-7 Well-formed UTF-8 Byte Sequences, which give succinct summaries of the encoding form. Make sure you refer to the latest version of the Unicode Standard, as theUnicode Technical Committee has tightened the definition of UTF-8 over time to more strictly enforce unique sequences and to prohibit encoding of certain invalid characters. There is an Internet RFC 3629 about UTF-8. UTF-8 is also defined in Annex D of ISO/IEC 10646.

       UTF-16 uses a single 16-bitcode unit to encode the most common 63K characters, and a pair of 16-bit code unites, called surrogates, to encode the 1M less commonly used characters in Unicode.Originally, Unicode was designed as a pure 16-bit encoding, aimed at representing all modern scripts. (Ancient scripts were to be represented with private-use characters.) Over time, and especially after the addition of over 14,500 composite characters for compatibility with legacy sets, it became clear that 16-bits were not sufficient for the user community. Out of this arose UTF-16.

from "http://unicode.org/faq/utf_bom.html#UTF16 "

peterjulun 发表于 2010-5-12 18:03:30

UTF-8, 8bit编码, ASCII不作变换, 其他字符做变长编码, 每个字符1-3 byte. 通常作为外码. 有以下优点:
* 与CPU字节顺序无关, 可以在不同平台之间交流
* 容错能力高, 任何一个字节损坏后, 最多只会导致一个编码码位损失, 不会链锁错误(如GB码错一个字节就会整行乱码)

UTF-16, 16bit编码, 是变长码, 大致相当于20位编码, 值在0到0x10FFFF之间, 基本上就是unicode编码的实现. 它是变长码, 与CPU字序有关, 但因为最省空间, 常作为网络传输的外码.

peterjulun 发表于 2010-5-12 18:04:18

http://blog.csdn.net/qinysong/archive/2006/09/05/1179480.aspx里面有字符编码

andy_wangyt 发表于 2010-10-15 23:53:47

学习了,经常见,却从没想过为什么

jiaruiqiang 发表于 2010-11-2 15:12:01

来学习哈

wyfyan 发表于 2010-11-19 20:41:10

了解 了

ydqjlf 发表于 2011-1-27 10:47:08

学习了

wangjf8711 发表于 2011-2-10 17:12:58

学习

jiazurongyu 发表于 2011-4-15 15:48:51

UTF-8, 8bit编码, ASCII不作变换, 其他字符做变长编码, 每个字符1-3 byte. 通常作为外码. 有以下优点:
* 与CPU字节顺序无关, 可以在不同平台之间交流
* 容错能力高, 任何一个字节损坏后, 最多只会导致一个编码码位损失, 不会链锁错误(如GB码错一个字节就会整行乱码)

UTF-16, 16bit编码, 是变长码, 大致相当于20位编码, 值在0到0x10FFFF之间, 基本上就是unicode编码的实现. 它是变长码, 与CPU字序有关, 但因为最省空间, 常作为网络传输的外码.

zxsh007 发表于 2011-4-15 15:52:00

上海熟悉Junit tester ,英语口语好,5年+,年薪20--30万
上海, 英语口语, 软件开发英语口语, 上海, tester, 年薪, Junit
senior tester ,有机会做Tech Leader.
要求有软件开发经验,能写自动化测试脚本,优先考虑做性能测试的,优先考虑用过Junit的(Junit就是用脚本写的自动化测试工具),不要做手动测试的


MSN:zxsh3598@hotmail.com

ceshi521 发表于 2012-2-9 17:10:34

学习了
页: [1]
查看完整版本: [求助]UTF-8和UTF-16的区别