UTF-8


UTF-8 is "An encoding form of Unicode that supports ASCII for backward compatibility and covers the characters for most languages in the world." (MultiLingual)

UTF stands for "Unicode transfer format," and 8 refers to the minimum number of bits used. "UTF-8 uses one to four bytes (strictly, octets) per character, depending on the Unicode symbol." (Wikipedia) (1 octet=8 bits). There are also UTF-16 and UTF-32 formats.

Wikipedia describes UTF-8 as "a variable-length character encoding for Unicode. It is able to represent any universal character in the Unicode standard, yet the initial encoding of byte codes and character assignments for UTF-8 is consistent with ASCII (requiring little or no change for software that handles ASCII but preserves other values). For these reasons, it is steadily becoming the preferred encoding for e-mail, web pages, and other places where characters are stored or streamed.

"The Internet Mail Consortium (IMC?) recommends that all email programs be able to display and create mail using UTF-8." (Wikipedia)


