"The main feature of PHP V6 is improved support for
Unicode. Currently, PHP is essentially a binary preprocessor. PHP
V5 does not provide native Unicode support; it assumes that all
characters are 1 byte long, which is a problem when handling
non-Latin characters. You can do conversions to Unicode, but it
requires the mbstring extension, which is not enabled by default or
an external tool like iconv.
"PHP V5 does not always render text correctly, or readable,
according to its associated character encoding. Unicode is an
industry standard developed by the Unicode Consortium to represent
every character no matter what the language, no matter the program,
no matter the platform. Unicode, widely supported by standards and
industry, makes international multilingual applications feasible.
Unicode can be represented by different character encodings, the
most common being UTF-8, UTF-16, and UTF-32.
"PHP V6, which supports Unicode (UTF-8) natively, has Unicode
support in the engine, in the extensions and in the API. PHP V6
handles all strings as Unicode, though support has been added to
handle binary strings as a different type. In PHP V6, string
literals are Unicode, Unicode identifiers are allowed, and
functions understand Unicode text. PHP V6 has built-in support for
converting between Unicode strings and other encodings when
required, and supports reading from and writing to a UTF-8
file."