📅 2010-Jan-13 ⬩ ✍️ Ashwin Nanjappa ⬩ 🏷️ multi-byte, unicode, wide character, windows ⬩ 📚 Archive
You run into Multi-Byte a lot when developing on Windows. For example, Visual Studio 2008 supports 2 character sets: Multi-Byte and Unicode. Notice that it does not list English or ASCII or some old comfortable 8-byte character set. Just so that we are not confused, Multi-Byte is not the same as Wide Character types and functions. Those use wchar_t
, stl::wstring
and their functions have a w
in their name, wprintf()
or std::wcout()
for example.
On Windows, Multi-Byte is the old character set. It is not Unicode, which is the new (and recommended) character set. Multi-Byte code looks like old C code written to deal with English characters and strings. It uses the old C char types (char
and char *
), literal strings ("Hello World"
) and stl::string
. It only differs in behavior: if Windows notices that it is running a Multi-Byte code/application on a non-English locale, the chars are interpreted and displayed according to that locale. For example, a char
string of length 2 (or more) could be combined to display just one glyph in the foreign language. Hence, the name Multi-Byte for this character set, its code, libraries and applications.