UTF-8 and Unicode Standards

Character encoding is a very bad beast when you write code, but UTF-8 will save us.

Belive me: You will love the Big Brother and the UTF-8.


Unicode Transformation Format 8-bit is a variable-width encoding that can represent every character in the Unicode character set. It was designed for backward compatibility with ASCII and to avoid the complications of endianness and byte order marks in UTF-16 and UTF-32.

Source: UTF-8 and Unicode Standards
Also read this:

The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!)