4. Understanding Unicode
i) In the last section we explained the three codes that can be used in HTML to display an upside down question mark: ¿
or ¿
or ¿
. As we explained, the latter two codes are numeric character references. Both these numbers (#191 and #xbf) are actually the same number.
ii) Let's jump to another topic for a minute - Unicode! We'll come back to these HTML codes in the next section after you understand what Unicode is.
iii) Unicode is a character set. That means that Unicode defines a set of possible characters. These characters include the Latin alphabet (a-z and A-Z), digits (0-9), and punctuation. But (and here's the important point about Unicode), Unicode also includes characters for almost every single international language in use. Unicode includes French accents, Chinese characters, Hebrew, and much much more. It even includes emojis.
iv) Each of character in Unicode is assigned a number and this number is referred to as the Unicode code point. The Unicode code point for the uppercase A is 65, the lower case a is 97, and the lowercase e with a grace accent (è) is 232. The upside down question mark has a code point of 191.