ASCII and Unicode
ASCII and Unicode
Understanding ASCII
-
The American Standard Code for Information Interchange (ASCII) is a character encoding standard used to represent text in computers and other devices that use text.
-
ASCII uses 7 bits to represent each character, this makes it capable of representing 128 different characters including numbers, English letters, special characters and control codes.
-
Being devised in the USA, ASCII has a bias towards English language and does not support other alphabets, special symbols, and accents typically used in other languages.
Advanced Understanding of ASCII
-
The eighth bit in an ASCII character was originally used for a parity bit for error detection, but was later incorporated, resulting in the Extended ASCII.
-
Extended ASCII can represent 256 different characters doubling the original set.
-
Some characters in the extended set are control characters, while others represent lowercase and special characters, additional graphical symbols and foreign characters.
Understanding Unicode
-
Unicode is a character encoding standard with the goal of replacing all existing character encoding schemes, providing a unique number for every character irrespective of platform, program, or language.
-
Unlike ASCII, Unicode supports over a million characters and can accommodate characters and symbols from all languages around the world.
-
Unicode and ASCII are compatible, the first 128 Unicode characters are identical to ASCII.
Unicode Encoding Schemes
-
UTF-8, UTF-16, and UTF-32 are the three encoding schemes in Unicode that define how a character’s numerical value is represented.
-
UTF-8 is a variable-length encoding system that uses 8-bit bytes. It is backward compatible with ASCII and more byte efficient for ASCII characters.
-
UTF-16 is a variable-length encoding system, using either 16 or 32 bits, more efficient for languages with characters not represented in ASCII.
-
UTF-32 is a fixed-length, using 32 bits for each character, providing the ease of byte alignment but is less memory efficient.
Importance of ASCII and Unicode
-
ASCII and Unicode are the foundations of text processing in computer systems, including input, display and storage.
-
The invention of ASCII and its standardisation led to efficient and consistent data exchange and communication across different systems.
-
Unicode resolves the internationalization issue, allowing the representation and interchange of a vast array of world languages and symbols. This has greatly influenced global digital communication and the internet.