Storage of Characters
Storage of Characters
Purpose of Character Storage
- The storage of characters is a critical concept in computer science.
- Converting characters to a format that can be processed by computers is fundamental to how data is stored and processed.
ASCII (American Standard Code for Information Interchange)
- ASCII is a commonly used method for character storage.
- It uses 7-bit binary codes to represent characters.
- ASCII can represent 128 characters in total, including letters (both uppercase and lowercase), numbers, and symbols.
- For example, the ASCII value for the uppercase letter “A” is 65, which is 1000001 in binary.
Extended ASCII
- An extension of the ASCII system is the Extended ASCII which uses an 8-bit binary code.
- This allows for the representation of 256 characters as opposed to the original 128.
Unicode
- As the range of ASCII was found to be insufficient for representing all characters and symbols used around the globe, the Unicode standard was developed.
- Unicode is capable of representing over a million unique characters, making it more appropriate for modern computing where globalisation is prominent.
UTF-8
- UTF-8 is a system in Unicode used for transmitting data.
- It uses an 8-bit system which aligns with the base unit of data in computers, the byte.
- It has compatibility with ASCII as the first 128 characters are identical between them.
Importance of Character Storage
- Understanding how characters are stored is essential to understanding how data is processed by computers.
- It can also highlight potential limitations in systems depending on the chosen character storage technique.
In Summary
- Characters are stored using various encoding standards, each with its own range and purpose: ASCII, Extended ASCII, Unicode and UTF-8.
- ASCII and Extended ASCII have a limited number of characters, which can be a limitation for global applications.
- Unicode has a much wider range of possible characters, with UTF-8 being a common approach due to its backward compatibility with ASCII and its adoption of the byte unit.