Storage of Characters

Storage of Characters

Purpose of Character Storage

The storage of characters is a critical concept in computer science.
Converting characters to a format that can be processed by computers is fundamental to how data is stored and processed.

ASCII (American Standard Code for Information Interchange)

ASCII is a commonly used method for character storage.
It uses 7-bit binary codes to represent characters.
ASCII can represent 128 characters in total, including letters (both uppercase and lowercase), numbers, and symbols.
For example, the ASCII value for the uppercase letter “A” is 65, which is 1000001 in binary.

Extended ASCII

An extension of the ASCII system is the Extended ASCII which uses an 8-bit binary code.
This allows for the representation of 256 characters as opposed to the original 128.

Unicode

As the range of ASCII was found to be insufficient for representing all characters and symbols used around the globe, the Unicode standard was developed.
Unicode is capable of representing over a million unique characters, making it more appropriate for modern computing where globalisation is prominent.

UTF-8

UTF-8 is a system in Unicode used for transmitting data.
It uses an 8-bit system which aligns with the base unit of data in computers, the byte.
It has compatibility with ASCII as the first 128 characters are identical between them.

Importance of Character Storage

Understanding how characters are stored is essential to understanding how data is processed by computers.
It can also highlight potential limitations in systems depending on the chosen character storage technique.

In Summary

Characters are stored using various encoding standards, each with its own range and purpose: ASCII, Extended ASCII, Unicode and UTF-8.
ASCII and Extended ASCII have a limited number of characters, which can be a limitation for global applications.
Unicode has a much wider range of possible characters, with UTF-8 being a common approach due to its backward compatibility with ASCII and its adoption of the byte unit.