Characters

Understanding Characters

  • Characters in computing refer to any symbol that is used to create text. These can be letters, numbers, punctuation marks, and symbols.
  • In computer data, everything is stored as binary. Therefore, all characters are represented by a unique binary code.

ASCII

  • ASCII (American Standard Code for Information Interchange) is one of the most common character encoding standards used to represent characters.
  • ASCII uses 7 bits, which provides 128 (from 0 to 127) unique combinations.
  • These combinations are enough to represent all unaccented English characters (lowercase and uppercase), numbers, common punctuation marks, and some control characters.
  • For example, the capital ‘A’ is represented as 0100001 in binary, or 65 in decimal.

Extended ASCII and ISO-8859-1

  • Extended ASCII includes an extra bit, allowing for 256 unique combinations. This extension supports more characters and symbols.
  • ISO-8859-1 or ‘Latin-1’ is another extension of ASCII that includes more special characters used in Western European languages, such as accented letters.

Unicode

  • Unicode has replaced ASCII for most purposes in modern computing. It can represent a much larger range of characters and is used to encode many different writing systems from around the world.
  • It was developed to overcome the limitations of ASCII, which can only represent a small set of characters and is largely limited to English.
  • Unicode commonly utilises 16 bits, which allows for 65,536 unique combinations, although larger configurations also exist.
  • Under Unicode, many of the basic Latin characters retain their ASCII values for compatibility, making it a superset of ASCII.
  • It supports all writing systems and also includes many special characters, such as mathematical symbols and emojis.

Representing Characters in Different Systems

  • You must be able to convert characters to binary and vice versa for different systems.
  • You can do this by finding the character’s code in its encoding scheme (e.g., ASCII or Unicode), and convert that code to binary or from binary to decimal as mentioned before.