Compression

Core Concepts of Compression

  • Compression is a method used to decrease the size of a file or data set.
  • When data is compressed, it takes up less storage space, allowing more data to be stored or transmitted more easily.
  • Compressed files are especially necessary when dealing with large data sizes such as videos or high-resolution images.

Lossless and Lossy Compression

  • There are two main types of compression: lossless and lossy.
  • Lossless compression allows data to be perfectly reconstructed from the compressed data.
  • Lossy compression reduces a file by permanently eliminating certain information, especially irrelevant information. When the file is decompressed, not all of the original information will be restored.
  • The trade-off between lossless and lossy compression is that lossy files are much smaller, but at a cost of some loss in quality.

Compression Ratio

  • The compression ratio is a measure of the difference in size between the original file and the compressed file.
  • It is calculated by dividing the size of the compressed file by the original size.
  • A higher compression ratio indicates a higher level of compression.

Run Length Encoding

  • Run Length Encoding (RLE) is a form of lossless compression where sequences of the same data values are stored as a single data value and count.
  • For example, in a string like “AAAABBBCC”, RLE would store it as “4A3B2C”. This is particularly effective with data that contains many such runs.

Huffman Coding

  • Huffman Coding is another method of lossless compression that uses variable-length code to represent strings of characters within a data set.
  • Characters that appear more frequently are assigned shorter code, while those that appear less frequently are assigned longer code.
  • Huffman coding is used in various applications, including the .jpeg image format and .zip file format.

File Formats

  • Different file formats utilise different types of compression.
  • For example, JPEG images use lossy compression and can be compressed to a smaller size, but some image quality is lost in the process.
  • PNG images use lossless compression, preserving quality, but result in larger file sizes.

Importance of Compression

  • Compression is vital in maximising storage, reducing transmission time, and saving bandwidth.
  • However, it’s important to note that too much compression, especially lossy, can result in noticeable loss of quality. The decision between which type of compression to use often depends on the application and its requirements.