Data Compression

Understanding Data Compression

Data Compression is a process that reduces the size of files so that they take up less disk space.
The primary aim of this process is to efficiently store and transmit data.
The two types of data compression are Lossless compression and Lossy compression.

Lossless Compression

Lossless Compression involves reducing the file size without losing any original data.
When a file is decompressed after Lossless compression, it is exactly the same as the original before compression.
Lossless compression is typically used for text and data files, where loss of words or data could be detrimental.

Lossy Compression

Lossy Compression reduces a file by permanently eliminating certain information, especially redundant information.
When a file is decompressed after Lossy compression, it isn’t exactly the same as the original. Some data is lost during the process.
Lossy compression is commonly used for audio and video files, where a small loss in quality is typically not noticeable.

The Compression Process

Compression algorithms replace repeated occurrences of data within a file.
RLE (Run Length Encoding) is a simple form of data compression where sequences (runs) of the same data value are stored as a single value and count.
Dictionary-based compression, used in formats like ZIP, involves indexing common sequences with a ‘dictionary’.

Benefits and Drawbacks of Data Compression

Benefits of data compression include saving disk storage space, reducing the time to transmit files over the internet, and saving bandwidth when streaming.
Drawbacks can include the time it takes to compress and decompress data, and the potential loss of data in lossy compression.