Compression is the process of representing data in a different more efficient way. As the old saying says there is more than one way to skin a cat there is also more than one way to represent data. In order for a compression algorithm to be valid there should be an agreement between the compressor which is the process that changes that data representation to a more efficient smaller form and between the de-compressor which is the process the converts the data back from its compressed form to its original usable form.
A simple compression algorithm could be for example to convert pure text data from 8 bit characters to 7 bit character thus saving one eight of the size needed to store the original data. The English alphabet including all the special characters requires far less than 128 symbols but for processing ease most computers use 8 bit units also know as bytes to represent each character. While this is wasteful in terms of storage it provides for much faster and more efficient CPU usage. So a simple algorithm would simply cram each 7 bit representation of a letter next to the other one saving one bit per character or a toal of eighth of the storage.
In real life compression algorithm are much more complex and include much more sophisticated mathematics that in essence uses statistical tools in order to measure usage of different symbol strings in a data and then comprise a better scheme for representing those strings. You might have heard buzz words like Huffman trees and Lempel Ziv Welch algorithm. These are fancy names for such statistical algorithms.
In real life there are also two types of compression one is known as lossless compression while rightfully so the other is known as lossy compression. The difference between the two is the ability to convert the compressed data back to its original format. Lossless compression algorithms do not lose any information in the process of changing the data representation. As such lossless compression algorithms allow a de-compressor to recreate the exact same original data.
In contrast lossy compression cannot recreate the exact original data. Usually a lossy compression algorithm de-compressor can create data that is very close to the original but is never exactly the same. So why use lossy compression if it cannot recreate the original data? It sounds like lossless compression is better. The answer is that while lossy compression fails to recreate the exact same original data it can provide much higher compression ratios. In fact because of the looser requirement to be able to recreate an approximation of the original data and not an exact match a lossy compression algorithm can achieve those high compression rates.
Lossy compression algorithm is used for data which its exact accuracy is not important. For example when compressing images or videos the ability to recreate the exact original image or video data is not important since minor inaccuracies are not noticeable by human viewers.
Danette Mckay is a well known author. Read more here 7-zip about this and other subject.