Compression

High-dimensional data with too much redundancy. Change basis until most coordinates are small. Drop the small ones. Reconstruct what's left. JPEG, TF-IDF, and Huffman codes are the *same procedure*, just applied to pixels, words, and probabilities.

the skeleton

1

change basis

Pick a basis where signal concentrates in few coordinates.
2

drop small

Zero out (or quantise) coordinates below a threshold.
3

reconstruct

Invert the basis change with the surviving coordinates.

instances · 3

graphics · jpeg-compression

Why JPEG Throws Pixels Away

objective 8×8 pixel block as 64-D vector

stops when DCT basis concentrates energy in low frequencies; quantise the rest; Huffman the residuals.

ml / dl · tf-idf

TF-IDF

objective document as bag of words; high-D sparse vector

stops when idf = log(N/df) zeros out common words; rare words carry the signal.

graphics · image-compression

Why Images Compress

objective raw pixel grid; redundancy across neighbours

stops when Histogram + spatial coding catch what the entropy floor predicts.

leans on

walk the instances

How Compression Works →

One three-step procedure under JPEG, TF-IDF, and the raw-pixel entropy floor — across graphics and ML.