Compression
High-dimensional data with too much redundancy. Change basis until most coordinates are small. Drop the small ones. Reconstruct what's left. JPEG, TF-IDF, and Huffman codes are the *same procedure*, just applied to pixels, words, and probabilities.
the skeleton
- 1 change basisPick a basis where signal concentrates in few coordinates.
- 2 drop smallZero out (or quantise) coordinates below a threshold.
- 3 reconstructInvert the basis change with the surviving coordinates.
instances · 3
graphics · jpeg-compression
Why JPEG Throws Pixels Away
objective 8×8 pixel block as 64-D vector
stops when DCT basis concentrates energy in low frequencies; quantise the rest; Huffman the residuals.
ml / dl · tf-idf
TF-IDF
objective document as bag of words; high-D sparse vector
stops when idf = log(N/df) zeros out common words; rare words carry the signal.
graphics · image-compression
Why Images Compress
objective raw pixel grid; redundancy across neighbours
stops when Histogram + spatial coding catch what the entropy floor predicts.
leans on
walk the instances
How Compression Works →
One three-step procedure under JPEG, TF-IDF, and the raw-pixel entropy floor — across graphics and ML.