How many times do you multiply 2 to get 1,024?
The answer is 10. That 10 is . Log is the inverse of exponentiation — it pulls the exponent out of the result. Most big numbers in nature are built from exponents: cells double, interest compounds, starlight dims as , sound spans twelve orders of magnitude. The result you see (1,024 cells, a 100× gain, magnitude-7.2) is rarely the natural parameter. The exponent (how many doublings, how many years) is. Log is the function that recovers it. You already use a special case — because the number has six zeros. The everyday digit count is the exponent for base 10. Generalize the digit count to any base and you have the log. And because exponents add when their bases multiply, log turns multiplication into addition: . That one line drives the rest of this page.
log = what’s the exponent. Everything else — , slide rules, log-likelihood, digit-count estimates — is consequence.
- what
The inverse of exponentiation. — products become sums. The whole module is that one identity.
- applies when
A quantity is built from exponents — compound interest, half-life, decibels, earthquake magnitudes, sequence probabilities. The natural parameter is how many factors, and you want to recover it from the result.
- breaks when
Argument is zero or negative — real log is undefined. Base 1 — every exponent gives 1 and the inverse collapses. The most common student error is logging across an addition: . The identity needs a product underneath, every time.
The identity that does all the work
Log is defined by one rule: . Pick any base. The rule is the same. (Context picks the base by convention: means
import math
# every log law from one identity:
math.log10(2 * 50) # ≈ math.log10(2) + math.log10(50)
math.log10(2 ** 10) # ≈ 10 * math.log10(2)
math.log10(1) # 0.0Same trick, five places
Exponential quantities scatter across many places — time-growth (
- Compound interest. A million won at 7%/year — when does it double? → . The
() is this formula sloppily memorized.Rule of 72 - Carbon-14 dating. Carbon-14 halves every 5,730 years after death. If 25% remains: → . For odd ratios (33%, 17%) only the log expression closes in a single line.
- Decibels. . Conversation 60 dB, rock concert 110 dB → acoustic power differs by . Your ears don’t perceive a hundred-thousand-fold gap; hearing is logarithmic in power, and decibels track that compression directly.
- Earthquake magnitude. . Tōhoku 2011 (M 9.0) vs an ordinary large quake (M 7.0): . Two units of magnitude, three orders of energy. Natural earthquake energies span 19 orders of magnitude — comparison is hopeless without the compression Richter applies.
- Bits and binary search. A 1,024-page dictionary, halving each step → steps to find any word. A 32-bit int holds values; identifying N items needs bits. A deck of cards has of shuffle entropy — 226 yes/no questions to specify a single shuffle exactly.
Five problems, one shape: nature’s equation is exponential, take logs both sides, the exponent falls out. The identity from § 1 — multiplication into addition — is doing this work every single time.
Napier and the slide rule (×→+ embodied)
John Napier published the first log tables in 1614 because astronomers were dying inside, multiplying nine-digit numbers by hand to predict eclipses. His tables let them look up and , add the two, and look up what number had that log — the answer to with no multiplication anywhere. Three centuries later, every engineer carried a
Underflow — and why log-space saves your model
A
import numpy as np
# Naive: multiply 40 probabilities. Underflows in float32.
p = np.float32(0.1)
np.prod([p] * 40) # → 0.0 (silent death)
# Log-space: add 40 log-probabilities. Survives.
np.sum(np.log([p] * 40)) # → -92.10 (well-defined)Where this shows up — same identity, two pillars
Log is what makes multiplication answer addition’s questions. Every field that compounds things multiplicatively — and many do — eventually needs to ask “how many?” or “how big?” or “how confident?” in a form that adds. Log is the bridge.
finance : rates compose multiplicatively;
log makes them add (years to target, CAGR, continuous compounding).
ml : independent likelihoods multiply;
log makes them add — and *negative* log makes them a loss.Five live consumers, all leaning on the single identity from arc 1:
- Bitcoin pizza inverts compound growth: can’t be solved for
twithout taking logs. . CAGR is the same identity solved forrinstead. Three unknowns, one equation, three log-shaped answers. - Present value bridges from discrete compounding to the continuous form : take the log of the discrete expression, watch it reduce to in the limit. Continuous compounding isn’t a separate operation — it’s the discrete one looked at through log.
- Confidently wrong builds the loss . Multiple training examples have likelihoods that multiply; logs turn that product into a sum, and the negative sign makes “more confident, more wrong” climb instead of vanish.
- TF-IDF measures rarity in bits: . The logarithm is what makes ‘three times rarer’ into ‘one bit more surprising’ — directly comparable to other bit-measured quantities like password strength and English-letter entropy.
- Model calibration fits temperature
Tby minimizing log-loss on held-out validation; the logit function is the basis change that linearizes the calibration curve in the first place. Two different log uses inside one workflow.
Five problems across two pillars, one identity: the swap from × into +. Napier’s slide rule from arc 3 is the same machine running today inside log_softmax and the calibration optimizer.
log(a·b) = log(a) + log(b). The whole module. Everything else — the digit-count rule, the
On the Two Stacks widget, set a = 4. What value of b makes a·b land exactly on 100? Read it off the log axis without computing.
Without a calculator, give log₁₀(2,000,000) using only .
You evaluate a 50-token sequence; each token has probability ~0.05. Write the formula your code should compute, and the formula it should avoid. Use .
On Two Stacks, drag a and b so that the gap log(b) − log(a) is exactly the gap from log(1) to log(10). What does b/a always equal, regardless of where you placed them?
You’re given two probabilities p and q, but you only know and (not p, q themselves — they’d underflow). Derive a numerically stable expression for . (This is the
A junior says: ”