Foundations
Cramér-Wold Theorem
A multivariate distribution is uniquely determined by all of its one-dimensional projections. This reduces multivariate convergence in distribution to checking univariate projections, and is the standard tool for proving multivariate CLT.
Prerequisites
Why This Matters
The central limit theorem in one dimension says . But in statistics and ML, you almost always work with vectors: the MLE , the gradient , the sample covariance matrix entries. The multivariate CLT says , but proving convergence in distribution for random vectors is harder than for scalars.
The Cramér-Wold theorem solves this: to prove a random vector converges in distribution, it suffices to prove that every one-dimensional projection converges. This reduces a -dimensional problem to infinitely many one-dimensional problems, each of which can be handled by the scalar CLT.
The Theorem
Cramér-Wold Theorem
Statement
Let be random vectors in . Then:
A multivariate distribution is uniquely determined by the collection of all its one-dimensional marginals (projections onto arbitrary directions).
Intuition
If two distributions agree on every 1D shadow (projection), they must be the same distribution. Conversely, if two sequences of distributions get close in every 1D shadow, they get close in the full -dimensional space. The projection is a scalar random variable, so you can use all the scalar tools (characteristic functions, univariate CLT, moment conditions) to check convergence direction by direction.
Proof Sketch
The characteristic function of is , the characteristic function of evaluated at .
If for all , then by Levy's continuity theorem, for each . But , so for all . By the multivariate Levy continuity theorem, .
Why It Matters
The standard proof of the multivariate CLT uses Cramér-Wold: to show , fix any and note that is a sample mean of scalars with variance . The scalar CLT gives . Since this holds for all , Cramér-Wold gives the full multivariate result.
This same technique proves asymptotic normality of multivariate MLE, multivariate delta method results, and joint convergence of multiple statistics.
Failure Mode
You must check ALL directions , not just the coordinate directions. Checking only (the standard basis) establishes convergence of each coordinate marginally, but marginal convergence does not imply joint convergence. The full collection of projections captures the dependence structure that marginals miss.
Application: Multivariate CLT Proof
The multivariate CLT follows immediately from the scalar CLT plus Cramér-Wold:
- Let be i.i.d. with mean and covariance .
- Fix any . Define . Then are i.i.d. scalars with mean and variance .
- By the scalar CLT: .
- But is the distribution of where .
- Since step 3 holds for all , Cramér-Wold gives .
This proof is three lines once you have the scalar CLT and Cramér-Wold. Without Cramér-Wold, you would need to work directly with multivariate characteristic functions, which is messier.
Common Confusions
Marginal convergence is not the same as joint convergence
If , then and (marginals converge). But the converse is false: marginal convergence does not imply joint convergence. Cramér-Wold fixes this by checking ALL linear combinations, not just the individual coordinates. The projection captures the dependence between and .
Cramér-Wold does require all directions, but the proof handles them uniformly
The theorem requires convergence of for every . In a symbolic proof you fix an arbitrary but unspecified , derive using parameters that depend symbolically on (e.g., the variance in the multivariate CLT proof), and then conclude — because the argument did not use any specific value of — that the convergence holds for every . This is what people informally mean by "checking a generic ": one symbolic argument that works for all directions.
What this is not: it is not "check a randomly chosen direction", it is not "check almost every direction", and it is not "check a finite set of directions". Cramér-Wold genuinely requires every , and counterexamples exist where checking only the coordinate directions, or only a finite or measure-zero set of directions, is insufficient.
Exercises
Problem
Use the Cramér-Wold theorem to show that if and is a fixed matrix, then .
Problem
Give an example of random vectors in such that and but does not converge in distribution to .
References
Canonical:
- Billingsley, Convergence of Probability Measures (2nd ed., 1999), Section 29
- van der Vaart, Asymptotic Statistics (1998), Theorem 2.4 (Cramér-Wold device)
- Durrett, Probability: Theory and Examples (5th ed., 2019), Theorem 3.9.5
Historical:
- Cramér & Wold, "Some Theorems on Distribution Functions" (1936)
Last reviewed: April 26, 2026
Canonical graph
Required before and derived from this topic
These links come from prerequisite edges in the curriculum graph. Editorial suggestions are shown here only when the target page also cites this page as a prerequisite.
Required prerequisites
2- Central Limit Theoremlayer 0B · tier 1
- Measure-Theoretic Probabilitylayer 0B · tier 1
Derived topics
2- Asymptotic Statistics: M-Estimators, Delta Method, LANlayer 0B · tier 1
- High-Dimensional Probability (Vershynin)layer 2 · tier 1
Graph-backed continuations