Skip to main content

AI Safety

Differential Privacy

Formal privacy guarantees for algorithms: epsilon-delta DP, Laplace and Gaussian mechanisms, composition theorems, DP-SGD for training neural networks, and the privacy-utility tradeoff.

AdvancedTier 2CurrentSupporting~55 min

Why This Matters

Models trained on sensitive data (medical records, financial transactions, user behavior) can memorize and leak individual records. Membership inference attacks can determine whether a specific person was in the training set. Model inversion attacks can reconstruct training examples. Differential privacy provides a mathematical guarantee: the output distribution of the algorithm changes only slightly when one protected unit is added, removed, or replaced. It is one of the strongest worst-case privacy notions in common use, and it has been deployed in large-scale telemetry, product analytics, and official statistics.

The Definition

Definition

Differential Privacy

A randomized algorithm M:DR\mathcal{M}: \mathcal{D} \to \mathcal{R} satisfies (ε,δ)(\varepsilon, \delta)-differential privacy relative to a fixed neighboring relation if and only if for all neighboring datasets DD and DD', and for all measurable subsets SRS \subseteq \mathcal{R}:

Pr(M(D)S)eεPr(M(D)S)+δ\Pr(\mathcal{M}(D) \in S) \leq e^\varepsilon \cdot \Pr(\mathcal{M}(D') \in S) + \delta

The parameter ε>0\varepsilon > 0 is the privacy budget (smaller is more private). The parameter δ0\delta \geq 0 is a small slack probability on which the multiplicative bound may fail; δ=0\delta = 0 gives pure DP.

Definition

Neighboring Relation

The neighboring relation specifies what counts as changing one protected unit. Two common conventions are:

  • Add/remove adjacency: DD' is obtained by adding or removing one record.
  • Replace-one adjacency: DD' is obtained by replacing one record by another.

These conventions differ only by constant factors, but those constants matter for sensitivity calculations and for interpreting the final (ε,δ)(\varepsilon, \delta) guarantee. In federated or grouped settings, the protected unit might be one user rather than one example.

Definition

Sensitivity

The global sensitivity of a function f:DRdf: \mathcal{D} \to \mathbb{R}^d is:

Δf=maxD,D neighborsf(D)f(D)p\Delta f = \max_{D, D' \text{ neighbors}} \|f(D) - f(D')\|_p

For 1\ell_1 sensitivity, use p=1p = 1. For 2\ell_2 sensitivity, use p=2p = 2. Sensitivity measures the maximum change in ff from adding or removing one protected unit. The norm must match the mechanism: 1\ell_1 sensitivity for Laplace noise, 2\ell_2 sensitivity for Gaussian noise.

Core Mechanisms

Theorem

Laplace Mechanism

Statement

For a function f:DRdf: \mathcal{D} \to \mathbb{R}^d with 1\ell_1 sensitivity Δf\Delta f, the Laplace mechanism:

M(D)=f(D)+(Z1,,Zd),ZiLap(0,Δf/ε)\mathcal{M}(D) = f(D) + (Z_1, \ldots, Z_d), \quad Z_i \sim \text{Lap}(0, \Delta f / \varepsilon)

satisfies (ε,0)(\varepsilon, 0)-differential privacy (pure DP).

Intuition

Add Laplace noise scaled to the sensitivity. If changing one record changes ff by at most Δf\Delta f, then noise with scale Δf/ε\Delta f / \varepsilon makes it hard to tell which dataset produced the output. Smaller ε\varepsilon (more privacy) requires more noise.

Proof Sketch

For neighboring D,DD, D': P(M(D)=x)P(M(D)=x)=iexp(xifi(D)ε/Δf)exp(xifi(D)ε/Δf)\frac{P(\mathcal{M}(D) = x)}{P(\mathcal{M}(D') = x)} = \prod_i \frac{\exp(-|x_i - f_i(D)| \varepsilon / \Delta f)}{\exp(-|x_i - f_i(D')| \varepsilon / \Delta f)}. By the triangle inequality on xifi(D)xifi(D)|x_i - f_i(D)| - |x_i - f_i(D')| and the bound f(D)f(D)1Δf\|f(D) - f(D')\|_1 \leq \Delta f, this ratio is at most exp(ε)\exp(\varepsilon).

Why It Matters

The Laplace mechanism is the simplest and most widely used DP mechanism. It achieves pure DP (δ=0\delta = 0) with a clear noise-privacy tradeoff. For counting queries (sensitivity 1), adding Lap(1/ε)\text{Lap}(1/\varepsilon) noise provides ε\varepsilon-DP.

Failure Mode

The Laplace mechanism uses global sensitivity (worst case over all possible databases). If one unusual record has disproportionate influence on ff, the sensitivity is large and the noise overwhelms the signal. Local sensitivity or the smooth sensitivity framework can help but requires more sophisticated analysis.

Gaussian Mechanism

For (ε,δ)(\varepsilon, \delta)-DP with δ>0\delta > 0, a standard textbook calibration uses Gaussian noise:

M(D)=f(D)+N(0,σ2I),σΔ2f2ln(1.25/δ)ε\mathcal{M}(D) = f(D) + \mathcal{N}(0, \sigma^2 I), \quad \sigma \geq \frac{\Delta_2 f \sqrt{2 \ln(1.25/\delta)}}{\varepsilon}

where Δ2f\Delta_2 f is the 2\ell_2 sensitivity. The Gaussian mechanism is preferred when composing many mechanisms because Gaussian-style accountants are much tighter than repeated Laplace analyses. The bound above is convenient but not sharp: it is the older high-privacy calibration and modern libraries typically use the analytic Gaussian mechanism or an RDP-based accountant instead.

Composition

Theorem

Advanced Composition Theorem

Statement

If mechanisms M1,,Mk\mathcal{M}_1, \ldots, \mathcal{M}_k each satisfy (ε,δ)(\varepsilon, \delta)-DP, then the composed mechanism (M1,,Mk)(\mathcal{M}_1, \ldots, \mathcal{M}_k) satisfies (ε,kδ+δ)(\varepsilon', k\delta + \delta')-DP where:

ε=2kln(1/δ)ε+kε(eε1)\varepsilon' = \sqrt{2k \ln(1/\delta')} \cdot \varepsilon + k\varepsilon(e^\varepsilon - 1)

for any δ>0\delta' > 0.

Intuition

Naive composition says kk applications of ε\varepsilon-DP give kεk\varepsilon-DP (linear growth). Advanced composition shows privacy degrades as O(kε)O(\sqrt{k} \varepsilon) instead of O(kε)O(k \varepsilon). This square root improvement is critical for iterative algorithms like DP-SGD that compose thousands of mechanism invocations.

Proof Sketch

Model the privacy loss log(P(M(D)S)/P(M(D)S))\log(P(\mathcal{M}(D) \in S) / P(\mathcal{M}(D') \in S)) as a random variable. Each mechanism contributes a bounded privacy loss. By a martingale argument (Azuma-Hoeffding on the privacy loss random variable), the total privacy loss concentrates around its mean with standard deviation O(εk)O(\varepsilon\sqrt{k}).

Why It Matters

DP-SGD runs for thousands of training steps, each with a private gradient computation. Without advanced composition, the total privacy budget would be impractical (thousands times ε\varepsilon). With it, the budget grows as T\sqrt{T}, making private deep learning feasible.

Failure Mode

Advanced composition still gives loose bounds for large kk. Tighter accounting via Renyi DP or the moments accountant (Abadi et al., 2016) provides better bounds in practice. The numerical privacy accountant (google/dp-accounting) gives the tightest known bounds.

Proposition

Privacy Amplification by Subsampling

Statement

If a mechanism M\mathcal{M} is ε\varepsilon-DP and we first Poisson-subsample the dataset at rate qq, then the resulting mechanism is

(log ⁣(1+q(eε1)),0)-DP.\left(\log\!\left(1 + q(e^\varepsilon - 1)\right), 0\right)\text{-DP}.

For small ε\varepsilon, this is approximately qεq\varepsilon rather than ε\varepsilon.

Intuition

Most people are not sampled on most iterations. If your record is absent from a round, that round contributes no privacy loss for you. Random subsampling therefore amplifies privacy before we even add up the losses across many steps.

Why It Matters

This is one of the key reasons DP-SGD is practical at all. Each training step touches only a small random fraction of the dataset, so the per-step privacy cost is much smaller than if we privatized a full-dataset gradient every time.

Failure Mode

This exact formula is for pure DP with Poisson subsampling. For the subsampled Gaussian mechanism used in deep learning, practitioners normally switch to Renyi DP, moments accountants, or numerical accountants rather than chaining this formula with textbook advanced composition.

Post-Processing Immunity

Proposition

Post-Processing Immunity

Statement

If M\mathcal{M} satisfies (ε,δ)(\varepsilon, \delta)-DP and g:RRg: \mathcal{R} \to \mathcal{R}' is any (possibly randomized) function, then gMg \circ \mathcal{M} also satisfies (ε,δ)(\varepsilon, \delta)-DP.

Intuition

Once you have a private output, you can compute anything you want from it without degrading privacy. Training a model with DP-SGD and then deploying that model for inference does not consume additional privacy budget. Privacy is a property of how the output was generated, not what you do with it.

Proof Sketch

For any set SS' in the range of gg, let S=g1(S)S = g^{-1}(S'). Then P(g(M(D))S)=P(M(D)S)eεP(M(D)S)+δ=eεP(g(M(D))S)+δP(g(\mathcal{M}(D)) \in S') = P(\mathcal{M}(D) \in S) \leq e^\varepsilon P(\mathcal{M}(D') \in S) + \delta = e^\varepsilon P(g(\mathcal{M}(D')) \in S') + \delta.

Why It Matters

This means you only need to ensure privacy at the mechanism level. All downstream analysis, model deployment, prediction serving, and reporting are automatically private. This greatly simplifies the privacy analysis of complex ML pipelines.

Failure Mode

Post-processing immunity requires that gg does not access the original dataset DD. If you use DD again (e.g., to evaluate the private model on the training set), that counts as additional composition and consumes more privacy budget.

DP-SGD

DP-SGD (Abadi et al., 2016) makes stochastic gradient descent differentially private via two modifications:

  1. Gradient clipping: clip each per-example gradient to norm CC: gˉi=gimin(1,C/gi2)\bar{g}_i = g_i \cdot \min(1, C / \|g_i\|_2). This bounds the sensitivity of the clipped sum to at most CC under add/remove adjacency, so the clipped average has sensitivity at most C/BC/B for batch size BB. Under replace-one adjacency the constant becomes 2C2C (or 2C/B2C/B for the average).

  2. Noise addition: add Gaussian noise calibrated to the clipped sensitivity: g~=1B(igˉi+N(0,σ2C2I))\tilde{g} = \frac{1}{B}(\sum_i \bar{g}_i + \mathcal{N}(0, \sigma^2 C^2 I)).

The noise scale σ\sigma is set based on the target (ε,δ)(\varepsilon, \delta) and the number of training steps, using a moments accountant / RDP accountant for tight composition. The right mental model is "subsampled Gaussian mechanism + accountant," not "naively add the same per-step ε\varepsilon 10,000 times."

The Privacy-Utility Tradeoff

More privacy (smaller ε\varepsilon) requires more noise, which degrades model accuracy. This tradeoff is inherent, not an artifact of specific mechanisms. For DP-SGD:

  • ε1\varepsilon \sim 1: strong privacy, moderate utility loss for large datasets, significant loss for small datasets
  • ε10\varepsilon \sim 10: meaningful privacy against practical attacks, small utility loss
  • ε100\varepsilon \sim 100: weak formal guarantees, minimal utility loss

The right ε\varepsilon depends on the threat model. There is no universal threshold for "good enough" privacy.

Common Confusions

Watch Out

DP protects individuals, not aggregate statistics

DP guarantees that no single individual's data significantly affects the output. It does not hide aggregate properties of the dataset. A DP mechanism can still accurately report that 60% of patients have condition X. It just cannot reveal whether you specifically are in the dataset.

Watch Out

Epsilon is not directly interpretable as a probability

ε=1\varepsilon = 1 does not mean "1% privacy loss." It means the log-odds ratio of any output changes by at most 1 when one record is added or removed. For ε=1\varepsilon = 1, the probability ratio is at most e12.72e^1 \approx 2.72. Translating ε\varepsilon into concrete attack success probabilities requires specifying the prior and the attack model.

Watch Out

DP-SGD noise is per step, not per model

Each gradient step adds noise independently. The total privacy cost is the composition of all steps, not the noise in a single step. A model trained for 1000 steps has a much larger total ε\varepsilon than one trained for 100 steps, all else equal.

Watch Out

Example-level DP and user-level DP are different guarantees

In centralized DP-SGD, the protected unit is usually one example. In federated or grouped settings, the protected unit may be one user and therefore an entire local dataset. User-level DP is much stronger and usually requires substantially more noise or stronger clipping.

Summary

  • (ε,δ)(\varepsilon, \delta)-DP bounds how much any single record can change the output distribution; smaller ε\varepsilon is more private
  • The Laplace mechanism achieves pure DP; the Gaussian mechanism achieves approximate DP with better composition properties
  • Advanced composition gives O(k)O(\sqrt{k}) privacy degradation over kk steps instead of O(k)O(k)
  • Random subsampling amplifies privacy, which is why DP-SGD can run many noisy steps at all
  • Post-processing immunity means all downstream computation on a DP output is automatically private
  • DP-SGD clips per-example gradients and adds Gaussian noise to make neural network training private
  • The privacy-utility tradeoff is fundamental and task-dependent

Exercises

ExerciseCore

Problem

You want to privately release the average salary of n=1000n = 1000 employees, where each salary is in [0,200000][0, 200000]. The query f(D)=1nixif(D) = \frac{1}{n}\sum_i x_i has 1\ell_1 sensitivity Δf=200000/1000=200\Delta f = 200000 / 1000 = 200. What noise scale is needed for ε=1\varepsilon = 1 using the Laplace mechanism?

ExerciseAdvanced

Problem

You run DP-SGD for T=10000T = 10000 steps, each satisfying (0.01,105)(0.01, 10^{-5})-DP. Using naive composition, what is the total privacy guarantee? Using advanced composition with δ=105\delta' = 10^{-5}, what is the approximate ε\varepsilon'?

References

Canonical:

  • Dwork, McSherry, Nissim, and Smith, Calibrating Noise to Sensitivity in Private Data Analysis (TCC 2006). The foundational Laplace-mechanism paper.
  • Dwork and Roth, The Algorithmic Foundations of Differential Privacy (2014), Chapters 2-3. The standard fundamentals reference.
  • Kairouz, Oh, and Viswanath, The Composition Theorem for Differential Privacy (ICML 2015). Sharp composition results beyond the classical theorem.

Current:

  • Abadi et al., Deep Learning with Differential Privacy (CCS 2016), Sections 2-3. The DP-SGD / moments-accountant paper.
  • Mironov, Renyi Differential Privacy (CSF 2017). The key accounting framework behind modern privacy analyses.
  • Balle and Wang, Improving the Gaussian Mechanism for Differential Privacy: Analytical Calibration and Optimal Denoising (ICML 2018). Why the textbook Gaussian calibration is loose.
  • Wang, Balle, and Kasiviswanathan, Subsampled Renyi Differential Privacy and Analytical Moments Accountant (AISTATS 2019). Tight subsampled accounting for iterative private learning.

Next Topics

  • Federated learning: where user-level privacy guarantees, secure aggregation, and DP accounting meet distributed optimization
  • Hypothesis testing for ML: privacy loss and membership inference are closely tied to distinguishability

Last reviewed: April 24, 2026

Canonical graph

Required before and derived from this topic

These links come from prerequisite edges in the curriculum graph. Editorial suggestions are shown here only when the target page also cites this page as a prerequisite.

Required prerequisites

2

Derived topics

2