Foundations
Basic Logic and Proof Techniques
The proof vocabulary behind serious mathematics and ML theory: implication, quantifiers, contradiction, contrapositive, induction, construction, cases, and counterexamples.
Why This Matters
Every theorem on this site is built from a small set of proof moves. If those moves feel mysterious, learning theory, optimization, probability, algorithms, and statistics all become harder than they need to be.
The goal is not to memorize proof names. The goal is to look at a statement and ask:
- What are the assumptions?
- What must be shown?
- Which quantifier or implication is the bottleneck?
- Would a direct proof, contrapositive, contradiction, induction, construction, or counterexample make the structure simpler?
Propositions And Implications
Proposition
A proposition is a statement that is either true or false. "The model overfits" is not precise enough by itself. "The training error is lower than the validation error by at least " is closer to a mathematical proposition.
Implication
The implication means: whenever is true, must be true. It is false only in the case where is true and is false.
| true | true | true |
| true | false | false |
| false | true | true |
| false | false | true |
The last two rows are called vacuous truth. If the premise is false, the implication has not been violated.
Contrapositive
The contrapositive of is . The original implication and its contrapositive are logically equivalent.
Converse and Inverse
The converse of is . The inverse is . Neither is equivalent to the original implication in general.
Quantifiers
Universal Quantifier
means holds for every element of .
Existential Quantifier
means at least one element of satisfies .
Negating quantifiers is one of the highest-yield proof skills:
For nested quantifiers, negate from the outside inward. For example, the negation of
is
That is the logical shape of discontinuity.
Choosing A Proof Technique
| Statement shape | First proof move to try | Why |
|---|---|---|
| direct proof | assume , derive | |
| where is concrete | contrapositive | derive from |
| "No object has property X" | contradiction | assume a witness exists and break something |
| "There exists..." | construction | build the witness |
| "For all natural numbers ..." | induction | prove base case and step |
| "For all objects..." and claim seems false | counterexample search | one counterexample disproves it |
| multiple regimes | cases | cover exhaustive alternatives |
Direct Proof
Assume the hypotheses and derive the conclusion.
The sum of two even integers is even
Let and for integers . Then , so is even.
Direct proof is usually best when definitions immediately unpack into useful algebra.
Contrapositive
To prove , prove .
If n squared is even, then n is even
Prove the contrapositive: if is odd, then is odd. Write . Then
which is odd.
Contrapositive is especially useful when the conclusion has a simple negation, such as "not injective," "not continuous," or "not linearly independent."
Contradiction
To prove a statement , assume and derive an impossibility.
The square root of 2 is irrational
Assume where are integers with no common factor. Then , so is even. Write . Then , so , hence is even. This contradicts the assumption that and had no common factor.
Contradiction is powerful, but it can hide structure. If a contrapositive proof works cleanly, it is often more informative.
Induction
Weak Induction
To prove for all , prove the base case and prove for every .
Strong Induction
To prove for all , prove enough base cases and prove that together imply .
Principle Of Mathematical Induction
Statement
If is true and for all , then is true for every .
Intuition
The base case starts the chain. The inductive step says every true case forces the next case. Together they cover all later natural numbers.
Proof Sketch
Suppose some fails. By the well-ordering principle, the set of failing has a smallest element . Since the base case holds, . Then is true by minimality of , so the inductive step gives , a contradiction.
Why It Matters
Induction proves formulas for sums, loop invariants, recursion correctness, convergence bounds after steps, and dynamic programming recurrences.
Failure Mode
The most common invalid induction proof assumes what it needs to prove. The inductive hypothesis gives only for the previous case, not for every future case.
Construction, Cases, And Counterexamples
Constructive Proof
To prove with property , explicitly define such an and verify the property.
Proof By Cases
Split the situation into exhaustive cases and prove the claim in each case. The cases must cover everything and should not silently overlap in a way that loses logic.
Counterexample
To disprove a universal claim , find one such that .
Counterexamples are not "negative examples." They are surgical logic tools. A single valid counterexample defeats a universal statement.
ML And Math Examples
| Area | Proof move you will see |
|---|---|
| generalization bounds | union bound, concentration, contradiction, quantifier control |
| convex optimization | direct proof from definitions, separating hyperplanes, KKT implications |
| linear algebra | construction of bases, contradiction for independence, induction on dimension |
| probability | law-of-total-probability decompositions, counterexamples to independence claims |
| algorithms | induction on iterations, loop invariants, cases by input branch |
| learning theory | contrapositive and reductions for lower bounds |
Common Confusions
Contrapositive is not the converse
The contrapositive of is and is equivalent to the original. The converse is a different claim.
A proof by examples is not a proof of a universal statement
Checking many examples can build intuition, but it does not prove . A universal proof must cover every allowed object.
Induction needs a base case
The step can be true even if no case ever starts. The base case is what activates the chain.
Negating an implication changes it into an and statement
The negation of is , not .
Quick Drills
Negate: "For every model, there exists a dataset on which it performs well."
Answer: There exists a model such that for every dataset, it does not perform well.
Identify the proof move: "Assume the algorithm does not terminate; then construct an infinite strictly decreasing sequence of nonnegative integers."
Answer: contradiction, usually with well-ordering or a descent measure.
Find the flaw: "The theorem holds for , so it holds for all ."
Answer: examples are not induction. You still need a step proving .
Contrapositive: "If a matrix has full column rank, then its columns are linearly independent."
Contrapositive: If the columns are linearly dependent, then the matrix does not have full column rank.
What To Remember
- An implication fails only when the premise is true and the conclusion is false.
- Contrapositive is equivalent to the original; converse is not.
- Negating quantifiers swaps and .
- Induction needs both a base case and a valid step.
- To disprove "for all," find one counterexample.
- Good proof technique follows statement structure, not vibes.
Exercises
Problem
Write the negation of: "For every , there exists such that implies ."
Problem
Prove by induction that for all .
Problem
Disprove: "If , then and ."
Problem
Give a proof template for a PAC-style statement of the form: with probability at least , every hypothesis in a finite class satisfies a deviation bound.
References
Canonical:
- Velleman, How to Prove It, 3rd ed., Chapters 1-6.
- Hammack, Book of Proof, 3rd ed., Chapters 1-10.
- Bloch, Proofs and Fundamentals (2011), Chapters 1-3.
Further:
- Rosen, Discrete Mathematics and Its Applications, logic and proof chapters.
- Sipser, Introduction to the Theory of Computation, proof examples in automata and computability.
Next Topics
- Sets, functions, and relations: the objects most proof statements talk about.
- Counting and combinatorics: proof by bijection, double counting, and induction.
- SAT, SMT, and automated reasoning: what happens when logic becomes computation.
Last reviewed: April 25, 2026
Canonical graph
Required before and derived from this topic
These links come from prerequisite edges in the curriculum graph. Editorial suggestions are shown here only when the target page also cites this page as a prerequisite.
Required prerequisites
0No direct prerequisites are declared; this is treated as an entry point.
Derived topics
16- Sets, Functions, and Relationslayer 0A · tier 1
- Concentration Inequalitieslayer 1 · tier 1
- Hoeffding's Lemmalayer 1 · tier 1
- PAC Learning Frameworklayer 1 · tier 1
- Information Retrieval Foundationslayer 2 · tier 1
+11 more on the derived-topics page.
Graph-backed continuations