QED — Historical (Dirac-Equation) Route

This is the route by which Dirac, Heisenberg, Pauli, and Fermi originally arrived at QED, and the way most introductory texts (Bjorken–Drell, Sakurai's Advanced QM, Griffiths' particle book, the early chapters of Peskin–Schroeder) build it up. Compared with the modern gauge-theory derivation, it is conceptually simpler — it does not presuppose the gauge principle — but pedagogically backward: gauge invariance appears at the end as a consequence of minimal coupling, rather than at the beginning as a postulate.

To make the logical structure transparent, every step below is tagged as one of:

(Postulate) — a primitive assumption of this formulation. These are the genuine physical inputs.
(Definition) — a mathematical object or notation introduced for use later.
(Derived) — a result that follows from the postulates plus standard mathematics or QM.
(Heuristic) — a step originally presented as physically motivated guesswork; later cleaned up by deeper theory.

The historical route rests on three postulates:

#	Postulate	Modern reinterpretation
H1	Relativistic single-particle wave equation must be first-order in derivatives	A choice of representation; equivalent to picking the Dirac field
H2	Minimal coupling $p_{μ} \to p_{μ} - e A_{μ}$	A theorem in the gauge-principle approach
H3	Second quantization with anticommutators (after the wavefunction interpretation breaks down)	Forced by spin–statistics

Everything else — including gauge invariance — is derived.

0. Preliminaries

The general mathematical machinery (Hilbert spaces, Lorentz/Poincaré groups, Lagrangian field theory, Fock space, mode expansions, regularization/renormalization) is collected in the QM and QFT preliminaries pages. This section covers only the prerequisite mathematics and classical physics specific to the wave-equation route to QED: gamma-matrix notation and the Lagrangian formulation of classical electromagnetism. Motivations, derived consequences, and the perturbative machinery used in §5 are deliberately not here — they appear next to the postulates or derivations that use them.

0.1 Gamma Matrices, Dirac Spinors, Notation

The Dirac gamma matrices $γ^{μ}$ ( $μ = 0, 1, 2, 3$ ) are $4 \times 4$ matrices satisfying the Clifford algebra

${γ^{μ}, γ^{ν}} = 2 η^{μν} 1.$

This is the defining relation of the algebra $Cl (1, 3)$ . For the abstract definition (any vector space with a quadratic form), classification (Bott periodicity), Spin groups, and spinor-representation theory in arbitrary dimensions, see math/clifford-algebra.md. For this document we just need the four matrices and the bilinear-covariant notation below.

Common explicit representations:

Dirac (standard) representation: $γ^{0} = (10 0 - 1)$ , $γ^{i} = (0 - σ^{i} σ^{i} 0)$ , with $σ^{i}$ the Pauli matrices. Convenient for the non-relativistic limit.
Weyl (chiral) representation: $γ^{0} = (0110)$ , $γ^{i} = (0 - σ^{i} σ^{i} 0)$ . Convenient for massless / high-energy limits and for separating left- and right-handed components.
Majorana representation: all $γ^{μ}$ purely imaginary; useful when discussing real (Majorana) fermion fields.

A Dirac spinor $ψ$ is a four-component object on which the $γ^{μ}$ act. Useful definitions:

Dirac adjoint: $\overset{ˉ}{ψ} \equiv ψ^{†} γ^{0}$ . The bilinear $\overset{ˉ}{ψ} ψ$ is a Lorentz scalar; $\overset{ˉ}{ψ} γ^{μ} ψ$ is a vector; $\overset{ˉ}{ψ} γ^{μ} γ^{ν} ψ$ has scalar + antisymmetric tensor parts; etc.
Feynman slash: $\neq a \equiv γ^{μ} a_{μ}$ for any four-vector $a^{μ}$ . Then $\neq a \neq b + \neq b \neq a = 2 a \cdot b$ .
Chirality matrix: $γ^{5} \equiv i γ^{0} γ^{1} γ^{2} γ^{3}$ , which anticommutes with all $γ^{μ}$ and squares to $1$ . Projectors $P_{L, R} = \frac{1}{2} (1 \mp γ^{5})$ project onto left- and right-handed components.

0.2 Classical Electromagnetism in Lagrangian Form

Classical electromagnetism has the Lagrangian density

$L_{EM} = - \frac{1}{4} F_{μν} F^{μν} - j^{μ} A_{μ}, F_{μν} = \partial_{μ} A_{ν} - \partial_{ν} A_{μ},$

whose Euler–Lagrange equations are the inhomogeneous Maxwell equations $\partial_{μ} F^{μν} = j^{ν}$ . The four-potential $A_{μ}$ is defined only up to a classical gauge transformation

$A_{μ} \to A_{μ} + \partial_{μ} Λ,$

with $Λ (x)$ an arbitrary scalar; this leaves $F_{μν}$ — and therefore the physical electric and magnetic fields — invariant. This redundancy is already present at the classical level, before any quantum mechanics.

The classical equation of motion for a point charge $e$ in an external $A_{μ}$ follows from the Lagrangian $L = - m 1 - \dot{x}^{2} - e A_{μ} \overset{x}{˙}^{μ}$ , equivalent to the canonical momentum substitution

$p_{μ} ⟶ p_{μ} - e A_{μ} .$

This is the classical input that Dirac borrowed when postulating minimal coupling (H2, §2.1 below).

1. The Dirac Equation

Scope of this route. Postulates H1+H2+H3 are sufficient to construct the QED Lagrangian, but they are not sufficient to derive the general Poincaré classification of particles. H1 only handles spin- $\frac{1}{2}$ — and only via the first-order-equation algebraic accident — so particles of other spins (scalar, vector, gravitino, graviton, ...) require separate analogous wave-equation postulates, case by case. The general classification, which says exactly which spins / helicities are possible and how they transform under $P_{+}^{↑}$ , requires elevating the Poincaré group to a primitive symmetry acting on a Hilbert space and analyzing its unitary irreducible representations — the Wigner classification (1939). The historical route never performs representation theory of $P_{+}^{↑}$ , so this classification is strictly more general than what H1+H2+H3 can reach. See foundations-modern.md §1.1 — One-particle states (Theorem — Wigner classification) for the modern derivation, and QFT/preliminaries.md § Wigner's Classification for the standalone reference.

1.1 Motivation: Klein–Gordon and the Negative-Probability Problem

The simplest relativistic wave equation is obtained by quantizing the relativistic dispersion $E^{2} = p^{2} + m^{2}$ via $E \to i \partial_{t}$ , $p \to - i \nabla$ :

$(\partial^{μ} \partial_{μ} + m^{2}) ϕ (x) = 0 (Klein-Gordon) .$

It is Lorentz-invariant but second-order in time, which has two unwanted consequences for a single-particle wavefunction interpretation:

The conserved current $j^{μ} = i (ϕ^{*} \partial^{μ} ϕ - (\partial^{μ} ϕ^{*}) ϕ)$ has $j^{0}$ that can be negative — incompatible with a probability density.
It admits both positive- and negative-energy solutions $E = \pm p^{2} + m^{2}$ with no obvious way to discard the negative-energy ones.

Both problems motivated Dirac to seek a first-order relativistic equation — the postulate H1 below.

(In modern QFT, KG is fine: it is the equation of motion of a free scalar field operator, and "negative probability" was a category error from insisting on a wavefunction interpretation.)

Other routes to Klein–Gordon. The canonical-substitution sketch above is the historical route ("Route A"). KG can equivalently be derived as the Euler–Lagrange equation of the free scalar Lagrangian $L_{ϕ} = \frac{1}{2} (\partial ϕ)^{2} - \frac{1}{2} m^{2} ϕ^{2}$ (Lagrangian route, "Route B"), or as a theorem — the position-space form of the first-Poincaré-Casimir constraint $P^{μ} P_{μ} = m^{2}$ on a spin-0 Wigner irrep (representation-theoretic route, "Route C"). See QFT/preliminaries.md § Worked example: free real scalar field and the Klein–Gordon equation for the three-route comparison. The first-order/probability-current concerns motivating H1 are specific to Route A; in the modern routes there is no such problem (KG is the EOM of a scalar field operator, never a single-particle wavefunction).

1.2 Demand a first-order relativistic wave equation (Postulate H1)

Seek a Lorentz-invariant wave equation for a single relativistic spin- $\frac{1}{2}$ particle that is first-order in $\partial_{μ}$ , so as to admit a positive-definite probability current and avoid the negative-norm problems of §1.1.

This is a postulate in two senses: it picks first-order over second-order (a real physical assumption motivated by probability-current concerns), and it implicitly chooses the spinor representation of the Lorentz group over scalar or vector ones.

First quantization: the unstated state-space framework

"Single relativistic spin- $\frac{1}{2}$ particle" is not a neutral phrase — it commits the historical narrative to the framework of first quantization without ever naming it. For symmetry with §3.3 (where second quantization is performed explicitly), it is worth stating what's been silently assumed.

First quantization here means: take a relativistic single particle and put it on the Hilbert space

$H_{1} ≅ L^{2} (R^{3}) \otimes V_{spin},$

a tensor product of two factors:

$L^{2} (R^{3})$ — the standard configuration-space Hilbert space of non-relativistic single-particle QM (square-integrable wavefunctions on space; see QM/preliminaries.md). The historical route carries this factor over by analogy when going relativistic — it is not itself derived from anything in this document.
$V_{spin}$ — a finite-dimensional internal space carrying the spin / Lorentz-representation degrees of freedom of $ψ$ . This is the new ingredient demanded by representing the Lorentz group on a single-particle wavefunction (a scalar wavefunction would take $V_{spin} = C$ trivially).

The dimension of $V_{spin}$ is not fixed by H1 alone. It will be determined in §1.3 by the algebraic consequences of the first-order ansatz: requiring the operator $i γ^{μ} \partial_{μ} - m$ to square to the Klein–Gordon operator forces the Clifford algebra ${γ^{μ}, γ^{ν}} = 2 η^{μν} 1$ , whose smallest faithful representation is by $4 \times 4$ matrices — so $V_{spin} ≅ C^{4}$ and

$H_{1} ≅ L^{2} (R^{3}) \otimes C^{4} .$

For the rest of §1.2 the precise dimension does not matter; only the product structure does.

The inner product is

$⟨ ψ_{1} ∣ ψ_{2} ⟩ = \int d^{3} x ψ_{1}^{†} (x) ψ_{2} (x) .$

A state is a single vector $∣ ψ (t)⟩ \in H_{1}$ , equivalently a wavefunction $ψ (x, t) \in V_{spin}$ . Time evolution is governed (in the Schrödinger picture) by the Dirac Hamiltonian

$\hat{H}_{Dirac} = α \cdot \hat{p} + β m, α = γ^{0} γ, β = γ^{0},$

via $i ℏ \partial_{t} ∣ ψ ⟩ = \hat{H}_{Dirac} ∣ ψ ⟩$ , which is just the Dirac equation $(i γ^{μ} \partial_{μ} - m) ψ = 0$ in disguise.

This is the framework that H1 (here) and H2 (§2.1) operate inside. Particle number is fixed at $N = 1$ : there is no vacuum, no antiparticles in a clean way, and no operators that can change the particle count. Only at §3.3 (postulate H3) does this framework get replaced by Fock-space second quantization, in which $ψ$ becomes an operator on a multi-particle state space. The relationship $H_{1} \subset F$ and the comparison of pre- and post-quantization states is laid out in QFT/fock-space-inventory.md §0.

Terminology note. "First" and "second" quantization are unfortunate names — there is only one Hilbert-space construction (first quantization); the second step is really field quantization. The historical names have stuck.

Loose end: Lorentz covariance. The position-space wavefunction $ψ (x, t) \in V_{spin}$ singles out a preferred spatial slice (every frame has its own $H_{1}$ ), and so this state space is not manifestly Lorentz-covariant. The genuinely covariant single-particle space is the momentum-space Wigner irrep, where Lorentz transformations act unitarily. The Fourier-transform relating the two carries a $1/ 2 ω_{p}$ measure mismatch that makes the position-space "wavefunction" not quite a probability amplitude in the relativistic sense — connected to the Newton–Wigner localization issues. The historical route ignores all of this; the modern treatment in foundations-modern.md § 1.1.3 — The natural basis: momentum and spin, not position addresses it head-on.

1.3 Derivation of the Clifford algebra and gamma matrices (Derived)

Writing the ansatz $(i γ^{μ} \partial_{μ} - m) ψ = 0$ and requiring that squaring the operator reproduce the Klein–Gordon equation $(\partial^{μ} \partial_{μ} + m^{2}) ψ = 0$ forces

${γ^{μ}, γ^{ν}} = 2 η^{μν} 1.$

Pure algebra then shows the smallest faithful representation is by $4 \times 4$ matrices, so $ψ$ has four complex components. None of this is an additional input — it all follows from H1. The explicit matrix representations and bilinear notation are collected in §0.1.

The Clifford algebra abstractly. The relation above is the defining identity of the Clifford algebra $Cl (1, 3)$ — the associative algebra generated by four vectors $e_{μ}$ with ${e_{μ}, e_{ν}} = 2 η_{μν} 1$ . As a general construction (any vector space with a quadratic form), Clifford algebras subsume the complex numbers, quaternions, and exterior algebras as special cases; their classification (Bott periodicity), spinor representations, and connection to Spin groups (e.g. $Spin (1, 3) ≅ S L (2, C)$ , the universal cover used in the modern route) are collected in math/clifford-algebra.md. The collapsible block below specializes to this algebra in $d = 4$ .

Why exactly 4 × 4? (counting argument + small-case obstructions)

The claim "smallest faithful representation is $4 \times 4$ " deserves an argument. It follows from (i) the dimension of the Clifford algebra and (ii) a check that the smaller cases concretely fail. The general $d$ -dimensional version of the result, and its connection to the Bott classification, is in math/clifford-algebra.md § 5.

Where the Clifford algebra comes from (recap). Squaring the Dirac operator:

$(i γ^{ν} \partial_{ν} + m) (i γ^{μ} \partial_{μ} - m) ψ = - γ^{ν} γ^{μ} \partial_{ν} \partial_{μ} ψ - m^{2} ψ .$

Because partial derivatives commute, we may symmetrize:

$γ^{ν} γ^{μ} \partial_{ν} \partial_{μ} = \frac{1}{2} {γ^{μ}, γ^{ν}} \partial_{μ} \partial_{ν} .$

For this to equal $\partial^{μ} \partial_{μ} = η^{μν} \partial_{μ} \partial_{ν}$ — so that the second-order equation is exactly $(□ + m^{2}) ψ = 0$ — we need $\frac{1}{2} {γ^{μ}, γ^{ν}} = η^{μν} 1$ . So the $γ$ 's cannot be numbers (numbers commute, but ${γ^{0}, γ^{1}} = 0$ forces non-commutation); they must be matrices.

Counting argument: $n \geq 4$ . The Clifford algebra $Cl (1, 3)$ generated by four anticommuting symbols subject only to ${γ^{μ}, γ^{ν}} = 2 η^{μν} 1$ has a canonical basis of antisymmetric products:

Length	Basis elements	Count
0	$1$	$(0 4) = 1$
1	$γ^{μ}$	$(1 4) = 4$
2	$γ^{μν} \equiv \frac{1}{2} [γ^{μ}, γ^{ν}], μ < ν$	$(2 4) = 6$
3	$γ^{μν ρ}, μ < ν < ρ$	$(3 4) = 4$
4	$γ^{0123} \propto γ^{5}$	$(4 4) = 1$

Total $\sum_{k} (k 4) = 2^{4} = 16$ . A representation on $V = C^{n}$ embeds the algebra in $End (V) = M_{n} (C)$ , which has dimension $n^{2}$ . Faithfulness (injectivity) demands the image be 16-dimensional, hence

$n^{2} \geq 16 ⟹ n \geq 4.$

So $n = 1, 2, 3$ are ruled out purely by counting. But it is illuminating to see why the small cases fail concretely.

$n = 1$ : numbers. Numbers commute, so ${γ^{0}, γ^{1}} = 0$ would force $γ^{0} γ^{1} = 0$ — not faithful.

$n = 2$ : spatial gammas fit, $γ^{0}$ has no room. A tempting attempt in $M_{2} (C)$ : set $γ^{i} \equiv i σ^{i}$ for $i = 1, 2, 3$ . Then

${γ^{i}, γ^{j}} = - {σ^{i}, σ^{j}} = - 2 δ^{ij} 1, ✓$

matching the spacelike Clifford relations. The obstruction is $γ^{0}$ : it must satisfy $(γ^{0})^{2} = + 1$ and ${γ^{0}, γ^{i}} = 0$ for all $i$ . But:

Lemma. The only $2 \times 2$ complex matrix $X$ with ${X, σ^{i}} = 0$ for all $i \in {1, 2, 3}$ is $X = 0$ .

Proof. $M_{2} (C)$ is spanned by ${1, σ^{1}, σ^{2}, σ^{3}}$ . Write $X = a 1 + b_{k} σ^{k}$ . Then ${X, σ^{j}} = 2 a σ^{j} + 2 b_{j} 1 = 0$ forces $a = 0$ and all $b_{j} = 0$ , so $X = 0$ . $□$

There is no nontrivial $γ^{0}$ in $M_{2} (C)$ — the three spatial gammas already exhaust the "anticommuting room", and a fourth generator that anticommutes with all of them cannot exist. (This is the counting argument made concrete: $dim M_{2} = 4$ , but $dim Cl (1, 3) = 16$ , so 12 dimensions' worth of elements must collapse — and you can see which ones get squeezed out first.)

$n = 3$ : ruled out by counting. $dim M_{3} (C) = 9 < 16$ , so no faithful $3 \times 3$ rep can exist. There is also no natural "Pauli-like" structure in $M_{3}$ to attempt.

$n = 4$ : works, and is unique. $dim M_{4} (C) = 16$ — exactly the Clifford-algebra dimension. The general classification gives

$Cl (1, 3)_{C} ≅ M_{4} (C),$

so a faithful 4-dimensional representation exists, is unique up to similarity transformations, and every element of the Clifford algebra is represented by a distinct $4 \times 4$ matrix. The Dirac, Weyl, and Majorana representations in §0.1 are three concrete choices of similarity transform.

The 4 in $C^{4}$ is therefore not a representational accident; it is the unique minimum dimension at which the algebraic defining relation admits a non-collapsing realization in $d = 4$ spacetime dimensions. For the general- $d$ pattern (Bott periodicity, when Majorana/Weyl/Majorana–Weyl spinors exist) see math/clifford-algebra.md § 5 and § 7.

1.4 The Dirac equation (Derived)

The result of §1.2 + §1.3 is the Dirac equation:

$(i γ^{μ} \partial_{μ} - m) ψ (x) = 0.$

At this stage $ψ$ is treated as an ordinary relativistic wavefunction (a single-particle theory), not yet as a quantum field. Its plane-wave solutions (§1.5), conserved current $j^{μ} = \overset{ˉ}{ψ} γ^{μ} ψ$ , and non-relativistic limit (the Pauli equation, with $g = 2$ , §2.4) are all derived consequences.

1.5 Plane-wave solutions and completeness relations (Derived)

Plugging the plane-wave ansatz $ψ (x) = u (p) e^{- i p \cdot x}$ into the free Dirac equation gives the algebraic equation $(\neq p - m) u (p) = 0$ , with two linearly independent positive-energy solutions $u (p, s)$ for $s = \pm \frac{1}{2}$ . Similarly, $ψ (x) = v (p) e^{+ i p \cdot x}$ gives $(\neq p + m) v (p) = 0$ , with two negative-energy solutions $v (p, s)$ that — after second quantization (§3.3) — describe antiparticles.

Standard normalization: $\overset{u}{ˉ} (p, s) u (p, r) = 2 m δ_{sr}$ , $\overset{v}{ˉ} (p, s) v (p, r) = - 2 m δ_{sr}$ , and the spin sums

$s \sum u (p, s) \overset{u}{ˉ} (p, s) = \neq p + m, s \sum v (p, s) \overset{v}{ˉ} (p, s) = \neq p - m .$

These appear throughout Feynman-diagram calculations (§5) and are the engine behind Casimir's trick (§5.4.2). The negative-energy spectrum signaled by the $v (p, s)$ family is the technical input to the consistency problem of §3.1, which forces the move to second quantization.

2. Coupling to Electromagnetism

2.1 Minimal coupling (Postulate H2 — heuristic)

Borrow from classical electrodynamics (§0.2), where a charged particle in an external electromagnetic field obeys equations of motion with the canonical momentum replaced by the kinetic momentum:

$p_{μ} ⟶ p_{μ} - e A_{μ} .$

Apply this minimal substitution to the Dirac equation:

$(i γ^{μ} (\partial_{μ} + i e A_{μ}) - m) ψ = 0.$

This is the second postulate of the historical formulation. It is not derived — it is a physically motivated rule, justified after the fact by the consequences collected in §2.4: simplicity, the correct non-relativistic Pauli equation with $g = 2$ , and the relativistic Lorentz force law.

In the modern gauge-principle derivation H2 becomes a theorem, not a postulate.

2.2 Definition of the covariant derivative (Definition)

Define $D_{μ} \equiv \partial_{μ} + i e A_{μ}$ . The minimally coupled Dirac equation is then

$(i γ^{μ} D_{μ} - m) ψ = 0.$

This is purely notation.

2.3 Gauge invariance (Derived)

Direct calculation shows that the simultaneous transformation

$ψ \to e^{i α (x)} ψ, A_{μ} \to A_{μ} - \frac{1}{e} \partial_{μ} α (x)$

leaves the minimally coupled Dirac equation invariant. So gauge invariance is a derived consequence of postulate H2 — exactly the inverse of the modern viewpoint, in which gauge invariance is the input and minimal coupling is the output. (See QED.)

2.4 Post-hoc justifications: Pauli equation, $g = 2$ , Lorentz force (Derived)

H2 is a heuristic at the level of postulation, but several derived consequences vindicate it after the fact. These are consequences of the minimally coupled Dirac equation (§2.1), not independent postulates.

Non-relativistic limit: the Pauli equation. Block-diagonalising the minimally coupled Dirac equation in the standard representation and dropping $O (v^{2} / c^{2})$ terms gives the Pauli equation

$i ℏ \partial_{t} χ = [\frac{( p - e A ) ^{2}}{2 m} + e Φ - \frac{e ℏ}{2 m} σ \cdot B] χ,$

for the two-component non-relativistic spinor $χ$ . The last term is a magnetic moment coupling with gyromagnetic ratio $g = 2$ — the historically dramatic prediction that vindicated the Dirac equation, since experiment had observed $g \approx 2$ for the electron whereas the orbital value would be $g = 1$ .

Classical limit: the Lorentz force. In the WKB / classical limit one recovers the relativistic Lorentz force law $d p^{μ} / d τ = e F^{μν} u_{ν}$ .

Simplicity. Minimal substitution is the simplest Lorentz-invariant coupling of $ψ$ to $A_{μ}$ at the operator level.

All three results were used historically as post hoc justification of H2.

3. From Wave Equation to Quantum Field

3.1 The negative-energy problem (Derived — and why a new postulate is needed)

The free Dirac equation admits both positive- and negative-energy plane-wave solutions:

$ψ^{(+)} \propto u (p, s) e^{- i p \cdot x}, ψ^{(-)} \propto v (p, s) e^{+ i p \cdot x}, p^{0} = + p^{2} + m^{2} .$

This is a calculation, not a postulate — but it shows that a single-particle interpretation is impossible (the spectrum is unbounded below). A new ingredient must be added.

3.2 The Dirac sea (Heuristic — historical)

Dirac's original resolution: postulate that all negative-energy states are filled, and identify a "hole" in the sea with a positive-energy antiparticle of opposite charge. This predicted the positron, discovered shortly after by Anderson (1932).

This is conceptually awkward (it relies on an infinite filled vacuum) and was eventually superseded.

3.3 Second quantization (Postulate H3)

The Dirac sea is replaced by a sharper axiom on $ψ$ itself.

Postulate H3. Promote the Dirac wavefunction $ψ (x)$ to an operator-valued field on a Fock space, satisfying the equal-time canonical anticommutation relations

${\hat{ψ}_{α} (x, t), \hat{ψ}_{β}^{†} (y, t)} = δ^{3} (x - y) δ_{α β}, {\hat{ψ}_{α} (x, t), \hat{ψ}_{β} (y, t)} = {\hat{ψ}_{α}^{†} (x, t), \hat{ψ}_{β}^{†} (y, t)} = 0,$

with $α, β$ the Dirac-spinor indices.

Two things are postulated together here:

The promotion of $ψ$ from c-number wavefunction to operator field.
The choice of anticommutators (Fermi statistics) over commutators.

Both are forced upon you in the modern view by the spin–statistics theorem, but that theorem was proved later (Pauli 1940). Historically, anticommutators were postulated to ensure positive-definite energy and the Pauli exclusion principle.

Consequence: mode expansion. Solving the free Dirac equation $(i γ^{μ} \partial_{μ} - m) \hat{ψ} = 0$ as an operator equation, and using the plane-wave spinor basis $u (p, s), v (p, s)$ from §1.5, fixes $\hat{ψ}$ uniquely up to the operator-valued coefficients of each mode:

$\hat{ψ} (x) = s \sum \int \frac{d ^{3} p}{( 2 π ) ^{3}} \frac{1}{2 ω _{p}} (b_{p, s} u (p, s) e^{- i p \cdot x} + d_{p, s}^{†} v (p, s) e^{+ i p \cdot x}),$

where $b_{p, s}$ annihilates an electron and $d_{p, s}^{†}$ creates a positron. Substituting this expansion into the postulated equal-time field anticommutators and using the spinor completeness relations of §1.5 reduces them to the ladder-operator anticommutators

${b_{p, s}, b_{q, r}^{†}} = {d_{p, s}, d_{q, r}^{†}} = (2 π)^{3} δ^{3} (p - q) δ_{sr}, all other anticommutators = 0.$

So the ladder form is a consequence of H3 in the plane-wave basis, not an independent postulate. The two forms are equivalent (one is the spatial Fourier transform of the other).

Consequence: Fock space. Once $\hat{ψ}$ is an operator with the ladder operators $b, b^{†}, d, d^{†}$ above, the natural state space they act on is no longer the single-particle $H_{1} ≅ L^{2} (R^{3}) \otimes C^{4}$ of §1.2, but the Fock space

$F = n = 0 ⨁ \infty H_{n},$

with $H_{0} = C ∣0 ⟩$ the one-dimensional vacuum sector and $H_{n}$ the antisymmetrized $n$ -particle sector built from $H_{1}$ . The vacuum $∣0 ⟩$ is defined by $b_{p, s} ∣0 ⟩ = d_{p, s} ∣0 ⟩ = 0$ for all $p, s$ , and arbitrary multi-particle (and multi-antiparticle) states are generated by acting with creation operators on $∣0 ⟩$ . This is the replacement of first quantization promised in §1.2.

Terminology warning. Before H3, the symbol $ψ (x)$ denoted a single-particle wavefunction (a state). After H3 it denotes a field operator on Fock space. Same symbol, completely different mathematical object. From here on, "the state" of the system is a vector in Fock space (typically the vacuum or an asymptotic Fock state); $ψ (x)$ is one of the operators acting on that space. See QFT/preliminaries.md § States vs. Fields for an extended discussion.

3.4 Antiparticles, vacuum, and positivity of energy (Derived)

Once H3 is in place, several historical puzzles resolve themselves automatically:

The vacuum $∣0 ⟩$ is annihilated by all $b_{p, s}$ and $d_{p, s}$ .
Negative-energy modes have been re-interpreted as positive-energy antiparticle creation; the Dirac sea disappears.
The Hamiltonian is positive after normal-ordering.
Pauli exclusion follows from $b^{†} b^{†} = 0$ .

None of this is an additional input.

4. Quantization of the Electromagnetic Field

4.1 Choice of gauge (Definition / convention)

Pick a gauge — for the historical treatment, Coulomb gauge $\nabla \cdot A = 0$ . This eliminates longitudinal and timelike components as non-dynamical (the latter by the Coulomb constraint) and leaves only the two transverse photon polarizations $λ = \pm 1$ .

This is a convention, not a postulate — a different choice (Lorenz gauge with Gupta–Bleuler, or BRST) gives the same physics.

4.2 Mode expansion of $A_{μ}$ (Postulate — same flavor as H3, applied to bosons)

Postulate the bosonic mode expansion

$A (x) = λ \sum \int \frac{d ^{3} k}{( 2 π ) ^{3}} \frac{1}{2∣ k ∣} (a_{k, λ} ϵ (k, λ) e^{- ik \cdot x} + a_{k, λ}^{†} ϵ^{*} (k, λ) e^{+ ik \cdot x}),$

with bosonic commutators $[a_{k, λ}, a_{k^{'}, λ^{'}}^{†}] = (2 π)^{3} δ^{3} (k - k^{'}) δ_{λ λ^{'}}$ . Conceptually this is the same kind of postulate as H3 — promotion of a classical field to an operator with prescribed (anti)commutators — applied to a boson rather than a fermion.

The instantaneous Coulomb interaction is reintroduced explicitly to compensate for the elimination of the timelike photon mode.

5. Dynamics and Predictions

5.0 Perturbative machinery: S-matrix, Dyson series, Wick's theorem (Generic QFT setup)

Why we need it. Postulates H1–H3 give us field equations and a Hilbert space, but no recipe for what an experimentalist measures. Real experiments prepare asymptotic states — well-separated wavepackets long before a collision — and detect asymptotic states long after. The S-matrix is the operator whose matrix elements $⟨ f, out ∣ i, in ⟩$ encode exactly these probabilities, packaging all interaction effects into a single object. Cross sections, decay rates, and (via poles) bound-state energies are all built from S-matrix elements. We collect the generic construction here so that §5.2 can apply it to QED without distraction.

Definition. Split the Hamiltonian as $H = H_{0} + H_{int}$ with $H_{0}$ free. In the interaction picture, operators evolve under $H_{0}$ while states evolve under $H_{int}$ . The S-matrix is the interaction-picture evolution operator from $t = - \infty$ to $t = + \infty$ :

$S \equiv U_{I} (+ \infty, - \infty), ∣ f, out ⟩ = S ∣ i, in ⟩ .$

Sketch of derivation (Dyson series). Solving the interaction-picture Schrödinger equation $i \partial_{t} ∣ ψ ⟩_{I} = H_{int} (t) ∣ ψ ⟩_{I}$ iteratively, and noting that the time-ordering operator $T$ accounts for the non-commutativity of $H_{int}$ at different times, gives

$S = T exp (- i \int_{- \infty}^{\infty} d t H_{int} (t)) = n = 0 \sum \infty \frac{( - i ) ^{n}}{n !} \int d^{4} x_{1} \dots d^{4} x_{n} T [H_{int} (x_{1}) \dots H_{int} (x_{n})] .$

This is the Dyson series — the order-by-order expansion of $S$ in powers of the coupling.

How each term is evaluated (Wick's theorem). Each Dyson term is a vacuum/in/out matrix element of a time-ordered product of free fields. Wick's theorem rewrites such a product as a sum of normal-ordered products with all possible pairwise contractions, where each contraction is by definition a vacuum-expectation time-ordered product — a Feynman propagator:

$\overline{\hat{ϕ} (x) \hat{ϕ} (y)} \equiv ⟨ 0∣ T \hat{ϕ} (x) \hat{ϕ} (y) ∣0 ⟩, ⟨ 0∣ T ψ (x) \overset{ˉ}{ψ} (y) ∣0 ⟩ = i S_{F} (x - y), ⟨ 0∣ T A_{μ} (x) A_{ν} (y) ∣0 ⟩ = i D_{F μν} (x - y) .$

Dyson series + Wick's theorem together turn perturbation theory into a sum of Feynman diagrams: one diagram per contraction pattern, with vertices coming from $H_{int}$ and lines from the propagators. The QED-specific application is in §5.2; the full mechanical derivation of the momentum-space rules is left to a standard QFT textbook (Peskin–Schroeder Ch. 4, Schwartz Ch. 7, Srednicki Ch. 9).

Transition operator $T$ (preview). $S$ is the primary operator — defined directly as the interaction-picture evolution above. Almost every later use, however, separates out the trivial no-scattering piece by writing

$S = 1 + i T, equivalently T \equiv - i (S - 1) .$

The transition operator (or $T$ -matrix) $T$ is defined from $S$ this way; it carries no information $S$ doesn't already carry. Its purpose is purely bookkeeping: for $∣ f ⟩ \neq = ∣ i ⟩$ , the matrix element $⟨ f ∣ S ∣ i ⟩ = i ⟨ f ∣ T ∣ i ⟩$ contains only genuine interaction effects, with the trivial $⟨ f ∣1∣ i ⟩ = δ_{f i}$ removed. This becomes important in §5.3 when we extract the invariant amplitude $M$ by stripping the spacetime-translation $δ^{4}$ from $⟨ f ∣ i T ∣ i ⟩$ .

Notation collision. The symbol $T$ in this document plays two unrelated roles: the time-ordering operator appearing inside the Dyson formula $S = T exp (\dots)$ above (an instruction to reorder operators by time), and the transition operator $T = - i (S - 1)$ just defined (an actual operator on Fock space). Context disambiguates — time-ordering $T$ always sits in front of a product of operators, transition $T$ sits between bra and ket as $⟨ f ∣ i T ∣ i ⟩$ .

Caveat (Haag's theorem). The interaction picture does not strictly exist in interacting QFT; the construction above is a formal expansion, justified after the fact by renormalization. See QFT/remarks.md.

5.1 Interaction Hamiltonian (Derived)

From the minimally coupled Lagrangian (the Lagrangian whose Euler–Lagrange equation is the minimally coupled Dirac equation of §2.1) the interaction Hamiltonian is read off directly:

$H_{int} = \int d^{3} x e \overset{ˉ}{ψ} (x) γ^{μ} ψ (x) A_{μ} (x) = \int d^{3} x j^{μ} (x) A_{μ} (x) .$

No new input.

5.2 The QED S-matrix and Feynman rules (Derived)

Why we need it (here). §5.0 introduced the S-matrix in general; we now plug in the QED interaction Hamiltonian from §5.1 to get the explicit perturbative expansion for electron–photon processes. The end product is the QED Feynman rules — a graphical algorithm for computing any S-matrix element to any order in $e$ . All of this is derivation, not postulation.

The QED Dyson series. Substituting $H_{int} (x) = e \overset{ˉ}{ψ} (x) γ^{μ} ψ (x) A_{μ} (x)$ into the general Dyson series of §5.0:

$S = n = 0 \sum \infty \frac{( - i e ) ^{n}}{n !} \int d^{4} x_{1} \dots d^{4} x_{n} T [\overset{ˉ}{ψ} γ^{μ} ψ A_{μ} (x_{1}) \dots \overset{ˉ}{ψ} γ^{ν} ψ A_{ν} (x_{n})] .$

Each factor $\overset{ˉ}{ψ} γ^{μ} ψ A_{μ}$ in the integrand is a vertex contribution at one spacetime point.

Sketch of derivation of the Feynman rules. The chain from this expression to the momentum-space rules table consists of three mechanical steps (extending the generic §5.0 Wick step to QED-specific bookkeeping):

Apply Wick's theorem to each order- $n$ time-ordered product, producing a sum of normal-ordered terms with all pairwise contractions of $ψ$ , $\overset{ˉ}{ψ}$ , and $A_{μ}$ . Surviving terms (after taking the $⟨ f ∣ \dots ∣ i ⟩$ matrix element) are those where the uncontracted fields exactly match the in/out particle content.
Diagrammatic bookkeeping. Identify each algebraic piece with a graphical element:
- Each interaction factor $\overset{ˉ}{ψ} γ^{μ} ψ A_{μ}$ at $x_{i}$ → one vertex with three lines (one electron in, one out, one photon).
- Each contraction $⟨ 0∣ T \hat{ψ} (x) \overset{ˉ}{\hat{ψ}} (y) ∣0 ⟩$ → an internal electron line between $x$ and $y$ , propagator $i S_{F} (x - y)$ .
- Each contraction $⟨ 0∣ T \hat{A}_{μ} (x) \hat{A}_{ν} (y) ∣0 ⟩$ → an internal photon line, propagator $i D_{F μν} (x - y)$ .
- Uncontracted fields acting on $⟨ p, s ∣$ or $∣ p, s ⟩$ → external line factors (spinors $u, \overset{u}{ˉ}, v, \overset{v}{ˉ}$ for fermions; polarization vectors $ϵ_{μ}, ϵ_{μ}^{*}$ for photons).

Position → momentum space. Fourier-transform every propagator and external line. Spacetime integrals over each vertex become momentum-conserving delta functions $(2 π)^{4} δ^{4} (\sum p_{in,vertex} - \sum p_{out,vertex})$ . Integrating away the internal delta functions leaves one overall delta function enforcing total energy–momentum conservation; the residue is the momentum-space Feynman rules dictionary:

Diagrammatic element	Algebraic factor
Vertex	$- i e γ^{μ}$
Internal electron line, momentum $p$	$\frac{i ( \neq p + m )}{p ^{2} - m ^{2} + i ϵ}$
Internal photon line, momentum $k$	$\frac{- i η _{μν}}{k ^{2} + i ϵ}$ (Feynman gauge)
External electron in / out	$u (p, s)$ / $\overset{u}{ˉ} (p, s)$
External positron in / out	$\overset{v}{ˉ} (p, s)$ / $v (p, s)$
External photon in / out	$ϵ_{μ} (k, λ)$ / $ϵ_{μ}^{*} (k, λ)$
Closed fermion loop	factor of $- 1$ and a trace
Symmetry factor	$1/ S$ for diagrams with internal symmetry

This is the same table referenced in QED Step 6 and applied in QED/compton.md to compute the Klein–Nishina cross section. The rules contain no new physics beyond $L_{QED}$ — they are a reorganization of perturbation theory into a graphical algorithm.

5.3 The Invariant Amplitude $M$ (Definition)

Why we need it. Raw S-matrix elements between momentum eigenstates always carry the overall delta function $(2 π)^{4} δ^{4} (\sum p_{in} - \sum p_{out})$ from translation invariance, plus state-normalization factors $2 E_{p}$ . Squaring such an element naively produces the meaningless $∣ δ^{4} (\cdot) ∣^{2}$ . To extract a finite, Lorentz-invariant probability density usable in a cross-section formula we strip the delta function out by definition and work with the residue. That residue is what the Feynman rules of §5.2 actually compute; giving it a name turns the rules into a self-contained calculational object.

Definition. Recall from §5.0 that the trivial no-scattering piece of $S$ can be peeled off via $S = 1 + i T$ , where the transition operator $T = - i (S - 1)$ encodes all genuine interaction effects. For an off-diagonal momentum-eigenstate matrix element of $T$ , factor the spacetime-translation delta function out:

$⟨ f ∣ i T ∣ i ⟩ \equiv (2 π)^{4} δ^{4} (\sum p_{in} - \sum p_{out}) i M_{f i} .$

The residue $M_{f i}$ is the invariant amplitude (also called the matrix element, or just the M-matrix element). It is a Lorentz scalar (with possible spinor/polarization indices on the external states) and is finite at generic kinematics.

Notation note — $M_{f i}$ vs. $M$ vs. $\overline{∣ M ∣^{2}}$ . Three closely related symbols appear in the literature and in the rest of this doc:

Symbol Spins / polarizations Typical use

$∥ M_{f i} ∥^{2}$ fixed, specific channel definitions, polarized observables, probability of one transition

$∥ M ∥^{2}$ same as above, $f i$ labels suppressed shorthand when the channel is clear from context

$\overline{∥ M ∥^{2}}$ summed over final, averaged over initial unpolarized cross sections; what Casimir's trick (§5.4.2) computes

So $∣ M_{f i} ∣^{2} = ∣ M ∣^{2}$ (notation only, same object), while $\overline{∣ M ∣^{2}} = (1/ N_{i}) \sum_{spins} ∣ M_{f i} ∣^{2}$ is a different object — a sum over an ensemble of $∣ M_{f i} ∣^{2}$ values. Below we keep the $f i$ subscript when the channel-specific meaning matters and drop it when it does not.

Symbol	Spins / polarizations	Typical use
$∥ M_{f i} ∥^{2}$	fixed, specific channel	definitions, polarized observables, probability of one transition
$∥ M ∥^{2}$	same as above, $f i$ labels suppressed	shorthand when the channel is clear from context
$\overline{∥ M ∥^{2}}$	summed over final, averaged over initial	unpolarized cross sections; what Casimir's trick (§5.4.2) computes

Sketch of derivation from $S$ .

Identity subtraction. Forward scattering trivially has $⟨ i ∣ S ∣ i ⟩ ∋ 1$ ; only the interacting part $i T = S - 1$ produces transitions $∣ i ⟩ \to ∣ f ⟩$ with $f \neq = i$ — which is why $T$ was introduced above.
Translation invariance ⇒ overall $δ^{4}$ . The interaction Hamiltonian density $H_{int} (x)$ is invariant under spacetime translations $x \to x + a$ . Each Dyson term, between momentum eigenstates, can therefore be written as $e^{i (\sum p_{in} - \sum p_{out}) \cdot X}$ times a function of relative coordinates, where $X$ is the centre of the diagram. Integrating over $X$ produces the universal factor $(2 π)^{4} δ^{4} (\sum p_{in} - \sum p_{out})$ . Defining $M$ as the residue is the precise statement of step 3 in §5.2 ("after integrating away delta functions, read off the rules").
Momentum-space Feynman rules compute $i M$ directly. Each tabulated rule (vertex $- i e γ^{μ}$ , internal propagator, external spinor/polarization) is the contribution to $i M$ from the corresponding diagrammatic element. Summing over all topologically distinct diagrams at order $n$ in $e$ gives $M_{f i}^{(n)}$ .

What you do with it. The whole point of $M$ is that the physically observable quantity is $∣ M ∣^{2}$ , not $M$ itself. The next subsection unpacks what that means and how it is computed.

5.4 The Squared Amplitude $∣ M ∣^{2}$

Why it is the physical object. Quantum mechanics produces complex amplitudes; experiments measure probabilities. The Born rule (QM Postulate 3) supplies the bridge: $P (i \to f) \propto ∣ ⟨ f ∣ \hat{U} ∣ i ⟩ ∣^{2}$ . Applied to scattering with $\hat{U} \to \hat{S}$ and the delta-function-stripping of §5.3, the relevant probability density is

$∣ M_{f i} ∣^{2} = M_{f i}^{*} M_{f i},$

a real, non-negative, Lorentz-invariant function of the external momenta and spins/polarizations. (The $∣ δ^{4} ∣^{2} \to V T \cdot δ^{4}$ subtlety produced by squaring the raw S-matrix element is what makes the $M$ -as-residue definition useful: $∣ M ∣^{2}$ has no leftover delta-function squared.)

The full QED interaction probability density per pair, derived inside QFT/cross-sections.md §3, is

$\frac{d P _{i \to f}}{d t} = \frac{1}{4 E _{a} E _{b} V} ∣ M_{f i} ∣^{2} (2 π)^{4} δ^{4} (p_{f} - p_{i}) d Π_{n},$

from which cross sections and decay rates are read off.

5.4.1 Spin and polarization sums: $\overline{∣ M ∣^{2}}$

Real experiments rarely measure individual spin/polarization channels. Two operations bring $∣ M ∣^{2}$ to a directly comparable form:

Average over initial-state spins/polarizations the experimenter does not control — divide by the multiplicity $\prod_{i} n_{spin, i}$ (e.g. a factor $1/4$ for two spin- $\frac{1}{2}$ particles, $1/2$ per unpolarized photon).
Sum over final-state spins/polarizations the detector does not resolve.

The overlined notation

$\overline{∣ M ∣^{2}} \equiv \frac{1}{N _{i}} init spins \sum final spins \sum ∣ M_{f i} ∣^{2}$

is universal in cross-section formulas. Polarized observables (e.g. spin asymmetries) keep specific spin labels instead and use $∣ M_{f i} ∣^{2}$ unaveraged.

5.4.2 Computational technology: Casimir's trick (trace technology)

For a generic QED amplitude built from spinor bilinears, the channel-specific amplitude $M_{f i}$ has the schematic form

$M_{f i} = \overset{u}{ˉ} (p^{'}, s^{'}) Γ u (p, s),$

where $Γ$ is some product of $γ$ -matrices and propagators (and $f, i$ here denote the specific spins $s^{'}, s$ in addition to the fixed external momenta). Taking the modulus squared and using the conjugation identity $(\overset{u}{ˉ}_{1} Γ u_{2})^{*} = \overset{u}{ˉ}_{2} \overset{ˉ}{Γ} u_{1}$ (with $\overset{ˉ}{Γ} \equiv γ^{0} Γ^{†} γ^{0}$ ):

$∣ M_{f i} ∣^{2} = [\overset{u}{ˉ} (p^{'}, s^{'}) Γ u (p, s)] [\overset{u}{ˉ} (p, s) \overset{ˉ}{Γ} u (p^{'}, s^{'})] .$

Summing over the initial- and final-state spins and using the completeness relations

$s \sum u (p, s) \overset{u}{ˉ} (p, s) = \neq p + m, s \sum v (p, s) \overset{v}{ˉ} (p, s) = \neq p - m,$

collapses the explicit $u$ -spinors into projectors, and the spinor matrix product becomes a trace over Dirac indices:

$s, s^{'} \sum ∣ M_{f i} ∣^{2} = Tr [(\neq p^{'} + m) Γ (\neq p + m) \overset{ˉ}{Γ}] .$

This is Casimir's trick (also called the Casimir trick / spin-sum-as-trace). The remaining computation is purely algebraic: apply the standard trace identities

$Tr (γ^{μ}) = 0, Tr (γ^{μ} γ^{ν}) = 4 η^{μν}, Tr (γ^{μ} γ^{ν} γ^{ρ} γ^{σ}) = 4 (η^{μν} η^{ρ σ} - η^{μ ρ} η^{ν σ} + η^{μ σ} η^{ν ρ}),$

traces of an odd number of $γ$ 's vanish, etc., to reduce $\overline{∣ M ∣^{2}}$ to a function of Lorentz-invariant dot products $p \cdot p^{'}$ , $p \cdot k$ , ... (i.e. Mandelstam variables $s, t, u$ ). Photon polarization sums proceed analogously: $\sum_{λ} ϵ_{μ}^{*} (k, λ) ϵ_{ν} (k, λ) \to - η_{μν}$ (in covariant gauges, with the Ward identity guaranteeing the longitudinal/timelike pieces drop out of physical amplitudes).

The Klein–Nishina calculation in QED/compton.md is a worked example of this entire pipeline.

5.4.3 What $∣ M ∣^{2}$ means physically

It is a probability density per phase-space point. Multiplied by $d Π_{n} / F$ and integrated, it produces a probability rate (cross section or decay rate). The bare $∣ M ∣^{2}$ has dimensions $[energy]^{4 - 2 n}$ for an $n$ -particle final state (Lorentz-invariant phase space carries the rest of the dimensions).
It is Lorentz invariant. Both $M$ and the relativistic-normalization phase space $d Π_{n}$ transform as Lorentz scalars, so $\overline{∣ M ∣^{2}}$ can be quoted in any frame and substituted unchanged.
It encodes interference. When several diagrams contribute at the same order in $e$ , $M = M_{1} + M_{2} + \dots$ and $∣ M ∣^{2} = \sum_{i} ∣ M_{i} ∣^{2} + 2 Re \sum_{i < j} M_{i}^{*} M_{j}$ . The cross terms are quantum-mechanical interference between Feynman diagrams — the same phenomenon that distinguishes Compton's $s$ - and $u$ -channel diagrams from an incoherent sum, or Bhabha's $s$ - and $t$ -channel diagrams.
It must be gauge-invariant. Although individual diagrams in covariant gauges may depend on the gauge parameter, the sum contributing to $M$ at fixed external states does not. This is enforced by the Ward identity $k_{μ} M^{μ} = 0$ on amplitudes with an external photon of momentum $k$ .
Crossing symmetry. The same analytic function $M (s, t, u)$ describes processes related by moving particles between initial and final states (e.g. $e^{-} e^{+} \to γγ$ vs. Compton $e^{-} γ \to e^{-} γ$ ); $∣ M ∣^{2}$ inherits this, with kinematic re-labeling.
Optical theorem. $Im M_{i \to i} (forward) = \frac{1}{2} \sum_{X} ∣ M_{i \to X} ∣^{2} d Π_{X}$ (S-matrix unitarity $S^{†} S = 1$ , sandwiched between $⟨ i ∣ \cdot ∣ i ⟩$ , after the $T = - i (S - 1)$ split); see QFT/cross-sections.md § Optical Theorem for the derivation.
Non-relativistic limit. In the kinematic regime where particles are slow and external lines reduce to non-relativistic wavefunctions, $∣ M ∣^{2} \to ∣ V_{f i} ∣^{2}$ where $V_{f i} = ⟨ f ∣ V ∣ i ⟩$ is the matrix element of a non-relativistic scattering potential, and $d σ = (1/ F) \overline{∣ M ∣^{2}} d Π_{n}$ reduces to Fermi's Golden Rule $Γ = (2 π /ℏ) ∣ V_{f i} ∣^{2} ρ_{f}$ . The QFT formula and Fermi's Golden Rule are the same Born-rule statement at different levels of relativistic completeness.

5.4.4 Worked-example pipeline

For any QED process, the calculation chain is:

Draw all Feynman diagrams at the desired order in $e$ .
Apply momentum-space Feynman rules (§5.2 table) to write $i M$ as a sum of diagram contributions.
Take $∣ M_{f i} ∣^{2} = M_{f i}^{*} M_{f i}$ , expanding interference terms.
Sum/average over spins and polarizations using completeness relations → $\overline{∣ M ∣^{2}}$ as a Dirac trace.
Evaluate the trace with $γ$ -matrix identities, expressing the result in Mandelstam variables.
Plug into $d σ = (1/ F) \overline{∣ M ∣^{2}} d Π_{n}$ (or $d Γ$ ) and integrate over the desired phase-space region.

Steps 1–2 are diagrammatic; steps 3–5 are algebra; step 6 is kinematic integration. The Klein–Nishina cross section (QED/compton.md) walks through all six steps explicitly.

5.5 Renormalization (Pragmatic procedure)

Loop integrals diverge; regularize and absorb divergences into multiplicative redefinitions of $ψ$ , $A_{μ}$ , $m$ , $e$ . The Ward–Takahashi identities ( $Z_{1} = Z_{2}$ ) follow from the gauge invariance derived in §2.3. Renormalization is not strictly a postulate — but it relies on the empirical fact that QED is renormalizable, which only later was proved (BPHZ).

6. Summary: Postulates vs. Derived Results

Item	Status
First-order relativistic wave equation	Postulate H1
$γ^{μ}$ Clifford algebra, 4-component spinors	Derived from H1
Free Dirac equation $(i γ^{μ} \partial_{μ} - m) ψ = 0$	Derived from H1
Conserved current, plane-wave solutions, non-rel. limit	Derived
Minimal coupling $p_{μ} \to p_{μ} - e A_{μ}$	Postulate H2 (heuristic)
Covariant derivative $D_{μ} = \partial_{μ} + i e A_{μ}$	Definition
Gauge invariance $ψ \to e^{i α} ψ$ , $A_{μ} \to A_{μ} - \frac{1}{e} \partial_{μ} α$	Derived (consequence of H2)
Negative-energy spectrum	Derived (and motivates need for new input)
$ψ$ as operator field with equal-time canonical anticommutators ${ψ_{α} (x, t), ψ_{β}^{†} (y, t)} = δ^{3} (x - y) δ_{α β}$	Postulate H3
Mode expansion of $\hat{ψ}$ with ladder anticommutators ${b, b^{†}}, {d, d^{†}}$ ; Fock space $F = ⨁_{n} H_{n}$	Derived from H3 + free Dirac equation
Antiparticles, positivity of energy, Pauli exclusion	Derived from H3
Mode expansion of $A_{μ}$ with bosonic commutators	Postulate (same flavor as H3)
Interaction Hamiltonian $H_{int} = j^{μ} A_{μ}$	Derived
S-matrix $S = T exp (- i \int H_{int})$ , Dyson series, Feynman rules	Derived (definition + interaction-picture algebra)
Invariant amplitude $M$ via $⟨ f ∣ i T ∣ i ⟩ = (2 π)^{4} δ^{4} (\sum p) i M$	Definition (residue after stripping translation $δ^{4}$ )
$\overline{∥ M ∥^{2}}$ via Casimir trace technology, Mandelstam variables	Derived (algebra + completeness relations)
Cross sections $d σ = (1/ F) \overline{∥ M ∥^{2}} d Π_{n}$	Derived (modulo Born-rule postulate inherited from QM)

7. Comparison with the Modern Gauge-Theory Route

Aspect	Historical (this file)	Modern (QED)
Foundational postulates	H1 (first-order eq.) + H2 (minimal coupling) + H3 (anticommutator quantization)	$U (1)$ gauge invariance + renormalizability + Lorentz/ $CPT$
Role of gauge invariance	Derived consequence of H2	Postulate
Role of minimal coupling	Postulate H2	Theorem (forced by gauge invariance)
Role of anticommutators	Postulate H3	Theorem (spin–statistics)
Photon mass	Implicit, justified after the fact	Forbidden by gauge invariance
Positron	Postulated via Dirac sea, then re-derived from H3	Built into the Fock-space construction from the outset
Generalization to other gauge groups	Awkward — no obvious route to Yang–Mills	Direct — replace $U (1)$ by $S U (N)$ to obtain QCD, electroweak, ...
Pedagogical accessibility	High — builds on familiar single-particle QM	Lower — requires accepting that local symmetry is the right organizing principle

Both routes lead to the same Lagrangian

$L_{QED} = \overset{ˉ}{ψ} (i γ^{μ} D_{μ} - m) ψ - \frac{1}{4} F_{μν} F^{μν},$

the same Feynman rules, and the same physical predictions. The choice between them is a matter of pedagogical preference and conceptual framing, not physics.

8. A Brief Word on the Wigner / Weinberg Route

A third, more foundational approach (Weinberg's QFT Vol. 1) starts from the Wigner classification of single-particle states and derives QED from the consistency requirements of a Lorentz-invariant, cluster-decomposable $S$ -matrix. In this view the polarization vector $ϵ^{μ} (k)$ of a massless spin-1 particle does not transform as a true four-vector under Lorentz boosts; the residual transformation $ϵ^{μ} \to ϵ^{μ} + β k^{μ}$ must be a symmetry of the interaction, which is precisely electromagnetic gauge invariance. From this perspective gauge invariance is neither a derived consequence nor a postulate, but a theorem about consistent interactions of massless spin-1 particles.

Keyboard shortcuts

youyuanwu