QED — Historical (Dirac-Equation) Route
This is the route by which Dirac, Heisenberg, Pauli, and Fermi originally arrived at QED, and the way most introductory texts (Bjorken–Drell, Sakurai's Advanced QM, Griffiths' particle book, the early chapters of Peskin–Schroeder) build it up. Compared with the modern gauge-theory derivation, it is conceptually simpler — it does not presuppose the gauge principle — but pedagogically backward: gauge invariance appears at the end as a consequence of minimal coupling, rather than at the beginning as a postulate.
To make the logical structure transparent, every step below is tagged as one of:
- (Postulate) — a primitive assumption of this formulation. These are the genuine physical inputs.
- (Definition) — a mathematical object or notation introduced for use later.
- (Derived) — a result that follows from the postulates plus standard mathematics or QM.
- (Heuristic) — a step originally presented as physically motivated guesswork; later cleaned up by deeper theory.
The historical route rests on three postulates:
| # | Postulate | Modern reinterpretation |
|---|---|---|
| H1 | Relativistic single-particle wave equation must be first-order in derivatives | A choice of representation; equivalent to picking the Dirac field |
| H2 | Minimal coupling | A theorem in the gauge-principle approach |
| H3 | Second quantization with anticommutators (after the wavefunction interpretation breaks down) | Forced by spin–statistics |
Everything else — including gauge invariance — is derived.
0. Preliminaries
The general mathematical machinery (Hilbert spaces, Lorentz/Poincaré groups, Lagrangian field theory, Fock space, mode expansions, regularization/renormalization) is collected in the QM and QFT preliminaries pages. This section covers only the prerequisite mathematics and classical physics specific to the wave-equation route to QED: gamma-matrix notation and the Lagrangian formulation of classical electromagnetism. Motivations, derived consequences, and the perturbative machinery used in §5 are deliberately not here — they appear next to the postulates or derivations that use them.
0.1 Gamma Matrices, Dirac Spinors, Notation
The Dirac gamma matrices () are matrices satisfying the Clifford algebra
This is the defining relation of the algebra . For the abstract definition (any vector space with a quadratic form), classification (Bott periodicity), Spin groups, and spinor-representation theory in arbitrary dimensions, see math/clifford-algebra.md. For this document we just need the four matrices and the bilinear-covariant notation below.
Common explicit representations:
- Dirac (standard) representation: , , with the Pauli matrices. Convenient for the non-relativistic limit.
- Weyl (chiral) representation: , . Convenient for massless / high-energy limits and for separating left- and right-handed components.
- Majorana representation: all purely imaginary; useful when discussing real (Majorana) fermion fields.
A Dirac spinor is a four-component object on which the act. Useful definitions:
- Dirac adjoint: . The bilinear is a Lorentz scalar; is a vector; has scalar + antisymmetric tensor parts; etc.
- Feynman slash: for any four-vector . Then .
- Chirality matrix: , which anticommutes with all and squares to . Projectors project onto left- and right-handed components.
0.2 Classical Electromagnetism in Lagrangian Form
Classical electromagnetism has the Lagrangian density
whose Euler–Lagrange equations are the inhomogeneous Maxwell equations . The four-potential is defined only up to a classical gauge transformation
with an arbitrary scalar; this leaves — and therefore the physical electric and magnetic fields — invariant. This redundancy is already present at the classical level, before any quantum mechanics.
The classical equation of motion for a point charge in an external follows from the Lagrangian , equivalent to the canonical momentum substitution
This is the classical input that Dirac borrowed when postulating minimal coupling (H2, §2.1 below).
1. The Dirac Equation
Scope of this route. Postulates H1+H2+H3 are sufficient to construct the QED Lagrangian, but they are not sufficient to derive the general Poincaré classification of particles. H1 only handles spin- — and only via the first-order-equation algebraic accident — so particles of other spins (scalar, vector, gravitino, graviton, ...) require separate analogous wave-equation postulates, case by case. The general classification, which says exactly which spins / helicities are possible and how they transform under , requires elevating the Poincaré group to a primitive symmetry acting on a Hilbert space and analyzing its unitary irreducible representations — the Wigner classification (1939). The historical route never performs representation theory of , so this classification is strictly more general than what H1+H2+H3 can reach. See foundations-modern.md §1.1 — One-particle states (Theorem — Wigner classification) for the modern derivation, and QFT/preliminaries.md § Wigner's Classification for the standalone reference.
1.1 Motivation: Klein–Gordon and the Negative-Probability Problem
The simplest relativistic wave equation is obtained by quantizing the relativistic dispersion via , :
It is Lorentz-invariant but second-order in time, which has two unwanted consequences for a single-particle wavefunction interpretation:
- The conserved current has that can be negative — incompatible with a probability density.
- It admits both positive- and negative-energy solutions with no obvious way to discard the negative-energy ones.
Both problems motivated Dirac to seek a first-order relativistic equation — the postulate H1 below.
(In modern QFT, KG is fine: it is the equation of motion of a free scalar field operator, and "negative probability" was a category error from insisting on a wavefunction interpretation.)
Other routes to Klein–Gordon. The canonical-substitution sketch above is the historical route ("Route A"). KG can equivalently be derived as the Euler–Lagrange equation of the free scalar Lagrangian (Lagrangian route, "Route B"), or as a theorem — the position-space form of the first-Poincaré-Casimir constraint on a spin-0 Wigner irrep (representation-theoretic route, "Route C"). See QFT/preliminaries.md § Worked example: free real scalar field and the Klein–Gordon equation for the three-route comparison. The first-order/probability-current concerns motivating H1 are specific to Route A; in the modern routes there is no such problem (KG is the EOM of a scalar field operator, never a single-particle wavefunction).
1.2 Demand a first-order relativistic wave equation (Postulate H1)
Seek a Lorentz-invariant wave equation for a single relativistic spin- particle that is first-order in , so as to admit a positive-definite probability current and avoid the negative-norm problems of §1.1.
This is a postulate in two senses: it picks first-order over second-order (a real physical assumption motivated by probability-current concerns), and it implicitly chooses the spinor representation of the Lorentz group over scalar or vector ones.
First quantization: the unstated state-space framework
"Single relativistic spin- particle" is not a neutral phrase — it commits the historical narrative to the framework of first quantization without ever naming it. For symmetry with §3.3 (where second quantization is performed explicitly), it is worth stating what's been silently assumed.
First quantization here means: take a relativistic single particle and put it on the Hilbert space
a tensor product of two factors:
- — the standard configuration-space Hilbert space of non-relativistic single-particle QM (square-integrable wavefunctions on space; see QM/preliminaries.md). The historical route carries this factor over by analogy when going relativistic — it is not itself derived from anything in this document.
- — a finite-dimensional internal space carrying the spin / Lorentz-representation degrees of freedom of . This is the new ingredient demanded by representing the Lorentz group on a single-particle wavefunction (a scalar wavefunction would take trivially).
The dimension of is not fixed by H1 alone. It will be determined in §1.3 by the algebraic consequences of the first-order ansatz: requiring the operator to square to the Klein–Gordon operator forces the Clifford algebra , whose smallest faithful representation is by matrices — so and
For the rest of §1.2 the precise dimension does not matter; only the product structure does.
The inner product is
A state is a single vector , equivalently a wavefunction . Time evolution is governed (in the Schrödinger picture) by the Dirac Hamiltonian
via , which is just the Dirac equation in disguise.
This is the framework that H1 (here) and H2 (§2.1) operate inside. Particle number is fixed at : there is no vacuum, no antiparticles in a clean way, and no operators that can change the particle count. Only at §3.3 (postulate H3) does this framework get replaced by Fock-space second quantization, in which becomes an operator on a multi-particle state space. The relationship and the comparison of pre- and post-quantization states is laid out in QFT/fock-space-inventory.md §0.
Terminology note. "First" and "second" quantization are unfortunate names — there is only one Hilbert-space construction (first quantization); the second step is really field quantization. The historical names have stuck.
Loose end: Lorentz covariance. The position-space wavefunction singles out a preferred spatial slice (every frame has its own ), and so this state space is not manifestly Lorentz-covariant. The genuinely covariant single-particle space is the momentum-space Wigner irrep, where Lorentz transformations act unitarily. The Fourier-transform relating the two carries a measure mismatch that makes the position-space "wavefunction" not quite a probability amplitude in the relativistic sense — connected to the Newton–Wigner localization issues. The historical route ignores all of this; the modern treatment in foundations-modern.md § 1.1.3 — The natural basis: momentum and spin, not position addresses it head-on.
1.3 Derivation of the Clifford algebra and gamma matrices (Derived)
Writing the ansatz and requiring that squaring the operator reproduce the Klein–Gordon equation forces
Pure algebra then shows the smallest faithful representation is by matrices, so has four complex components. None of this is an additional input — it all follows from H1. The explicit matrix representations and bilinear notation are collected in §0.1.
The Clifford algebra abstractly. The relation above is the defining identity of the Clifford algebra — the associative algebra generated by four vectors with . As a general construction (any vector space with a quadratic form), Clifford algebras subsume the complex numbers, quaternions, and exterior algebras as special cases; their classification (Bott periodicity), spinor representations, and connection to Spin groups (e.g. , the universal cover used in the modern route) are collected in math/clifford-algebra.md. The collapsible block below specializes to this algebra in .
Why exactly 4 × 4? (counting argument + small-case obstructions)
The claim "smallest faithful representation is " deserves an argument. It follows from (i) the dimension of the Clifford algebra and (ii) a check that the smaller cases concretely fail. The general -dimensional version of the result, and its connection to the Bott classification, is in math/clifford-algebra.md § 5.
Where the Clifford algebra comes from (recap). Squaring the Dirac operator:
Because partial derivatives commute, we may symmetrize:
For this to equal — so that the second-order equation is exactly — we need . So the 's cannot be numbers (numbers commute, but forces non-commutation); they must be matrices.
Counting argument: . The Clifford algebra generated by four anticommuting symbols subject only to has a canonical basis of antisymmetric products:
| Length | Basis elements | Count |
|---|---|---|
| 0 | ||
| 1 | ||
| 2 | ||
| 3 | ||
| 4 |
Total . A representation on embeds the algebra in , which has dimension . Faithfulness (injectivity) demands the image be 16-dimensional, hence
So are ruled out purely by counting. But it is illuminating to see why the small cases fail concretely.
: numbers. Numbers commute, so would force — not faithful.
: spatial gammas fit, has no room. A tempting attempt in : set for . Then
matching the spacelike Clifford relations. The obstruction is : it must satisfy and for all . But:
Lemma. The only complex matrix with for all is .
Proof. is spanned by . Write . Then forces and all , so .
There is no nontrivial in — the three spatial gammas already exhaust the "anticommuting room", and a fourth generator that anticommutes with all of them cannot exist. (This is the counting argument made concrete: , but , so 12 dimensions' worth of elements must collapse — and you can see which ones get squeezed out first.)
: ruled out by counting. , so no faithful rep can exist. There is also no natural "Pauli-like" structure in to attempt.
: works, and is unique. — exactly the Clifford-algebra dimension. The general classification gives
so a faithful 4-dimensional representation exists, is unique up to similarity transformations, and every element of the Clifford algebra is represented by a distinct matrix. The Dirac, Weyl, and Majorana representations in §0.1 are three concrete choices of similarity transform.
The 4 in is therefore not a representational accident; it is the unique minimum dimension at which the algebraic defining relation admits a non-collapsing realization in spacetime dimensions. For the general- pattern (Bott periodicity, when Majorana/Weyl/Majorana–Weyl spinors exist) see math/clifford-algebra.md § 5 and § 7.
1.4 The Dirac equation (Derived)
The result of §1.2 + §1.3 is the Dirac equation:
At this stage is treated as an ordinary relativistic wavefunction (a single-particle theory), not yet as a quantum field. Its plane-wave solutions (§1.5), conserved current , and non-relativistic limit (the Pauli equation, with , §2.4) are all derived consequences.
1.5 Plane-wave solutions and completeness relations (Derived)
Plugging the plane-wave ansatz into the free Dirac equation gives the algebraic equation , with two linearly independent positive-energy solutions for . Similarly, gives , with two negative-energy solutions that — after second quantization (§3.3) — describe antiparticles.
Standard normalization: , , and the spin sums
These appear throughout Feynman-diagram calculations (§5) and are the engine behind Casimir's trick (§5.4.2). The negative-energy spectrum signaled by the family is the technical input to the consistency problem of §3.1, which forces the move to second quantization.
2. Coupling to Electromagnetism
2.1 Minimal coupling (Postulate H2 — heuristic)
Borrow from classical electrodynamics (§0.2), where a charged particle in an external electromagnetic field obeys equations of motion with the canonical momentum replaced by the kinetic momentum:
Apply this minimal substitution to the Dirac equation:
This is the second postulate of the historical formulation. It is not derived — it is a physically motivated rule, justified after the fact by the consequences collected in §2.4: simplicity, the correct non-relativistic Pauli equation with , and the relativistic Lorentz force law.
In the modern gauge-principle derivation H2 becomes a theorem, not a postulate.
2.2 Definition of the covariant derivative (Definition)
Define . The minimally coupled Dirac equation is then
This is purely notation.
2.3 Gauge invariance (Derived)
Direct calculation shows that the simultaneous transformation
leaves the minimally coupled Dirac equation invariant. So gauge invariance is a derived consequence of postulate H2 — exactly the inverse of the modern viewpoint, in which gauge invariance is the input and minimal coupling is the output. (See QED.)
2.4 Post-hoc justifications: Pauli equation, , Lorentz force (Derived)
H2 is a heuristic at the level of postulation, but several derived consequences vindicate it after the fact. These are consequences of the minimally coupled Dirac equation (§2.1), not independent postulates.
Non-relativistic limit: the Pauli equation. Block-diagonalising the minimally coupled Dirac equation in the standard representation and dropping terms gives the Pauli equation
for the two-component non-relativistic spinor . The last term is a magnetic moment coupling with gyromagnetic ratio — the historically dramatic prediction that vindicated the Dirac equation, since experiment had observed for the electron whereas the orbital value would be .
Classical limit: the Lorentz force. In the WKB / classical limit one recovers the relativistic Lorentz force law .
Simplicity. Minimal substitution is the simplest Lorentz-invariant coupling of to at the operator level.
All three results were used historically as post hoc justification of H2.
3. From Wave Equation to Quantum Field
3.1 The negative-energy problem (Derived — and why a new postulate is needed)
The free Dirac equation admits both positive- and negative-energy plane-wave solutions:
This is a calculation, not a postulate — but it shows that a single-particle interpretation is impossible (the spectrum is unbounded below). A new ingredient must be added.
3.2 The Dirac sea (Heuristic — historical)
Dirac's original resolution: postulate that all negative-energy states are filled, and identify a "hole" in the sea with a positive-energy antiparticle of opposite charge. This predicted the positron, discovered shortly after by Anderson (1932).
This is conceptually awkward (it relies on an infinite filled vacuum) and was eventually superseded.
3.3 Second quantization (Postulate H3)
The Dirac sea is replaced by a sharper axiom on itself.
Postulate H3. Promote the Dirac wavefunction to an operator-valued field on a Fock space, satisfying the equal-time canonical anticommutation relations
with the Dirac-spinor indices.
Two things are postulated together here:
- The promotion of from c-number wavefunction to operator field.
- The choice of anticommutators (Fermi statistics) over commutators.
Both are forced upon you in the modern view by the spin–statistics theorem, but that theorem was proved later (Pauli 1940). Historically, anticommutators were postulated to ensure positive-definite energy and the Pauli exclusion principle.
Consequence: mode expansion. Solving the free Dirac equation as an operator equation, and using the plane-wave spinor basis from §1.5, fixes uniquely up to the operator-valued coefficients of each mode:
where annihilates an electron and creates a positron. Substituting this expansion into the postulated equal-time field anticommutators and using the spinor completeness relations of §1.5 reduces them to the ladder-operator anticommutators
So the ladder form is a consequence of H3 in the plane-wave basis, not an independent postulate. The two forms are equivalent (one is the spatial Fourier transform of the other).
Consequence: Fock space. Once is an operator with the ladder operators above, the natural state space they act on is no longer the single-particle of §1.2, but the Fock space
with the one-dimensional vacuum sector and the antisymmetrized -particle sector built from . The vacuum is defined by for all , and arbitrary multi-particle (and multi-antiparticle) states are generated by acting with creation operators on . This is the replacement of first quantization promised in §1.2.
Terminology warning. Before H3, the symbol denoted a single-particle wavefunction (a state). After H3 it denotes a field operator on Fock space. Same symbol, completely different mathematical object. From here on, "the state" of the system is a vector in Fock space (typically the vacuum or an asymptotic Fock state); is one of the operators acting on that space. See QFT/preliminaries.md § States vs. Fields for an extended discussion.
3.4 Antiparticles, vacuum, and positivity of energy (Derived)
Once H3 is in place, several historical puzzles resolve themselves automatically:
- The vacuum is annihilated by all and .
- Negative-energy modes have been re-interpreted as positive-energy antiparticle creation; the Dirac sea disappears.
- The Hamiltonian is positive after normal-ordering.
- Pauli exclusion follows from .
None of this is an additional input.
4. Quantization of the Electromagnetic Field
4.1 Choice of gauge (Definition / convention)
Pick a gauge — for the historical treatment, Coulomb gauge . This eliminates longitudinal and timelike components as non-dynamical (the latter by the Coulomb constraint) and leaves only the two transverse photon polarizations .
This is a convention, not a postulate — a different choice (Lorenz gauge with Gupta–Bleuler, or BRST) gives the same physics.
4.2 Mode expansion of (Postulate — same flavor as H3, applied to bosons)
Postulate the bosonic mode expansion
with bosonic commutators . Conceptually this is the same kind of postulate as H3 — promotion of a classical field to an operator with prescribed (anti)commutators — applied to a boson rather than a fermion.
The instantaneous Coulomb interaction is reintroduced explicitly to compensate for the elimination of the timelike photon mode.
5. Dynamics and Predictions
5.0 Perturbative machinery: S-matrix, Dyson series, Wick's theorem (Generic QFT setup)
Why we need it. Postulates H1–H3 give us field equations and a Hilbert space, but no recipe for what an experimentalist measures. Real experiments prepare asymptotic states — well-separated wavepackets long before a collision — and detect asymptotic states long after. The S-matrix is the operator whose matrix elements encode exactly these probabilities, packaging all interaction effects into a single object. Cross sections, decay rates, and (via poles) bound-state energies are all built from S-matrix elements. We collect the generic construction here so that §5.2 can apply it to QED without distraction.
Definition. Split the Hamiltonian as with free. In the interaction picture, operators evolve under while states evolve under . The S-matrix is the interaction-picture evolution operator from to :
Sketch of derivation (Dyson series). Solving the interaction-picture Schrödinger equation iteratively, and noting that the time-ordering operator accounts for the non-commutativity of at different times, gives
This is the Dyson series — the order-by-order expansion of in powers of the coupling.
How each term is evaluated (Wick's theorem). Each Dyson term is a vacuum/in/out matrix element of a time-ordered product of free fields. Wick's theorem rewrites such a product as a sum of normal-ordered products with all possible pairwise contractions, where each contraction is by definition a vacuum-expectation time-ordered product — a Feynman propagator:
Dyson series + Wick's theorem together turn perturbation theory into a sum of Feynman diagrams: one diagram per contraction pattern, with vertices coming from and lines from the propagators. The QED-specific application is in §5.2; the full mechanical derivation of the momentum-space rules is left to a standard QFT textbook (Peskin–Schroeder Ch. 4, Schwartz Ch. 7, Srednicki Ch. 9).
Transition operator (preview). is the primary operator — defined directly as the interaction-picture evolution above. Almost every later use, however, separates out the trivial no-scattering piece by writing
The transition operator (or -matrix) is defined from this way; it carries no information doesn't already carry. Its purpose is purely bookkeeping: for , the matrix element contains only genuine interaction effects, with the trivial removed. This becomes important in §5.3 when we extract the invariant amplitude by stripping the spacetime-translation from .
Notation collision. The symbol in this document plays two unrelated roles: the time-ordering operator appearing inside the Dyson formula above (an instruction to reorder operators by time), and the transition operator just defined (an actual operator on Fock space). Context disambiguates — time-ordering always sits in front of a product of operators, transition sits between bra and ket as .
Caveat (Haag's theorem). The interaction picture does not strictly exist in interacting QFT; the construction above is a formal expansion, justified after the fact by renormalization. See QFT/remarks.md.
5.1 Interaction Hamiltonian (Derived)
From the minimally coupled Lagrangian (the Lagrangian whose Euler–Lagrange equation is the minimally coupled Dirac equation of §2.1) the interaction Hamiltonian is read off directly:
No new input.
5.2 The QED S-matrix and Feynman rules (Derived)
Why we need it (here). §5.0 introduced the S-matrix in general; we now plug in the QED interaction Hamiltonian from §5.1 to get the explicit perturbative expansion for electron–photon processes. The end product is the QED Feynman rules — a graphical algorithm for computing any S-matrix element to any order in . All of this is derivation, not postulation.
The QED Dyson series. Substituting into the general Dyson series of §5.0:
Each factor in the integrand is a vertex contribution at one spacetime point.
Sketch of derivation of the Feynman rules. The chain from this expression to the momentum-space rules table consists of three mechanical steps (extending the generic §5.0 Wick step to QED-specific bookkeeping):
-
Apply Wick's theorem to each order- time-ordered product, producing a sum of normal-ordered terms with all pairwise contractions of , , and . Surviving terms (after taking the matrix element) are those where the uncontracted fields exactly match the in/out particle content.
-
Diagrammatic bookkeeping. Identify each algebraic piece with a graphical element:
- Each interaction factor at → one vertex with three lines (one electron in, one out, one photon).
- Each contraction → an internal electron line between and , propagator .
- Each contraction → an internal photon line, propagator .
- Uncontracted fields acting on or → external line factors (spinors for fermions; polarization vectors for photons).
-
Position → momentum space. Fourier-transform every propagator and external line. Spacetime integrals over each vertex become momentum-conserving delta functions . Integrating away the internal delta functions leaves one overall delta function enforcing total energy–momentum conservation; the residue is the momentum-space Feynman rules dictionary:
Diagrammatic element Algebraic factor Vertex Internal electron line, momentum Internal photon line, momentum (Feynman gauge) External electron in / out / External positron in / out / External photon in / out / Closed fermion loop factor of and a trace Symmetry factor for diagrams with internal symmetry
This is the same table referenced in QED Step 6 and applied in QED/compton.md to compute the Klein–Nishina cross section. The rules contain no new physics beyond — they are a reorganization of perturbation theory into a graphical algorithm.
5.3 The Invariant Amplitude (Definition)
Why we need it. Raw S-matrix elements between momentum eigenstates always carry the overall delta function from translation invariance, plus state-normalization factors . Squaring such an element naively produces the meaningless . To extract a finite, Lorentz-invariant probability density usable in a cross-section formula we strip the delta function out by definition and work with the residue. That residue is what the Feynman rules of §5.2 actually compute; giving it a name turns the rules into a self-contained calculational object.
Definition. Recall from §5.0 that the trivial no-scattering piece of can be peeled off via , where the transition operator encodes all genuine interaction effects. For an off-diagonal momentum-eigenstate matrix element of , factor the spacetime-translation delta function out:
The residue is the invariant amplitude (also called the matrix element, or just the M-matrix element). It is a Lorentz scalar (with possible spinor/polarization indices on the external states) and is finite at generic kinematics.
Notation note — vs. vs. . Three closely related symbols appear in the literature and in the rest of this doc:
Symbol Spins / polarizations Typical use fixed, specific channel definitions, polarized observables, probability of one transition same as above, labels suppressed shorthand when the channel is clear from context summed over final, averaged over initial unpolarized cross sections; what Casimir's trick (§5.4.2) computes So (notation only, same object), while is a different object — a sum over an ensemble of values. Below we keep the subscript when the channel-specific meaning matters and drop it when it does not.
Sketch of derivation from .
- Identity subtraction. Forward scattering trivially has ; only the interacting part produces transitions with — which is why was introduced above.
- Translation invariance ⇒ overall . The interaction Hamiltonian density is invariant under spacetime translations . Each Dyson term, between momentum eigenstates, can therefore be written as times a function of relative coordinates, where is the centre of the diagram. Integrating over produces the universal factor . Defining as the residue is the precise statement of step 3 in §5.2 ("after integrating away delta functions, read off the rules").
- Momentum-space Feynman rules compute directly. Each tabulated rule (vertex , internal propagator, external spinor/polarization) is the contribution to from the corresponding diagrammatic element. Summing over all topologically distinct diagrams at order in gives .
What you do with it. The whole point of is that the physically observable quantity is , not itself. The next subsection unpacks what that means and how it is computed.
5.4 The Squared Amplitude
Why it is the physical object. Quantum mechanics produces complex amplitudes; experiments measure probabilities. The Born rule (QM Postulate 3) supplies the bridge: . Applied to scattering with and the delta-function-stripping of §5.3, the relevant probability density is
a real, non-negative, Lorentz-invariant function of the external momenta and spins/polarizations. (The subtlety produced by squaring the raw S-matrix element is what makes the -as-residue definition useful: has no leftover delta-function squared.)
The full QED interaction probability density per pair, derived inside QFT/cross-sections.md §3, is
from which cross sections and decay rates are read off.
5.4.1 Spin and polarization sums:
Real experiments rarely measure individual spin/polarization channels. Two operations bring to a directly comparable form:
- Average over initial-state spins/polarizations the experimenter does not control — divide by the multiplicity (e.g. a factor for two spin- particles, per unpolarized photon).
- Sum over final-state spins/polarizations the detector does not resolve.
The overlined notation
is universal in cross-section formulas. Polarized observables (e.g. spin asymmetries) keep specific spin labels instead and use unaveraged.
5.4.2 Computational technology: Casimir's trick (trace technology)
For a generic QED amplitude built from spinor bilinears, the channel-specific amplitude has the schematic form
where is some product of -matrices and propagators (and here denote the specific spins in addition to the fixed external momenta). Taking the modulus squared and using the conjugation identity (with ):
Summing over the initial- and final-state spins and using the completeness relations
collapses the explicit -spinors into projectors, and the spinor matrix product becomes a trace over Dirac indices:
This is Casimir's trick (also called the Casimir trick / spin-sum-as-trace). The remaining computation is purely algebraic: apply the standard trace identities
traces of an odd number of 's vanish, etc., to reduce to a function of Lorentz-invariant dot products , , ... (i.e. Mandelstam variables ). Photon polarization sums proceed analogously: (in covariant gauges, with the Ward identity guaranteeing the longitudinal/timelike pieces drop out of physical amplitudes).
The Klein–Nishina calculation in QED/compton.md is a worked example of this entire pipeline.
5.4.3 What means physically
- It is a probability density per phase-space point. Multiplied by and integrated, it produces a probability rate (cross section or decay rate). The bare has dimensions for an -particle final state (Lorentz-invariant phase space carries the rest of the dimensions).
- It is Lorentz invariant. Both and the relativistic-normalization phase space transform as Lorentz scalars, so can be quoted in any frame and substituted unchanged.
- It encodes interference. When several diagrams contribute at the same order in , and . The cross terms are quantum-mechanical interference between Feynman diagrams — the same phenomenon that distinguishes Compton's - and -channel diagrams from an incoherent sum, or Bhabha's - and -channel diagrams.
- It must be gauge-invariant. Although individual diagrams in covariant gauges may depend on the gauge parameter, the sum contributing to at fixed external states does not. This is enforced by the Ward identity on amplitudes with an external photon of momentum .
- Crossing symmetry. The same analytic function describes processes related by moving particles between initial and final states (e.g. vs. Compton ); inherits this, with kinematic re-labeling.
- Optical theorem. (S-matrix unitarity , sandwiched between , after the split); see QFT/cross-sections.md § Optical Theorem for the derivation.
- Non-relativistic limit. In the kinematic regime where particles are slow and external lines reduce to non-relativistic wavefunctions, where is the matrix element of a non-relativistic scattering potential, and reduces to Fermi's Golden Rule . The QFT formula and Fermi's Golden Rule are the same Born-rule statement at different levels of relativistic completeness.
5.4.4 Worked-example pipeline
For any QED process, the calculation chain is:
- Draw all Feynman diagrams at the desired order in .
- Apply momentum-space Feynman rules (§5.2 table) to write as a sum of diagram contributions.
- Take , expanding interference terms.
- Sum/average over spins and polarizations using completeness relations → as a Dirac trace.
- Evaluate the trace with -matrix identities, expressing the result in Mandelstam variables.
- Plug into (or ) and integrate over the desired phase-space region.
Steps 1–2 are diagrammatic; steps 3–5 are algebra; step 6 is kinematic integration. The Klein–Nishina cross section (QED/compton.md) walks through all six steps explicitly.
5.5 Renormalization (Pragmatic procedure)
Loop integrals diverge; regularize and absorb divergences into multiplicative redefinitions of , , , . The Ward–Takahashi identities () follow from the gauge invariance derived in §2.3. Renormalization is not strictly a postulate — but it relies on the empirical fact that QED is renormalizable, which only later was proved (BPHZ).
6. Summary: Postulates vs. Derived Results
| Item | Status |
|---|---|
| First-order relativistic wave equation | Postulate H1 |
| Clifford algebra, 4-component spinors | Derived from H1 |
| Free Dirac equation | Derived from H1 |
| Conserved current, plane-wave solutions, non-rel. limit | Derived |
| Minimal coupling | Postulate H2 (heuristic) |
| Covariant derivative | Definition |
| Gauge invariance , | Derived (consequence of H2) |
| Negative-energy spectrum | Derived (and motivates need for new input) |
| as operator field with equal-time canonical anticommutators | Postulate H3 |
| Mode expansion of with ladder anticommutators ; Fock space | Derived from H3 + free Dirac equation |
| Antiparticles, positivity of energy, Pauli exclusion | Derived from H3 |
| Mode expansion of with bosonic commutators | Postulate (same flavor as H3) |
| Interaction Hamiltonian | Derived |
| S-matrix , Dyson series, Feynman rules | Derived (definition + interaction-picture algebra) |
| Invariant amplitude via | Definition (residue after stripping translation ) |
| via Casimir trace technology, Mandelstam variables | Derived (algebra + completeness relations) |
| Cross sections | Derived (modulo Born-rule postulate inherited from QM) |
7. Comparison with the Modern Gauge-Theory Route
| Aspect | Historical (this file) | Modern (QED) |
|---|---|---|
| Foundational postulates | H1 (first-order eq.) + H2 (minimal coupling) + H3 (anticommutator quantization) | gauge invariance + renormalizability + Lorentz/ |
| Role of gauge invariance | Derived consequence of H2 | Postulate |
| Role of minimal coupling | Postulate H2 | Theorem (forced by gauge invariance) |
| Role of anticommutators | Postulate H3 | Theorem (spin–statistics) |
| Photon mass | Implicit, justified after the fact | Forbidden by gauge invariance |
| Positron | Postulated via Dirac sea, then re-derived from H3 | Built into the Fock-space construction from the outset |
| Generalization to other gauge groups | Awkward — no obvious route to Yang–Mills | Direct — replace by to obtain QCD, electroweak, ... |
| Pedagogical accessibility | High — builds on familiar single-particle QM | Lower — requires accepting that local symmetry is the right organizing principle |
Both routes lead to the same Lagrangian
the same Feynman rules, and the same physical predictions. The choice between them is a matter of pedagogical preference and conceptual framing, not physics.
8. A Brief Word on the Wigner / Weinberg Route
A third, more foundational approach (Weinberg's QFT Vol. 1) starts from the Wigner classification of single-particle states and derives QED from the consistency requirements of a Lorentz-invariant, cluster-decomposable -matrix. In this view the polarization vector of a massless spin-1 particle does not transform as a true four-vector under Lorentz boosts; the residual transformation must be a symmetry of the interaction, which is precisely electromagnetic gauge invariance. From this perspective gauge invariance is neither a derived consequence nor a postulate, but a theorem about consistent interactions of massless spin-1 particles.