Home
Docs and works of youyuanwu.
Notes in this book
- Mathematics — group theory, Clifford algebras, topology and smooth manifolds.
- Physics — quantum mechanics and quantum field theory (postulates, Fock space, observables, QED / QCD / electroweak / Standard Model).
External projects
- Quantum Computing — qcfront
Mathematics
Reference notes on the mathematical structures used throughout the physics sections. These pages are companion material — physics docs link in for definitions and structure theorems, and the math pages stay self-contained and general.
Contents
- Group Theory: Definitions and Axioms — group axioms, Lie groups and Lie algebras, representations, and the universal cover; the foundation behind the Lorentz, Poincaré, and internal symmetry groups (, , ) used in QFT/preliminaries.md.
- Clifford Algebras — the abstract definition and structure theory of , the gamma-matrix and Dirac-spinor algebra underlying relativistic-fermion fields; supplies the algebra used by QED/historical.md.
- Topology, Manifolds, and Smooth Structure — point-set topology, smooth manifolds, tangent spaces, and smooth maps; the prerequisites that physics texts typically take for granted.
Reading order
The dependency arrows run
so newcomers should read in that order. Readers who only need a specific definition can jump in anywhere — each page is cross-linked to its prerequisites.
Group Theory: Definitions and Axioms
The Lorentz, Poincaré, and internal symmetry groups (e.g. , , ) used throughout physics are all instances of Lie groups with associated Lie algebras and representations. This document collects the abstract definitions; specific examples appear in the next section. It is intended as a reference companion to QFT/preliminaries.md, which uses these definitions to set up the Lorentz and Poincaré groups and Wigner's classification.
For the manifold/topology prerequisites referenced below — manifold, tangent space, smooth map — see topology-manifolds.md.
Group axioms
A group is a set together with a binary operation satisfying:
- (G1) Closure: .
- (G2) Associativity: for all .
- (G3) Identity: there exists such that for all .
- (G4) Inverses: for every there exists such that .
If additionally for all elements, is abelian (commutative).
Subgroups, normal subgroups, quotients
- is a subgroup if is itself a group under the inherited operation.
- For and , the conjugate of by is the set obtained by sandwiching every element of between and . It is itself a subgroup of , isomorphic to .
- The left coset of by is the set ; the right coset is . Cosets partition into disjoint pieces of equal size.
- is a normal subgroup () if for all — equivalently, every left coset equals the corresponding right coset (). Normal subgroups are exactly the kernels of group homomorphisms.
- For , the quotient group has elements the cosets , with multiplication . (Normality is what makes this product well-defined independently of which representatives are chosen.)
Example: in , the centre is a normal subgroup; the quotient .
Homomorphisms and isomorphisms
A homomorphism is a map preserving the group operation: . It automatically satisfies and . A bijective homomorphism is an isomorphism, written .
The kernel of is the preimage of the identity:
It is the set of elements that collapses to the identity, and is always a normal subgroup of . Properties:
- is injective iff (only the identity gets sent to the identity).
- The image is a subgroup of (not necessarily normal).
- First isomorphism theorem: .
Example: the determinant is a homomorphism with kernel (matrices of determinant 1). The covering map has kernel .
Group actions and representations
A (left) group action of on a set is a map , , satisfying and .
A representation of on a vector space is a homomorphism (the group of invertible linear maps on ). Equivalently, acts linearly on :
- is unitary if has an inner product preserved by for all (i.e. ). Unitary representations are what physical Hilbert-space transformations must be (probabilities are conserved).
- is irreducible (irrep) if no proper non-trivial subspace is invariant under all . Irreps are the elementary building blocks; reducible reps decompose as direct sums of irreps (Schur's lemma).
- A projective representation satisfies for some phase . Quantum mechanics uses projective reps because physical states are rays, not vectors. Projective reps of correspond to ordinary reps of the universal cover (this is why rather than shows up in physics — see QFT/preliminaries.md § Universal cover ).
Standard matrix groups (notation)
The following classical matrix Lie groups appear throughout this repository and in the rest of the QFT literature. Let denote either or .
| Symbol | Name | Definition | Dimension over |
|---|---|---|---|
| General linear group | invertible matrices over (equivalently, ) | () or () | |
| Special linear group | or | ||
| Orthogonal group | (preserves Euclidean inner product) | ||
| Special orthogonal group | (rotations: orientation-preserving orthogonal) | ||
| Indefinite orthogonal group | preserves a metric of signature ; e.g. is the Lorentz group | ||
| Proper orthochronous component | identity component of | same | |
| Unitary group | (preserves Hermitian inner product) | ||
| Special unitary group | () | ||
| Symplectic group | preserves a symplectic form | ||
| Spin group | universal double cover of for |
Inclusion lattice (for the most-used cases): , and , with being the determinant-1 subgroups.
Compactness summary (matters for representation theory, see Compact vs. non-compact below):
- Compact: .
- Non-compact: with , .
Lie groups and Lie algebras
A Lie group is a group that is also a smooth manifold, with the group operations and being smooth maps. (Here "smooth" means — infinitely differentiable in local coordinates; for matrix Lie groups it just means the matrix entries depend smoothly on parameters. Full definitions of manifold, topological space, and smooth map are collected in topology-manifolds.md.) Examples: (translations), , , , , , the Lorentz group , the Poincaré group .
The Lie algebra of a Lie group is the tangent space at the identity, , equipped with the Lie bracket — bilinear, antisymmetric, and satisfying the Jacobi identity
For matrix Lie groups (the only kind we use), the bracket is the matrix commutator .
The exponential map takes a Lie algebra element to a one-parameter subgroup of :
For matrix groups, . In the connected component of , every element lies in some one-parameter subgroup, so is locally bijective and gives a near-identity parameterization.
Generators and structure constants. A generator of is an element of the Lie algebra (the tangent space at the identity element — see topology-manifolds.md § Tangent Spaces). Choose a basis of — i.e. a linearly independent set spanning , so that every has a unique expansion with real coefficients . Every group element near the identity can then be written as
so the "generate" group elements via the exponential map — hence the name. The Lie algebra structure is encoded in the structure constants :
(The factor of is a physics convention; mathematicians omit it. With this convention, Hermitian produce unitary .) The structure constants are antisymmetric in , satisfy a Jacobi identity inherited from the Lie bracket, and determine the entire group structure near the identity.
Casimir operators. A Casimir is an element of the universal enveloping algebra of that commutes with all generators. By Schur's lemma, it acts as a scalar multiple of the identity on every irrep, providing a labelling of irreps. The number of independent Casimirs equals the rank of (= dimension of a Cartan subalgebra).
Examples:
- has rank 1; the single Casimir is with eigenvalue .
- has rank 2; two Casimirs label irreps by Dynkin labels .
- The Poincaré algebra has two Casimirs: (mass-squared) and (Pauli–Lubanski-squared, related to spin). See QFT/preliminaries.md § Casimir invariants.
Direct products and semidirect products
Two ways of building a group from smaller pieces.
Direct product
The direct product is the set of ordered pairs with componentwise multiplication:
Both factors are normal subgroups, and they commute with each other inside the product (elements of and , viewed as subgroups, satisfy ). The Lie algebra is the direct sum with .
Semidirect product
The semidirect product combines a normal factor and a non-normal factor that acts on by automorphisms. The data is a homomorphism
so each defines an automorphism of . The underlying set is pairs , but multiplication is twisted by :
The second entry is first transformed by before being combined with . The inverse is .
Key structural properties:
- embedded as is a normal subgroup (the kernel of the projection ).
- embedded as is a subgroup, but generally not normal.
- Inside the product, and do not commute: . When is the trivial action ( for all ), the semidirect product degenerates to the direct product .
- The Lie algebra is with given by the derivative of at the identity of .
Examples.
- Poincaré group : Lorentz transformations act on translations by . The twisted multiplication encodes the physical fact that translating then boosting is not the same as boosting then translating — the translation vector itself gets rotated. Translations form the normal subgroup; the Lorentz group is not normal.
- Euclidean group : the same structure with rotations acting on translations. The little group of a massless particle is the 3D analogue (see QFT/preliminaries.md § Little groups).
- Dihedral group : reflection acts on rotation by inversion (). Symmetries of a regular -gon.
- Affine group : translations and linear transformations of .
- Trivial direct product is the special case where is trivial: .
When does a group decompose as a semidirect product? If has a normal subgroup and a subgroup with and (a split extension), then with (conjugation inside ). Not every extension splits: e.g. has the normal subgroup , but there is no complementary giving (such a product would be the Klein four-group instead).
Compact vs. non-compact
A Lie group is compact if its underlying manifold is compact (closed and bounded for matrix groups). Compactness has dramatic consequences for representation theory:
- Compact groups (, , , ): all finite-dimensional irreps are unitary; every unitary rep decomposes as a direct sum of finite-dimensional irreps.
- Non-compact groups (Lorentz , Poincaré , , ): finite-dimensional reps are generically non-unitary; unitary reps are infinite-dimensional.
This is why field components transform under finite-dimensional non-unitary reps of , while Hilbert-space states must transform under infinite-dimensional unitary reps of (Wigner's classification).
Connectedness and discrete components
A Lie group can have multiple connected components (e.g. the Lorentz group has four — see QFT/preliminaries.md § The Lorentz group). The component containing the identity is the identity component , a normal subgroup; the component group is discrete.
A connected Lie group can be simply connected or multiply connected, depending on whether all loops contract to a point. The universal cover is the simply connected Lie group with the same Lie algebra , related to by a covering homomorphism whose kernel is a discrete (central) subgroup. Examples:
Universal covers matter in physics because, by the projective-representation theorem, projective reps of are ordinary reps of . Quantum mechanics admits projective reps (states are rays), so the relevant symmetry group is always , not itself.
Worked examples: and
Two compact Lie groups that exercise nearly all of the machinery above without notational overhead.
— rotations in the plane
is the group of rotation matrices
- Group axioms: (closure + associativity); (identity); (inverses). Abelian.
- Topology: is diffeomorphic to the circle — compact and connected, but not simply connected (the loop does not contract). Its universal cover is (the additive group) with covering map and kernel .
- Lie algebra: , 1-dimensional. Single generator (anti-Hermitian); the physics-convention generator is . Then . No structure constants (1-dim algebra is automatically abelian, ).
- Representations: All irreps of are 1-dimensional (compact + abelian ⇒ irreps are characters). They are labelled by an integer : Integer because . Non-integer would give multi-valued reps — these are projective reps of , equivalently ordinary reps of the universal cover .
- Casimir: , eigenvalue 1 on every irrep — trivial, since rank = 1 and the algebra is 1-dimensional. The genuine label is .
- Physics use: rotations in the plane (e.g. for 2D systems), the gauge group of QED ( as Lie groups, with a different identification: is parameterized by , which is also a circle).
— the simplest non-abelian compact group
is the group of complex unitary matrices with determinant 1:
- Group axioms: standard matrix multiplication; identity ; inverse . Non-abelian (matrix multiplication doesn't commute in general).
- Topology: the constraint identifies with the unit 3-sphere — compact, connected, and simply connected (so is its own universal cover). It double-covers :
- Lie algebra: — anti-Hermitian traceless matrices, real-3-dimensional. Conventional generators are for , with the Pauli matrices: Hermitian and traceless, so . Structure constants: , so (the Levi-Civita symbol).
- Casimir: rank 1 ⇒ one Casimir, on the defining rep. On a general spin- irrep it acts as .
- Representations: irreps labelled by , of dimension . The defining rep () is the spinor rep; is the vector rep that descends to the defining rep of (since admits only integer- reps). Half-integer reps are projective reps of , equivalently ordinary reps of — this is the deep reason a rotation flips the sign of a spin- wavefunction.
- Physics use: spin in non-relativistic QM, isospin, the factor of the electroweak gauge group, and (in complexified form ) the Lorentz algebra.
Side-by-side summary
| Dimension | 1 | 3 |
| Topology | ||
| Connected | yes | yes |
| Simply connected | no | yes |
| Universal cover | itself | |
| Abelian | yes | no |
| Compact | yes | yes |
| Generators | (1) | (3) |
| Structure constants | trivial | |
| Rank / # Casimirs | 1 / 1 | 1 / 1 |
| Casimir eigenvalue (irrep) | ||
| Irreps | 1-dim, labelled by | -dim, labelled by |
Why this matters
Group theory is the language in which every QFT symmetry is expressed:
- Spacetime symmetries (Lorentz, Poincaré, conformal, supersymmetry) — see QFT/preliminaries.md § Lorentz and Poincaré Groups.
- Internal symmetries (, , ) — gauge groups of the Standard Model.
- Discrete symmetries ( and their products).
- Spontaneous symmetry breaking is the breaking of the global / gauge symmetry group to a subgroup , with the coset parameterizing the Goldstone modes.
- Anomalies are obstructions to lifting a classical symmetry to the quantum theory, classified by group cohomology of .
References
- Hall, Lie Groups, Lie Algebras, and Representations — modern mathematical reference.
- Fulton & Harris, Representation Theory: A First Course — finite groups and Lie algebras with concrete examples.
- Tung, Group Theory in Physics — physics-oriented, with detailed Poincaré-group treatment.
- Georgi, Lie Algebras in Particle Physics — pragmatic computational reference.
- Weinberg, The Quantum Theory of Fields, Vol. 1, Ch. 2 — the Poincaré-group classification using the structure laid out here.
Clifford Algebras
This document collects the abstract definition and structure theory of Clifford algebras — the algebraic objects underlying gamma matrices, Dirac spinors, Spin groups, and the relativistic-fermion fields used throughout QFT. It is intended as a reference companion: physics docs link here for the algebra they use (e.g. the Dirac algebra in QED/historical.md § 1.3), and this page stays general.
For the Lie group / Lie algebra prerequisites (representations, universal covers, , ) see group-theory.md.
1. Abstract Definition
1.1 Tensor-algebra construction
Let be a finite-dimensional vector space over a field (here or ) equipped with a non-degenerate symmetric bilinear form , equivalently a quadratic form with and polarization .
The Clifford algebra is the associative unital -algebra
where is the tensor algebra over and the brackets denote the two-sided ideal generated by the displayed elements. Concretely: take all formal words in vectors of , multiply by concatenation, and impose
Polarizing this relation (apply it to , expand, and subtract):
This is the defining anticommutation relation: vectors do not commute, but their failure to commute is exactly twice the inner product. The two boxed forms are equivalent.
1.2 Universal property
is characterized — up to unique isomorphism — by the following universal property:
For any associative unital -algebra and any linear map satisfying for all , there exists a unique algebra homomorphism extending .
In other words, is the "freest" associative algebra in which the boxed relation holds. Every concrete realization of vectors-as-things-squaring-to- — gamma matrices, complex numbers, quaternions, etc. — is a representation (i.e. an algebra homomorphism from to some matrix algebra) of one of these.
1.3 Dimension and basis
Pick an orthogonal basis of with (always possible by Gram–Schmidt for non-degenerate over ; over all can be taken ). The defining relation reduces to
Any product of the can therefore be reordered (with sign flips) so each appears at most once and in a fixed canonical order. The independent monomials are indexed by subsets :
with . These form a basis of as a vector space, so
The grading by recovers the -fold antisymmetric products — see § 2 for the relation to the exterior algebra.
2. Relation to the Exterior Algebra
The exterior algebra is the special case :
When the relation is exact, so all products are antisymmetric; this is the Grassmann algebra used in differential geometry (differential forms) and in fermionic path integrals.
For general , define the antisymmetric piece of a product:
So Clifford multiplication splits as
a symmetric (scalar) piece plus an antisymmetric (2-form) piece. Higher products extend this: every Clifford product is a sum of -form parts. As a vector space (forgetting the multiplication)
with the same total dimension . Turning on "deforms" the multiplication so that products absorb scalar pieces, but the underlying vector space is unchanged. This is sometimes called the Chevalley identification.
3. Standard Notations and Signatures
Over , by Sylvester's law of inertia, every non-degenerate quadratic form has a signature with positive and negative squares (). Write
Equivalently, the orthogonal generators satisfy for and for . The Lorentzian conventions of physics:
- : signature , the Dirac algebra with satisfying , . Used in this repo. See QED/historical.md § 0.1.
- : signature , the "mostly-plus" convention common in GR-flavored texts. Not isomorphic to over (they are different real algebras), but isomorphic after complexification.
Over , signature stops mattering: any quadratic form can be diagonalized to , so there is a unique complex Clifford algebra in each dimension,
This is the version directly relevant to QFT representations (since spinors are complex).
4. Small-Dimensional Examples
| Algebra | Generators | Isomorphic to | |
|---|---|---|---|
| none | 1 | ||
| , | 2 | (split-complex; projectors ) | |
| , | 2 | (with ) | |
| , both | 4 | ||
| , both | 4 | (quaternions; ) | |
| 4 | |||
| Euclidean 3-space | 8 | ||
| "negative" Euclidean 3-space | 8 | ||
| (Dirac) | 16 | ||
| (mostly-plus convention) | 16 | ||
| complexified | 16 |
Two structural takeaways from the small cases:
- Complex numbers, quaternions, and split-complex numbers are all Clifford algebras — historically, this is how Clifford (1878) found his algebras: by generalizing Hamilton's quaternions and Grassmann's exterior algebra into a single framework.
- The 16-dim Dirac algebra is isomorphic to over — i.e. matrices of quaternions — and to after complexification. The latter is what guarantees the Dirac, Weyl, and Majorana representations of § 0.1 of QED-historical exist and are all equivalent.
5. Classification: Bott Periodicity
The complete classification of real Clifford algebras (Atiyah–Bott–Shapiro 1964, building on earlier work) shows that is always a matrix algebra (or a direct sum of two) over , or , and the pattern is periodic in for real algebras and periodic in for complex algebras. The full table:
| structure | |
|---|---|
| 0 | |
| 1 | |
| 2 | |
| 3 | |
| 4 | |
| 5 | |
| 6 | |
| 7 |
with . For complex Clifford algebras the period is 2:
The minimum faithful matrix dimension (the spinor dimension) is therefore over , and may be larger or smaller over depending on signature:
| Spacetime | Complex spinor dim | Notable signatures (Lorentzian ) |
|---|---|---|
| 2 | 2 | 2D Majorana–Weyl exists |
| 3 | 2 | Real (Majorana) spinors |
| 4 | 4 | Dirac, Weyl (), Majorana () |
| 5 | 4 | No Majorana, no Weyl |
| 6 | 8 | Weyl exists |
| 9 | 16 | Majorana exists |
| 10 | 16 | Majorana–Weyl (relevant for superstrings) |
| 11 | 32 | Majorana (M-theory) |
This pattern — when each kind of spinor (Dirac/Weyl/Majorana/Majorana–Weyl) exists — is governed entirely by the algebraic classification above, and is a non-trivial input to supergravity and string theory.
6. Pin and Spin Groups
Sitting inside (the multiplicative group of invertible elements) are two distinguished subgroups built from products of unit vectors.
6.1 Definitions
A unit vector is with (i.e. ).
- = subgroup of generated by all unit vectors of .
- = subgroup of generated by products of an even number of unit vectors.
Equivalently, , where is the even subalgebra (linear combinations of with even).
6.2 Connection to the orthogonal group
Every element acts on by the twisted adjoint action
where is the grading automorphism . This map sends to the orthogonal group and to the special orthogonal group :
These are double covers (kernel ). The Spin one is the universal cover of in dimensions — see group-theory.md § Connectedness and discrete components.
6.3 Worked example:
For the physically central case :
- The even subalgebra has dimension .
- (4×4 real-dim = 8 complex-dim).
- .
So — exactly the universal cover used in QFT/preliminaries.md § Universal cover and foundations-modern.md to handle spinor representations. The Clifford-algebra construction is where this universal cover comes from algebraically; the Lorentz-group story only sees it as "the simply connected group with the same Lie algebra as ".
Other physically relevant cases:
| Group | to | Where it appears |
|---|---|---|
| (double cover via ) | trivial | |
| non-rel. spin, isospin | ||
| Euclidean Lorentz / Wick-rotated | ||
| Lorentz spinors | ||
| SYM R-symmetry | ||
| — | superstring spinor | |
| — | GUT (one fermion family = rep) |
7. Spinor Representations
A spinor representation of is an irreducible representation that does not descend to — i.e. it is genuinely double-valued on the orthogonal group, single-valued only on the cover. The available spinor types in signature are governed by the Bott classification of § 5:
- Dirac spinor. A faithful representation of the full complex Clifford algebra , restricted to . Always exists; dimension .
- Weyl spinor. When is even, has a central element with . Its eigenspaces give two inequivalent half-dimensional irreps of — the left- and right-handed Weyl spinors. For : , projectors .
- Majorana spinor. When a real structure (charge conjugation ) on the spinor space commutes with the action and squares to , we can impose , halving the real dimension. Exists in certain signatures by the Bott table — including (real Majorana, 4 real components).
- Majorana–Weyl spinor. Both conditions simultaneously. Requires Weyl projector and charge conjugation to be compatible. By the Bott table this exists only in signatures with , including Minkowski — the dimension that makes superstring theory consistent.
The story in specifically is exactly the table in QED/historical.md § 0.1: the Dirac (standard), Weyl (chiral), and Majorana representations of the -matrices are three concrete realizations of the same abstract Clifford module on , related by similarity transformations.
8. Bilinear Covariants (Dirac field, )
The 16-dim basis of classifies the independent Lorentz-covariant bilinears that can be built from a single Dirac field:
| Length | Count | Bilinear | Lorentz type | |
|---|---|---|---|---|
| 0 | 1 | Scalar | ||
| 1 | 4 | Vector | ||
| 2 | 6 | Antisymmetric tensor | ||
| 3 | 4 | Pseudo-vector (axial current) | ||
| 4 | 1 | Pseudo-scalar |
Total . These are the building blocks of every Lorentz-invariant fermion interaction:
- in QED (vector × vector).
- in the weak interaction (V−A coupling, parity-violating).
- as the magnetic-moment term (Pauli term).
- four-fermion contact interactions (Fermi theory, NJL, EFT).
The 16-dim count is therefore not abstract trivia — it is exactly the dimension of the operator basis you build interactions from.
9. Pointers and References
This document is a reference; for derivations and worked examples in the physics context:
- QED/historical.md § 0.1 — Gamma matrices, Dirac spinors, notation: the three concrete -matrix representations and bilinear notation conventions used throughout the QFT docs.
- QED/historical.md § 1.3 — Derivation of the Clifford algebra and gamma matrices: how falls out of squaring the Dirac operator (Route A above).
- QFT/preliminaries.md § Universal cover : the Lorentz/Spin story from the Wigner-classification side.
- group-theory.md § Connectedness and discrete components: the general universal-cover machinery that instantiates.
External references:
- Lawson & Michelsohn, Spin Geometry — comprehensive mathematical reference (Cl, Spin, Dirac operators, K-theory).
- Hamilton, Mathematical Gauge Theory — Chapters 8–9 give a physics-friendly Clifford / Spin treatment.
- Atiyah, Bott, Shapiro, "Clifford modules" (Topology 3, 1964) — the original classification paper.
- Doran & Lasenby, Geometric Algebra for Physicists — Clifford algebra as a unified language for classical mechanics, EM, and relativity (a different pedagogical tradition).
- Polchinski, String Theory Vol. 2, App. B — Clifford-algebra / spinor tables in arbitrary dimensions, indispensable for SUSY / string spectrum.
Topology, Manifolds, and Smooth Structure
This document collects the point-set topology and differential-geometry definitions that physics texts (including this repository) generally take for granted: topological space, continuity, manifold, smooth structure, smooth map, tangent space, and friends. It is intended as a reference, not a course; for a real treatment see Lee, Introduction to Smooth Manifolds; Munkres, Topology; Nakahara, Geometry, Topology and Physics.
1. Topological Spaces
1.1 Definition
A topological space is a pair where is a set and is a collection of subsets called open sets, satisfying:
- (T1) and .
- (T2) Arbitrary unions of open sets are open: .
- (T3) Finite intersections of open sets are open: .
The collection is called the topology on . A subset is closed if its complement is open.
1.2 Continuity
A map between topological spaces is continuous if the preimage of every open set is open:
For with the standard topology this is equivalent to the - definition. The advantage of the topological formulation is that it generalizes to arbitrary spaces with no notion of distance.
A homeomorphism is a bijective continuous map with continuous inverse. Two spaces are homeomorphic () if there exists a homeomorphism between them — they are then "the same" topologically.
1.3 Examples
- Discrete topology: — every set is open. All maps from a discrete space are continuous.
- Indiscrete topology: — only and are open. Few maps to an indiscrete space are continuous.
- Standard topology on : open sets are arbitrary unions of open balls .
- Subspace topology: for , .
- Product topology: for , generated by sets of the form with .
- Quotient topology: for an equivalence relation on , declare open iff its preimage in is open.
1.4 Useful properties
- Hausdorff (or ): for any two distinct points , there exist disjoint open sets and . This separates points by neighbourhoods. (Most physically reasonable spaces are Hausdorff.)
- Second countable: there exists a countable collection of open sets such that every open set is a union of 's. Roughly, "the space is not too big" — necessary to avoid pathologies in manifold theory.
- Connected: cannot be written as a disjoint union of two non-empty open sets. The Lorentz group has four connected components (see QFT/preliminaries.md).
- Compact: every open cover has a finite subcover. For subsets of , compact = closed and bounded (Heine–Borel).
- Path-connected: any two points can be joined by a continuous path .
- Simply connected: path-connected, and every loop contracts to a point. The universal cover of a connected space is the simply connected version (see Universal cover in QFT/preliminaries.md).
2. Manifolds
2.1 Topological manifold
A topological manifold of dimension is a topological space that is:
- Hausdorff,
- Second countable,
- Locally Euclidean of dimension : every point has an open neighbourhood together with a homeomorphism to an open subset of .
The pair is called a chart (or local coordinate system); the components of are local coordinates.
The intuition: "looks like near every point", but globally may have non-trivial topology (a sphere is a 2-manifold, but no single chart covers all of it).
2.2 Atlas and smooth structure
A smooth atlas on is a collection of charts covering (i.e. ), such that any two charts are smoothly compatible: whenever , the transition map
is a smooth () map between open subsets of .
What "smooth" means. A map between open subsets of and is smooth (or ) if all partial derivatives of all orders exist and are continuous. In coordinates, has components , and smoothness means exists and is continuous for all . This is the standard multivariable-calculus notion; the manifold framework just lets us say "smooth" on more general spaces by reducing to coordinate patches.
A smooth structure on is a maximal smooth atlas (one that contains every chart smoothly compatible with all charts in the atlas). A smooth manifold is a topological manifold equipped with a smooth structure.
2.3 Smooth maps between manifolds
Given smooth manifolds and , a map is smooth if for every chart and with , the coordinate representation
is smooth in the standard sense.
A diffeomorphism is a smooth bijective map with smooth inverse. Two smooth manifolds are diffeomorphic if there is a diffeomorphism between them.
2.4 Examples
- itself is a smooth -manifold, with a single global chart .
- The -sphere is a smooth -manifold; it requires (at minimum) two charts (e.g. stereographic projection from north and south poles).
- The torus .
- The real projective space = lines through the origin in .
- Lie groups: is an open subset of and inherits a smooth structure; , , etc. are smooth submanifolds carved out by polynomial constraints.
- The graph of any smooth function is a smooth -manifold sitting inside .
2.5 Manifolds with boundary
Replace "locally Euclidean of dimension " with "locally homeomorphic to either or the half-space ". Points mapped to form the boundary , itself a manifold of dimension . Examples: the closed disk , the closed interval .
3. Tangent Spaces
At each point of a smooth -manifold there is a vector space of dimension , the tangent space at . Three equivalent definitions, each useful in different contexts:
3.1 Velocities of curves
A smooth curve through is a smooth map with . Two such curves are equivalent () if for any chart around ,
The tangent space is the set of equivalence classes .
3.2 Derivations
A derivation at is a linear map satisfying the Leibniz rule
The space of all derivations at is . In a chart with coordinates , every derivation is uniquely a linear combination
so the partial derivatives form a basis.
3.3 Coordinate vectors
In a chosen chart, a tangent vector is just an -tuple , with the transformation rule under change of chart :
This is the "physicist's definition" — a tangent vector is "something with an upper index that transforms as a contravariant vector".
3.4 The tangent bundle
The disjoint union is itself a smooth manifold of dimension , called the tangent bundle. A vector field is a smooth map with — equivalently, a smooth section of the projection .
4. Lie Groups (where physics meets topology)
A Lie group is a smooth manifold that is also a group, such that the group operations
are smooth maps. Equivalently, multiplication and inversion are differentiable when expressed in local coordinates.
For matrix Lie groups (subgroups of defined by smooth equations on matrix entries — see group-theory.md § Standard matrix groups), the definition reduces to: the group elements depend smoothly on a finite number of real parameters, and so do products and inverses. Concretely:
- , , are all smooth submanifolds of the appropriate , defined by smooth polynomial equations (, , etc.).
- The Lorentz group is a smooth submanifold of defined by .
- The Poincaré group is the semidirect product of (a manifold) and (a manifold) — itself a manifold.
Lie algebra as the tangent space at the identity
The Lie algebra of is by definition
the tangent space at the identity element . It is a vector space of dimension , equipped with a Lie bracket inherited from the group structure (for matrix Lie groups, this is the matrix commutator ).
The exponential map takes a Lie algebra element to a one-parameter subgroup. See group-theory.md § Lie groups and Lie algebras for the physics-oriented treatment using these definitions.
5. Why physics gets away with skipping all this
Almost every Lie group used in physics is a matrix group — a closed subgroup of for some . For matrix groups:
- "Smooth manifold structure" reduces to "matrix entries depend smoothly on parameters", which is intuitively obvious from calculus.
- The Lie algebra is a subspace of the matrices , with bracket = commutator.
- The exponential map is the matrix exponential .
So most QFT texts (Peskin–Schroeder, Schwartz, Srednicki, Weinberg, Mandl–Shaw, Georgi) skip the manifold formalism entirely and proceed by example, defining each Lie group concretely as a matrix group. This document exists for the reader who wants the formal definitions filled in.
When the manifold structure does become essential:
- General relativity: spacetime is a non-trivial 4-manifold, and the entire formalism of GR (covariant derivatives, curvature, geodesics) is differential geometry on this manifold. See e.g. Carroll, Spacetime and Geometry; Wald, General Relativity.
- Gauge theory / fibre bundles: the global structure of gauge fields (instantons, monopoles, anomalies) requires viewing gauge fields as connections on principal bundles, which needs full manifold theory. See Nakahara, Geometry, Topology and Physics.
- Topological QFT: the entire framework lives on manifolds, where the topology of the manifold is the central object of study.
6. Pointers
- This repository: math/group-theory.md, QFT/preliminaries.md § Lorentz and Poincaré Groups.
- Lee, Introduction to Topological Manifolds / Introduction to Smooth Manifolds: standard mathematics references.
- Munkres, Topology: standard reference for point-set topology (Chs. 2–4).
- Nakahara, Geometry, Topology and Physics: physics-friendly comprehensive treatment.
- Bredon, Topology and Geometry: a more advanced unified reference.
Physics
Notes on the formal structure of modern physics, organized around the postulational presentations of quantum mechanics and quantum field theory.
Contents
- Quantum Mechanics — non-relativistic QM presented from its seven standard postulates, plus the Heisenberg and path-integral reformulations.
- Quantum Field Theory — relativistic QFT in the Wightman axiomatic style: postulates, the modern Wigner–Weinberg derivation, Fock space, observables (cross sections, decay rates, collider measurements), and specific theories (QED, QCD, electroweak, the Standard Model).
Mathematical prerequisites
The physics pages assume working familiarity with the structures collected in the Mathematics section:
- Group theory — Lie groups, Lie algebras, representations.
- Clifford algebras — for Dirac spinors and relativistic fermions.
- Topology and manifolds — for spacetime and gauge-bundle backgrounds.
Reading order
QM → QFT. Within QFT the preliminaries → postulates → modern foundations → Fock space → particles-as-excitations chain is linear; the observables and specific theories subtrees are independent and can be read in any order.
Quantum Mechanics
Notes on the formal structure of non-relativistic quantum mechanics.
Contents
- Mathematical Preliminaries — Hilbert spaces, bra–ket notation, operators, eigenstates, observables, tensor products, density operators.
- Postulates of Quantum Mechanics — the seven standard postulates: state space, observables, Born rule, measurement collapse, Schrödinger evolution, composite systems, symmetrization.
- The Heisenberg Picture — equivalent reformulation in which operators carry the time dependence and states are fixed; comparison with Schrödinger and interaction pictures.
- Path Integral Formulation — Feynman's sum-over-paths formulation, including preliminaries on Lagrangian mechanics, functionals, Wick rotation, and the generating functional.
Mathematical Preliminaries
This section collects the core mathematical objects used to formulate quantum mechanics. Familiarity with linear algebra over the complex numbers is assumed.
Hilbert Space
A Hilbert space is a complex vector space equipped with an inner product that is complete with respect to the norm . The inner product satisfies:
- Conjugate symmetry: .
- Linearity in the second argument: (physics convention).
- Positive-definiteness: , with equality iff .
In quantum mechanics, is taken to be separable (it admits a countable orthonormal basis). Examples: for finite-dimensional systems (e.g. spin), and — square-integrable wavefunctions — for a particle in three-dimensional space.
Notation Key for Common Hilbert Spaces
The symbols and recur throughout these notes. Their meaning:
and
- is the field of complex numbers; , , are the reals, integers, rationals.
- is the -dimensional complex vector space: -tuples with , componentwise addition and complex scalar multiplication.
- It comes with the standard inner product , making it a finite-dimensional Hilbert space.
- Typical uses: for a single qubit / electron spin, for a Dirac spinor, for any internal/spin/flavor index space.
and
is the space of complex-valued functions that are -integrable with respect to the measure :
The "" is just the exponent inside the integrand — different choices give different function spaces:
| Condition on | Name | |
|---|---|---|
| Integrable (absolutely integrable) | ||
| Square-integrable | ||
| general | "-integrable" | |
| Essentially bounded |
So "-integrable" is the umbrella term, of which "integrable" () and "square-integrable" () are the most common special cases. A function being in for one value of does not imply it is in for another — e.g. on is in and but not in .
The "" honors Henri Lebesgue (the integral is the Lebesgue integral, not the Riemann integral). The case is special:
- is the only that is a Hilbert space, with inner product .
- It is the natural space for wavefunctions, since is the probability density (Born rule), and the normalization condition is precisely the condition.
The argument specifies the underlying measure space:
| Notation | Meaning |
|---|---|
| Square-integrable functions on the real line — 1D wavefunctions | |
| or | Square-integrable functions on 3-space — standard 3D wavefunctions |
| Same Hilbert space, viewed in momentum coordinates (related by Fourier transform) | |
| Square-integrable functions on the sphere — angular-momentum eigenfunctions | |
| Square-integrable functions of position vectors — spinless particles |
Two technicalities usually glossed over in physics: elements of are equivalence classes of functions agreeing almost everywhere (the "value at a single point" is not really meaningful), and position eigenstates are not in — they are distributions in a larger rigged-Hilbert-space construction.
Tensor Products and Direct Sums
The symbols and are used to compose Hilbert spaces (see also Tensor Product below):
- — both kinds of data simultaneously (e.g. spatial × spin, particle 1 × particle 2). A vector is a "product" or a sum of such products.
- — either kind of data (orthogonal direct sum). A vector is a pair , with norm .
Common Hilbert-Space Building Blocks
| Hilbert space | Physical system |
|---|---|
| Single qubit; electron spin alone (no spatial degrees of freedom) | |
| Spinless particle on a line | |
| Spinless particle in 3D (Schrödinger / Klein–Gordon wavefunction) | |
| Non-relativistic spin- particle in 3D (Pauli wavefunction) | |
| Single Dirac particle in 3D (see QFT/fock-space-inventory.md §0.8) | |
| non-relativistic spin- particles |
These building blocks are composed via and to give every Hilbert space encountered in QM and QFT. In particular, Fock space (see QFT/fock-space-inventory.md) is built as an orthogonal direct sum of (anti)symmetrized tensor powers of a one-particle space.
Bra–Ket (Dirac) Notation
- A ket denotes a vector in .
- A bra denotes the corresponding dual vector (a continuous linear functional on ), so that is the inner product.
- The outer product is a linear operator acting as .
State Vector
A state vector is a unit vector (i.e. ) representing the physical state of a system. State vectors are defined only up to a global phase: and describe the same physical state.
Linear Operators
A linear operator satisfies .
- The adjoint is defined by for all .
- is Hermitian (self-adjoint) if .
- is unitary if . Unitary operators preserve the inner product: .
- is a projector if and .
Eigenvalues, Eigenstates, and Eigenspaces
For an operator on :
- An eigenvalue is a scalar for which there exists a nonzero vector such that .
- An eigenstate (or eigenvector) is such a vector .
- The eigenspace associated with eigenvalue is the subspace Its dimension is the degeneracy of .
- The spectrum of is the set of its eigenvalues (more generally, the set of for which is not invertible).
For a Hermitian operator, all eigenvalues are real, eigenstates corresponding to distinct eigenvalues are orthogonal, and the eigenstates form a complete orthonormal basis of (spectral theorem).
Observable
An observable is a Hermitian operator representing a measurable physical quantity. By the spectral theorem it admits the decomposition
where are the (real) eigenvalues and is the projector onto the eigenspace . The projectors satisfy and (completeness relation).
Expectation Value
The expectation value of an observable in the state is
It is the statistical mean of measurement outcomes over many identically prepared systems.
Commutator
The commutator of two operators is . Two observables can be measured simultaneously with arbitrary precision (they share a common eigenbasis) if and only if .
Tensor Product
For Hilbert spaces and with bases and , the tensor product has basis . A general element is
A state is separable if it can be written as , and entangled otherwise.
Density Operator (brief)
A more general description of a quantum state — including statistical mixtures — is given by a density operator : a Hermitian, positive semidefinite operator with . A pure state corresponds to ; a mixed state is a convex combination with and .
Postulates of Quantum Mechanics
See also: Mathematical Preliminaries for definitions of Hilbert space, observables, eigenstates, etc.
Postulate 1 — State Space
The state of an isolated physical system is completely described by a unit vector (the state vector) in a complex separable Hilbert space , called the state space of the system. Two state vectors that differ only by a global phase represent the same physical state.
Postulate 2 — Observables
Every measurable physical quantity (an observable) is represented by a self-adjoint (Hermitian) linear operator acting on the Hilbert space . The possible outcomes of a measurement of are the eigenvalues of the operator:
Because is Hermitian, its eigenvalues are real and its eigenvectors form a complete orthonormal basis of .
Postulate 3 — Measurement (Born Rule)
If the system is in the normalized state , the probability of obtaining the eigenvalue when measuring the observable is
assuming a non-degenerate spectrum. For a degenerate eigenvalue with projector onto its eigenspace,
Postulate 4 — Collapse (Projective Measurement)
Immediately after a measurement of that yields the eigenvalue , the state of the system collapses to the normalized projection onto the corresponding eigenspace:
Postulate 5 — Time Evolution
The time evolution of the state vector of a closed quantum system is governed by the Schrödinger equation:
where is the Hamiltonian operator (the observable corresponding to the total energy of the system). Equivalently, evolution is given by a unitary operator such that .
Postulate 6 — Composite Systems
The state space of a composite physical system is the tensor product of the state spaces of its components. If systems and have state spaces and , then the joint system has state space
If the subsystems are prepared independently in states and , the joint state is . General states in may be entangled and not expressible as such a product.
Postulate 7 — Symmetrization (Identical Particles)
The state vector of a system of identical particles is either fully symmetric under the exchange of any two particles (bosons, integer spin) or fully antisymmetric (fermions, half-integer spin):
The Heisenberg Picture
The Heisenberg picture is one of three mathematically equivalent formulations of the time evolution prescribed by Postulate 5 of quantum mechanics — the others being the Schrödinger picture (used implicitly in the postulates page) and the interaction (Dirac) picture. The three pictures are related by a unitary change of basis in time and predict identical physical results; they differ only in what carries the time dependence.
1. The Three Pictures at a Glance
Take a closed quantum system with time-independent Hamiltonian . The unitary time-evolution operator (see postulates §5) is
| Picture | States | Operators | Equation of motion |
|---|---|---|---|
| Schrödinger (S) | time-independent | ||
| Heisenberg (H) | time-independent | ||
| Interaction (I) | mixed (states under , operators under ) |
All three reproduce identical matrix elements:
2. Definition
Choose a reference time at which the Heisenberg and Schrödinger pictures coincide. For all later times define:
- States are frozen:
- Operators carry all time dependence: At the reference time, .
The Hamiltonian itself is the same in both pictures whenever it is time-independent, since :
3. The Heisenberg Equation of Motion
Differentiating the definition with respect to (and using ) gives the Heisenberg equation:
The last term is present only if has explicit time dependence (e.g. a time-dependent external field); for most observables (, , the components of angular momentum, etc.) it vanishes and the equation reduces to
This is the direct quantum analogue of the classical Hamilton equation with the Poisson bracket replaced by — a manifestation of the canonical quantization correspondence
4. Conserved Quantities
If has no explicit time dependence and , then is constant in time:
Symmetries of the Hamiltonian thus give rise to conserved Heisenberg-picture observables, just as in classical mechanics. The energy itself is always conserved (since ) for time-independent .
5. Example: Harmonic Oscillator
For , the Heisenberg equations are
which are formally identical to the classical equations and have solutions
Equivalently, in terms of ladder operators (which satisfy , ),
The Heisenberg picture makes the quantum-classical correspondence completely transparent here: operators evolve along orbits indistinguishable from classical phase-space trajectories.
6. Example: Free Particle
For ,
so and . The unequal-time commutator is non-trivial:
This kind of unequal-time commutator becomes central in QFT, where it underlies microcausality.
7. Equivalence with the Schrödinger Picture
Inserting on either side of the Schrödinger expectation value,
so all measurable quantities (probabilities, expectation values, transition amplitudes) coincide. Eigenvalues of at any fixed are the same as those of — only the eigenvectors shift in time:
The Heisenberg-picture eigenstates evolve backward in time relative to the Schrödinger states they coincide with at .
8. When to Use Which Picture
| Situation | Preferred picture | Why |
|---|---|---|
| Solving the Schrödinger equation for a wavefunction | Schrödinger | The wavefunction is the unknown to be evolved |
| Studying conserved quantities and their algebraic structure | Heisenberg | is manifest |
| Quantum-classical correspondence | Heisenberg | EoM look like classical Hamilton equations |
| Time-dependent perturbation theory | Interaction | Splits "easy" from "hard" |
| Relativistic field theory | Heisenberg | Lorentz-covariant — operators carry the spacetime label |
| Numerical integration of TDSE | Schrödinger | One state vector vs. many evolving operators |
9. Generalization to QFT
In quantum field theory the Heisenberg picture is the standard choice: a field operator carries the full spacetime label, and Lorentz covariance is manifest. The state space is fixed (typically the vacuum or an asymptotic Fock state), and dynamics live entirely in the time evolution of the operators:
Vacuum correlation functions — the central objects of QFT — are inherently Heisenberg-picture quantities. See QFT/preliminaries.md § States vs. Fields.
Path Integral Formulation
Feynman's path integral is an alternative — but equivalent — formulation of quantum mechanics. Instead of evolving a state vector with the Schrödinger equation (see the postulates), it expresses transition amplitudes as a sum over all possible classical trajectories, weighted by a phase determined by the classical action.
To make the logical structure transparent, every subsection below is tagged as one of:
- (Definition) — a mathematical object or notation introduced for use later.
- (Postulate) — a primitive assumption of the path-integral formulation. These are the axioms; everything else follows.
- (Derived) — a result that can be proved from the postulates plus the standard postulates of QM.
The path integral is built on two postulates (P1 and P2 below). Everything else in this document is either a definition or a derived consequence.
1. Preliminaries
The path integral uses several concepts from classical mechanics and functional analysis that go beyond the Hilbert-space framework of the Mathematical Preliminaries. Everything in this section is a definition or a recap of classical mechanics — no quantum-mechanical content yet.
1.1 Lagrangian Mechanics (Definition / classical recap)
In the Lagrangian formulation of classical mechanics, the dynamics of a system with generalized coordinates are encoded in a Lagrangian
where is the kinetic energy and the potential energy. The action is the time integral of the Lagrangian along a trajectory:
Hamilton's principle of stationary action (a postulate of classical mechanics, not of QM) states that the classical trajectory between fixed endpoints is the one for which is stationary, yielding the Euler–Lagrange equations
1.2 Functionals and Functional Derivatives (Definition)
A functional is a map from a space of functions to numbers. The functional derivative is defined by the variation
For the action,
1.3 Functional Integration (Definition)
A functional integral generalizes ordinary integration to integration over a space of functions:
The symbol is defined by a limiting procedure (time-slicing):
- Discretize time into slices with .
- Replace the path by its values at the slice points.
- Replace by together with an appropriate normalization factor (fixed by Postulate P1 below so that the resulting evolution is unitary).
- Take , .
For oscillatory integrands this limit is formal; it acquires rigorous mathematical meaning only after Wick rotation (§1.4), where it becomes the Wiener measure of Brownian motion.
1.4 Wick Rotation (Definition)
A Wick rotation is the analytic continuation (with real). It maps the Lorentzian time axis to the Euclidean one and converts oscillatory factors into exponentially damped ones:
where is the Euclidean action, obtained from by replacing and changing the sign of the kinetic term so that is positive-definite.
1.5 Stationary-Phase / Saddle-Point Approximation (Definition / mathematical lemma)
For an integral in the limit , the integrand oscillates rapidly except near stationary points . Expanding to quadratic order,
The infinite-dimensional generalization applies to path integrals: in the limit they are dominated by paths satisfying . This is a fact about oscillatory integrals, not about quantum mechanics.
1.6 Time Ordering (Definition)
The time-ordering operator rearranges a product of operators in order of decreasing time argument:
with an extra sign for each pair-exchange of fermionic operators.
1.7 Propagator (Definition)
The propagator (or transition amplitude) is the position-basis matrix element of the time-evolution operator:
This is purely a definition in terms of objects already present in the canonical formulation (see postulates).
2. The Postulates of the Path Integral
These two statements are the entire content of the path-integral formulation. Everything in §3 is derived from them (combined with the standard postulates of QM).
Postulate P1 — Amplitude as a Sum Over Paths
The propagator is given by a functional integral over all continuous paths connecting to :
where is the classical action along the path. Each path contributes a complex number of unit modulus with phase .
Concretely, using the time-slicing definition (§1.3), for a non-relativistic particle with ,
The normalization factor is part of the postulate — it is fixed by requiring that the resulting evolution be unitary.
Postulate P2 — Composition (Superposition of Alternatives)
The amplitude for a process composed of two successive sub-processes is the integral over intermediate alternatives of the product of sub-amplitudes:
Note. P2 is not logically independent of P1 — given the time-slicing definition of , P2 follows from P1. In Feynman's original formulation P1 and P2 were stated as independent axioms, with P2 (the "sum over alternatives" rule) playing the conceptual role of replacing the classical-probability composition law. We list both because the conceptual content is split between them.
3. Derived Results
Everything below is a consequence of the postulates above (together with the canonical postulates of QM where applicable).
3.1 Wavefunction Evolution → Schrödinger Equation (Derived)
Given an initial wavefunction , inserting a position completeness relation and applying P1 gives
Expanding the time-sliced form of for an infinitesimal and keeping terms to reproduces the Schrödinger equation . Hence Postulate 5 of the canonical formulation is derivable from P1 — the path integral is a complete alternative to canonical quantization, not an addition to it.
3.2 Classical Limit (Derived)
Applying the stationary-phase approximation (§1.5) to P1 in the limit , the integral is dominated by paths satisfying
These are precisely the classical Euler–Lagrange equations. Classical mechanics therefore emerges from the path integral in the limit; this is a theorem, not an additional postulate.
3.3 Euclidean (Imaginary-Time) Path Integral (Derived)
Wick-rotating P1 with gives the Euclidean propagator
with Euclidean action . This is mathematically a genuine measure (the Wiener measure in the free case) and is the foundation for non-perturbative methods (instantons, lattice quantization, the QFT–statistical-mechanics correspondence).
3.4 Correlation Functions (Derived)
Time-ordered vacuum expectation values of Heisenberg-picture operators have the path-integral representation
The time-ordering on the operator side appears automatically because the under the integral are ordinary commuting numbers.
3.5 Generating Functional (Derived)
A compact way to encode all correlation functions is the generating functional
Time-ordered correlators are obtained as functional derivatives:
This formalism generalizes directly to quantum field theory.
4. Equivalence with the Canonical Formulation
The path-integral and canonical (operator) formulations of quantum mechanics are mathematically equivalent: each can be derived from the other.
- Canonical → Path integral: P1 can be proved by repeated insertion of position-basis completeness relations into and using the Trotter product formula. From this perspective P1 is a theorem, not a postulate.
- Path integral → Canonical: Conversely, taking P1 (and P2) as primitive, one derives the Schrödinger equation (§3.1), the Hilbert-space structure, and the Born rule. From this perspective the canonical postulates are theorems.
Whether to call P1 a "postulate" or a "theorem" therefore depends on which formulation one takes as foundational. The two are interchangeable axiomatic starting points for the same physical theory; their advantages are complementary:
- Canonical formulation: transparent treatment of states, measurement, and the Hilbert-space structure; well-suited to non-relativistic problems and perturbation theory in the interaction picture.
- Path integral: manifest classical limit, natural treatment of symmetries (especially gauge symmetries), straightforward generalization to field theory and curved backgrounds, and a direct bridge to statistical mechanics via Wick rotation.
Quantum Field Theory
Notes on the formal structure of relativistic quantum field theory, in the Wightman axiomatic style.
General formalism
- Definitions and Preliminaries — Minkowski spacetime, the Lorentz/Poincaré groups, Wigner's classification, classical fields and Lagrangians, canonical structure, operator-valued distributions, Fock space, vacuum, correlation functions, the S-matrix, symmetries and Noether currents, gauge fields, regularization and renormalization.
- Postulates of Quantum Field Theory — the ten Wightman-style postulates: relativistic state space, spectrum condition, unique vacuum, field operators, Poincaré covariance, microcausality, spin–statistics, vacuum cyclicity, dynamics from a local action, asymptotic completeness / S-matrix.
- Modern Foundations — Wigner–Weinberg Derivation — the modern derivation of the QFT postulates from three primitive inputs (special relativity + quantum mechanics + cluster decomposition); fields, microcausality, spin–statistics, antiparticles, , and gauge invariance for massless spin-1 all emerge as theorems.
- Fock Space Inventory — what spaces, states, and operators exist after second quantization; clarifies the distinct roles of field operators, ladder operators, mode coefficients, and state vectors.
- Particles as Excitations of Quantum Fields — concrete unpacking of the slogan "the electron is an excitation of the electron field", in terms of specific Fock-space vectors and operators.
- Observables of QFT — the map of experimentally measurable quantities (cross sections, decay rates, branching ratios, asymmetries, bound-state energies, and form factors, IR-safe QCD observables, masses, couplings, etc.) and which part of the QFT machinery produces each. Concrete master formulas live in the
observables/subfolder:- Cross Sections — the master formula ; flux factor, Lorentz-invariant phase space, Mandelstam variables, optical theorem, units (barns).
- Decay Rates — the parallel master formula ; lifetimes, branching ratios, the muon-lifetime worked example.
- Remarks and Open Issues — the Wightman reconstruction theorem, Haag's theorem, gauge theories, the status of rigorous construction, the algebraic (Haag–Kastler) reformulation, and the status of measurement and collapse in QFT.
Specific theories
See theories/ for concrete QFTs:
Definitions and Preliminaries
Familiarity with non-relativistic quantum mechanics (Hilbert spaces, Hermitian operators, the Born rule, etc.) is assumed — see the QM notes. This page collects the additional structures specific to QFT.
Minkowski Spacetime
QFT is formulated on Minkowski spacetime with metric
A spacetime point is denoted (with ). The invariant interval classifies separations as timelike (), lightlike (), or spacelike (). Greek indices run over and are raised/lowered with ; repeated indices are summed (Einstein convention).
Group Theory Prerequisites
The Lorentz, Poincaré, and internal symmetry groups (e.g. , , ) used throughout QFT are all instances of Lie groups with associated Lie algebras and representations. The abstract definitions — group axioms, subgroups and quotients, homomorphisms, representations (unitary, irreducible, projective), the standard matrix groups, Lie groups and Lie algebras (generators, structure constants, Casimirs), direct and semidirect products, compactness, connectedness and universal covers, and worked examples ( and ) — are collected separately in math/group-theory.md. The rest of this page assumes that material as background.
Lorentz and Poincaré Groups
The Lorentz and Poincaré groups are the symmetry groups of special relativity, and are the geometric input of every relativistic quantum theory. They are 6- and 10-dimensional Lie groups respectively; this section catalogues their components, generators, algebra, representations, and the Casimir invariants used to label particles.
The Lorentz group
The Lorentz group consists of linear transformations preserving the metric:
It is a 6-dimensional non-compact Lie group with four disconnected components, distinguished by two -valued discrete invariants:
- — proper () vs. improper ().
- — orthochronous (, preserves the direction of time) vs. non-orthochronous ().
The four components are connected to one another by the discrete operations:
| Component | Symbol | Contains | Got from by |
|---|---|---|---|
| Proper, orthochronous | identity, rotations, boosts | (identity component) | |
| Improper, orthochronous | parity-flipped rotations | ||
| Proper, non-orthochronous | combined rotations | ||
| Improper, non-orthochronous | time-flipped rotations |
The proper orthochronous Lorentz group is the identity-component subgroup; the discrete symmetries and are separate (and may or may not be symmetries of a given physical theory — the weak interaction violates both).
The Poincaré group
Spacetime translations commute with each other but not with Lorentz transformations: combining a Lorentz transformation followed by a translation gives , and the order matters. This is captured by the semidirect product structure:
with group multiplication
The four-dimensional translation subgroup is normal in ; the Lorentz subgroup is not (boosting and translating do not commute). is 10-dimensional (6 Lorentz + 4 translations) and has the same four-component structure as the Lorentz group.
The proper orthochronous Poincaré group is the connected identity component. It is the symmetry group of relativistic physics (excluding the discrete operations, which are treated separately).
Lie algebra and generators
The Lie algebra of has 10 generators:
- 4 translation generators (energy–momentum), .
- 6 Lorentz generators , conventionally split into:
- 3 rotation generators (angular momentum),
- 3 boost generators .
The defining commutation relations are
In rotation/boost form these read:
The combinations satisfy two independent algebras:
so the complexified Lorentz algebra factorizes as . Finite-dimensional representations are therefore labelled by two half-integers — see Representations below. (The are not Hermitian on their own, since is anti-Hermitian in unitary representations; the factorization is at the level of the complexified algebra.)
Universal cover
The proper orthochronous Lorentz group is doubly connected: there are loops in it (e.g. rotations) that cannot be continuously contracted to a point, but all such loops contract after being traversed twice (a rotation). Its universal cover is
where the kernel of the covering map is . The Poincaré universal cover is correspondingly .
Why we care. Single-valued representations of are double-valued representations of — i.e. spinor representations. To accommodate spin- particles (electrons, quarks, neutrinos) the relevant covering group is , not . This is the deep reason a rotation flips the sign of a spinor wavefunction: at the level of the fundamental physical group, " rotation" is not the identity but .
Representations
Finite-dimensional irreducible representations of (= projective irreps of ) are labelled by a pair with , of dimension . The most-used cases:
| Dimension | Field type | Examples | |
|---|---|---|---|
| 1 | Lorentz scalar | Higgs, pion | |
| 2 | left-handed Weyl spinor | left-handed neutrino field | |
| 2 | right-handed Weyl spinor | right-handed component of the electron | |
| 4 | Dirac spinor | electron, quark | |
| 4 | 4-vector | photon , | |
| 6 | self-dual + anti-self-dual 2-form | field strength | |
| 12 | Rarita–Schwinger spinor-vector | gravitino | |
| 9 | symmetric traceless tensor | graviton (linearized) |
The spin of a finite-dimensional representation is ; equivalently, under the rotation subgroup , the rep decomposes as .
These are the representations carried by classical fields and by field operators in QFT. They are all non-unitary (because is non-compact), which is why they describe field components, not Hilbert-space states. Unitary representations of the Poincaré group are infinite-dimensional and are labelled differently — see Wigner's Classification below.
Casimir invariants
The Poincaré algebra has two independent Casimir operators (commuting with all generators):
- — the mass-squared invariant. On any irreducible unitary representation it acts as a scalar .
- , where is the Pauli–Lubanski vector. On an irrep with it acts as with the spin; on a massless irrep its eigenvalues are different (see Wigner classification below).
These two scalars are the only Poincaré-invariant labels of single-particle states, and they are the basis of Wigner's classification.
Little groups
For each non-zero momentum , the little group is the subgroup of Lorentz transformations leaving fixed. Wigner's strategy is to classify states first at a standard momentum (a representative of each Lorentz orbit) by their little-group transformation, then propagate to all other momenta by Lorentz boost.
| Orbit of | Standard | Little group | Physical interpretation |
|---|---|---|---|
| , | (rotations) | massive particle, spin | |
| , | (2D Euclidean) | massless particle, helicity | |
| tachyon (unphysical) | |||
| vacuum | |||
| , but limit | — | "continuous-spin" reps (not observed) |
The little group structure is what produces the discrete spin/helicity quantum numbers attached to single-particle states.
Why this matters
Three things in QFT all trace back to the Poincaré group structure laid out above:
- Wigner classification of single-particle states (next subsection) — uses the Casimirs and little groups.
- Field representations — the table above lists which classical/operator-valued field types are available, used in QFT Postulate 5 and in specifying field content for any specific theory (QED, QCD, ...).
- Spin–statistics, , gauge invariance for massless spin-1 — all derive from the Poincaré structure plus locality / cluster decomposition in the modern story (foundations-modern.md).
Wigner's Classification of Particles
A particle is identified with an irreducible unitary representation of the Poincaré group on a complex separable Hilbert space. Wigner's theorem (1939) classifies these irreps. The result is that every physical irrep is labelled by two Casimirs (mass and spin/helicity); the structure of the classification follows from Mackey's induced-representation theorem applied to the semidirect-product structure .
Statement
The irreducible unitary representations of (or, more precisely, of its universal cover) are classified by:
- Mass-squared with , the eigenvalue of the first Casimir.
- A choice of irrep of the little group of a standard momentum on the corresponding mass shell:
- : little group , irreps labelled by spin .
- : little group , irreps labelled by helicity (with helicity quantized in integer/half-integer units by rotation closure of the universal cover; "continuous-spin" reps are mathematically allowed but empirically absent).
- (tachyons): little group — unphysical, excluded by the spectrum condition (P2).
- : trivial rep — the vacuum.
Sketch of derivation (Mackey induction)
The classification proceeds in five steps. Each is mechanical given the Lie-algebra and topology data already in § Lorentz and Poincaré Groups; we sketch the logic and defer technical proofs to references.
Step 1 — Diagonalize translations. is abelian and normal; on any unitary representation it can be simultaneously diagonalized, with eigenvalues labelled by a four-momentum (the spectrum of ). So the Hilbert space decomposes as a direct integral
with labelling extra degrees of freedom at each .
Step 2 — Lorentz orbits. Lorentz transformations move around. Irreducibility forces the support of to be a single Lorentz orbit (otherwise the representation would split into pieces supported on different orbits). The orbits are:
| Orbit | Standard | Sign of |
|---|---|---|
| Massive forward | , | |
| Massless forward | , | |
| Tachyonic | ||
| Trivial |
(Backward-pointing orbits with are excluded by the spectrum condition.)
Step 3 — Pick a standard momentum and identify the little group. For each orbit , pick a representative and define the little group
— the Lorentz transformations that fix . The little groups for each orbit (computed by direct algebra):
| Orbit | Universal cover acting in irreps | |
|---|---|---|
| covered by | ||
The relevant little group is the universal cover (since we want projective reps of , ordinary reps of its universal cover; see math/group-theory.md § Connectedness and discrete components).
Step 4 — Mackey induction: irreps of ↔ irreps of . Mackey's induced-representation theorem (a general result for semidirect products with abelian) states:
The irreducible unitary representations of supported on a Lorentz orbit are in bijective correspondence with the irreducible unitary representations of the little group of any standard momentum .
Concretely, given an irrep of , the induced rep is built on the Hilbert space . A boost acts by , where the Wigner rotation is the little-group-valued rotation that compensates for the -dependent boost (see Weinberg Vol. 1 §2.5 for the explicit construction). Irreducibility on the side ⇔ irreducibility of on the side.
Step 5 — Classify irreps of each little group.
- (massive): irreps are the spin- reps with of dimension . (Standard representation theory; see math/group-theory.md § Worked examples: and .)
- Universal cover of (massless): the abelian subgroup has unitary characters labelled by a vector . Two cases:
- : trivial action, leaving the rotation generator to label irreps by helicity . For the cover, is quantized in .
- : the continuous-spin representations, parameterized by . Allowed mathematically, not observed in nature (their absence is empirical, sometimes posed as an extra "Wigner condition").
- (tachyonic): irreps exist but tachyonic states violate causality — excluded by the spectrum condition (P2).
Conclusion
Combining Steps 1–5: every irreducible unitary representation of supported on a forward orbit (massive or massless, discrete) is labelled by
- mass , and
- spin (massive) or helicity (massless).
This is Wigner's classification. The labels are exactly the two Casimirs and (for ) where is the Pauli–Lubanski vector. The full proof — including the technical analysis of induced representations, projective representations, and continuous-spin exclusion — is laid out in:
- Weinberg, The Quantum Theory of Fields, Vol. 1, Ch. 2 (the standard physics treatment).
- Streater & Wightman, PCT, Spin and Statistics, and All That, Ch. 1 (axiomatic version).
- Tung, Group Theory in Physics, Chs. 9–10 (representation-theoretic emphasis).
- Bargmann–Wigner (1948), the original induced-representation construction.
Classical Fields and Lagrangians
A classical field is a function taking values in some target space carrying a representation of the Lorentz group:
- Scalar field : trivial representation, e.g. the Higgs field.
- Spinor field : spin- representation of , e.g. the Dirac field.
- Vector field : the four-vector representation, e.g. the electromagnetic potential.
- Tensor / spinor-tensor fields: higher-spin generalizations.
Dynamics are encoded in a Lagrangian density , a Lorentz scalar, with action . The classical equations of motion follow from the Euler–Lagrange equations
Worked example: free real scalar field and the Klein–Gordon equation
The simplest non-trivial Lagrangian field theory is a single real scalar with the free scalar Lagrangian
This is essentially uniquely fixed by demanding: Lorentz invariance, polynomial in and , at most two derivatives (for second-order field equations), reflection symmetry , and a kinetic term with conventional sign and normalization. Applying the Euler–Lagrange equation gives
i.e. the Klein–Gordon equation
In this Lagrangian route the Klein–Gordon equation is derived, not postulated; the postulate has moved from "the equation" to "the Lagrangian ".
Three routes to Klein–Gordon
For completeness, the Klein–Gordon equation can be reached three ways, with the genuine input shifting in each:
| Route | KG status | Genuine input |
|---|---|---|
| A. Canonical-substitution heuristic | Postulated by analogy | Take the classical dispersion and apply to a wavefunction. The substitution rule and the choice over are themselves not derived from anything. See QED/historical.md § 1.1 for the historical version. |
| B. Lagrangian field theory | Derived from Euler–Lagrange | The scalar-field Lagrangian . (This subsection.) |
| C. Wigner / Casimir | Theorem | The definition of "spin-0 massive particle" as a Poincaré irrep with first Casimir . On a position-space realization , with translations acting as , this Casimir constraint is . The mass-shell condition from Wigner's Classification automatically forces KG on any scalar interpolating field. |
All three give the same equation. The physical input differs: a substitution rule (A), a choice of Lagrangian (B), or the Casimir-eigenvalue definition of a massive spin-0 species (C). Routes B and C are the modern viewpoint; Route A survives only as a heuristic motivating Dirac's first-order ansatz in QED-historical §1.1.
As a field operator
After quantization (whether canonical or in the Wigner-construction sense), is promoted to an operator-valued distribution on Fock space, still satisfying as an operator equation. Its mode expansion and ladder structure are given below in § Fock Space; its propagator enters Feynman-rule calculations; the Klein–Gordon operator reappears as the external-leg amputation operator in § LSZ Reduction Formula.
The free Dirac field is the spin- analogue (postulating instead, derived in QED/historical.md § 1.4); the free photon field is the massless spin-1 analogue, from in QED/historical.md § 0.2. Each is built by the same Route-B recipe: write down the simplest Lorentz-scalar Lagrangian in the appropriate field, derive the EOM, quantize.
Canonical Structure
The conjugate momentum to a field is
The classical Hamiltonian density is .
Operator-Valued Distributions
In QFT, fields cannot be ordinary operator-valued functions of — products like are too singular. Instead, is an operator-valued tempered distribution: it is well-defined only after smearing against a Schwartz test function ,
yielding an (unbounded) operator on a dense domain .
Fock Space
For a free field of mass and spin , the Hilbert space is the Fock space
or with antisymmetrization for fermions. Here is the one-particle Hilbert space (an irreducible Wigner representation). Fock space is built from a vacuum via creation () and annihilation () operators satisfying canonical (anti)commutation relations:
Terminology — ladder operators. The pair is collectively called ladder operators (or raising/lowering operators): raises the particle number by one (), lowers it (, with ). The name comes from the harmonic oscillator in QM (see QM/heisenberg-picture.md), where the same algebra moves between energy eigenstates on the "ladder" of equally spaced levels. The QFT usage is the same algebra applied per Fourier mode: each momentum mode of a free field is an independent harmonic oscillator, and are its ladder operators. Particle number is the count of excitations across all modes. In QFT-specific contexts "creation/annihilation operators" is more common than "ladder operators", but the terms are interchangeable.
A free scalar field admits the mode expansion
In an interacting theory, no such Fock space exists for the full interacting field (this is the content of Haag's theorem — see Remarks), but Fock spaces remain the appropriate description for asymptotic in/out states.
Vacuum
The vacuum is the lowest-energy, Poincaré-invariant state. For a free theory it is annihilated by all . In an interacting theory the physical vacuum differs from the free (Fock) vacuum and is in general unitarily inequivalent to it.
States vs. Fields: Why QFT Looks "Operator-Heavy"
A reader coming from non-relativistic QM may notice that QFT seems to focus almost entirely on operators (the field operators , , and their correlators) while saying very little about states. This is real — and worth being explicit about.
| Non-relativistic QM | QFT | |
|---|---|---|
| Primary object | State (or wavefunction ) | Field operators |
| Time evolution | State evolves (Schrödinger picture) | Operators evolve (Heisenberg picture) |
| What you compute | , transition amplitudes | , then S-matrix elements |
The state is still primary in principle, but in practice it is usually fixed implicitly to be one of:
- the vacuum — for vacuum correlation functions and most perturbative computations,
- an asymptotic Fock state — for S-matrix calculations,
- a coherent state of a bosonic field — for connecting to classical fields and for IR problems,
- a bound-state wavefunction (positronium, hydrogen) — handled non-perturbatively (Bethe–Salpeter), and rarely written down explicitly,
- a density matrix — for thermal QFT, decoherence, and open systems.
Several reasons drive the operator-centric emphasis:
- Heisenberg picture is manifestly Lorentz-covariant. Putting all spacetime dependence into operators avoids singling out the time slice that the Schrödinger picture requires.
- The state space is Fock space, not . A multi-particle state is a function over arbitrary numbers of particles with arbitrary momenta; there is no useful single "wavefunction in position space" to write down.
- Most observables of interest are scattering amplitudes. Prepare an asymptotic in-state, evolve, take the overlap with an asymptotic out-state — the details of the state during the interaction never appear; only does, computed from operator correlators via LSZ.
- Haag's theorem (see Remarks) says the interacting vacuum and Fock states are unitarily inequivalent to the free ones — there is no concrete Hilbert space on which to write down "the interacting state." So one works with operator correlators and asymptotic states only.
Warning: the Field Operator Is Not a Wavefunction
This is a major terminological collision. In some derivations of QED (notably the historical / Dirac-equation route), starts out as a single-particle relativistic wavefunction — a state. After second quantization, the symbol is reused to denote a field operator on Fock space. From that point on:
- is an operator, not a state.
- is the (improper) state of one particle localized at .
- The matrix element is a Dirac wavefunction, but it's the matrix element of the field operator between two particular states — not the state itself.
Conflating these two meanings of is one of the most common sources of confusion when crossing from QM to QFT.
Time-Ordering Operator
The time-ordering operator is the instruction to permute a product of operators so that those with larger time arguments stand to the left:
where is the permutation that gives . For the two-operator case:
For fermionic operators an extra sign from the permutation is included (so ).
is not an operator on Fock space — it is a notational rule for handling non-commuting operators at different times. It plays a central role in:
- The Dyson series for the S-matrix, (see QED/historical.md §0.6).
- Time-ordered correlators / Green's functions (next subsection), which are the inputs to the LSZ reduction formula.
- Path-integral derivations of operator correlators (see QM/path-integral.md §1.6 for the full discussion in the simpler QM setting).
Notation collision warning. The same symbol is used in QFT for the transition operator — a genuine operator on Fock space — that appears in the splitting . The two meanings are entirely unrelated; context disambiguates: time-ordering always sits in front of a product of operators ( or ); transition sits between a bra and a ket as . See QED/historical.md §0.6 for the corresponding callout in context.
Correlation Functions
The fundamental observables of QFT are vacuum expectation values of products of fields:
- Wightman functions: .
- Time-ordered (Green's) functions: , where orders fields by decreasing (with a sign for fermion exchanges).
Time-ordered correlators are what enter the LSZ reduction formula (next subsection) to compute -matrix elements.
LSZ Reduction Formula
The Lehmann–Symanzik–Zimmermann (LSZ) reduction formula is the bridge between time-ordered correlators of local fields (what perturbation theory and Feynman rules naturally compute) and S-matrix elements between asymptotic states (what observables require).
Setup. Pick any local field with non-zero matrix element to a one-particle state of the species of interest:
Such a is called an interpolating field for that species. Different choices of give the same on-shell -matrix elements (this is the LSZ equivalence of field redefinitions).
The formula for external particles (in the simplest scalar case; spin and species labels suppressed):
In words: each external leg contributes a Klein–Gordon operator (which on the mass shell extracts the residue of the single-particle pole in momentum space) plus a Fourier factor; everything else is the time-ordered -point correlator.
What it actually says. When the external momenta are on-shell (, ), the time-ordered correlator has poles in each external momentum at from the propagation of single-particle intermediate states (the Källén–Lehmann spectral representation). The LSZ recipe extracts the residue at all those poles simultaneously and identifies it with the on-shell -matrix element.
Why this is significant.
- No Hamiltonian required. LSZ takes correlators as input. Correlators can be computed from a Lagrangian via path integrals, but they could equally come from lattice simulations, the conformal bootstrap, integrability, or any other source. So LSZ provides an -free route to -matrix elements (compare the Møller-operator construction in foundations-modern.md §2.0.1, which does require ).
- Field-redefinition equivalence. Replacing for any local function leaves on-shell -matrix elements invariant. This is why effective field theories with different operator bases can describe identical physics, and why the "fundamental field" choice is largely conventional.
- Bridges Feynman rules to observables. Step 6 of any QFT calculation (in QED, QED/historical.md §5.2, and elsewhere) implicitly uses LSZ to convert amputated time-ordered diagrams into -matrix elements: the operators on the external legs cancel against the external propagators of an amputated diagram, leaving exactly (up to wavefunction renormalization factors ).
For the full derivation see Peskin–Schroeder Ch. 7.2 or Weinberg Vol. 1 §10.3. For its role in pinning down asymptotic states without invoking , see foundations-modern.md §2.0.1.
S-Matrix and Cross Sections
The S-matrix maps asymptotic in-states (free particles in the far past) to asymptotic out-states (free particles in the far future):
Writing with , the invariant amplitude determines physical observables: differential cross sections , decay rates , etc. The full master formulas connecting to measurable rates and the necessary kinematic ingredients (flux factor, Lorentz-invariant phase space, units, the optical theorem) are collected separately in Cross Sections and Decay Rates; the broader observable inventory is in Observables.
Symmetries and Noether Currents
A continuous symmetry of the action implies, by Noether's theorem, the existence of a conserved current with and a conserved charge . In the quantum theory, generates the symmetry on operators via .
Symmetries are classified as:
- Global (parameter independent of ) vs. local / gauge (parameter depends on ).
- Internal (acting on field indices) vs. spacetime (Poincaré, conformal, ...).
- Continuous (Lie group) vs. discrete (, , ).
Gauge Fields
A gauge theory has a local internal symmetry . To make the Lagrangian invariant one introduces a gauge field valued in the Lie algebra of , and a covariant derivative , where are generators of in the relevant representation. The field strength
(with structure constants ) generalizes the electromagnetic . Quantization requires gauge fixing to remove redundant degrees of freedom.
Regularization and Renormalization
Naive computations in interacting QFT yield divergent loop integrals. Regularization (cutoff, dimensional, Pauli–Villars, lattice) parametrizes the divergences; renormalization absorbs them into a redefinition of a finite number of parameters (masses, couplings, field normalizations). A theory is renormalizable if this can be done with finitely many counterterms; otherwise it is an effective field theory valid only below some energy scale.
The renormalization group describes how renormalized parameters depend on the chosen energy scale , governed by beta functions .
Postulates of Quantum Field Theory
Quantum field theory (QFT) extends quantum mechanics to systems with infinitely many degrees of freedom and incorporates special relativity. The following postulates are stated in the Wightman axiomatic style, which provides a mathematically precise framework for relativistic QFT in Minkowski spacetime with metric .
See Definitions and Preliminaries for the underlying mathematical objects, and Remarks and Open Issues for the Wightman reconstruction theorem, Haag's theorem, gauge theories, and the status of rigorous construction.
Postulate 1 — Relativistic State Space
The states of the system are unit rays in a complex separable Hilbert space . There exists a strongly continuous unitary representation of the (proper, orthochronous) Poincaré group acting on :
Physical predictions (probabilities, expectation values) are independent of the choice of inertial frame.
Postulate 2 — Spectrum Condition
The generators of spacetime translations form the energy–momentum operator , defined by
Its joint spectrum lies in the closed forward light cone:
This guarantees positive energy in every inertial frame and forbids tachyonic or negative-energy states.
Postulate 3 — Unique Poincaré-Invariant Vacuum
There exists a unique (up to phase) state , the vacuum, that is invariant under all Poincaré transformations:
Uniqueness implements the absence of spontaneous symmetry breaking of the Poincaré group; it is essential for the cluster decomposition property.
Postulate 4 — Field Operators
Quantum fields are operator-valued tempered distributions on Minkowski spacetime: for every test function , the smeared field
is a (generally unbounded) operator on a common dense domain that contains the vacuum and is stable under the action of all and .
Fields are not pointwise operators because generally has infinite norm; smearing with regularizes this.
Postulate 5 — Poincaré Covariance of Fields
Under a Poincaré transformation, the fields transform according to a finite-dimensional representation of the Lorentz group (e.g. scalar, spinor, vector):
This ties the algebraic structure of the fields to the geometry of spacetime.
Postulate 6 — Microcausality (Local Commutativity)
Fields at spacelike-separated points either commute or anticommute, according to their spin–statistics:
with the commutator () for bosonic (integer-spin) fields and the anticommutator () for fermionic (half-integer-spin) fields. This expresses relativistic causality: measurements at spacelike separation cannot influence each other.
Postulate 7 — Spin–Statistics
In a Lorentz-invariant local QFT with positive energy, fields of integer spin must be quantized as bosons (commutators) and fields of half-integer spin as fermions (anticommutators). Any other choice leads to either a non-positive Hilbert space, violation of microcausality, or violation of the spectrum condition. (This is a theorem — the spin–statistics theorem — in the Wightman framework, but it is often listed alongside the postulates.)
Postulate 8 — Cyclicity of the Vacuum
Polynomials in the smeared field operators acting on the vacuum yield a dense subspace of :
Equivalently, every state can be approximated by acting with local field operators on the vacuum. This ensures the field algebra is large enough to describe all physical states.
Postulate 9 — Dynamics from a Local Action
The dynamics of the fields are derived from a Poincaré-invariant, local action
where the Lagrangian density is a polynomial in the fields and their first derivatives at the same spacetime point. Classical equations of motion follow from . The corresponding quantum theory is defined either through:
- Canonical quantization: imposing equal-time (anti)commutation relations on the fields and their conjugate momenta , or
- Path integral quantization: defining correlation functions via
Postulate 10 — Asymptotic Completeness and the S-Matrix
In scattering theory, the framework rests on two non-trivial assumptions, with the existence and unitarity of the S-matrix following as a consequence:
(P10a) Existence of asymptotic states. The Møller operators
exist on a dense subspace of (limits taken in a strong-operator-on-wavepackets sense). They map free Fock states to the corresponding dressed in/out states of the interacting theory:
This fails when interactions do not switch off enough at large times — e.g. long-range Coulomb potentials, or QED with massless soft photons.
(P10b) Asymptotic completeness. The asymptotic in- and out-spaces both equal the full interacting Hilbert space:
That is, every state of the interacting theory is reachable as some asymptotic in-state, and equivalently as some asymptotic out-state. This fails in confining theories (QCD: quarks/gluons are not asymptotic states; only hadrons are).
Consequence: the S-matrix. Given (P10a) and (P10b), the S-matrix
is automatically a unitary operator on . Its matrix elements are computed from time-ordered correlation functions via the LSZ reduction formula.
Status of P10. Provable from the Wightman axioms + isolated mass shells via the Haag–Ruelle theorem (1962); held conjectural for realistic 4D theories like QED and QCD (where rigorous construction is an open problem, see Remarks); modified or replaced in IR-divergent and confining cases (Faddeev–Kulish dressing for soft photons; reformulation on hadron Fock space for QCD; abandoned entirely for conformal field theories without a mass gap).
See foundations-modern.md §2.0.1 for the modern derivation and discussion of failure modes.
Implicit Empirical Inputs
Beyond the ten postulates above, the standard QFT framework silently absorbs several empirical inputs that the modern Wigner–Weinberg derivation makes explicit (see foundations-modern.md §1 and its §7.1 reverse mapping):
- Particle = irreducible unitary Poincaré representation. P1 names "Hilbert space + unitary Poincaré action" but does not say particles are irreps. The identification is logically prior to P1 — it is what makes P1 the natural starting point. See foundations-modern.md §1.1.
- Symmetrization postulate. P7 (spin–statistics) packages two distinct claims: the binary "states are symmetric OR antisymmetric under exchange" and the assignment "integer spin → bosons, half-integer → fermions". The first is a separate postulate (an additional input in that can be relaxed to anyons in d); the second is a theorem given the first plus M1+M2+M3 + locality. Most treatments collapse the two; see foundations-modern.md §1.3.
- Variable particle number. No postulate above states that particle number is non-conserved; this is empirically motivated (relativity allows pair production via ; decays change ; scattering admits different in/out counts) and is silently used by P4 (operator-valued field distributions) and P10 (multi-particle in/out spaces). See foundations-modern.md §1.4.
- Field content (which species, which masses, which couplings). The postulates set the framework; the actual content of any given QFT (QED, QCD, electroweak) is empirical.
These inputs are universally accepted and never controversial in standard QFT — but they are not derivable from the ten postulates alone, and a reader using this page in isolation would not see them flagged anywhere. The modern derivation makes the assumption budget transparent at the cost of more upfront machinery.
Modern Foundations — From SR + QM to the QFT Postulates
This is the Wigner–Weinberg derivation of relativistic quantum field theory: starting from special relativity, ordinary quantum mechanics, and the cluster-decomposition principle, derive the framework whose end-product is the QFT postulates. It is the modern counterpart to the historical / canonical-quantization route, and the conceptual prequel to every specific theory built on the resulting postulates — QED, QCD, electroweak theory, the Standard Model, and any other theory obtained by adding a choice of field content, gauge symmetry, and renormalizable Lagrangian.
The companion in the historical direction is QED/historical.md, which goes the other way (single-particle Dirac equation → quantization → field theory). Where the historical route postulates relativistic wave equations and derives gauge invariance later as a consequence, the modern route here postulates relativistic invariance + cluster decomposition and derives the wave-equation structure, the spin–statistics theorem, the field operator framework, and gauge invariance for massless particles all as theorems.
To keep the logical structure transparent we tag every step as one of:
- (Postulate) — a primitive assumption (an empirical input or general principle).
- (Definition) — mathematical notation introduced for use later.
- (Theorem) — a result that follows from previous postulates plus standard mathematics.
- (Heuristic) — physically motivated step whose rigorous justification is deferred.
The modern route rests on three primitive postulates:
| # | Postulate | Role |
|---|---|---|
| M1 | Special relativity: physics is invariant under the proper orthochronous Poincaré group | Geometric backdrop |
| M2 | Quantum mechanics: states are unit rays in a complex separable Hilbert space; observables are self-adjoint operators; probabilities follow the Born rule; time evolution is unitary | Inherited from QM postulates |
| M3 | Cluster decomposition: distant experiments are uncorrelated. Formally, connected -matrix elements between widely separated wavepackets vanish | Locality / no-spooky-distant-correlations |
Everything in this document — fields as operator-valued distributions, microcausality, spin–statistics, , and the connection to gauge invariance — is derived from M1 + M2 + M3 plus standard mathematical machinery (group representation theory, distribution theory, -algebras).
Reference. Weinberg, The Quantum Theory of Fields, Vol. 1 (1995), Chs. 2–5, is the canonical reference. This document is a structural summary of that derivation, with cross-references to where each ingredient lives in this repository.
0. Preliminaries
The mathematical machinery used below — Lorentz / Poincaré groups, Wigner's classification, Lie algebras, irreducible unitary representations, cluster decomposition, -algebras — is collected in QFT/preliminaries.md. This section names the specific ingredients we will rely on, without re-deriving them.
0.1 The Poincaré group and its Lie algebra
The Poincaré group is the semidirect product of spacetime translations and Lorentz transformations. Its Lie algebra is generated by:
- 4 translation generators (energy-momentum),
- 6 Lorentz generators (3 rotations + 3 boosts ).
Commutation relations:
The Casimir invariants are
with the Pauli–Lubanski vector.
0.2 Cluster decomposition principle (M3, restated)
Let denote a multi-particle scattering state with momenta and other quantum numbers . The connected part of the -matrix, , is defined recursively by removing all "factorizable" pieces. Cluster decomposition demands
Equivalently (and more usefully): the full -matrix factorizes when the experiment splits into well-separated, non-overlapping subexperiments. This is the relativistic generalization of "labs in different cities don't influence each other" and is the seed of locality.
0.3 The minimal additional inputs (where empiricism enters)
On top of M1+M2+M3, the framework needs empirical inputs to specify which QFT we live in:
- A list of species (one-particle representations) observed in nature.
- For each, mass and spin (or, if massless, helicity).
- A choice of interaction consistent with M1+M2+M3.
The first two are pinned down by Wigner's classification (§1 below); the third is constrained but not fixed by the framework, and is where QED, QCD, electroweak theory, etc. diverge as different specializations of the same postulates.
1. The State Space
This section builds the state space of a relativistic quantum theory step by step, tagging each step as (Postulate), (Definition), (Theorem), or (Empirical input) so the assumption budget is transparent. The goal is to expose which assumptions feed into the Fock-space construction — several of them are silently invoked in textbook treatments.
1.1 Defining "particle" (Definition + empirical input)
The Wigner-Weinberg framework rests on a specific operational identification:
Definition. A particle in a relativistic quantum theory is an irreducible unitary representation of the proper orthochronous Poincaré group acting on a complex separable Hilbert space.
This is a definition, not a theorem. The next three subsections unpack what each clause means in concrete physical terms; the definitional content and its empirical commitments are summarised in §1.1.4. Two pieces of empirical input are baked in:
- (Empirical-1) Identification of "particle" with "Poincaré irrep". Justified physically: an elementary excitation should be characterized by quantum numbers (mass, spin) that are invariant under choice of inertial frame. Irreducibility = "cannot be split into uncoupled subspecies" = elementarity. Mathematically tight, but the physical claim that nature's elementary objects are organized this way is empirical.
- (Empirical-2) Hilbert space is complex and separable. Inherited from M2 (standard QM). Real-Hilbert-space and quaternionic-Hilbert-space alternatives exist mathematically but are not used in the Standard Model.
1.1.1 What "representation" means here
A unitary representation of on a Hilbert space is a group homomorphism
where each is a unitary operator on and . Equivalently: every Poincaré symmetry — boost, rotation, translation — is realized as a concrete unitary operator moving state vectors around inside . Three adjectives carry the content:
- Unitary, because Poincaré transformations are symmetries of the probabilistic structure of QM: must be invariant under change of inertial frame.
- Irreducible, meaning no proper -invariant subspace exists: there is no non-trivial closed under all . Physically: the representation does not split into uncoupled subsectors that the Poincaré group keeps separate — anything bigger than an irrep would be several species in one Hilbert space.
- Projective allowed, i.e. equality up to a phase, because physical states are rays not vectors. The projective representations of correspond to ordinary representations of its universal cover (see math/group-theory.md § Group actions and representations and § Connectedness and discrete components). This is what allows half-integer-spin particles.
1.1.2 "Particle = irrep" is a statement about the Hilbert space, not a single state
The most common point of confusion. The definition does not say "a particle is a state vector "; it says
"a particle species" the entire one-particle Hilbert space together with the action on it.
The single space contains all possible states of one specimen of that species (any momentum, any spin orientation, any superposition). The Poincaré group shuffles them around — boosts change momentum, rotations rotate spin, translations multiply by a phase . Irreducibility says you can reach every state in from any other by some Poincaré action (plus superposition); that is what makes the species one species, with no leftover "different kind of electron" sub-Hilbert-space.
| Level | Object |
|---|---|
| Species (e.g. "electron") | An irrep — a Hilbert space with a action, labelled by Casimirs or |
| Particular particle in a particular state | A single vector (modulo phase) |
| identical particles | A vector in or (§1.3) |
| Arbitrary number | A vector in Fock space (§1.4) |
So "what is a particle?" has two answers depending on which question you are asking: the kind of particle is an irrep; a particle in a state is a vector in that irrep's Hilbert space.
1.1.3 The natural basis: momentum and spin, not position
Wigner's classification (§1.2) hands you more than the existence of — it hands you a canonical basis labelled by simultaneous eigenstates of the translation generators together with a little-group component :
where runs over the spin -component (massive case) or the helicity (massless case). A general one-particle state is
with the Lorentz-invariant momentum measure. The momentum-space wavefunction is the natural description because momentum and spin component are Casimir-respecting quantum numbers handed to you by the group action itself.
Why momentum and not position? Three reasons, in order of severity:
- Position is not a generator of the Poincaré algebra. The algebra has (translation generators ↔ momentum) and (Lorentz generators ↔ angular momentum / boosts). There is no . Momentum is intrinsic; position would have to be defined on top of , and no canonical choice exists.
- There is no Lorentz-covariant position operator. The best-known attempt — the Newton–Wigner position operator — is self-adjoint on but
- is not Lorentz-covariant: a state localized at in one frame is not localized in any boosted frame;
- has eigenstates with non-local tails of width (the Compton wavelength);
- fails entirely for massless particles of helicity (no localized photon states — the Newton–Wigner / Hegerfeldt obstructions).
- Causality is in tension with localization. Any state strictly localized inside a spatial region at instantaneously develops support everywhere by (Hegerfeldt's theorem). For single-particle states there is no finite-signal-velocity rescue.
The way QFT resolves this is to stop localizing particles and localize fields instead. Field operators (built in §2) live at spacetime points and satisfy microcausality at spacelike separation (§3). The matrix element
looks like a position-space wavefunction but is genuinely the matrix element of an operator-valued distribution between two states — see QFT/preliminaries.md § States vs. Fields. Position-dependence re-enters at the operator level, not the state level. This inversion is exactly what makes relativistic quantum theory a field theory rather than a particle theory: particles are irreps (defined by momentum-space data), and the spacetime point is an argument of the operator, not a label on a state.
1.1.4 Summary
- "Representation" = a unitary action of on . Irreducibility = no invariant sub-Hilbert-space = elementarity.
- A particle is a state only in the loose sense that states of that species live in . The species itself is the entire irrep, labelled by or .
- The natural basis is momentum + spin component, not position. There is no good relativistic position operator, and any attempt at one breaks Lorentz covariance or causality. Position re-enters as the argument of field operators in §2, not as a label of single-particle states.
With this definition fixed, the rest of §1 unfolds with very few additional postulates.
1.2 One-particle states (Theorem — Wigner classification)
Premise: M1 (special relativity) + M2 (QM) + the §1.1 definition of particle.
Wigner's theorem (1939). The irreducible unitary representations of are classified by two Casimirs:
- Mass-squared with .
- For : spin (irrep of the little group ).
- For : helicity (irrep of the little group , with quantized in integer/half-integer units by rotation closure).
Each irrep is realized on a one-particle Hilbert space . The full derivation lives in QFT/preliminaries.md § Wigner's Classification.
Conclusion. is forced, not chosen. Specifying a species means picking or .
Empirical-3 (caveat). Continuous-spin representations of the massless little group exist mathematically but are not observed in nature. Their absence is empirical, not a theorem.
1.3 Identical particles and (anti)symmetrization (Postulate — symmetrization)
Premise: copies of for the same species, plus the empirical fact that identical particles are truly indistinguishable.
The naive -fold tensor product contains many states distinguished by labels on the particles (which particle has which momentum). Indistinguishability requires that the physical state space be only the part of on which the symmetric group acts trivially up to a phase:
(Postulate — symmetrization) Multi-particle states of identical species lie in either or , never in any other subspace. This is an additional assumption beyond M1+M2+M3 in spacetime dimensions. (In d it can be relaxed to give anyons — irreps of the braid group with arbitrary phases.)
Note on bosons vs. fermions. Which of vs each species uses is not postulated here — it is determined later by the spin–statistics theorem (§4) from M1+M2+M3 + the field construction of §2. The symmetrization postulate only fixes the binary structure, not the assignment.
1.4 Variable particle number → Fock space (Empirical input + Definition)
Premise: the empirical observation that physical processes change the number of particles.
In non-relativistic QM, particle number is conserved and the state space is a single for fixed . In relativistic QFT this fails:
- permits pair creation/annihilation when sufficient energy is available.
- Decays ( becomes ).
- Scattering with different in/out particle counts.
(Empirical-4) Variable particle number is an empirical input of relativistic QFT, justified by relativity () and observation. It is not derivable from M1+M2+M3 alone — it is a fact about nature that the formalism must accommodate.
Definition. Given variable , the natural state space is the Fock space direct sum over all sectors:
with the one-dimensional vacuum sector and the appropriately (anti)symmetrized -particle sector from §1.3.
Inputs combined. Constructing thus requires:
- M1 + M2 + §1.1 definition (gives );
- §1.3 symmetrization postulate (gives for each );
- §1.4 variable-number empirical input (motivates ).
The detailed structure of — sectors, vacuum, ladder action, etc. — is in fock-space-inventory.md.
Caveat — free vs. interacting. is the free Fock space, built from one-particle Wigner reps. The interacting Hilbert space of a non-trivial theory is not unitarily equivalent to (Haag's theorem). is the appropriate state space for asymptotic in/out states only — see §2.0.1 and QFT/remarks.md § Haag's theorem.
1.5 Cluster decomposition forces creation/annihilation operators (Theorem)
Premise: Fock space from §1.4 + M3 (cluster decomposition).
Working with -particle states sector by sector is unwieldy — every requires its own . Cluster decomposition singles out a much more economical formalism: build everything from operators that raise or lower particle number by one.
Definition. Creation operators take an -particle state to an -particle state by adjoining a particle of momentum and spin component . Annihilation operators are their adjoints. On Fock space they satisfy
with the bosonic commutator or fermionic anticommutator according to the species statistics. (See QFT/preliminaries.md § Fock Space for the Terminology — ladder operators callout connecting these to the QM harmonic-oscillator .)
Theorem (Weinberg Vol. 1 §4.4). Given from §1.4, any operator whose connected matrix elements satisfy cluster decomposition can be written as a polynomial in and . Conversely, an interaction Hamiltonian not expressible in terms of these operators violates cluster decomposition (its connected part fails to vanish at large separation).
Conclusion. M3 + §1.4 ⇒ ladder-operator formalism is unique among the alternatives. Cluster decomposition does not derive Fock space itself (that came from §1.4); it derives the ladder-operator structure on top of an already-given Fock space.
1.6 Summary of §1: assumption budget
| Step | Status | Premise | Conclusion |
|---|---|---|---|
| §1.1 | Definition + Empirical | "particle" = Poincaré irrep on a complex separable Hilbert space | One-particle space exists |
| §1.2 | Theorem | M1 + M2 + §1.1 | Wigner classification: labelled by or |
| §1.3 | Postulate | identical-particle indistinguishability | Multi-particle sectors are or |
| §1.4 | Empirical + Definition | variable particle number | Fock space |
| §1.5 | Theorem | M3 + §1.4 | Ladder operators are the unique cluster-decomposable formalism |
Three inputs beyond M1+M2+M3 are quietly invoked along the way: the operational definition of particle (§1.1), the symmetrization postulate (§1.3), and the empirical fact of variable particle number (§1.4). All three are universally accepted in standard QFT but are additional assumptions, not consequences of the core M1+M2+M3.
2. The Need for Local Fields
2.0 The S-matrix as the central object (Postulate / framing)
Before constructing fields, it is worth being explicit about what we are constructing them for. In the modern story, the S-matrix is closer to a primitive than the Hamiltonian is. This is the conceptual inversion relative to the historical route, and clarifies the logic of the rest of §2. The next three sub-subsections introduce the asymptotic-state framework that acts on (§2.0.1), define together with its three structural constraints (§2.0.2), and unpack what it means to take as the primary object (§2.0.3).
2.0.1 Asymptotic states (Definition)
Real experiments prepare configurations that are effectively free long before scattering and detect configurations that are effectively free long after. The mathematical objects representing these are asymptotic in-states and asymptotic out-states .
Heuristic content. An asymptotic state is a multi-particle Fock-space configuration of the species classified by Wigner (§1.2) — a definite collection of momenta, spins, and species — whose interactions are negligible in the relevant time limit. The interacting Hilbert space does not literally contain free states (interactions never fully turn off), but it contains states that behave as if free in the limits .
Formal construction (deferred). Two equivalent formal definitions exist; both are technically delicate and we defer derivations to standard references:
- Møller operators. Define (limits taken in a strong-operator-on-wavepackets sense). maps a free Fock state to its dressed in-counterpart, to its dressed out-counterpart:
- LSZ. Asymptotic states are extracted as residues of single-particle poles in the time-ordered correlators of interpolating fields (any local field with non-zero matrix element between the vacuum and a one-particle state). Unlike the Møller construction above, LSZ does not require a Hamiltonian — it takes correlators as input, which can come from a Lagrangian, the lattice, the conformal bootstrap, or any other source. See QFT/preliminaries.md § LSZ Reduction Formula for the formula and properties.
Asymptotic Hilbert spaces. The in-states span ; the out-states span . Each is naturally a Fock space built on one-particle Wigner reps (§1.2–1.4). The two spaces are independent by construction, and may a priori be different from each other and from the full interacting .
The two-part structure of P10. QFT Postulate 10 packages two genuinely separate assumptions, plus a derived consequence:
| Part | Statement | Where it can fail |
|---|---|---|
| (P10a) Existence | The Møller operators exist on a dense set of free states | Long-range potentials (Coulomb), QED soft photons — the interaction does not switch off enough at large times |
| (P10b) Asymptotic completeness | , equivalently — every interacting state is reachable as some asymptotic state | Confinement (QCD): quarks/gluons are not asymptotic states; only hadrons are |
| Consequence: is unitary | is automatically a well-defined unitary operator | (follows from P10a + P10b) |
So " exists as a unitary operator" is not an independent assumption — it is a theorem given (P10a) and (P10b). The substantive content of P10 is exactly (P10a) + (P10b), and the failure modes of in real theories trace back to whichever of these two is violated.
Where these are guaranteed vs. assumed.
- Free theory: (a) and (b) are trivially satisfied, .
- Constructive QFT in : (a) and (b) are theorems (Haag–Ruelle 1962, given Wightman axioms + isolated mass shells).
- Realistic 4D QFT (QED, QCD): (a) and (b) are conjectural — no rigorous construction exists; they are physically motivated assumptions justified retroactively by the agreement of perturbative -matrix elements with experiment.
- Modified asymptotics: When (a) or (b) fails, the framework is modified rather than abandoned — Faddeev–Kulish coherent states for QED soft photons; hadron Fock space for QCD; conformal-bootstrap data for CFTs without a mass gap.
For the rest of this document we take (P10a) + (P10b) for granted, consistent with how Wightman / Weinberg structure the framework.
2.0.2 The S-matrix and its three structural constraints (Definition)
Given asymptotic states from §2.0.1, the S-matrix is the unitary operator linking the in- and out-bases of :
Its matrix elements encode all observable scattering probabilities. Three structural constraints — each a direct consequence of one of the modern postulates M1, M2, M3 (plus P10) — pin down what can look like:
- Unitarity (M2 ⇒ probability conservation): Equivalent to "total probability of some outcome equals 1": . The optical theorem () is the perturbative content of this identity.
- Lorentz invariance (M1 ⇒ frame-independence): with the strongly continuous unitary representation of the Poincaré group on (see QFT/preliminaries.md § Lorentz and Poincaré Groups). This is what forces the spacetime-translation to factor out of every matrix element, leading to the -residue construction (see Computational machinery below).
- Cluster decomposition (M3 ⇒ no spooky long-distance correlations): if a multi-particle process splits into well-separated, non-overlapping subprocesses, the connected -matrix factorizes: This is what §1.5 used to force the ladder-operator structure, and §2.1 will use to force locality of the interaction density.
Together, (unitarity + Lorentz invariance + cluster decomposition) is the modern definition of "a relativistic quantum scattering theory". Specifying which — i.e. which species couple to which, and how — is the empirical content of any given theory (QED, QCD, …).
The transition operator and the invariant amplitude are introduced from exactly as in QED/historical.md §5.3.
2.0.3 The status of as primary (Framing)
What this means structurally. "Specifying a relativistic quantum theory" is operationally the same as "specifying a Lorentz-invariant, cluster-decomposable, unitary -matrix". The Hamiltonian formalism, the Lagrangian formalism, and Feynman-diagram perturbation theory are all means to that end — concrete parameterizations of valid -matrices, none more privileged than the others. (In its purest form, this idea is the S-matrix bootstrap / modern amplitudes program, where one writes down constraints — Lorentz, unitarity, analyticity, locality — directly on and never introduces a Lagrangian at all. This is most powerful for theories where Lagrangian descriptions are awkward or non-existent, e.g. higher-spin, certain CFTs, and the on-shell amplitudes literature.)
Why fields then appear. Demanding that be Lorentz invariant + cluster-decomposable + built from the ladder operators of §1.5 forces the construction in §2.1: the only consistent way to package the ladder operators into a Lorentz-covariant local interaction is via local fields . Fields are construction tools for , not the fundamental objects.
Comparison with the historical route. QED/historical.md §5.0 starts from a Hamiltonian and defines as the interaction-picture evolution operator. There is derived from . Here we run the logic the other way: is postulated to exist (P10), and the Hamiltonian (or Lagrangian) is one of several conventional ways to specify which . Both routes converge at the level of the Dyson + Wick + Feynman-rules computation — the same machinery, motivated differently. The distinction is conceptual: "what is fundamental?" not "what do we actually compute?".
Computational machinery (deferred). Once has been specified by a choice of Lagrangian / Hamiltonian, the entire computational chain
is identical to the one in the historical route, and we do not reproduce it here. The full derivation lives in:
- QED/historical.md §5.0 — Perturbative machinery: S-matrix, Dyson series, Wick's theorem — Dyson series, Wick's theorem, generic Feynman-diagram structure.
- QED/historical.md §5.2 — The QED S-matrix and Feynman rules — momentum-space Feynman-rules table.
- QED/historical.md §5.3 — The Invariant Amplitude — definition of and as the residue after stripping the spacetime-translation .
- QED/historical.md §5.4 — The Squared Amplitude — Casimir's trick, spin/polarization sums, physical interpretation, six-step worked-example pipeline.
- QFT/cross-sections.md — master formula ; the parallel decay-rate formula lives in decay-rates.md.
- QFT/preliminaries.md § Time-Ordering Operator — the convention used in the Dyson series.
These computational results carry over verbatim to the modern story; only their interpretive status differs (e.g. the Hamiltonian whose Dyson series is being expanded is, in the modern view, just a parameterization of rather than the primary object).
Aside — is derived from ? The modern viewpoint elevates above , but does not provide a constructive recipe to extract from . The relationship is more subtle:
- is fully determined by the species content (Wigner classification, §1). For free fields the Hamiltonian is forced:
- is only existentially constructive. Weinberg Vol. 1 Ch. 3 shows that any Lorentz-invariant, cluster-decomposable, unitary admits some interaction-picture Hamiltonian density built from local fields satisfying for . But this is not unique: many different Hamiltonians (differing by field redefinitions) produce the same .
- Field-redefinition equivalence (LSZ). Two 's related by a local field redefinition give the same on-shell -matrix. This is why effective field theories using different operator bases can describe identical physics, and why the choice of "fundamental" field is partly conventional.
- The bootstrap view. Modern programs (on-shell amplitudes, conformal bootstrap, double-copy constructions) compute -matrix elements directly from Lorentz / unitarity / locality / analyticity constraints without writing down any — empirical evidence that is genuinely a parameterization, not a derivation target.
So the modern claim is not " is computed from " but rather " exists, is non-unique, and is one of several equivalent parameterizations of the same ." The historical and modern routes thus differ on which is primary, but agree that they are equally complete descriptions of any given theory.
2.1 Lorentz invariance + cluster decomposition demand a special structure (Theorem)
To build a Lorentz-invariant interaction Hamiltonian , the natural ansatz is
with a scalar density under the Poincaré group. But this Hamiltonian density must be built from creation/annihilation operators (by §1.5, to satisfy cluster decomposition), and must be a Lorentz scalar.
Naive ladder-operator combinations don't transform covariantly. Under a Lorentz transformation, individual transforms via Wigner's little-group rotation , which depends on — not as an ordinary tensor field. Building a Lorentz scalar density out of these is non-trivial.
Solution: package the ladder operators into local fields. Define
with mode functions and chosen so that the combined field
transforms as a finite-dimensional representation of the Lorentz group:
The mode functions are uniquely determined by this requirement (up to normalization and basis choice) for each species; for spin they are the Dirac spinors of the historical route, for spin 1 the polarization vectors, etc.
Two non-trivial consequences emerge automatically:
-
Antiparticles. The piece must contain — creation operators for a different species (the antiparticle) with the same mass and spin and opposite charge. (For neutral self-conjugate fields, particle = antiparticle.) The existence of antiparticles is not postulated separately; it is forced by the requirement that transform locally and covariantly.
-
Microcausality. For the interaction Hamiltonian density to commute with itself at spacelike separation (which is necessary for Lorentz-invariant time-evolution and for cluster decomposition of the -matrix), one finds that the fields themselves must (anti)commute at spacelike separation:
This is microcausality — derived, not postulated.
2.2 Fields are operator-valued distributions (Theorem — heuristic)
The fields defined above are not operators on in the strict sense: has infinite norm (the integrand involves at one spacetime point, with no damping). They are operator-valued tempered distributions: only the smeared objects
are bona fide (unbounded) operators. This is the source of the technical complication that distinguishes QFT from finite-DOF QM.
Why "heuristic"? The above is rigorous for free fields. For interacting fields in , Haag's theorem shows the construction does not survive intact and the fields must be reconstructed via renormalization. See QFT/remarks.md § Haag's Theorem.
3. Microcausality and Locality
§2.1 already established microcausality
as a consequence of M1 + M3 plus the field construction. The (anti)commutator choice is fixed by spin–statistics (§4 below). Three points worth highlighting:
- Microcausality is what makes locality precise. Cluster decomposition (M3) is a statement about scattering at large spatial separation; microcausality is the operator-level consequence in Hilbert space.
- It is necessary for relativistic causality. Two operators that don't (anti)commute at spacelike separation could in principle propagate signals faster than light by repeated measurement.
- The (anti)commutator must vanish, not just be small. This is a much stronger condition than thermodynamic locality and is what fails in (e.g.) string field theory, leading to its non-local behavior on small scales.
The full statement is QFT Postulate 6; the modern derivation just outlined is summarized in §2.1.
4. Spin–Statistics (Theorem)
In §1.3 we left the choice between (bosonic) and (fermionic) open. The spin–statistics theorem fixes it:
Theorem (Pauli 1940; Lüders–Zumino 1958). In a Lorentz-invariant local QFT with positive energy and a unique vacuum, fields of integer spin must be quantized as bosons (commutators) and fields of half-integer spin as fermions (anticommutators). Any other choice produces either negative-norm states, violation of microcausality, or violation of the spectrum condition.
Proof sketch. Construct the field for spin using both choices (commutators vs anticommutators) and compute at spacelike separation. The result has the form for spin , for spin 1, etc., where is the commutator function. For half-integer spin, anticommutators give the spacelike-vanishing combination; commutators give a result that does not vanish (and would violate microcausality). For integer spin it's the reverse.
So spin–statistics is not an independent postulate in the modern derivation — it is forced by M1 + M2 + M3 + the field construction of §2.
Corollary — Pauli exclusion. Anticommutators of fermion creation operators give , forbidding two identical fermions in the same state. The Pauli exclusion principle is a downstream consequence of M1 + M2 + M3, not an independent postulate.
5. Massless Particles and the Origin of Gauge Invariance (Theorem)
Massless particles have a richer little group than massive ones — instead of — with consequences for the construction in §2.1.
5.1 Polarization vectors of massless spin 1 don't transform as 4-vectors
For a massive spin-1 particle, the polarization vectors form a true 4-vector representation of the little group (3 transverse polarizations). For a massless spin-1 particle, the little group acts on with an inhomogeneous piece:
The piece is not removable. So is not a true 4-vector under Lorentz transformations.
5.2 Lorentz invariance forces gauge invariance
For an interaction term to be Lorentz-invariant despite the non-tensorial transformation of , the inhomogeneous piece must drop out. This requires
i.e. the current must be conserved: .
By Noether's theorem, a conserved current corresponds to an internal symmetry. For an external photon line with polarization , the residual transformation must be a symmetry of the interaction — and that residual transformation is exactly gauge invariance:
So gauge invariance of QED is a theorem in the modern derivation: it is what is required to make a massless spin-1 particle interact in a Lorentz-invariant way. The photon's gauge invariance is not a separate postulate; it is a consistency requirement of M1 + M2 + M3 applied to a massless spin-1 species.
Generalization. For multiple massless spin-1 particles forming a non-trivial multiplet under an internal symmetry , the same argument forces the Yang–Mills structure: the connection 1-form, structure constants , and self-interactions are all uniquely determined. See QED for the abelian case (, one massless spin-1) and QCD for the non-abelian realization (, eight massless spin-1 in the adjoint).
What this story does not derive. The existence of multiple massless spin-1 particles forming a -multiplet is an empirical input, not a theorem. Likewise, massive gauge bosons (e.g. ) are permitted by Wigner classification without gauge invariance, but their longitudinal polarizations break perturbative unitarity at high energies — the resolution (the Higgs mechanism, with a scalar VEV breaking the gauge symmetry spontaneously) is a postulate added on top of the modern story, not a derivation. See QFT/remarks.md § Spontaneous Symmetry Breaking and the electroweak theory doc, where the Higgs mechanism is worked out in detail.
6. CPT Theorem (Theorem)
Theorem (Pauli 1955; Lüders 1957; Jost 1957). Every Lorentz-invariant local QFT with positive energy and a unique vacuum is invariant under the combined action (charge conjugation × parity × time reversal), where swaps particles and antiparticles, inverts spatial coordinates, and reverses time.
The theorem is a consequence of M1 + M2 + M3 + the field construction of §2 (specifically, the analytic-continuation properties of the Wightman functions in complexified Minkowski space). , , and individually are not required to be symmetries — and indeed are violated by the weak interaction — but their combined product must be.
For the proof structure see Streater–Wightman PCT, Spin and Statistics, and All That (1964); for the modern statement see Weinberg Vol. 1 §5.8.
7. Arrival at the QFT Postulates
We can now read off the QFT postulates as theorems / definitions / postulates of the modern derivation:
| QFT Postulate | Status in this derivation |
|---|---|
| P1 — Relativistic state space | Theorem (M1 + M2 + Wigner classification, §1) |
| P2 — Spectrum condition | Theorem (Wigner: physical particles have , ; §1.2) |
| P3 — Unique Poincaré-invariant vacuum | Theorem (cluster decomposition + irreducibility; the alternative — degenerate vacua — corresponds to spontaneous symmetry breaking, treated as a separate empirical case) |
| P4 — Field operators (operator-valued distributions) | Theorem (M1 + M3 + Weinberg's theorem on cluster decomposition force ladder operators; Lorentz covariance forces packaging into local fields; §2) |
| P5 — Poincaré covariance of fields | Theorem (built into the construction of §2.1) |
| P6 — Microcausality | Theorem (M3 + scalar interaction density; §3) |
| P7 — Spin–statistics | Theorem (Pauli; §4) |
| P8 — Cyclicity of the vacuum | Theorem (cluster decomposition + irreducibility ⇒ the field algebra acts cyclically on ) |
| P9 — Dynamics from a local action | Definition / convention (Lagrangian field theory is the practical way to specify subject to the constraints derived above) |
| P10 — Asymptotic completeness / S-matrix | Postulate (an additional assumption — that the interacting theory has a well-defined asymptotic limit; equivalent to the LSZ reduction). Not derivable from M1 + M2 + M3 alone. |
Two postulates remain genuinely additional in the modern story:
- P9 is more of a definitional convention — the Lagrangian formalism is a practical way to specify a Lorentz-invariant interaction, but other formalisms (Hamiltonian, path integral, S-matrix bootstrap) would also work.
- P10 is a real assumption about the dynamics: that the in/out spaces exist and equal . Examples (confinement in QCD) where individual asymptotic-particle interpretations break down show this is not automatic.
So the modern story compresses 10 postulates into 3 (M1, M2, M3) + 1 dynamical assumption (P10) + empirical inputs (field content, masses, couplings).
7.1 Reverse mapping: where do §1's non-M assumptions live in the QFT postulates?
The §1 derivation invokes three inputs beyond the core M1+M2+M3 (see §1.6 assumption budget). Tracing each back into QFT/postulates.md shows where they are explicit, where they are absorbed into other postulates, and where they are silently assumed:
| §1 non-M input | Maps to QFT postulate(s) | Status of mapping |
|---|---|---|
| §1.1 — particle = Poincaré irrep on a complex separable Hilbert space | P1 (relativistic state space) + P5 (Poincaré covariance) | Upstream framing. P1 says "states form a complex separable Hilbert space with acting unitarily" — but doesn't say particles = irreps. The "particle = irrep" identification is logically prior to P1; P1 is its formalization. |
| §1.3 — symmetrization postulate (states are sym OR antisym under exchange) | P7 (spin–statistics) | Logically prior, finer-grained. P7 packages two distinct claims: (a) the binary sym/antisym structure (= §1.3) and (b) the assignment "integer spin → bosons, half-integer → fermions" (= the spin–statistics theorem of §4). Most treatments don't separate (a) from (b); the modern derivation makes the split visible. |
| §1.4 — variable particle number (empirical input) | None directly | Hidden empirical input. No QFT postulate explicitly states "particle number is not conserved". P4 (operator-valued field distributions) and P10 (asymptotic completeness with multi-particle in/out) implicitly use variable- Fock space, but the empirical input is silently absorbed rather than stated. This is a gap in the standard axiomatic statement. |
So the 10 P-postulates are not literally a complete restatement of the modern derivation: variable particle number in particular is an unflagged empirical input in postulates.md. For a corresponding note on the postulates side, see QFT/postulates.md § Implicit Empirical Inputs.
8. Comparison with Other Routes
| Aspect | Modern (this doc) | Historical (QED/historical.md) | Wightman-axiomatic (postulates.md) |
|---|---|---|---|
| Foundational input | SR + QM + cluster decomposition + species data | Relativistic single-particle wave equation + minimal coupling + second quantization | All 10 Wightman postulates stated upfront |
| Where fields come from | Theorem (forced by Lorentz + cluster decomposition) | Promoted from wavefunction by hand at second quantization (postulate H3) | Postulated as operator-valued distributions (P4) |
| Spin–statistics | Theorem (Pauli 1940) | Postulated (anticommutators by hand) | Postulated as P7 (with note that it's a theorem) |
| Microcausality | Theorem (consequence of locality of ) | Inherited from canonical quantization | Postulated as P6 |
| Antiparticles | Theorem (forced by Lorentz covariance of ) | Postulated via Dirac sea, then reformulated via vs | Built into the field operator definition |
| Gauge invariance (massless spin 1) | Theorem (forced by Lorentz consistency of polarization vectors) | Derived from minimal coupling postulate | Not a Wightman postulate; added as a specialization |
| Pedagogical accessibility | Hardest (group theory + abstract framework) | Easiest (extends single-particle QM step by step) | Cleanest mathematically, opaque physically |
| Generalization to non-abelian, Higgs | Multi-particle + multiplet gives YM directly; Higgs is a separate empirical postulate | Gauge generalization requires a leap of faith (no obvious route) | Symmetry structure built in by hand |
All three routes converge on the same physical content — the Wightman postulates and the Lagrangians of the Standard Model — but they organize the inputs differently. The modern route is the most economical (fewest primitive postulates) but requires the most mathematical machinery; the historical route is most intuitive but forces gauge invariance to be discovered later as a "lucky accident"; the Wightman axioms are the most rigorous but presuppose what other routes try to motivate.
9. What Comes Next
Given the postulates of §7, the next questions are:
- Specialize to a specific theory. Pick field content, internal symmetry, renormalizable Lagrangian. Worked examples:
- QED — with Dirac matter.
- QCD — with quark matter.
- Electroweak — with chiral fermions, broken to by the Higgs mechanism.
- Standard Model — the union with cross-sector content (anomaly cancellation, generations + GIM, CP problems).
- GUTs, supersymmetric extensions, effective theories (forthcoming).
- Extract observables. S-matrix elements via LSZ + Feynman rules; cross sections via cross-sections.md, decay rates via decay-rates.md; the broader observable inventory is in observables.md.
- Address foundational caveats. Haag's theorem, rigorous construction in , measurement in QFT — see QFT/remarks.md.
- Move beyond. EFT / Wilsonian framing; QFT as the universal IR description of relativistic systems; possible UV completion (string theory, asymptotic safety).
Fock Space Inventory: Spaces, States, and Operators After Second Quantization
This page is a reference for what is what in a quantum field theory once second quantization has been performed. It expands on QFT/preliminaries.md § Fock Space and § States vs. Fields, and is meant to clarify the distinct roles of state vectors, field operators, ladder operators, and c-number mode coefficients.
The conventions and notation follow the historical-route QED notes, but the content is general.
0.0 Mathematical Preliminaries for Fock Space
The Fock-space construction below uses three pieces of linear-algebraic machinery: the tensor product, the direct sum, and the (anti)symmetric tensor power. The basic versions of and are recapped in QM/preliminaries.md § Notation Key; this section adds the parts specific to identical-particle Hilbert spaces.
0.0.1 Tensor Powers
For a Hilbert space , the -fold tensor power is
Vectors are linear combinations of product states with . The inner product is defined factor-by-factor and extended by linearity:
This is the natural Hilbert space for distinguishable particles. For identical particles, the symmetry of the wavefunction under particle exchange must be specified — leading to the symmetric / antisymmetric subspaces below.
0.0.2 The Symmetric Group
A permutation of is a bijection . The set of all such permutations forms the symmetric group , with .
Each permutation has a sign :
- if is a product of an even number of transpositions (swaps).
- if is a product of an odd number of transpositions.
For example, in : the identity has sign ; any single swap (e.g. ) has sign ; the cyclic permutation has sign (it's a product of two transpositions).
0.0.3 (Anti)Symmetrization Projectors
Permutations act on by permuting factors:
Two orthogonal projectors are constructed by averaging over :
Both satisfy , , , .
0.0.4 Symmetric Tensor Power
The -fold symmetric tensor power is the image of :
Vectors in are invariant under any permutation of the tensor factors. Explicitly, the symmetrized product of single-particle states is
For :
For :
This is the Hilbert space of identical bosons — the wavefunction is unchanged when any two particles are swapped, and the same single-particle state can be occupied any number of times.
Notation aliases used elsewhere: , , .
0.0.5 Antisymmetric (Exterior) Tensor Power
The -fold antisymmetric (exterior) tensor power is the image of :
Vectors in pick up a sign under permutation. The antisymmetrized product of single-particle states is the Slater determinant
For :
In particular, : two identical fermions cannot occupy the same single-particle state. This is the Pauli exclusion principle, built directly into the algebra.
This is the Hilbert space of identical fermions.
Notation aliases used elsewhere: , , .
0.0.6 Conventions for
Both extreme cases give the same result for bosons and fermions:
- — a one-dimensional space, identified with the vacuum sector spanned by .
- — the single-particle sector is just itself; there are no permutations to (anti)symmetrize.
Bosons and fermions therefore differ only for .
0.0.7 Hilbert Direct Sum
The Hilbert direct sum of countably many Hilbert spaces consists of sequences with and finite total norm:
The inner product is
Different summands are mutually orthogonal — a vector in is orthogonal to a vector in for .
A generic element of is therefore a superposition across all sectors, not a single vector in one . The "particle number" has integer eigenvalues, but generic states (coherent states, the interacting vacuum, ...) are not eigenstates of .
0.0.8 Putting It Together: Fock Space
With this machinery, the Fock space definitions used below are:
Each Fock space is a Hilbert direct sum of -particle sectors , where each is the (anti)symmetric tensor power of the single-particle space . The vacuum sits at the bottom; the single-particle space sits at and is the same as before second quantization (see §0.2 below).
0. Before vs. After Second Quantization
It is worth pausing to ask: what was the Hilbert space before second quantization, and how does it relate to Fock space? Answering this carefully clears up a recurring confusion: are the "states" of the pre-QFT theory the same as the states of the QFT?
0.1 Two Quantizations, Not One
The historical route to QFT involves two distinct constructions, both called "quantization":
| Stage | What is quantized | Resulting Hilbert space |
|---|---|---|
| 0. Classical | Classical field — a 4-component c-number (or Grassmann) function on spacetime (see note below) | None |
| 1. "First quantization" | Promote to a relativistic wavefunction; treat as the state of a single particle | (for a Dirac particle) |
| 2. "Second quantization" | Promote to an operator on a many-particle space | Fock space |
The terminology is regrettable — there is only one Hilbert-space construction (first quantization); the second step is really field quantization — but the historical names have stuck.
Note on what "stage 0" means. Several different objects could legitimately be called "the classical field":
Object Components Source of the count Classical relativistic particle (worldline ) 4 spacetime coords, 0 internal components Spacetime dimension Classical Dirac field 4 spinor components per spacetime point Clifford algebra (see §0.8.1) — not the spacetime dimension Classical Maxwell field 4 components per spacetime point Spacetime dimension (genuinely; ) The table above uses the classical Dirac field as "stage 0" — the modern field-theory starting point. Note that this is not what Dirac himself started from historically: he began with a relativistic particle (no field), found his equation by demanding a first-order quantization (postulate H1), and only later did anyone write down the corresponding classical Lagrangian. So the modern chain "classical field → first quantization → second quantization" is a retrospective tidying-up; the original sequence was "classical particle → relativistic wavefunction → field operator". Either ordering arrives at the same QFT.
Note also that the two "4"s in the Dirac and Maxwell rows above are different things: spinor index vs. spacetime index. They coincide only in 4D spacetime — see the caveat in §0.8.1.
0.2 Pre-Quantization Setup (Single-Particle Relativistic QM)
After first quantization but before second quantization:
- Hilbert space: (Dirac case).
- State: a single vector , i.e. a wavefunction .
- Particle number: fixed at by construction.
- Operators: , , , etc. — they act on a single wavefunction; they cannot create or destroy particles.
- Missing: no vacuum, no antiparticles (only awkward Dirac-sea heuristics), no scattering with changing particle number, no second particle.
For multi-particle relativistic QM (which is rarely written down because it does not really work as a stand-alone theory), one would put identical particles into
separately for each , with no operators connecting different sectors.
0.3 Post-Quantization Setup (QFT)
After second quantization:
- Hilbert space: Fock space — the same as direct summands, plus the new vacuum sector .
- State: a vector in , generically a superposition over different particle-number sectors.
- Particle number: no longer fixed. The number operator has integer eigenvalues , but generic states (coherent states, the interacting vacuum) are not eigenstates of .
- Operators: that connect different sectors; field operators , ; observables .
- Gained: vacuum, antiparticles as a derived concept (not the Dirac sea), variable particle number, locality of operators, automatic (anti)symmetrization.
0.4 Are the States "the Same"?
The one-particle states are essentially the same; everything else is genuinely new.
| State | Pre-2nd-quantization | Post-2nd-quantization | Same? |
|---|---|---|---|
| One-particle wavefunction | Yes — same vector, embedded in a larger space | ||
| Two-particle Slater state | contracted with the wavefunction | Yes — but only because antisymmetrization was postulated by hand before | |
| Vacuum | Does not exist | Sector | No — genuinely new |
| Antiparticle states | Do not exist (or: filled negative-energy states, awkward) | , clean and positive-energy | No — genuinely new |
| Superposition of different | Forbidden — different sectors are unrelated | Allowed: e.g. coherent states | No — genuinely new |
| Interacting vacuum | Does not exist | A specific vector in (extended) | No — genuinely new |
So every pre-quantization state has a faithful image in , but contains many states with no pre-quantization counterpart.
0.5 The Mathematical Relationship
Concretely, Fock space is built from the single-particle space:
Consequences:
- The single-particle space is a closed subspace of — no information is lost.
- The multi-particle sectors are also subspaces — but pre-quantization theory needed a separate construction for each , while Fock space gives them all at once.
- The truly new ingredients are (i) the vacuum , (ii) operators connecting different sectors, and (iii) coherent superpositions across sectors.
0.6 Why Bother?
Two reasons that are not obvious until you do it:
- Operators that change particle number become local fields. The map "particle at " lets you write down operators like — local densities, currents, energy-momentum tensors — that have no analog in pre-quantization theory because there is no vacuum to create from. This is what makes interactions and locality possible.
- Antisymmetrization becomes automatic. In pre-quantization theory you postulate Slater determinants by hand and have to keep track of permutation signs. In second quantization the anticommutator does this for you — Pauli exclusion is built into the algebra of operators.
0.7 A Useful Slogan
First quantization turns a classical particle into a quantum state. Second quantization turns a quantum state into a quantum operator.
Or, in terms of objects:
Classical: position x(t) field φ(x) (numbers/functions)
↓ first q. ↓ first q.
1-particle QM: operator x̂ wavefunction ψ(x) (acts on / is in 𝓗_1)
↓ second q.
QFT: field operator ψ̂(x) (acts on 𝓕)
The right-hand column is the relevant one: a wavefunction (a state) is promoted to a field operator on a larger Fock space. The symbol becomes , but it now means an operator, not a state. Meanwhile the state of the system becomes a vector in Fock space, which is something genuinely new (see QFT/preliminaries.md § States vs. Fields).
0.8 The Pre-Quantization Hilbert Space in Detail (Dirac Case)
The single-particle Dirac Hilbert space looks deceptively simple — "square-integrable 4-component spinor wavefunctions" — but several things are non-obvious and worth spelling out, because they motivate why second quantization is needed in the first place.
0.8.1 The Naïve Definition
is an abstract complex separable Hilbert space carrying the spin-, mass- representation of the (universal cover of the) Poincaré group. The most familiar concrete realization is the position representation
in which a state vector is represented by a 4-spinor wavefunction
Why 4 components? The number is forced, not chosen. Three equivalent derivations:
- Algebraic. Postulate H1 (first-order relativistic equation, see QED/historical.md §1.2) requires satisfying the Clifford algebra . The smallest faithful matrix representation of this algebra in 4D Minkowski space has dimension . (In D it would be ; larger sizes are reducible direct sums.) The general pattern — Bott periodicity, when Majorana/Weyl spinors exist — is collected in math/clifford-algebra.md. Hence must be a 4-component column.
- Group-theoretic. Under Lorentz transformations must carry a representation of (the universal cover of the Lorentz group). The smallest faithful spinor reps are the two 2-component Weyl spinors and . A massive particle's mass term couples both chiralities, so a massive spin- field needs — dimension .
- Physical. The 4 components decompose as : two for the spin- doublet, doubled by the antiparticle degree of freedom that relativity unavoidably introduces (§0.8.5–§0.8.7).
Non-relativistic spin- (Pauli theory) gets away with because it has no antiparticle sector and uses Galilean rather than Lorentz boosts.
Caveat: the "4" in is not the spacetime dimension. It is a coincidence that the spinor dimension equals the spacetime dimension in . The spinor dimension is in spacetime dimensions:
Spacetime dim Spinor dimension ← our universe (relevant in superstring theory) In 6D spinors would be 8-component, in 10D 32-component. The "4 of " (counting ) and the "4 of " (counting spinor components) are two different things that happen to coincide only in our 4D spacetime. The spatial dimension enters separately, in the inside .
Equivalent realizations of the same Hilbert space are obtained by unitary change of basis:
| Representation | Concrete model | Vector becomes |
|---|---|---|
| Position | ||
| Momentum | (via Fourier) | |
| Energy/spin | components along the split (§0.8.6) | |
| Bound-state energy basis | discrete + continuum amplitudes (e.g. for hydrogen) | |
| Abstract | no concrete model | just |
None of these is "the" Hilbert space — the Hilbert space is the abstract equivalence class; the rest are coordinate choices that diagonalize different operators (position, momentum, energy, ...).
There is an additional, independent choice for the factor: a representation of the Clifford algebra (Dirac, Weyl, Majorana — see QED/historical.md §0.1 for the explicit matrix forms and math/clifford-algebra.md § 7 for the abstract classification). Different Clifford-algebra representations are related by a similarity transformation , , and again give the same abstract .
The position representation is used below by default because the Dirac equation is most familiar in that form. Time is not part of the Hilbert-space label; it parametrizes how a vector evolves under the Schrödinger-form Dirac equation.
0.8.2 The Inner Product
Note carefully: this uses (Hermitian conjugate), not the relativistic-looking . The latter is a Lorentz scalar but not positive-definite — it cannot be an inner product. The price of the former is that the inner product is not manifestly Lorentz-invariant: it singles out the time component of . This is one of the first signs that does not sit comfortably with relativity.
The norm is , where is the Dirac probability current. Since , the norm is positive-definite — this was the whole point of demanding a first-order equation (postulate H1 in QED/historical.md).
0.8.3 The Three Tensor-Product Factors
It helps to unpack explicitly:
| Factor | Dimension | What it carries |
|---|---|---|
| (separable) | Spatial wavefunction information | |
| 2 | Upper vs. lower 2-component blocks (in standard rep.) | |
| 2 | Spin-up vs. spin-down within each block |
In the standard (Dirac) representation, the four components split as , where ("upper") dominates for non-relativistic particle states and ("lower") dominates for antiparticles. In the Weyl (chiral) representation the same four components split into left- and right-handed Weyl spinors instead. Different representations of the Clifford algebra give different physical interpretations of the four components, but the abstract Hilbert space is the same.
0.8.4 Position-Basis "Eigenstates" (Improper Vectors)
As in non-relativistic QM, position eigenstates ( a spinor index) are not normalizable elements of — they are improper distributions:
They are the rigged-Hilbert-space generalized eigenvectors of the position operator . They live in a larger space , never in itself.
0.8.5 Momentum Basis and the Free Hamiltonian
Going to momentum space by Fourier transform,
the Hilbert space is unitarily equivalent to . The free Dirac Hamiltonian acts on each momentum mode as the matrix
whose eigenvalues are , each two-fold degenerate (for the two spin states). The corresponding eigenvectors are the c-number 4-spinors (positive energy) and (negative energy). This is the source of the negative-energy problem: half of the spectrum of on is unbounded below.
0.8.6 Decomposition into Positive- and Negative-Energy Subspaces
Since , one can build orthogonal projectors
inducing an orthogonal decomposition
Each summand is isomorphic to (the being the two-fold spin degeneracy). Two key facts:
- Each summand carries an irreducible unitary representation of the Poincaré group (mass , spin , positive or negative energy). is the Wigner one-particle space — what Fock space ought to be built from.
- The decomposition is non-local in position space. The projectors become non-local integral kernels when transformed back to -space — a signature of the Newton–Wigner localization problem (§0.8.7).
After second quantization, becomes the electron sector of Fock space, and — with charge conjugation applied — becomes the positron sector, both with positive energy.
0.8.7 Why Is Not a Satisfactory Relativistic Hilbert Space
Several pathologies show up if one tries to take seriously as the state space of a relativistic quantum particle:
- Negative-energy spectrum. Already noted; the Hamiltonian is unbounded below.
- No relativistically invariant inner product. The natural Lorentz-scalar bilinear is indefinite; the positive-definite singles out a frame.
- Newton–Wigner localization. There is no position operator on whose eigenstates are Lorentz-covariantly localized. The naïve does not commute with the projection onto — projecting a -localized state onto positive energies smears it out over a Compton wavelength .
- Zitterbewegung. Solutions of the Dirac equation on exhibit a rapid oscillation between positive- and negative-energy components, with no classical analogue. It vanishes once one restricts to .
- Klein paradox. A barrier of height "transmits" more current than is incident — pair production sneaking in through a single-particle interpretation.
- No multi-particle states. has baked in; it cannot describe scattering processes that change particle number, which is exactly what relativistic kinematics permits.
All of these are symptoms of the same underlying issue: a relativistic theory cannot be a one-particle theory. Energy can be converted to particle number (), so any consistent theory must allow particle creation and annihilation. Second quantization is the resolution.
0.8.8 What Survives in QFT
Despite its problems as a relativistic state space, — or really its positive-energy half — has a clean and important role after second quantization:
-
It is the one-particle sector of Fock space (the summand): .
-
It carries the Wigner irreducible representation for a mass-, spin- particle.
-
One-particle wavepackets
are vectors in , with playing the role of the post-quantization "wavefunction" (see §3.1 below).
So the historical does not disappear: its positive-energy half is reborn cleanly inside Fock space, and the negative-energy half is reinterpreted (via charge conjugation) as the antiparticle sector.
1. The State Space: Fock Space
After second quantization, the Hilbert space is no longer the single-particle space (e.g. for one Dirac particle), but the much larger Fock space
where:
- — the one-dimensional vacuum sector.
- — the one-particle Hilbert space (an irreducible Wigner representation of the Poincaré group, e.g. mass , spin ).
- — the -particle space, (anti)symmetrized tensor product:
Concretely:
- A single fermion lives in .
- A two-fermion state is an antisymmetrized two-particle wavefunction — a Slater determinant.
- An -particle state has permutations symmetrized/antisymmetrized.
- The vacuum is a distinguished one-dimensional sector — not "no wavefunction"; it is a normalized vector orthogonal to all .
For a theory with several particle species, the full Fock space is a tensor product of one Fock space per species. For QED:
2. The Operators
There are two distinct families.
2.1 Creation and Annihilation Operators
These are the building blocks of every other operator. For each mode (e.g. for an electron):
- — creates an electron in mode . Raises sector: .
- — annihilates one. Lowers: .
- , — analogous for positrons.
- , — analogous for photons.
Algebra:
with all other (anti)commutators vanishing. These are densely-defined unbounded operators on .
2.2 Field Operators
The electron/positron field is not a state — it is an operator-valued distribution on Fock space:
So is a linear combination of creation and annihilation operators — at each spacetime point it is a 4-component (in spinor index) operator-valued distribution. The spinors and are c-number coefficients (numerical 4-vectors), not operators or states.
The photon field admits an analogous expansion in terms of , , and polarization vectors .
2.3 Observables Built from Fields
Energy, momentum, charge, and number are all expressible as integrals of bilinears in the fields (after normal-ordering to remove the zero-point divergences):
- Hamiltonian: .
- Momentum: .
- Electric charge: .
- Number operators: , etc.
All act on .
3. The States
Once second quantization is in place, the "wavefunction" of pre-QFT QM is gone — its role is taken over by state vectors in Fock space, of which there are several useful kinds.
| Kind of state | Construction | Lives in |
|---|---|---|
| Vacuum | , defined by | |
| One-electron momentum eigenstate | ||
| One-positron | ||
| One-photon | ||
| Two-electron Slater state | (auto-antisymmetric) | |
| Multi-particle scattering state | ||
| Single-particle wavepacket | ||
| Coherent state of photons | superposition across all photon- | |
| Bound state (e.g. positronium) | — but is non-perturbative | |
| Mixed state (thermal, decoherent) | density matrix on | not a vector — see §3.3 |
3.1 Single-Particle States Are Not Wavefunctions
Single-particle states are not the wavefunctions of pre-QFT. They are vectors in the one-particle sector . The connection to Dirac wavefunctions is through matrix elements of the field operator:
The right-hand side is a Dirac wavefunction — but it is the matrix element of an operator between two specific states, not the state itself.
The proper "single-particle wavefunctions" in the QFT framework are wavepackets:
with playing the role of the momentum-space wavefunction. Its position-space transform is, in many ways, the closest analogue of the pre-QFT .
3.2 The Interacting Vacuum
The interacting (physical) vacuum is not the same as the free Fock vacuum . Loop diagrams continually create and annihilate virtual particles, so has nonzero overlap with all for the free Fock decomposition.
In fact, by Haag's theorem (see QFT/remarks.md), lives in a Hilbert space unitarily inequivalent to — a deep structural issue that textbook QFT typically glosses over by working with formal power series and renormalization, never with explicitly.
In perturbation theory, is reached from by adiabatic switching of the interaction (Gell-Mann–Low), and most computations only ever require the correlator , not the state itself.
3.3 Density Operators (Mixed States)
Density operators on describe statistical mixtures: thermal QED (), decoherence, and partial-trace reduced states for entanglement studies. They are not vectors but positive-semidefinite, trace-class operators on with .
Expectation values are then , generalizing the pure-state formula .
4. What's "Wavefunction-Like" After Second Quantization
If you really want to recover something wavefunction-shaped from the Fock-space machinery, there are three places it shows up:
- Single-particle wavepacket coefficient or its Fourier transform — same role as the QM wavefunction, but only for a one-particle sector.
- -particle wavefunction — recoverable in principle for a fixed- sector, but rarely useful in practice (relativistic kinematics, no fixed particle number under interactions).
- Mode-function coefficients , , — the c-number coefficients in the field expansion. Often called "wavefunctions" loosely, but they are Lorentz-covariant coefficient objects, not states.
5. Inventory at a Glance
┌──────────────────────────────────────────────────────┐
│ Fock space 𝓕 = ⨁ₙ 𝓗ₙ │
│ │
│ States (vectors) Operators │
│ ────────────────── ────────── │
│ |0⟩ b, b†, d, d†, a, a† │
│ |p,s⟩ = b†|0⟩ ψ̂(x), Â_μ(x) │
│ |p₁,s₁;p₂,s₂⟩ Ĥ, P̂, Q̂, N̂ │
│ wavepackets │
│ coherent states │
│ bound states (built from a/a†) │
│ density matrices ρ̂ │
│ │
│ c-numbers (not on 𝓕) │
│ ───────────────────── │
│ u(p,s), v(p,s), ε^μ(k,λ) — mode coefficients │
│ ⟨0|ψ̂(x)|p,s⟩ = u(p,s)e^{-ip·x} — matrix element │
└──────────────────────────────────────────────────────┘
The two most common confusions:
- The field operator is not a state and not a wavefunction. It is an operator-valued distribution on .
- The mode-function spinors , are c-numbers, not states. They appear as numerical coefficients multiplying , in the field expansion, and as matrix elements .
Particles as Excitations of Quantum Fields
"The electron is an excitation of the electron field" is one of the central slogans of QFT. This page unpacks what it means concretely — as a statement about specific vectors in Fock space and operators on it, not as philosophical metaphor.
The page complements QFT/fock-space-inventory.md (which catalogues the spaces, states, and operators) and QFT/preliminaries.md § States vs. Fields (which explains why QFT is operator-centric).
1. The Slogan, Decoded
The statement "the electron is an excitation of the electron field" packages three concrete claims:
- The field is the fundamental object, not the electron.
- The field has a vacuum (lowest-energy state) in which the field has zero (or minimum) excitation.
- A one-electron state is what you get by acting on the vacuum with the field's creation operator: .
The electron is not the field, and not a chunk carved out of the field. It is a state of the system — specifically, the next discrete eigenstate above the vacuum in the appropriate sector.
2. The Mechanical Analogy
The terminology comes directly from the mechanics of vibrating systems. A guitar string has:
| Mechanical system | QFT analogue |
|---|---|
| String displacement field | Quantum field operator |
| String at rest, | Vacuum |
| Discrete vibrational modes (fundamental + harmonics) | Single-particle modes |
| Excited mode (one quantum of vibration) | One-particle state |
| Two excitations of the same mode | — bosons; for fermions |
| Superposition of modes | Wavepacket |
A "particle" in QFT is what an "excited mode" is for the string: a discrete eigenstate of the field's energy spectrum, parametrized by momentum and spin (or polarization). And just as a string can have one harmonic excited, two of the same mode (for bosons), or a superposition of different modes, the field can have zero, one, two, ... particles of various momenta — the Fock space structure of fock-space-inventory.md §1.
3. The Mathematical Content, Step by Step
3.1 There Is a Field, Not Particles
In QFT the fundamental object on the spacetime manifold is the field operator , defined for every spacetime point . It is a 4-component (in spinor index) operator-valued distribution on Fock space — not a wavefunction.
This is genuinely different from saying "there's a population of electrons each obeying the Dirac equation". There is one field, and the question "how many electrons are there?" is a question about what state of the field you are in.
3.2 The Vacuum Is the Field's Ground State
The vacuum is the unique lowest-energy state of the field's Hamiltonian, defined by
for all modes of the electron, positron, and photon fields. The vacuum is not "no field" — it is "the field in its ground state", analogous to a string at rest. It has nonzero zero-point energy (formally divergent, removed by normal-ordering) and nonzero quantum fluctuations (after suitable regularization).
3.3 A One-Particle State Is One Quantum of Excitation
The simplest excitation is created by applying a creation operator:
This vector lives in the one-particle sector . Its physical interpretation is given by the eigenvalues of the standard observables:
| Observable | Eigenvalue on |
|---|---|
| 4-momentum | |
| Number operator | |
| Charge | |
| Spin magnitude | |
| Spin projection | depending on |
So has all the properties textbooks attribute to "an electron of momentum and spin ". The electron is this state.
3.4 Multi-Particle States Are Multiple Excitations
Two electrons:
By the anticommutator , swapping the two creation operators gives a sign flip — automatic Pauli antisymmetry, with no need to symmetrize by hand. Three, four, ... electrons are obtained by applying more creation operators. Coherent states of photons are exponentials of acting on , and so on.
3.5 The Field Operator Creates and Destroys Excitations
The field itself is a linear combination of operators that create and destroy excitations:
So acting with on a state can:
- annihilate an electron (via ), or
- create a positron (via ),
both at the spacetime point (in a smeared, distributional sense).
This is what makes interactions and locality possible: the QED interaction Hamiltonian is built from these creation/destruction operators evaluated at the same spacetime point.
4. Concrete Consequences
The slogan has real predictive content; it is not philosophy.
4.1 Indistinguishability Is Automatic
In pre-QFT QM you have to postulate that "all electrons are identical" and impose antisymmetrization by hand (QM Postulate 7). In QFT, since every electron is an excitation of the same field, indistinguishability is automatic: there is only one field, so its excitations have no individual identity beyond their quantum numbers.
The anticommutator then gives Pauli exclusion automatically: two excitations with the same quantum numbers don't exist, since .
4.2 Variable Particle Number Is Automatic
A particular state of the electron field can have electrons — or be a superposition of different values. There is no separate "-electron theory" for each ; the same Hilbert space accommodates all of them. This is what makes scattering processes like (changing both species and number) describable.
4.3 Antiparticles Arise Naturally
The same field has two kinds of mode operators: (annihilates electrons) and (creates positrons). The positron is not a separate object; it is another excitation of the same electron field, in a different sector. The field operator's structure encodes both at once.
This is the modern resolution of the negative-energy puzzle that plagued Dirac's original single-particle theory (see QED/historical.md §3.1–§3.4).
4.4 Vacuum Fluctuations and Virtual Particles
Because does not commute with itself at different spacetime points (microcausality only enforces anticommutation at spacelike separation), even the vacuum has nonzero correlations:
This is the Feynman propagator — the amplitude for a "virtual" excitation to propagate between and . So the vacuum, far from being empty, contains nonzero correlations of the field across spacetime — heuristically a sea of virtual electron–positron pairs blinking in and out of existence. Concretely: is nonzero (after regularization) and contributes to physical effects:
- The Casimir force between conducting plates.
- The Lamb shift in hydrogen (see QED/hydrogen.md §2 Level 2).
- Vacuum polarization modifying the photon propagator.
4.5 Locality of Interactions
Because is defined at every spacetime point, interaction terms like
are local: they couple field operators at the same spacetime point. This is the QED interaction (see QED/historical.md §5.1). Locality is what makes Feynman rules read off directly from the Lagrangian, and what guarantees relativistic causality (microcausality, QFT Postulate 6).
If the electron were a discrete object localized at a single point, you would need nonlocal interactions ("the electron at point felt the photon emitted at point "). The field formulation makes the interaction a contact term at each spacetime point.
5. The Picture vs. the Alternatives
| Picture | Fundamental object | What is "an electron"? |
|---|---|---|
| Pre-QFT relativistic QM | Wavefunction | The state itself |
| Schrödinger non-relativistic | Wavefunction | The state itself |
| Canonical QFT | Field operator on | A vector in — one quantum of excitation of the electron field |
| Algebraic QFT | Net of local algebras | A state on the algebra with the right superselection numbers |
| Path integral | Field configuration | A pole in correlation functions |
In all the QFT pictures, the electron is a secondary, derived concept; the field is primary. This inverts the pre-QFT QM picture, where the wavefunction is the state of "the electron", and the electron is the primary object.
6. A Three-Line Summary
The field is the fundamental object — it exists everywhere in spacetime.
The vacuum is the field's ground state — what is "there" when nothing is there.
An excitation is one quantum of energy above the vacuum — what we call an "electron".
So "the electron is an excitation of the quantum field" unpacks to: the electron is a vector in Fock space obtained by acting on the vacuum with one , which the field operator produces by smearing against a test function. Its energy, momentum, charge, and spin are eigenvalues of operators built bilinearly out of and . It has no further substance beyond that.
Observables of QFT
QFT predicts numbers measured by experiments. The list of types of such numbers is wider than scattering cross sections alone — particle physics, atomic physics, condensed-matter QFT, and cosmology each draw from different parts of the formalism. This page is the map of those observables: what each one is, what mathematical object inside the QFT machinery produces it, and where in this repository it lives (or should live).
Cross sections and decay rates are by far the most prominent and have their own dedicated pages (cross-sections.md, decay-rates.md); this doc places them inside the broader inventory. Per-theory observable inventories live in electroweak.md (EW-sector observables with overlap tags) and standard-model.md (cross-sector SM observables). The methodology layer — how a theory prediction becomes a published number — is in collider-measurements.md, and the direct-vs-inferred taxonomy (what the detector actually records vs. what is fit-extracted) is in direct-vs-inferred.md.
1. The Three Fundamental Sources
Almost every QFT-derived experimental number falls into one of three structural classes, each tied to a different feature of the underlying machinery:
| Class | Built from | Examples |
|---|---|---|
| (A) Squared S-matrix elements | On-shell amplitudes computed via Feynman rules and LSZ | Cross sections, decay rates, branching ratios, asymmetries, distribution shapes |
| (B) Pole positions and residues of correlators | Singularities of in momentum space | Particle masses, bound-state energies, lifetimes (via complex poles), form factors |
| (C) Static / on-shell matrix elements | for a local operator at zero momentum transfer | , electric/magnetic moments, charge radii, weak nucleon couplings |
Most observables are reducible to one of these, possibly through a long but mechanical chain.
2. The Inventory
2.1 Cross sections (Class A)
The dominant observable in collider physics. For 2 → scattering processes:
Subtypes — total, differential (, , ...), inclusive (sum over final states), exclusive (specific final state). Full treatment in cross-sections.md.
Where measured. Colliders (LHC, LEP, B-factories), fixed-target experiments (Rutherford, electron–nucleon DIS), neutrino experiments.
2.2 Decay rates and lifetimes (Class A)
For 1 → processes, with the analogous master formula
Conceptually distinct from cross sections — measured for unstable particles in their rest frame. Shares the same machinery; covered in decay-rates.md.
Where measured. Particle lifetimes (-meson, -lepton, muon, neutron), nuclear -decay, cosmological evolution of unstable particles.
2.3 Branching ratios (Class A, derived)
Ratios of partial decay widths. Cancel many normalization uncertainties (overall coupling constants, wavefunction renormalizations) and are often the cleanest predictions from QFT for hadronic processes.
Examples. , , neutron -decay branching to versus radiative modes.
2.4 Asymmetries (Class A, derived)
Designed-to-cancel ratios of differential cross sections / decay rates. They expose subtle effects (interferences, parity violation, violation) by removing dominant common factors.
| Asymmetry | Formula schema | Probes |
|---|---|---|
| Forward–backward | -mediated interference, e.g. | |
| CP | -violating phase in the CKM matrix | |
| Spin | depends on initial-state polarizations | Spin structure of nucleon (DIS), electroweak parameters |
| Charge | Differences between and quark distributions |
2.5 Distribution shapes / kinematic spectra (Class A)
The shape of an observable distribution rather than its absolute rate:
- Invariant-mass spectra — peaks reveal new particles (, , , ).
- Transverse-momentum distributions.
- Angular distributions of decay products (e.g. angles probe spin/parity of the Higgs).
- Rapidity / pseudorapidity distributions.
These are differential cross sections viewed as functions; the underlying machinery is identical.
2.6 Bound-state energy levels (Class B)
Spectroscopy of bound systems (atoms, hadrons). The bound-state masses appear as poles in the appropriate channel of multi-particle correlators, computed via:
- Bethe–Salpeter equation — relativistic two-body bound-state framework.
- NRQED / NRQCD — effective theories integrating out the relativistic scale.
- Lattice QCD — Euclidean two-point functions in the meson / baryon channels.
Examples. Hydrogen Lamb shift (see QED/hydrogen.md), positronium / muonium / hydrogenic-ion spectra, hyperfine splittings, the entire light hadron spectrum (lattice QCD), quarkonium (, families) levels.
Conceptually distinct from cross sections. A bound state is not an asymptotic in/out state — it is a singularity of the off-shell propagator, not an entry in .
2.7 Static electromagnetic / weak properties (Class C)
Properties of a particle as seen by an external static probe. Extracted from the on-shell vertex form factors in the parameterization
| Observable | Definition | Example |
|---|---|---|
| Charge | Universal: for the electron, for the proton | |
| Anomalous magnetic moment | — most precise QFT prediction in physics, 1-part-in- agreement | |
| Charge radius | Proton-radius puzzle (muonic vs electronic measurements) | |
| Electric dipole moment (EDM) | -odd form factor | Searches for new physics; extremely small in SM |
| Weak charges, axial coupling | Analogous extraction from weak current matrix elements | Neutron -decay |
These are all static (zero or near-zero momentum-transfer) limits of vertex functions, not cross sections.
2.8 IR-safe QCD observables (Class A, specialised)
In QCD, infrared and collinear divergences make naive for individual quark/gluon final states ill-defined. IR-safe observables are constructed to be insensitive to these divergences:
- Jet cross sections — defined via jet algorithms (anti-, Cambridge–Aachen) on inclusive final states.
- Event shapes — thrust , sphericity, -parameter; characterize the "shape" of a multi-jet event.
- Inclusive structure functions , , in deep inelastic scattering.
- Total hadronic cross section .
All are class A in the sense that they reduce to , but with sums-over-final-states designed for IR safety.
2.9 Particle masses and coupling constants (Class B + RG)
These are parameters of any specific QFT, but they are also observables — fixed by experiment to specify a renormalization scheme.
- Pole masses — from the location of the propagator pole. Universal but suffers from renormalon ambiguities at higher loops.
- masses — running masses defined by minimal-subtraction renormalization. Scheme-dependent but theoretically clean.
- Running coupling constants , , — extracted from fits to many observables across an energy range, satisfying RG equations.
- Mixing matrices — CKM (quark) and PMNS (neutrino) matrix elements. Extracted from a global fit to a large set of decay rates and asymmetries.
2.10 Vacuum and topological observables (misc)
Less common in particle-physics tables, but real:
- Vacuum stability bounds — does the SM electroweak vacuum decay? Computed from the effective potential at large field values.
- Anomaly coefficients — the chiral anomaly has measurable consequences (e.g. rate).
- Theta-vacuum / instanton effects — measured via the neutron EDM bound; baryogenesis via electroweak sphalerons.
- CMB / cosmological observables — primordial scalar and tensor power spectra from inflationary QFT, BBN abundances from electroweak-era equilibrium.
2.11 Information-theoretic / quantum-correlation observables
A growing area in modern QFT, motivated by quantum information and condensed matter:
- Entanglement entropy of a spatial region (computed from the reduced density matrix of the QFT vacuum).
- Bell-type inequalities in particle physics (recent measurements in top-quark pair production at LHC).
- Out-of-time-order correlators (OTOCs) — diagnostic of quantum chaos / scrambling, e.g. .
These are not "cross sections" in any classical sense but are real QFT observables increasingly being measured.
3. Relating Each Class to the QFT Machinery
| Observable | Built from | Computed via | Measured at |
|---|---|---|---|
| Cross sections | + phase space + flux | LSZ + Feynman rules + integration | Colliders, fixed targets |
| Decay rates | + phase space + | Same | Lifetime measurements |
| Branching ratios | Ratios of 's | Same, ratios cancel normalizations | Decay experiments |
| Asymmetries | Ratios of differential observables | Same | Collider, parity / experiments |
| Distribution shapes | Differential cross sections | Same | Collider event reconstruction |
| Bound-state energies | Poles of multi-particle correlators | Bethe–Salpeter, NRQED, lattice | Spectroscopy |
| , form factors | Vertex function at on-shell kinematics | Loop diagrams + LSZ amputation of external legs | Penning traps, rings, scattering with low |
| IR-safe QCD observables | summed over IR-safe final-state classes | Jet algorithms, factorization theorems | Colliders |
| Masses, couplings | Propagator poles, vertex functions, RG running | Renormalization conditions + global fits | All of the above (parametric fits) |
| Vacuum / topological | Effective potential, anomaly coefficients, instantons | Various non-perturbative methods | Cosmology, neutron EDM, |
4. Where each lives in this repository
| Observable class | Existing coverage | Status |
|---|---|---|
| Cross sections | cross-sections.md | ✅ Full treatment |
| Decay rates | decay-rates.md | ✅ Master formula + worked muon example |
| Electroweak observables (full inventory) | electroweak.md | ✅ Per-class with overlap tags |
| Standard Model cross-sector observables | standard-model.md | ✅ Global EW fit, CKM-UT, GIM, lepton universality |
| Compton scattering (worked example) | QED/compton.md | ✅ End-to-end calculation |
| Hydrogen levels (bound-state example) | QED/hydrogen.md | ✅ Bethe–Salpeter sketch |
| , Lamb shift, hyperfine | QED precision tests, QED/hydrogen.md | ⚠️ Mentioned, not derived |
| QED observables (full inventory) | None | ❌ Gap (would parallel electroweak.md) |
| QCD observables (full inventory) | None | ❌ Gap (jets, DIS structure fns, , lattice spectrum) |
| Form factors framework | None | ❌ Gap |
| Vacuum / topological | None | ❌ Gap |
| Information / Bell / entanglement | None | ❌ Gap |
5. The Logical Flow Across Observables
The chain that connects QFT to any observable:
So every observable in QFT goes through correlators of local operators. The differences between cross sections, bound-state energies, and form factors are differences in what part of the correlator structure you extract — not in the underlying machinery.
6. Pointers
- cross-sections.md — the master formula, flux factor, Lorentz-invariant phase space, Mandelstam variables, optical theorem, units (barns).
- decay-rates.md — the master formula, lifetimes, branching ratios, the muon-lifetime worked example.
- direct-vs-inferred.md — the taxonomy of what is directly measured by detectors (track hits, calorimeter deposits, magnetic-field curvature, timing, polarization) vs. what is fit-extracted (essentially everything else in particle physics); per-observable inference-distance table; the "which mass?" renormalization-scheme ambiguity.
- collider-measurements.md — the theory → variable → reconstruction → result pipeline, with 8 worked case studies ( scan, transverse mass, peak, Higgs discovery, cross section, branching ratio, asymmetry, coupling modifier).
- electroweak.md — per-class inventory of EW observables (gauge-boson masses, -pole asymmetries, , CKM elements, Higgs sector, neutrinos), tagged by overlap with QED/QCD/SM.
- standard-model.md — cross-sector SM observables (CKM unitarity triangle, global EW fit, GIM, lepton universality, cosmological constraints).
- LSZ Reduction Formula — extracts S-matrix elements from correlators.
- QED/compton.md — worked tree-level cross-section calculation.
- QED/hydrogen.md — bound-state observables.
- QED § Successes and Tested Predictions — historical highlights of QED-derived observables.
Cross Sections
QFT predicts numbers measurable in experiments via two parallel observables: cross sections for scattering processes and decay rates for unstable particles. Both are constructed in a uniform way from the invariant amplitude produced by Feynman-rule calculations (see QED/historical.md §5.2 and QED Step 6).
Scope. This page focuses on cross sections. The companion observable — decay rates — shares the same kinematic ingredients but has its own master formula and conceptual status; that material lives in decay-rates.md. For the broader landscape — branching ratios, asymmetries, bound-state energies, and form factors, IR-safe QCD observables, masses, couplings, etc. — see observables.md, which places this page inside a wider inventory and explains how each observable type relates to the underlying / correlator / vertex-function machinery.
This page collects the definitions, the formulas, and the kinematic ingredients (flux factor, Lorentz-invariant phase space) that the rest of the notes use without explanation. It complements QFT/preliminaries.md § S-Matrix and Cross Sections, which only sketches them in one line.
1. What Is a Cross Section?
1.1 What Is a Cross Section?
A cross section is, in the original sense, the effective transverse area that one target particle presents to an incoming beam: a beam particle that crosses this area triggers a scattering reaction; one that misses it does not. It has dimensions of area — hence the name.
This is the picture Rutherford used in 1911 to interpret his alpha-on-gold-foil scattering experiments, and it is the historical origin of the term. We take it as the primary definition below; the per-particle probability (ii), the operational rate formula (iii), the quantum amplitude formula (iv), and the S-matrix expression (v) are then all derived from it (or from its quantum generalization), in the chain
(i) Geometric / effective-area definition — primary
For a single target particle and a single beam particle approaching with impact parameter , define as the area in the plane perpendicular to the beam such that the beam particle triggers reaction if and only if it passes through that area:
Differential version: the probability per unit solid angle that the outgoing particle goes into direction defines the differential cross section , with .
For a hard sphere of radius , this is exact and literal: any beam particle with impact parameter hits the sphere, any with misses, so . For other potentials (Coulomb, Yukawa, ...), the picture remains conceptually correct if you allow to depend on the reaction (different final states correspond to different effective areas) and on the energy (slower particles spend more time near the target and have larger effective areas).
This was Rutherford's working concept and remains the cleanest mental picture. It also fixes the units of (area) and motivates the unit name barn (§1.2 below).
(ii) Single-target (per-particle) reformulation — derived from (i)
Consider a single beam particle traversing a thin slab of target material of thickness , transverse area , and target density . The slab contains target particles, each presenting effective area to the beam (definition (i)). The total effective scattering area in the slab is
A single beam particle entering the slab at a random transverse position (within ) scatters iff it crosses one of these effective disks, so its scattering probability is
Both and the slab transverse geometry have cancelled — the probability depends only on , , and .
Integrating along a path of length , the fraction of beam particles that have scattered is , and the mean free path is .
(Some texts take this as the primary definition of — declare the cross section to be whatever number makes hold for a thin slab. That gives the same as (i), with the advantage of not requiring the literal "effective area" picture for soft potentials.)
(iii) Operational (rate) reformulation — derived from (ii)
For a beam of particles of type (number density , velocity ) impinging on the slab, multiply the per-particle probability (ii) by the rate at which beam particles enter the slab. Throughout, denotes the cumulative number of scattering events of type ; is the event rate.
-
Beam particles entering the slab per unit time. The number of beam particles crossing the slab face per unit time is , where is the incident flux (beam particles per unit area per unit time).
-
Multiply by per-particle probability. By linearity of expectation, the expected event rate is (attempts per unit time) × (probability per attempt):
The dice analogy: "if you roll dice per second and each scatters with probability , you get events per second on average."
-
Divide by slab volume :
The slab geometry has cancelled: and both drop out. Equivalently, in terms of the incident flux and target areal density (per unit transverse area along the beam),
This is the experimentalist's working formula: measure , divide by , and what comes out is . Many practical texts take this rate formula as the operational definition of instead of (i); the two are equivalent.
Each ingredient has a clear physical role:
- — more beam particles per unit volume → more chances to scatter.
- — more targets per unit volume → more targets to scatter from.
- — faster relative motion → more beam particles encounter each target per unit time.
- — bigger effective area → each encounter is more likely to react.
Relativistic generalization
For relativistic beams the simple product is not Lorentz-invariant, but the rate density is (events per spacetime volume). The Lorentz-invariant generalization replaces with the Møller flux
which reduces to in the non-relativistic limit and gives the boxed formula in the lab frame with .
Worked example: numerical estimate
Suppose a 1 mA proton beam ( for a focused beam of cross-section) hits a 1 mm thick lead target (). The geometric cross section of a Pb nucleus is . Then
- Target areal density: .
- Scattering probability per beam particle: .
- Event rate:
So at the geometric cross-section level, about scattering events per second occur. Cross sections for specific reactions (e.g. nuclear excitations, deep-inelastic scattering, particular final states) are typically much smaller — picobarn to femtobarn — yielding correspondingly fewer events.
Caveats
- Single-scattering assumption. The formula assumes each beam particle scatters at most once. For thick or dense targets () one must use the exponential attenuation from (ii) instead, and account for multiple scattering.
- Reaction-specific . A given collision can produce many different final states, each with its own . The total cross section counts any reaction.
- Coherent vs. incoherent. The formula assumes incoherent scattering off independent targets. Coherent processes (e.g. Bragg diffraction off a crystal) require summing amplitudes, not probabilities, and produce different angular distributions.
(iv) Quantum / amplitude reformulation — derived from (v); the working formula
For potentials (Coulomb, Yukawa, ...) where there is no literal "area", the geometric picture (i) is defined by the rate formula (iii) instead, and computed from the quantum-mechanical scattering amplitude . Extracting from the off-diagonal -matrix elements via , the cross section in (v) below takes the form:
with the flux factor and Lorentz-invariant phase space defined in §2 below.
This is the working formula used throughout perturbative QFT — nearly all standard graduate textbooks (Peskin–Schroeder, Schwartz, Srednicki, Mandl–Shaw, Bjorken–Drell) take this as the quoted definition, with (v) shown only as motivation. The numerical agreement between this computed value and an experimental measurement via (iii) is what makes QFT predictive.
(iv) ↔ (iii): how the chains meet
The chain is purely classical counting; the chain is purely quantum (extracting from ). What links them is the observation that the quantum-side rate density turns out to have exactly the same functional form as (iii) — bilinear in and , proportional to , with a kinematic factor of dimensions area. This is not a derivation of (iii) from (iv) (or vice versa); it is a non-trivial structural compatibility check that allows the quantum coefficient to be defined as a cross section.
Three steps make the compatibility manifest.
-
Transition probability per particle pair. The S-matrix element between asymptotic momentum eigenstates contains a momentum-conserving delta function . Squaring it produces in the long-time, large-volume limit (one delta function evaluates to the spacetime volume at zero argument). With the standard relativistic normalization and , the transition probability per unit time per particle pair is
The 's from the delta-squaring and from the state norms partially cancel; one residual remains.
-
Sum over all pairs in the box. A box of volume contains beam particles and target particles, giving pairs (assuming each pair scatters independently — the perturbative single-pair regime). Multiplying:
-
Divide by box volume to get the rate density. Dividing by and integrating over the final-state phase space (the delta function fixes one momentum):
This is a derived result from the quantum side, with no input from (iii). What's striking about it: it is bilinear in and , with a kinematic prefactor depending only on the two beam-particle energies and the matrix element. That bilinearity is not assumed — it follows from the standard normalization of momentum-eigenstate Fock states and from each beam particle scattering off one target at a time. (If the rate density had turned out to depend on or , the cross-section interpretation would not be available.)
Now compare with (iii): . The functional forms match. We define the cross section in the quantum framework as the coefficient that makes the matching work:
The combination in the denominator is exactly the lab-frame flux factor (which generalizes covariantly to the Lorentz-invariant Møller flux of §2.3). Absorbing the delta function into the phase-space integral recovers (iv) in its standard form ✓.
Two things to take away:
- The structural fact that the quantum rate density is bilinear in densities, proportional to , and has a kinematic factor of dimensions area is derived from the S-matrix formalism (steps 1–3). It is a non-trivial check; it could have failed for a non-perturbative or coherent process.
- The identification of that kinematic factor as the cross section is a definition — one chosen so that the quantum framework reproduces the classical counting picture (iii). It is what makes the symbol in (iv) refer to the same physical quantity as in (iii).
So the quantum chain (v) → (iv) and the classical chain (i) → (ii) → (iii) do not derive each other; they meet at this definitional matching, made possible by the structural compatibility shown in steps 1–3.
What is being postulated
The bridge above relies on two unstated assumptions, both of which are part of QFT's foundational postulate set rather than theorems derivable within the S-matrix formalism:
-
Born rule for scattering. is interpreted as the probability of a transition between asymptotic states. This is QM Postulate 3 applied to scattering — the same postulate, with and the bookkeeping of delta-function-normalized states. The non-relativistic single-particle analog is Fermi's Golden Rule , which has the same (squared matrix element) × (final-state density) × (kinematic prefactor) structure as . See QFT/remarks.md § Born Rule for Scattering for the full QM↔QFT comparison and the technical subtlety of non-normalizable asymptotic states. Without this postulate, is just an algebraic object with no physical meaning.
-
Quantum-classical rate correspondence. The S-matrix transition rate per pair — computed in an idealized infinite-volume, infinite-time limit — equals the classical event rate per pair that an experimentalist measures with a real detector of finite size operating for finite time. This is a correspondence-principle postulate: it asserts that the asymptotic-state idealization captures what real apparatus does, modulo finite-size and resolution effects.
Without these two, the boxed quantum formula in step 3 would be a formal expression with no connection to laboratory measurements, and the matching with (iii) would not exist as anything more than dimensional coincidence. With them, (iv) becomes a prediction of the experimentally-measured (iii) — and the agreement of those two numbers, on which all of perturbative-QFT phenomenology depends, becomes the empirical content of the theory.
(v) S-matrix definition — quantum-foundational form
In axiomatic / S-matrix-based QFT, is defined in terms of the S-matrix acting between asymptotic in/out Fock states (see QFT/postulates.md Postulate 10). The cross section is what falls out of the squared modulus of after dividing by the natural normalization of momentum-eigenstate Fock states (which produces the flux factor ) and integrating over the final-state phase space:
Factoring out the spacetime-translation delta function and packaging the residue into the invariant amplitude recovers (iv) as the practical form.
This is the form used in foundational / axiomatic texts (Weinberg QFT Vol. 1, Itzykson–Zuber, Streater–Wightman, Haag), where the S-matrix is taken as the primitive quantum object and (iv) is derived from it. The two are mathematically equivalent; the choice between them is a matter of which is taken as the definition vs. the working formula.
All five describe the same object. (i) is taken as the classical foundational definition, following Rutherford's original geometric picture. (ii) is derived from (i) by counting targets in a thin slab. (iii) is then derived from (ii) by multiplying by the rate of beam particles entering the slab. On the quantum side, (v) is the most foundational form (S-matrix on asymptotic Fock states), and (iv) is the working formula derived from it by extracting . Numerical agreement between (iv)/(v) on the quantum side and the classical chain (i)→(ii)→(iii) on the experimental side is the empirical content of QFT predictions.
1.2 Units
Cross sections have units of area. Common units in particle physics:
| Unit | Value |
|---|---|
| barn (b) | |
| millibarn (mb) | |
| microbarn (μb) | |
| nanobarn (nb) | |
| picobarn (pb) | |
| femtobarn (fb) |
The barn is roughly the geometric cross-section of a heavy nucleus ("as big as a barn door" — wartime Manhattan-Project lab humor).
In natural units (), cross sections have dimensions of . The conversion is .
1.3 Differential vs. Total
- Differential cross section (or , , etc.) is the cross section per solid angle (or per Mandelstam , etc.).
- Total cross section integrates over the full final-state phase space.
The differential form contains the angular / energy distribution of scattered particles; the total contains only the overall reaction rate.
1.4 Classical Analog
The cross section is not a quantum invention — it was defined and used in classical mechanics decades before QM existed. The QFT formula is a generalization of the classical one and reduces to it in appropriate limits.
Classical definition
For a beam of point particles incident on a fixed target, the operational definition (rate ∝ flux × density × ) is literally identical to §1.1 above; only the computation differs between classical and quantum mechanics. Particles incident with impact parameter (perpendicular distance from the target's center to the beam line) and azimuthal angle scatter to angle determined by the classical equations of motion. Then
This is purely geometric — no , no .
The area formula
It's worth pausing to note what is exact and what is an approximation in the formula :
-
The differential cross-section relation is exact only in the limit . Equating the cross-sectional area with gives a true equality only when both differentials are infinitesimal. For finite patches, the Jacobian varies across the patch, so the ratio (annulus area)/(solid angle) is approximate.
- For hard-sphere scattering specifically, the Jacobian is constant (independent of ), so the ratio is exact even for finite patches. This is a feature of the hard-sphere geometry, not a general fact.
- For Coulomb scattering, the Jacobian diverges at , so finite patches give noticeably different ratios.
-
In the interactive visualizations above, the rendered annulus is a polygonal approximation of the smooth annular sector (triangulated mesh; a few subdivisions per edge). Bumping the subdivision counts would make it visually smooth, but does not change the algebraic content — both the area formula and (for the hard sphere) the ratio are exact at any subdivision level.
Interactive 3D demo. An interactive three.js visualization of classical scattering — hard-sphere reflection and repulsive/attractive Coulomb trajectories — is available in cross-sections-3d.html. Open it in a browser; drag to rotate, scroll to zoom, and use the controls to switch modes, vary the impact-parameter range, and adjust the energy.
Spherical-coordinate visualization. A second three.js page, cross-sections-spherical.html, focuses specifically on the definition of : the incoming annulus at the source plane is shown together with its image on a detection sphere surrounding the target, after hard-sphere reflection. Sliders control , , , — letting you watch how the two patches transform into each other.
Classical scattering geometry
─────────────────────────────
↗ outgoing ray
╱ (at angle θ)
╱
● target ╱
━━━┿━━━ ╱ ← scattering angle θ
│ ↑ ╱ measured from the
│ │ ╱ incoming beam axis
│ │ b (impact ╱
│ │ parameter) ╱
│ ↓ ╱
─────────────────●━━━━━┿━━━━━━━━━━━━━━━━━●──────→ beam axis (θ = 0)
incoming particle scattering
(parallel ray, perp. center
distance b from axis)
Key: b = perpendicular distance from beam axis to the incoming-ray asymptote.
φ = azimuthal angle of that ray around the beam axis (not shown — out of page).
θ = polar deflection angle measured from the beam axis to the outgoing ray.
Geometrically, all incoming particles whose impact parameter lies in the annulus scatter into the same conical region around the beam axis (modulo the azimuthal range):
Annulus → solid angle (top view, beam coming out of page)
────────────────────────────────────────────────────────
area dA = b db dφ solid angle dΩ
╭─────╮ ╭───╮
╭─┤ b+db├─╮ │ │
╭──┤ ├──╮ scattering │ θ │
│ │ ● │ │ ─────────────→ │ │ → (cone of
╰──┤ ├──╯ │ │ opening
╰─┤ b ├─╯ ╰───╯ angle θ)
╰─────╯
Two textbook examples
-
Hard sphere of radius . Geometry gives , hence
The total cross section equals the geometric cross-sectional area — the source of the "effective area" intuition. Classical and quantum results agree at energies where the de Broglie wavelength is much smaller than .
Hard-sphere reflection ────────────────────── outgoing ╲ ╲ θ angle of scattering ╲ ●━━━━━ surface α ╱│ ╱ │ α = angle of incidence ╱ │ R (incidence = reflection) ╱ │ ╱ │ Geometry: ╱ │ sin α = b/R b → ─────────────●──────● θ = π - 2α │ center ⟹ b = R cos(θ/2) │ ⟹ dσ/dΩ = R²/4 incoming ⟹ σ_tot = π R² parallel ray -
Rutherford scattering. A point charge scattering off a fixed point charge with kinetic energy :
Remarkably, the non-relativistic Born-approximation quantum-mechanical calculation gives exactly the same answer, and tree-level QED (Mott) reduces to it in the non-relativistic limit. Rutherford scattering is one of the few cases where classical, NRQM, and QED results all coincide — a useful consistency check.
Rutherford (Coulomb) scattering — hyperbolic trajectory ─────────────────────────────────────────────────────── ↗ outgoing ╱ asymptote ╱ ╱ ╱ ╱ ⌒╱ θ ⌒ ╱ ⌒ ╱ ⌒ ╱ ⌒ ╱ ⌒ ╱ ⊕ Z₂e ⌒ ╱ scattering angle θ ●═══════════════════════ ╱ determined by the │ ╱ hyperbolic orbit │ b ╱ ───────────●─────────────────────╱──────→ beam axis ────────────────⌒ ─⌒ ← incoming asymptote (parallel to beam axis at distance b) Solving the orbit equation gives cot(θ/2) = (2 E b)/(Z₁ Z₂ e²), and inverting yields the Rutherford formula above. The cross section diverges at small θ because the Coulomb potential has infinite range (1/r tail).
How the QFT formula reduces to the classical one
The QFT master formula reduces to in the WKB / eikonal limit, where:
- Wavelengths are small compared to the target ().
- The wavepacket localization is much smaller than the impact-parameter resolution.
In this regime the amplitude is dominated by stationary-phase paths, which are exactly the classical trajectories. The squared amplitude reproduces the geometric Jacobian .
| Regime | Cross-section formula | Equivalent description |
|---|---|---|
| Classical | Trajectories with definite impact parameter | |
| Non-relativistic QM | ( = scattering amplitude) | Wavefunctions, Born / partial-wave expansion |
| Relativistic QFT | Feynman amplitudes, Fock-space states |
These are one formula in three regimes, not three different formulas. The meaning of as "effective area" is unchanged across all of them.
Where the classical picture breaks
The classical analog is misleading or absent in several genuinely quantum cases:
- Diffraction at long wavelength. When , the cross section is no longer geometric. Hard-sphere scattering at low energy gives — a factor of 4 larger than the classical , due to wave diffraction.
- Resonances and threshold cusps. Breit–Wigner peaks and channel-opening cusps have no classical analog.
- Polarization-dependent processes. Spin / polarization don't exist classically (in the relevant sense), so cross sections involving them have no classical limit.
- Identical-particle interference. Møller and Bhabha scattering have antisymmetric amplitudes producing interference patterns absent in classical-particle scattering.
- Production processes. creates new particles — fundamentally a quantum process with no classical analog.
In all these cases the QFT formula still applies and gives finite cross sections; there's just nothing classical to compare against.
(For the analogous discussion of decay rates and their weaker classical analog, see decay-rates.md §4.1.)
2. The Master Formula
2.1 General Cross Section
For ,
with three ingredients explained below: the flux factor , the squared invariant amplitude , and the Lorentz-invariant phase space (LIPS) .
2.2 The Invariant Amplitude
The S-matrix element factorizes as
is what Feynman-rule calculations directly produce. The overline denotes averaging over initial spins / polarizations and summing over final ones, appropriate for unpolarized beams and detectors:
For example, has 2 initial electron spins × 2 photon polarizations = 4 initial states, so the prefactor is — exactly as in QED/compton.md §4.
2.3 The Flux Factor
accounts for the relative motion of the incoming beam and target. The Lorentz-invariant form is
Equivalent expressions in common frames:
- Lab frame (target at rest, ): .
- CM frame (total 3-momentum zero): , where is the CM energy.
The factor of 4 is a convention tied to the normalization used throughout the notes (see fock-space-inventory.md §3).
2.4 Lorentz-Invariant Phase Space
For final-state particles,
Each final-state particle gets a Lorentz-invariant measure ; the overall delta function imposes 4-momentum conservation. The "" denominator is the same normalization factor that appears in the asymptotic states (see §2.3 above).
For commonly encountered cases:
- : after using the delta function and integrating over one final-state magnitude, — equivalently the classic CM-frame formula .
- : more involved — requires Dalitz-plot variables or a numerical integration.
2.5 The Differential Cross Section
Putting flux and phase space together for the most common case ( scattering in the CM frame):
For elastic scattering () the kinematic ratio is unity and one is left with .
This is the formula applied (with a frame-conversion to the lab frame) to derive the Klein–Nishina cross section in QED/compton.md §5.
3. Decay Rates
Decay rates use the parallel master formula
with the same and as above and the flux factor replaced by the rest-frame normalization of the decaying particle. The full treatment — master formula, lifetime, branching ratios, the muon-lifetime worked example, classical analog, and connection to complex propagator poles — lives in decay-rates.md. Cross-references from the rest of the notes that previously pointed here for should now point there.
4. Practical Conversions
A few rules of thumb for going between formulas and numbers.
4.1 Useful Numerical Factors
| Quantity | Value |
|---|---|
| Classical electron radius | |
| Thomson cross section | |
| Bohr radius | |
| Bohr cross section |
4.2 Mandelstam Variables
For a process , the Lorentz-invariant kinematic variables are
satisfying . Differential cross sections are often quoted as , which has the advantage of being manifestly Lorentz-invariant.
The relation to the CM scattering angle for elastic scattering is .
4.3 Converting Between Frames
Since is Lorentz-invariant, total cross sections are the same in any frame. But differential cross sections are not: in the lab frame differs from by the Jacobian of the angle transformation, which is computable from kinematics.
Frame-invariant differential forms (, in rapidity, etc.) are preferred when the choice of frame is irrelevant.
4.4 Optical Theorem
A useful consistency check: the optical theorem relates the imaginary part of the forward scattering amplitude to the total cross section,
a direct consequence of (unitarity). It provides a non-trivial constraint on perturbative calculations — every loop-level imaginary part must correspond to a contribution to a physical cross section.
5. Where Cross Sections Appear in the Notes
| Place | Use |
|---|---|
| QFT/preliminaries.md § S-Matrix and Cross Sections | One-line mention of , as observables built from |
| QFT/postulates.md Postulate 10 | The S-matrix is the formal object; cross sections are extracted from it |
| QED Step 6 | Feynman rules → amplitude → → cross section |
| QED/historical.md §5.2 | Same chain, with the Feynman-rule derivation sketched |
| QED/compton.md | Worked example: Klein–Nishina cross section for |
| QED/hydrogen.md | Not a cross-section calculation — bound-state spectroscopy uses different machinery |
6. Take-Aways
- A cross section has units of area; it is the proportionality between scattering rate and (flux × target density). The barn () is the standard unit.
- The master formula is — flux factor, squared amplitude, Lorentz-invariant phase space.
- The parallel decay-rate formula is , treated in decay-rates.md.
- Total cross sections are Lorentz-invariant; differential ones are not (use for invariance).
- The optical theorem ties total cross sections to forward-amplitude imaginary parts via S-matrix unitarity.
- A measured cross section is itself an inferred quantity, obtained from event counts via — luminosity, acceptance, efficiency. The taxonomy of what is directly measured by detectors vs. what is fit-extracted is in direct-vs-inferred.md.
These formulas are the universal interface between "what QFT computes" () and "what experiments measure" (, , , ).
Decay Rates and Lifetimes
Decay rates govern the kinematics of unstable particles: how fast they disappear, into what final states, and with what distribution of products. They are computed from the same invariant amplitude that produces scattering cross sections, with one initial particle instead of two.
Companion page. This doc focuses on decays specifically. The general kinematic ingredients (flux factor, Lorentz-invariant phase space , Mandelstam variables, units) and the parallel structure for scattering are in Cross Sections. For the broader observable map, see Observables.
1. Master Formula
For an unstable particle of mass at rest decaying into final-state particles,
The replacement relative to the scattering master formula is straightforward:
- There is no flux factor (only one incoming particle, evaluated in its rest frame).
- The "" is the rest-frame normalization of the decaying particle's state in the relativistic-normalization convention .
The squared amplitude and Lorentz-invariant phase-space measure are exactly as defined in Cross Sections §2.
2. Lifetime, Total Rate, and Branching Ratios
2.1 Total decay rate
summed over all kinematically accessible final states .
2.2 Lifetime
In natural units (), has dimensions of energy; is in inverse-energy = time units. A useful conversion: .
2.3 Branching ratios
Branching ratios are dimensionless and cancel many normalization uncertainties (overall coupling constants, wavefunction renormalizations), making them the cleanest predictions from QFT for many hadronic and electroweak processes. They satisfy by construction.
3. Worked Example: Muon Lifetime
Muon decay is a process with three (essentially) massless final-state particles. Using the four-Fermi effective interaction
the standard tree-level computation yields
Plugging in and :
agreeing with experiment to part-in- precision once electroweak and radiative corrections are included.
The scaling is generic for purely leptonic decays via a four-fermion interaction — three powers from the phase space, two from the at fixed kinematics (squared coupling × dimensionful amplitude factor). The same pattern produces for tau decays into leptons.
4. Conceptual Status
4.1 The classical analog is weaker than for cross sections
Decay rates have a weaker classical analog than cross sections. The "classical decay law" is the exponential , which describes any Markovian stochastic decay process (radioactive decay, Arrhenius rates, photon spontaneous emission viewed semiclassically) — but the origin of the rate (a quantum transition matrix element) has no direct classical counterpart. Compare with cross sections, which have a literal geometric area interpretation classically (effective transverse cross-section of the target).
So decay rates are conceptually closer to quantum-only territory than scattering cross sections. The exponential decay law itself is an emergent stochastic phenomenon (Wigner–Weisskopf approximation in the more careful quantum-mechanical treatment), not a fundamental QFT prediction.
4.2 Born rule for decay
The interpretation of as a probability density for decay is the Born rule applied to a decaying state — see Cross Sections §1.6 and QFT/remarks.md § Born Rule for Scattering for the full QM↔QFT comparison. The non-relativistic single-particle analog is Fermi's Golden Rule
which has the same (squared matrix element) × (final-state density) × (kinematic prefactor) structure as .
4.3 Decays as poles of correlators
A decay rate is also extractable from the structure of two-point correlators: an unstable particle's propagator has a complex pole
where the imaginary part of the pole is (proportional to) the total decay rate . This is the optical-theorem view: the imaginary part of the forward two-point function corresponds to the sum of all decay channels. See Observables §2.6 for the connection to the broader pole-residue framework (Class B observables).
5. Where Decay-Rate Observables Appear in the Notes
| Place | Use |
|---|---|
| Observables §2.2 | Decays in the Class A (squared-amplitude) framework |
| Observables §2.3 | Branching ratios as derived observables |
| Cross Sections | Companion master formula , kinematic ingredients shared with this page |
| QED § Successes | Historical highlights of decay-rate predictions (positronium lifetimes, etc.) |
| QED/historical.md §5.4 | The machinery feeding both and |
6. Take-Aways
- The decay-rate master formula is — same machinery as cross sections, with .
- Lifetime summed over all decay channels.
- Branching ratios cancel normalization uncertainties.
- The muon-lifetime calculation is the canonical worked example.
- Decay rates have a weaker classical analog than cross sections (the exponential law is stochastic, but the rate's origin is purely quantum).
- Decays also appear as complex poles of propagators, bridging Class A (rates) and Class B (pole/residue) observables.
- A measured decay rate is itself an inferred quantity — fit from an exponential time distribution or from a Breit–Wigner peak width, both of which require a theoretical model. See direct-vs-inferred.md for the taxonomy of detector primitives vs. fit-extracted observables.
Direct vs. Inferred Observables
Almost nothing in published collider results is directly observed. The detector records a small universal set of primitives — track hits, calorimeter energy deposits, magnetic-field curvature, timing, polarization — and everything else (cross sections, masses, branching ratios, couplings, , CKM elements, decay rates) is a fit-extracted parameter inferred from those primitives via a theoretical model.
This page collects the taxonomy: what is directly measured vs. what is inferred, ordered by the inference distance between raw data and quoted result. It is the conceptual companion to:
- cross-sections.md — the master formula whose parameters are extracted by the fits described here.
- decay-rates.md — the master formula, same structure.
- collider-measurements.md — the full theory → variable → reconstruction → result pipeline for eight concrete worked examples.
1. The Direct Observables
These are what a particle-physics detector actually records. Every other number in this folder is derived from these:
| Primitive | Physical signal | Detector subsystem |
|---|---|---|
| Track hits | Position + time of charged-particle ionization | Silicon pixel + strip tracker |
| Calorimeter energy deposits | Total ionization in absorber, summed across a particle shower | ECAL (); HCAL (hadrons) |
| Magnetic-field curvature | Bending radius of charged-particle track in known -field → particle momentum | Tracker (in solenoidal -field, ) |
| Timing | Bunch-crossing timestamp; time-of-flight; decay-vertex displacement | Timing layers, trigger |
| Polarization (selected experiments) | Spin-precession frequency in storage rings, or decay-product angular distribution | E.g. resonant depolarization (LEP), muon ring (Fermilab) |
That's it. All five together fit in a single short table. Everything else in particle physics is inference from these.
Two ancillary direct measurements complete the picture:
| Primitive | Physical signal |
|---|---|
| Luminosity | Beam-overlap measurement via van der Meer scan (vertical/horizontal beam separation vs. rate); cross-checked against forward elastic scattering (TOTEM, LHCf). Per-experiment uncertainty . |
| Beam energy | At : resonant depolarization (to at LEP). At hadron colliders: from accelerator RF + lattice optics, with auxiliary calibration. |
These two — luminosity and beam energy — are directly measured external quantities. Every other "measurement" is internal to the theory model.
2. The Inferred Observables
Every quantity below is a parameter whose value is obtained by:
- Building a theoretical model that depends on the parameter + a long list of nuisance parameters.
- Predicting a histogram of some kinematic variable from the model.
- Maximizing the likelihood that the observed histogram of detector primitives matches the model prediction, while profiling out the nuisances.
The result is the best-fit value of the parameter, with (stat) ⊕ (calibration / detector) ⊕ (theory / PDF / scheme) uncertainties.
2.1 Cross sections, decay rates, and their derived ratios (Family A)
| Observable | Inferred from | Master formula |
|---|---|---|
| Total cross section | — count events, divide by luminosity, acceptance, efficiency | (cross-sections.md) |
| Differential cross section | Same as above, binned in | same |
| Decay rate / width | Exponential fit of decay-time distribution , or Breit–Wigner peak width | (decay-rates.md) |
| Lifetime | Same; sometimes measured directly from displaced-vertex statistics | same |
| Branching ratio | — ratio of fits | (ratio) |
| Asymmetries | — counts in two regions | (ratio of cross sections) |
2.2 Particle masses (Family B — propagator poles)
| Observable | Inferred from | Inference distance |
|---|---|---|
| Breit–Wigner peak position in scan at LEP1 | Short — pole of | |
| Gaussian-broadened peak position in invariant-mass histograms | Short — invariant-mass peak | |
| Jacobian peak in transverse-mass template fit | Long — needs PDFs, recoil model, FSR, calibration | |
| Endpoint / peak of in events; or template fit | Long — needs jet-energy scale, -tag calibration, color reconnection, scheme translation | |
| Lattice hadron masses () | Exponential falloff of Euclidean two-point correlator | Short (within lattice setup) |
| Quark masses () | Combined inputs: spectroscopy + lattice + perturbative matching | Long + scheme-dependent |
2.3 Coupling constants and mixing angles (Family B + global fits)
| Observable | Inferred from |
|---|---|
| Combined fit to QED observables; runs from low-energy via hadronic vacuum polarization | |
| Multiple QCD observables (jet rates, -ratio, lattice, decay, DIS) fit together | |
| -pole asymmetries (LEP/SLC) + parity-violating scattering + atomic PV | |
| Muon lifetime via leptonic three-body decay formula | |
| CKM elements | Semileptonic decay rates × lattice form factors (overdetermined, fit together) |
| , | Time-dependent CP asymmetries in -meson decays |
| PMNS angles + | Neutrino-oscillation rates |
2.4 Static and vertex observables (Family C)
| Observable | Inferred from |
|---|---|
| Anomalous moments | Spin-precession frequency in Penning trap (electron) or storage ring (muon) |
| Charge radii | Low- extrapolation of elastic-scattering form factors; or atomic spectroscopy |
| EDMs | Atomic-physics interferometry (ACME, JILA); upper limits in SM, BSM if positive |
| Form factors | Multiple-energy scattering measurements stitched into a function |
| Axial couplings | -decay angular correlations |
2.5 Oscillation parameters (mixing)
| Observable | Inferred from |
|---|---|
| (-meson mass differences) | Oscillation frequency in flavor-tagged time-dependent decays |
| (neutrino mass splittings) | Oscillation pattern vs. |
| Time-dependent and direct CP-violating decay-rate ratios in -meson decays |
2.6 Composite / global-fit parameters
| Observable | Inferred from |
|---|---|
| PDFs | Global fit to data points across DIS + DY + jets + top + |
| CKM unitarity-triangle apex | Combined fit to all CKM-related measurements (standard-model.md §1) |
| Global EW fit predictions ( from inputs) | Profile-likelihood over all -pole + low-energy EW measurements |
| Higgs coupling modifiers | Profile fit over all measured Higgs production × decay channels |
3. Inference Distance: Short → Long
The chain from primitives to parameter has very different lengths depending on the observable. This is the most operationally important distinction across collider measurements — it determines whether the published uncertainty is dominated by data statistics or by theory modeling.
| Distance | Example | Why |
|---|---|---|
| Zero | Event count in a histogram bin | The universal base level — every collider measurement starts here. Counts in bins of some reconstructed variable, summed over a run. Everything below is one or more steps removed from this. |
| Trivial | Charge of a track ( from curvature direction) | Sign-only inference |
| Trivial | Asymmetry | Pure ratio of two histogram-bin counts; luminosity + most efficiencies cancel |
| Trivial | Branching ratio | Pure ratio; luminosity and total cross section cancel |
| Short | from line-shape scan | Single dominant feature (pole position); only beam-energy calibration enters |
| Short | from peak | Narrow Gaussian peak; only lepton/photon energy scale |
| Short | Hadron masses from lattice-QCD Euclidean correlators | Exponential falloff; only statistical noise + lattice-spacing extrapolation |
| Short | Lifetime from exponential decay-time fit | Slope of ; needs no cross-section quantity |
| Medium | Neutrino from oscillation pattern | Frequency in ; need calibrated + flux normalization |
| Medium | Total cross section | Counting + luminosity + acceptance — but the shape of the histogram is irrelevant |
| Long | from transverse-mass templates | Needs PDFs, recoil model, FSR, calibration |
| Long | from kinematic fits | Needs jet-energy scale, -tag calibration, MC color reconnection, scheme-translation theory |
| Long | from jet observables | Many NLO/NNLO QCD corrections, PDFs, non-perturbative effects |
| Very long | Higgs self-coupling | Requires di-Higgs cross section + production modeling; uncertainty even at HL-LHC |
The zero-row is worth pausing on: all collider measurements share the same base-level quantity — counts in histogram bins. Cross sections, branching ratios, asymmetries, masses, lifetimes, couplings, and discovery significances are different functionals of the same primitive. Cross sections are one common such functional — but branching ratios, asymmetries, and lifetimes are extracted without ever forming a cross-section quantity. So "everything at colliders relies on cross sections" is too strong; the correct statement is "everything at colliders relies on histograms of event counts, of which the cross section is the most familiar consumer".
In the long-inference cases (notably and ) the theoretical-modeling uncertainty competes with or exceeds the statistical uncertainty in the final published number. This is why theory progress (better PDFs, higher-order calculations) reduces published uncertainties even without new data.
4. The Renormalization-Scheme Wrinkle
For all mass and coupling observables, the inference does not stop at "best-fit parameter value" — even the meaning of the parameter depends on a renormalization-scheme choice. This is a separate inference step on top of the experimental fit.
4.1 The "which mass?" question
| Quantity | Scheme | Definition |
|---|---|---|
| On-shell | Real part of propagator pole | |
| Mass parameter in -renormalized Lagrangian, depends on scale | ||
| Complex-pole | where | |
| On-shell | Real part of top propagator pole | |
| Running mass | ||
| Monte Carlo | Value of parameter inside the event generator (Pythia, Herwig, ...) |
Numerical differences:
| Comparison | Difference |
|---|---|
| vs. | |
| vs. | (translation theory uncertainty) |
| vs. | |
| vs. |
All of these are comparable to or larger than current experimental precision. The published value carries an implicit scheme tag, and translating between schemes is a theoretical operation that adds its own uncertainty to the experimental fit.
4.2 The "which coupling?" question
runs by orders of magnitude across observable scales. Saying "" is meaningless without specifying and the scheme ( for most modern conventions). Similarly:
- at the -pole vs. vs. on-shell — these differ at the level.
- vs. — same quantity, two scales.
5. The Full Chain
Combining the §1 pipeline (collider-measurements.md) with scheme translation gives:
The only left-end node is data. Every subsequent step is model + inference.
6. Why This Matters
- Comparing measurements across experiments requires comparing apples to apples: same scheme, same inputs to nuisance parameters. A tension may evaporate if one experiment used a different PDF set or a different MC tune.
- Quoted uncertainty bands can mislead. The full chain has stat ⊕ syst ⊕ theory + (scheme-translation) — the latter is often not in the published uncertainty.
- Theory progress directly reduces experimental uncertainties for long-inference observables. NNLO/NNNLO QCD calculations, better PDF fits, and improved MC generators have shrunk and error bars without any new data.
- What counts as "data-driven" is itself ambiguous. Even the "data-driven" hadronic vacuum polarization in muon uses lattice QCD as a cross-check; pure first-principles measurement is rare.
7. See Also
- Cross Sections — the master formula whose parameters are inferred via the fits described here. See §1.1 for the five equivalent definitions of (geometric, single-target, operational, quantum amplitude, S-matrix) and how the empirical chain (counting events in a detector → ) ties to the theoretical chain ().
- Decay Rates — the master formula, with the muon-lifetime worked example showing how a direct-time-domain measurement of is also an inferred quantity (lifetime fit + theory model).
- Collider Measurements — eight worked case studies of the theory → variable → reconstruction → result pipeline; this page is the conceptual filter on top.
- Electroweak Observables — the EW-sector inventory, each entry tagged with its inference structure.
- Standard Model Observables — cross-sector observables (CKM unitarity triangle, global EW fit) — all long-inference by construction.
- Observables — General Map — the structural Class A/B/C classification that all of the above instantiate.
Collider Measurements: Theory → Variable → Result
A collider experiment is a pipeline that turns a QFT prediction into a number with an uncertainty. This doc traces that pipeline end-to-end for the most important measurements, showing for each one:
- What theory predicts (the QFT computation, formula, free parameter(s))
- Which detector-level variable carries the prediction's signal
- How experimentalists reconstruct it from raw detector data
- The published result + dominant systematic uncertainty
The companion observables docs (electroweak.md, standard-model.md, cross-sections.md, decay-rates.md) catalogue what is measured; this doc explains how the measurement is actually done.
1. The Generic Collider-Measurement Pipeline
Every collider measurement follows the same chain. The vocabulary differs by measurement type but the structure does not:
The right end ( — a mass, coupling, cross section, asymmetry) is what gets published; everything to its left is what makes the comparison possible. The five chain steps map to standard software stacks:
| Step | What it does | Typical tools |
|---|---|---|
| Theory | Compute fixed-order amplitudes; convolute with PDFs | MadGraph, NLOJET++, MCFM, FEWZ, NNLOJET, OpenLoops |
| Parton shower + hadronization | Dress partons into hadrons | Pythia, Herwig, Sherpa |
| Detector simulation | Trace particles through magnetic field, calorimeters, trackers | Geant4, Delphes (fast) |
| Reconstruction | Tracks, vertices, jets, leptons, missing | Experiment-specific (CMSSW, Athena) |
| Statistical analysis | Likelihood, profile fits, unfolding | RooFit, RooStats, HistFactory |
Two general design principles:
- Calibrate against a known reference. Every measurement uses another measurement as a calibration anchor. is calibrated against ; cross sections are normalized to a known luminosity (itself measured via van der Meer scans against forward elastic scattering); jet energies are calibrated using photon+jet balance. Nothing is purely ab initio.
- Make a histogram, fit a template. Almost every published result reduces to: build a histogram of some variable ; predict its shape from SM + nuisance parameters; vary the parameter(s) of interest + nuisance parameters until the predicted histogram matches data. The "fit" is a maximum-likelihood / profile-likelihood-ratio procedure with hundreds to thousands of nuisance parameters.
2. Anatomy of a Modern Collider Experiment
Modern colliders (LHC) and earlier colliders (LEP, SLC, BaBar, Belle) all share a common geometry:
- Beam pipe — vacuum, with bunches of accelerated particles crossing at the interaction point (IP) every (LHC).
- Tracker — silicon pixel + strip layers in a solenoidal magnetic field (). Measures charged-particle momenta via curvature; resolves primary (collision) and secondary (displaced, from -decays) vertices.
- Electromagnetic calorimeter (ECAL) — measures energies of by total absorption.
- Hadronic calorimeter (HCAL) — measures energies of hadrons (charged + neutral) by absorption.
- Muon spectrometer — outermost, beyond the calorimeters. Muons are the only charged particles that punch through; identified by hits in muon chambers.
- Trigger — multi-level system reducing the bunch-crossing rate to the that can be written to disk, by online cuts on , energy, isolation.
From these raw signals, physics objects are reconstructed:
| Object | Reconstruction | Resolution at LHC |
|---|---|---|
| Charged tracks | Helix fits to tracker hits | at |
| Electrons | Track + ECAL cluster matched | on energy |
| Photons | ECAL cluster, no track | same |
| Muons | Tracker + muon-spectrometer combined fit | on |
| Jets | Sequential recombination (anti- algorithm) of calorimeter towers / particle-flow objects | on energy |
| -jets | Jets + displaced-vertex / secondary-vertex tagger | efficiency, light-jet mistag |
| Missing () | in | |
| -jets | Narrow jet + 1 or 3 tracks + ID variables | ID, jet mistag |
The neutrinos, the dark matter, and any new weakly-interacting BSM particles leave no signal — they show up as , the vector sum of all visible-transverse momenta with a sign flip.
Every measurement that follows uses these objects as primitives.
3. Case Studies
3.1 Measuring the boson mass
Theory. In near , the cross section is a relativistic Breit–Wigner resonance:
The peak position fixes ; the FWHM fixes ; the peak height (combined with , known from ) fixes the absolute coupling normalization. Three independent quantities from one curve.
Variable. measured at energy points around the peak (a scan). Hadronic channel because — highest statistics.
Reconstruction. events are trivially identified at LEP1 (huge visible energy, charged tracks; has 2 tracks, has 2–6 tracks with displaced vertices — all separable). Counted with efficiency.
The hard part is calibrating :
- LEP measured beam energies via resonant depolarization: transversely polarized beams have a spin-precession frequency ; sweep an RF kicker, find the depolarizing resonance, read to .
- Corrections were applied for: lunar tides distorting the LEP tunnel (1 cm radial deformation ↔ 1 MeV beam-energy shift), train passages on the Geneva–Bellegarde TGV line (RF coupling to the LEP RF system), seasonal water-table changes (Versoix river level), and even the time of day.
Result. — relative uncertainty . Beam-energy calibration is the dominant systematic; statistical uncertainty was negligible after M s.
Pipeline summary.
3.2 Measuring the boson mass
Theory. Unlike the , the cannot be produced via (charge conservation forbids it), and its decay contains a neutrino whose full 4-momentum is not reconstructible — only the transverse component . So never appears as a clean pole in a directly-measured ; it shows up in three cross-section formulas, each carrying differently:
-
Pair-threshold formula ( at LEP2):
Near the cross section rises from zero as ; the rate of rise pins down . LEP2 scanned GeV across threshold and the on-shell region.
-
Single- Breit–Wigner in transverse mass (, used by Tevatron + LHC):
where the transverse mass
plays for the the role plays for the — the variable in which the cross section has a Breit–Wigner pole. For a at rest ; for a recoiling the distribution has a Jacobian peak at . The lepton- spectrum has the analogous Jacobian peak at . The kinematic Jacobian contains all the hadron-collider complexity (PDFs, recoil distribution, QED FSR off the lepton) and is the source of most systematic uncertainty.
-
Propagator factor in -mediated processes (Drell–Yan, deep-inelastic , , etc.):
At the propagator collapses to and gives Fermi's — the source of the historical "weak force is weak". At (deep-inelastic neutrino scattering, -search regions) the full propagator resolves and the cross section is directly sensitive to as an independent parameter. Historically used to constrain from DIS before its 1983 direct discovery.
Modern precision measurements use formula 2 — the transverse-mass Breit–Wigner — fit as a template.
Variable. Three nearly equivalent variables, often combined:
- distribution (most robust, less sensitive to recoil)
- Lepton distribution (most sensitive to at the Jacobian peak)
- distribution (least used because of pileup contamination)
Reconstruction.
- Charged lepton — directly from tracker + muon-system (muons) or tracker + ECAL (electrons). Lepton momentum scale is the leading systematic; calibrated against events whose is known to from LEP1.
- — over the entire detector. Requires modeling the "hadronic recoil": every particle in the event that is not the signal lepton.
- Hadronic recoil response — calibrated using events, where the recoil is fully measurable (no missing ). The same calibration is then transferred to the analysis.
- PDF inputs — proton PDFs (NNPDF, MSHT, CT) determine the rapidity / template; PDF uncertainties contribute to the systematic budget.
The measurement is a template fit: simulate for many values of (with all nuisance parameters profiled — calibration scales, PDFs, QED FSR), find the value of that best matches the data histogram.
Result. Per-experiment values (transverse-mass + combination):
| Experiment | Year | [GeV] | Stat. precision |
|---|---|---|---|
| LEP2 combination | ~2013 | dominated by stats | |
| D0 (Tevatron Run II) | 2013 | calibration-dominated | |
| CDF (Tevatron Run II) | 2022 | calibration-dominated | |
| ATLAS (LHC, 7 TeV) | 2018, 2024 | PDF+calibration | |
| LHCb (LHC, 13 TeV) | 2022 | clean low-pileup | |
| CMS (LHC, 13 TeV) | 2024 | calibration+PDF | |
| World average (excluding CDF) |
The CDF result is above the others. Subsequent ATLAS, LHCb, and CMS measurements all agreed with the non-CDF world average; the CDF anomaly remains unresolved as of 2026 and is the single most active open puzzle in EW precision physics.
Pipeline summary.
3.3 Measuring the Higgs boson mass
Theory. The Higgs is produced (Section 3.5) and decays. Two clean channels give an invariant-mass peak over a smooth background:
- : SM BR , but two photons are cleanly reconstructed with energy resolution.
- (): SM BR , but signal-to-background is enormous and resolution on is .
In either case, theory predicts a Breit–Wigner of width (intrinsic) broadened by detector resolution to .
Variable.
- — diphoton invariant mass.
- — four-lepton invariant mass.
Reconstruction.
- Diphoton. Two ECAL clusters above (or so), each isolated from tracks (to reject jets). Photon energy scale calibrated using (electrons treated as photons after dropping the track requirement). .
- Four-lepton. Four well-identified, isolated leptons; reconstruct closest to , the other pair, then . Lepton momentum scale again calibrated on .
Result. (ATLAS + CMS Run-2 combination). Photon-energy / lepton-momentum scale is the dominant systematic; statistics enter at the same level after Run 2.
Pipeline summary.
3.4 Discovering a new particle: the Higgs in 2012
Theory. Compute expected signal yield for a Higgs of mass in a chosen channel, vs. expected background yield from QCD continuum (estimated from data sidebands or simulation).
Variable. The signal channel's invariant-mass distribution (, , ).
Reconstruction + statistical extraction.
- Reconstruct events passing the analysis selection; histogram them in .
- Define a likelihood ratio: where is the signal-strength modifier (best-fit signal divided by SM expectation) and the likelihood is built from a Poisson product over mass bins.
- The test statistic has a -like distribution under the no-signal null; convert observed to a -value, then to a number of Gaussian-equivalent .
Result. Both ATLAS and CMS reported excesses at on July 4, 2012, combining and (and as supporting evidence). The "5 discovery threshold" corresponds to for a single channel — chosen conservatively to account for the look-elsewhere effect (the fact that "a bump anywhere in the allowed mass range" is more likely than "a bump at this specific mass").
Pipeline summary.
3.5 Measuring a cross section: Higgs production at the LHC
Theory. Factorization theorem (see QCD § Factorization):
For (the dominant channel, ), is computed via a top-quark loop. NNLO QCD K-factors are relative to LO; full N3LO (NNNLO) results are now standard. Predicted total at is for .
Variable. Number of observed events:
with the kinematic acceptance (fraction of events passing fiducial cuts) and the reconstruction efficiency.
Reconstruction.
- Luminosity . Measured by van der Meer scans: vertically and horizontally separate the colliding bunches, measure the rate vs. separation, fit a Gaussian, extract the bunch overlap integral. Cross-checked against forward elastic scattering (TOTEM, LHCf at the LHC). Per-experiment uncertainty: .
- Signal count . Bump-fit (Section 3.4) or counting in a signal region after sideband background subtraction.
- Acceptance and efficiency . From signal Monte Carlo, validated against data control regions.
Result. Typical published format:
equivalently as a signal-strength modifier . All five production modes (ggH, VBF, VH, , ) are now individually measured.
Pipeline summary.
3.6 Measuring a branching ratio:
Theory. A FCNC process forbidden at tree level (GIM), proceeding through a -box + -penguin loop. The SM rate is
computed from CKM elements (), top-quark loop function, and the decay constant from lattice QCD.
Variable. Number of reconstructed candidates relative to a normalization channel (typically ):
The normalization channel cancels luminosity, PDFs, and many systematics; is the fragmentation-fraction ratio (separately measured).
Reconstruction.
- Two opposite-sign muons forming a vertex displaced from the primary vertex by .
- Vertex isolation cuts to reject background from .
- Multivariate classifier (BDT or neural net) trained on simulation, then transferred to data.
Result. LHCb 2022 + ATLAS + CMS combined:
Consistent with SM at . Constrains BSM models with new scalars or modified -penguins.
Pipeline summary.
3.7 Measuring an asymmetry: at the -pole
Theory. In near , the differential cross section in (the direction relative to the beam in the c.m.) is
The asymmetry depends on through the vector and axial couplings. SM predicts at the peak — a small but very clean number.
Variable.
with = events with , = events with .
Reconstruction. Identify events (two opposite-sign isolated muons with near ). Compute from the muon directions. Count.
Result. Combined from LEP1 with millions of events per flavor:
The ratio cancels luminosity, efficiency-symmetric reconstruction effects, and most acceptance corrections — that's the point of asymmetries. Combined across all asymmetries, the -pole data pins down .
Pipeline summary.
3.8 Measuring a coupling: Higgs from
Theory. Higgs couples to fermions via Yukawa . Predicted SM branching for . The coupling modifier
is in the SM by definition.
Variable. Signal-strength modifier , then convert: where depends on the production mode and is the total-width modifier.
Reconstruction.
- Identify -leptons in their decay modes: (one muon), (one electron), (a narrow 1-prong or 3-prong -jet). Six final-state combinations (, ...).
- Reconstruct via the SVfit / collinear-approximation algorithm (the neutrinos add to , which is decomposed back along the directions).
- Multivariate analysis to separate signal from background and from QCD multijet fakes.
Result. ATLAS + CMS combination: — consistent with SM () at .
Similar fits for — all consistent with 1. Predicted couplings to first-generation fermions () are below current LHC sensitivity; HL-LHC and future colliders are needed.
Pipeline summary.
4. Common Threads
The eight case studies above span very different physical observables — masses, cross sections, branching ratios, asymmetries, discovery significance, coupling modifiers — but the measurement structure is essentially the same:
| Element | Universal role |
|---|---|
| Differential prediction | Always either a cross section or a rate; the QFT-side output |
| Histogram in some | The bridge between theory and data; everything fits a histogram in the end |
| Template fit / counting | Find the SM parameter values (or BSM contribution) that match the observed histogram |
| Calibration anchor | Every measurement borrows accuracy from a previously-measured reference: for energy scale, for vertex resolution, for hadronic recoil, van der Meer for luminosity |
| Profile-likelihood fit | The standard statistical machinery; treats systematics as nuisance parameters and integrates them out |
| Quoted uncertainty | (stat) ⊕ (calibration / detector systematics) ⊕ (theory / PDF / Higher-order) |
A cross-cutting consequence: no LHC measurement is purely "data-driven". All of them depend on (a) Monte-Carlo simulation of signal + background, (b) NLO/NNLO QCD predictions, (c) PDF parametrizations, (d) parton-shower modeling, and (e) detector calibration anchors. The error budget always has a non-trivial "theory" component, and progress on the theory side (e.g. better PDFs, higher-order QCD calculations) reduces published uncertainties even without new data.
4.1 Direct vs. inferred observables
A second cross-cutting point: almost nothing in published collider results is "directly observed". The detector records a small universal set of primitives (track hits, calorimeter energy deposits, magnetic-field curvature, timing, polarization), and everything else — cross sections, masses, branching ratios, couplings, , CKM elements — is a fit-extracted parameter obtained by inferring the value that makes a theoretical model match a histogram of those primitives.
The full taxonomy — direct primitives, the per-class list of inferred observables (cross sections, masses, couplings, vertex form factors, oscillation parameters, composite global fits), the inference-distance spectrum (short for and , long for and ), and the renormalization-scheme wrinkle (the "which mass?" question) — lives in direct-vs-inferred.md. The short version of the chain:
The only left-end node is data; everything to its right is model + inference.
5. Where Each Measurement Lives in This Repo
| Quantity | Observable doc | Theory doc |
|---|---|---|
| , , | electroweak.md §1–§2 | electroweak/from-postulates.md §3.2 |
| , Higgs couplings | electroweak.md §5 | electroweak/from-postulates.md §3 |
| CKM elements | electroweak.md §4 | electroweak/from-postulates.md §3.4 |
| FCNC rare decays | electroweak.md §4.3 | standard-model/from-postulates.md §B GIM |
| Cross sections, master formula | cross-sections.md | (general) |
| Decay rates, master formula | decay-rates.md | (general) |
| Global EW fit, UT triangle | standard-model.md | standard-model/from-postulates.md |
6. See Also
- Cross Sections — master formula, flux factor, phase space.
- Decay Rates — master formula, worked muon-lifetime example.
- Electroweak Observables — full inventory of what is measured at EW experiments.
- Standard Model Observables — cross-sector observables (UT triangle, global EW fit, GIM, lepton universality).
- Observables — General Map — the structural classification (Class A/B/C) all of the above instantiate.
- QCD § Factorization — the PDF × partonic-cross-section structure used by all hadron-collider measurements.
- QED/compton.md — the cleanest worked theory calculation in this repo, for comparison with the experimental extraction side covered here.
Electroweak Observables
The set of experimentally measured quantities that test the electroweak theory. This page is the inventory; the theoretical machinery used to compute each entry lives in cross-sections.md, decay-rates.md, and the general observables map, specialized to the field content of electroweak theory.
Each observable is tagged with overlap status with QED, QCD, and the Standard Model:
| Tag | Meaning |
|---|---|
| [EW-exclusive] | No analogue in QED or QCD alone — relies on or chiral structure |
| [QED-shared] | QED contribution at low energy; EW contribution required at the -pole or for sensitivity |
| [QCD-shared] | EW vertex but hadronic input from QCD (form factors, PDFs, decay constants) |
| [SM-cross-sector] | Tests the combined QCD + EW structure; lives in standard-model.md |
1. Gauge-Boson Properties [EW-exclusive]
The masses and widths of and are the headline EW observables — they directly fix the gauge couplings and the EW scale .
1.1 Boson masses
| Observable | Value | Where measured |
|---|---|---|
| LEP1 (1989–95, line-shape scan) | ||
| (PDG 2024 world avg.) | LEP2, Tevatron (CDF/D0), LHC (ATLAS, CMS, LHCb) | |
| LEP1 line-shape | ||
| LEP2, Tevatron, LHC |
How these masses are actually measured. comes from a Breit–Wigner line-shape scan at LEP1 ( beam energy stepped through the resonance, beam energy calibrated to via resonant depolarization). is much harder because every decay involves a neutrino — it is extracted from the transverse-mass distribution at hadron colliders, using events as a lepton-momentum calibration anchor. Full pipelines, including the CDF anomaly, are walked through in collider-measurements.md §3.1–3.2.
CDF tension. In 2022 the CDF collaboration reported , above the world average. Subsequent ATLAS and CMS measurements have agreed with the world average; the CDF result remains an unresolved outlier as of 2026. Either CDF is wrong, or one of the experiments handling the same data is wrong, or this is the cleanest hint of BSM physics in the EW sector.
1.2 -pole line shape and partial widths
At the pole the cross section is a pure resonance:
Here is the Mandelstam invariant — the squared total CM energy, equal to at a symmetric collider like LEP. The resonance peaks at , i.e. . (See cross-sections.md § Mandelstam variables for the general kinematics.)
Fitting the line shape across at LEP1 pinned down:
-
All hadronic and leptonic partial widths to .
-
The invisible width — equals — giving
This is the measurement that pins down the number of light () neutrino flavors. Rules out a fourth chiral SM generation.
1.3 branching ratios
| Channel | BR (SM tree-level) | Comment |
|---|---|---|
| corrections small | ||
| same | Lepton universality test: ratios = 1 to | |
| same | Same | |
| via CKM; the factor of 3 from color |
Lepton universality is enforced by construction in the SM (all three generations have identical EW couplings); the measured BR ratios are universality tests at .
2. -Pole Asymmetries and [EW-exclusive]
Parity-violating asymmetries at the pole are the cleanest way to measure the Weinberg angle. They expose the V−A structure of the coupling.
| Observable | Definition | Best measurement |
|---|---|---|
| Forward–backward | for near -pole | LEP1, all |
| Left–right | with polarized beam | SLC (SLAC, precision) |
| polarization | helicity asymmetry of produced in | LEP1 |
| quark-flavor-tagged FB asymmetries | LEP1 |
These all reduce to one underlying parameter, the effective leptonic weak mixing angle:
This single number, combined with and , constrains the entire EW radiative-correction structure — including (historically) predicting before its 1995 discovery and before 2012.
2.1 The parameter
At tree level (custodial symmetry of the Higgs sector). Measured . BSM physics with isospin-breaking would shift — encoded in the Peskin–Takeuchi oblique parameters :
| Parameter | Probes | Current bound |
|---|---|---|
| Custodial-isospin breaking | ||
| Non-universal new physics in gauge-boson self-energies | ||
| doublet symmetry |
Most BSM extensions (Two-Higgs-Doublet Models, technicolor, extra , heavy fourth-generation fermions) shift one or more of by an amount the data already excludes.
3. Fermi Constant and Low-Energy Charged Currents [Mostly QCD-shared]
The most precisely measured electroweak quantity:
extracted from the muon lifetime via the purely leptonic decay :
[EW-exclusive] — no hadronic input needed.
Other low-energy CC observables that do require QCD form factors:
| Observable | EW input | QCD input |
|---|---|---|
| Neutron lifetime | , | (axial coupling, lattice or low-energy fit) |
| super-allowed nuclear -decays | Nuclear structure corrections (Vud-dominant) | |
| (lattice) | ||
| (lattice) | ||
| (lattice) |
The CKM matrix elements (Section 4) are extracted by combining these EW + QCD ingredients.
3.1 Lepton-universality ratios
Ratios cancel hadronic form factors and isolate pure EW structure:
agrees with SM to — a stringent universality test.
3.2 Parity-violating atomic / electron physics [QED-shared]
The -exchange weak charge of a nucleus produces a parity-violating energy shift in heavy atoms:
| System | Probes | Best result |
|---|---|---|
| Cesium-133 () | SM: ; consistent at | |
| Møller scattering at SLAC (E158) / JLab (MOLLER) | Tests running of from down | |
| Proton weak charge ( at JLab) | Tests SM at the level |
These low-energy parity-violation measurements probe BSM scales of , complementing direct LHC searches.
4. Flavor Physics: CKM and CP Violation [Heavily QCD-shared]
The CKM matrix has 4 physical parameters (3 angles + 1 phase). They are overdetermined by many independent measurements that all must agree.
4.1 Magnitudes
| Element | Best determination | Value | QCD input |
|---|---|---|---|
| Super-allowed nuclear -decay | Nuclear structure | ||
| form factor, | , form factor (lattice) | ||
| , neutrino DIS | Form factors (lattice) | ||
| , | Form factors (lattice) | ||
| (excl.), (incl.) | Form factors / OPE matrix elements | ||
| (excl.), (incl.) | Same | ||
| – oscillation | Bag parameter | ||
| – oscillation | Bag parameter | ||
| branching | at 95% CL | (clean — single-top production) |
A long-standing tension between exclusive and inclusive determinations () is one of the unresolved puzzles of flavor physics; it remains as of 2026.
4.2 CP-violating observables
| Observable | Physical meaning | SM prediction |
|---|---|---|
| Indirect CP violation in mixing | Consistent within | |
| Direct CP violation in | Consistent (large lattice uncertainty) | |
| from | Direct CP via interference of mixing + decay | — consistent with global UT fit |
| from | CKM angle, theoretically cleanest | Consistent |
| from | CP phase in mixing | Consistent with tiny SM value |
The unitarity triangle — when plotted in the complex plane — must close. All current measurements close it to accuracy. This is the central SM flavor test, sensitive to a wide range of BSM models. See standard-model.md § Unitarity-Triangle Fit for the consolidated cross-sector view.
4.3 Rare FCNC decays [Pure quantum loop probes]
In the SM, flavor-changing neutral currents are forbidden at tree level by the GIM mechanism (see standard-model.md § GIM). The measured rates probe loop physics directly.
| Decay | SM BR | Measured |
|---|---|---|
| (LHCb 2022) — ✅ | ||
| — consistent | ||
| , | computed | LFU ratios — once anomalous (2014–22), now consistent with SM (LHCb 2023) |
| (NA62) — consistent | ||
| in SM with | (MEG) — any observation = BSM | |
| conversion | in SM | (SINDRUM-II); Mu2e/COMET aim for |
Lepton-flavor violation in charged leptons would be a smoking gun for BSM physics. SM predicts essentially zero; current experimental sensitivity is , future — enormous discovery potential.
5. Higgs Sector Observables [Mixed: QCD-shared production, mostly EW-exclusive couplings]
After 2012 the Higgs sector is its own subfield.
5.1 Mass
Combined with this fixes in the Higgs potential to .
5.2 Production cross sections [QCD-shared]
ATLAS/CMS resolve five production modes, each probing different couplings:
| Mode | Diagram | Fraction at | Probes |
|---|---|---|---|
| Gluon fusion (ggH) | top loop | top Yukawa (and any BSM colored states) | |
| Vector-boson fusion (VBF) | via | couplings, tagging signature | |
| Associated | etc. | couplings | |
| Direct measurement | |||
| (small, hard) |
Heavy QCD K-factors ( for ggH) — Higgs production calculations require matching to be percent-level.
5.3 Branching ratios
| Channel | BR (SM) | Measured |
|---|---|---|
| ✓ | ||
| ✓ | ||
| inferred | ||
| ✓ | ||
| first evidence 2022, still poorly measured | ||
| "golden channel": final state | ||
| "diphoton" — original 2012 discovery channel | ||
| first evidence 2023, currently above SM | ||
| evidence (2020); the smallest measured Higgs coupling |
5.4 Coupling modifiers
A common parameterization defines for each Higgs vertex. Current LHC measurements:
All consistent with (pure SM). Total Higgs width — also consistent with measurements.
5.5 CP and spin
Angular analyses of and confirm to high confidence; large CP-violating Higgs–top coupling already excluded.
5.6 Higgs self-couplings
enters quadratically into di-Higgs production:
- Current bounds: at 95% CL (HL-LHC projection: precision on ).
- FCC-hh would reach precision on .
This is the least well-tested part of the SM Higgs sector and the cleanest probe of the shape of the Higgs potential — relevant to the cosmological electroweak phase transition and the universe's matter–antimatter asymmetry.
6. Anomalous Moments and Precision Tests [QED-shared]
The anomalous magnetic moments are predominantly QED observables but receive small required EW contributions.
| Observable | Contribution from EW | Significance |
|---|---|---|
| Far below experimental sensitivity | ||
| Required for SM consistency at | ||
| Not yet measured |
The muon anomaly (Fermilab E989, results 2021–2023) shows a excess over SM theory — if the data-driven hadronic vacuum polarization is correct. Lattice-QCD (BMW 2020 and subsequent reproductions) gives a different hadronic value that brings SM closer to experiment. As of 2026 the situation remains contested; resolution requires either (a) the SM hadronic VP is right and there's BSM physics, or (b) lattice is right and the dispersive data-driven analysis has a subtle issue.
6.1 Electric dipole moments [BSM-sensitive]
SM-CKM contribution is many orders of magnitude below current experimental sensitivity. Any positive electron-EDM measurement = BSM CP violation.
7. Neutrino Sector Observables [EW-exclusive, BSM-driven]
Neutrinos feel only the weak force (and gravity), so neutrino observables are 100% EW. Most current results require physics beyond the renormalizable SM (neutrino masses).
| Observable | Source | Status |
|---|---|---|
| Atmospheric mass-squared diff. | Super-K, T2K, NOvA, IceCube | Measured; sign (NH vs IH) still unresolved |
| Solar mass-squared diff. | SNO, KamLAND, Super-K | Measured |
| PMNS mixing angles | Combined oscillation data | All large (unlike CKM); measured 2012 |
| Leptonic CP phase | T2K + NOvA combined | Hints of ; not yet |
| Sum of neutrino masses | Cosmology (CMB + LSS) | (Planck 2018 + BAO) |
| Neutrinoless double- decay | KamLAND-Zen, GERDA, LEGEND | (depending on nuclear matrix element); positive observation would prove Majorana nature |
Within the renormalizable SM neutrinos are massless. All of the above are evidence for BSM physics — either right-handed neutrinos (Dirac masses) or the Weinberg dim-5 operator (Majorana masses). See electroweak Caveats.
8. Computational Pipelines
The structural map of how each observable connects back to QFT machinery:
| Observable type | Method |
|---|---|
| Gauge-boson masses & widths | Tree-level + radiative corrections in gauge → cross-sections.md, decay-rates.md |
| -pole asymmetries | Differential cross sections at ; ratios cancel normalizations → observables/README.md § 2.4 |
| from | Decay rate of three-body final state → decay-rates.md |
| CKM extraction | EW vertex + QCD form factor (lattice) — pair-wise → master formulas of cross-sections.md, decay-rates.md |
| Higgs cross sections | Multi-loop QCD + EW; production matched to NNLO+NNLL accuracy → general observables/README.md § 2.8 |
| Higgs branching ratios | Decay-rate master formula × spectrum of channels → decay-rates.md |
| Anomalous moments | Vertex function → observables/README.md § 2.7 |
| Neutrino oscillations | Quantum-mechanical superposition of mass eigenstates; requires BSM neutrino mass term |
9. See Also
- Electroweak Theory — the underlying theory whose observables are listed here.
- Standard Model Observables — cross-sector observables (CKM unitarity, GIM, anomaly fits, lepton universality, global EW fit).
- Observables — General Map — the structural classification (Class A/B/C) and inventory of all QFT observables.
- Cross Sections, Decay Rates — master formulas behind most observables above.
- The Standard Model — how the SM combines EW + QCD; defines the global parameter space.
Standard Model Observables
This document collects observables that test the Standard Model as a whole — i.e. ones that probe the combined structure of QCD + electroweak rather than either sector in isolation. Observables that live entirely inside one sector are owned by:
- QED — anomalous magnetic moments, Lamb shift, hydrogenic atoms, Compton/Bhabha/Møller scattering.
- QCD — running, jet rates, deep inelastic structure functions, hadron spectroscopy.
- Electroweak Observables — masses & widths, , , individual CKM elements, Higgs sector, neutrino oscillations.
Each observable here requires simultaneous use of EW and QCD machinery and is what makes "the Standard Model" a single predictive theory rather than three independent ones.
1. The CKM Unitarity-Triangle Fit
The CKM matrix has 4 physical parameters but independent measurements (Section 4 of electroweak.md). The SM requires all of them to fit a single point in the plane.
1.1 The triangle
Pick the -row unitarity relation:
Plotting these three terms in the complex plane gives a closed triangle with vertices, sides, and angles all measurable from independent experiments.
| Quantity | What measures it | Status |
|---|---|---|
| Side | Semileptonic decays + lattice form factors | Persistent inclusive/exclusive tension |
| Side | from oscillations + lattice ratio | Clean (ratios cancel hadronic) |
| Angle | from | |
| Angle | time-dependent CP | |
| Angle | tree-level interference | — theoretically cleanest |
| (indirect CP violation) | Box diagrams in – mixing + lattice | Constrains |
1.2 Global fit and BSM constraints
All measurements above are compared against a single overconstrained fit (CKMfitter, UTfit collaborations). The triangle closes to accuracy, with a single global minimum. Any one measurement that disagreed with the others would be a smoking gun for BSM physics, since adding new particles to the loops contributing to etc. would shift them differently.
Current state (2026): triangle closes; all individual measurements within of the global best fit. This is the cleanest test of the SM flavor structure, and rules out generic BSM physics at scales below for many operator structures.
2. The Global Electroweak Fit
The SM is overconstrained: given a small set of input parameters (e.g. in the "EW input scheme") all other EW observables are predictions with computable radiative corrections.
2.1 Inputs vs. predictions
| Role | Observable | Value |
|---|---|---|
| Input | ||
| Input | ||
| Input | ||
| Predicted | SM: vs. exp. | |
| Predicted | SM: vs. exp. | |
| Predicted | SM: vs. exp. | |
| Predicted | SM matches measured at |
2.2 Predicting and before discovery
Radiative corrections to EW observables depend on and . Using LEP/SLC precision data, the global EW fit predicted:
- (1995, before Tevatron discovery at ).
- at 95% CL (2011, before LHC discovery at ).
Both confirmed. As of 2026 the global fit shows no significant tension between predictions and direct measurements — the CDF anomaly (if real) is the only outlier.
2.3 Why this is SM-level, not EW-alone
The fit requires QCD inputs:
- (from QCD observables) enters in hadronic widths and QCD corrections.
- Hadronic vacuum polarization — computed via dispersive integrals over data or lattice QCD — is the dominant input uncertainty.
So the global EW fit is implicitly a global SM fit at the EW scale; pure-EW input is insufficient.
3. GIM Mechanism and FCNC Suppression
In the SM, flavor-changing neutral currents (FCNCs) are absent at tree level. The reason — the Glashow–Iliopoulos–Maiani (GIM) mechanism — requires simultaneous use of all generations + the structure of CKM:
At loop level, FCNCs are generated but GIM-suppressed: the amplitude is proportional to or . Without a heavy charm quark, would proceed too fast — historically predicting the charm quark before its 1974 discovery.
| Observable | SM rate | Sensitivity |
|---|---|---|
| Discovery of GIM-required charm | ||
| Top-quark loops dominant | ||
| Theoretically cleanest FCNC | ||
| inclusive rate | NLO-NNLO QCD + EW |
Suppression of FCNCs is a cross-sector SM prediction: it works because the up-type and down-type quark masses are both small (compared to ) and because CKM is unitary. Either ingredient alone would not give the observed FCNC structure.
4. Anomaly-Cancellation Verification
The hypercharge assignments in the SM (standard-model.md § Cross-Sector Content) are tuned so that all gauge anomalies cancel per generation:
- — automatically zero (vector-like in color).
- — automatically zero (SU(2) reps are self-conjugate).
- — requires . Quark factor of 3 from color exactly cancels lepton contribution.
- — requires per generation.
- — requires .
Indirect experimental verification: the consistency of EW measurements at the loop level (where uncanceled anomalies would produce uncontrolled divergences) is itself the test. The SM passes; ad-hoc extensions adding fermions that don't cancel anomalies are immediately ruled out at the radiative-correction level.
The deeper hint. Color factor of 3 + matching hypercharges is the strongest signal inside the SM that quarks and leptons belong in larger multiplets — the motivation for and GUTs, where one generation fits into a of or a single of .
5. Lepton Universality
The SM postulates identical gauge quantum numbers for all three lepton generations. Non-trivially, this must show up across many observables that test electrons / muons / taus separately.
| Observable ratio | SM value | Measured | Status |
|---|---|---|---|
| ✓ | |||
| ✓ | |||
| ✓ | |||
| matches to | ✓ | ||
| (LHCb 2023) | ✓ (previous "anomaly" resolved) | ||
| (HFLAV 2023) | tension persists |
The persistent tension in semileptonic vs. is the most active cross-sector lepton-universality probe as of 2026 — combining EW vertices, third-generation hadronic form factors (QCD), and the puzzle of why would behave differently from lighter leptons.
6. Cosmological / Astrophysical SM Observables
A few SM observables come from cosmology rather than colliders:
| Observable | Sector | Constraint |
|---|---|---|
| Big-Bang Nucleosynthesis (BBN) light-element abundances () | EW + QCD + cosmology | Number of relativistic species — agrees with 3 SM neutrinos |
| CMB anisotropy | All sectors at high | |
| Baryon-to-photon ratio | Requires violation + out-of-equilibrium dynamics (Sakharov 1967) | SM has all three in principle, but CKM CP violation is too small and EW phase transition is smooth — evidence for BSM |
| Dark matter abundance | None in SM | SM has no viable candidate |
| Dark energy / | Vacuum energy of SM diverges absurdly | Cosmological constant problem |
7. Outlook: What the SM Cannot Predict
The SM passes every laboratory test described above. Its failures are entirely above the lab scale:
- No graviton; quantum gravity is not part of the SM.
- No dark matter candidate.
- No mechanism for the observed matter–antimatter asymmetry.
- No explanation for the hierarchical Yukawa pattern.
- No solution to the strong CP problem.
- Neutrino masses (now established) require the dim-5 Weinberg operator , not part of the renormalizable SM.
Detailed treatment: standard-model § Caveats and Open Issues.
8. See Also
- The Standard Model — the underlying theory.
- Electroweak Observables — the larger inventory of EW (single-sector) measurements.
- Electroweak, QCD, QED — the constituent gauge theories.
- Observables — General Map — the structural classification (Class A/B/C) that all of the above instantiate.
- Cross Sections, Decay Rates — master formulas behind most observables here.
Remarks and Open Issues
A collection of foundational results, caveats, and alternative formulations that complement the Wightman-style postulates of QFT.
Wightman Reconstruction Theorem
From a set of vacuum expectation values
satisfying Poincaré covariance, the spectrum condition, locality, hermiticity, and positivity, one can reconstruct the entire QFT — Hilbert space, fields, and vacuum — uniquely up to unitary equivalence. This makes the Wightman functions the fundamental data of a Wightman QFT.
Haag's Theorem
The interaction picture, used heuristically in textbook perturbation theory, does not strictly exist in interacting QFT: the free and interacting fields cannot act on the same Hilbert space (they are unitarily inequivalent). Perturbative QFT must therefore be understood as a formal expansion, justified by renormalization, rather than as a strict consequence of the Wightman axioms.
Gauge Theories
For theories with local gauge symmetry (e.g. QED, QCD) the postulates above must be supplemented with a gauge-fixing procedure — Gupta–Bleuler, Faddeev–Popov, BRST quantization, etc. The physical Hilbert space is then a quotient or subspace of the full state space, defined by gauge-invariance conditions.
In particular, gauge fields like the photon do not satisfy strict Wightman positivity in covariant gauges (unphysical polarizations have negative norm); they only become consistent on the physical subspace.
Status of Rigorous Construction
No interacting QFT in four spacetime dimensions has been rigorously shown to satisfy all Wightman axioms. Constructive QFT has succeeded in lower dimensions:
- : , sine-Gordon, Thirring, ...
- : , Yukawa, ...
The four-dimensional case — including QED, QCD, and the entire Standard Model — remains an open problem. Establishing the existence of a non-trivial Yang–Mills theory in with a mass gap is one of the Millennium Prize Problems.
Algebraic QFT (Haag–Kastler)
An alternative axiomatic framework takes the primary objects to be local algebras of observables associated with bounded spacetime regions , rather than fields. The postulates are then phrased as conditions on the net of algebras :
- Isotony: .
- Locality: and commute when , are spacelike-separated.
- Covariance: the Poincaré group acts by automorphisms compatible with the net.
- Spectrum condition: as in the Wightman framework.
Algebraic QFT clarifies several conceptual issues — superselection sectors, charge structure, particle statistics in low dimensions (anyons) — that are awkward in the field-based formulation.
Measurement and Collapse: Inherited, Not Derived
QFT is a quantum theory, so it inherits the entire QM measurement framework — but does it derive the Born rule (Postulate 3) and collapse (Postulate 4) of QM? Strictly: no. They are presupposed by QFT, not produced by it.
How Each QM Postulate Fares in QFT
| QM postulate | Status in QFT |
|---|---|
| 1 — State space | Inherited; generalized to relativistic state space (QFT Postulate 1) |
| 2 — Observables | Inherited; observables are local field operators / algebras (QFT Postulate 4) |
| 3 — Born rule | Inherited as primitive. The form of the rule is partially derived (Gleason's theorem); that probabilities exist is postulated. |
| 4 — Collapse | Inherited as primitive. Decoherence partially explains the appearance of collapse but not the selection of a unique outcome. |
| 5 — Schrödinger evolution | Generalized to dynamics from a local action (QFT Postulate 9); Lorentz-covariantized. |
| 6 — Composite systems (tensor product) | Inherited; QFT Hilbert spaces are tensor products of per-species Fock spaces. |
| 7 — Symmetrization | Promoted to a theorem — the spin–statistics theorem. |
So QFT promotes one QM postulate (symmetrization → spin–statistics theorem), generalizes others (state space, observables, evolution), and inherits unchanged the measurement and collapse postulates.
Where Measurement Appears in QFT Practice
In practice, "measurement" in QFT means three things, all of which use the Born rule and none of which derive it:
- S-matrix elements , interpreted as transition probabilities via .
- Cross sections and decay rates , derived from by standard kinematic factors.
- Vacuum expectation values and correlation functions , interpreted as expectation values via .
The collapse postulate is rarely invoked explicitly in standard QFT because S-matrix calculations only ask for initial and final state probabilities, never for the post-measurement state. So one can do a great deal of QFT without ever using collapse — but it lurks whenever a post-measurement state is required.
Born Rule for Scattering
The "Born rule for scattering" — invoked in QFT/cross-sections.md as a foundational postulate behind the cross-section formula — is literally the QM Born rule applied to a particular kind of measurement. The same name with different-looking formulas reflects bookkeeping, not new physics.
Specialization, not a new postulate
The general QM Born rule for transitions: a system in at evolves under unitary , and the probability of finding it in at is
For scattering, take:
- , = asymptotic in/out states (free particles in the far past / future),
- = the S-matrix, the unitary that maps in to out under the full interacting Hamiltonian.
The Born rule then reads . No new postulate; just a specialization to asymptotic-state amplitudes.
Why the formula looks different
The QFT cross-section formula is the Born rule wrapped in three layers of bookkeeping:
- Subtract the identity: scattering is , so .
- The squared delta function gives spacetime volume . Dividing by converts probability to rate.
- Dividing by the flux converts rate to cross section. Integrating over the final-state phase space sums over indistinguishable outcomes.
The cross-to-classical-rate "bridge" derivation in cross-sections.md §1.1 (iv) makes each step explicit.
Comparison with the QM Born rule and its cousins
The Born rule shows up in five recognizable forms across QM and QFT, all the same postulate with different bookkeeping:
| Setting | Form |
|---|---|
| QM Postulate 3 (general) | |
| QM transition (Schrödinger picture) | |
| Fermi's Golden Rule (NRQM, perturbative) | |
| QFT scattering | |
| QFT decay |
Each of the last three has the structure (squared transition matrix element) × (sum over final states) × (kinematic conversion to a rate):
| Element | Fermi's Golden Rule | QFT cross section |
|---|---|---|
| Squared matrix element | ||
| Density / measure of final states | (Lorentz-invariant phase space) | |
| Conservation enforcer | implicit in | |
| Kinematic prefactor | (flux factor) |
Fermi's Golden Rule is the non-relativistic single-particle scattering / decay limit of the QFT formulas. Both are the Born rule plus standard QM time-dependent perturbation theory; the QFT version adds Lorentz-covariant phase space, antiparticles, and Feynman-rule machinery for computing , but the underlying postulate is the same.
The Born rule appears in all the standard QM observables — transmission probabilities , Rabi oscillation , decay rates, etc. The QFT cross section is just the most elaborate dressed-up version, decorated with relativistic and many-body bookkeeping.
A technical subtlety: non-normalizable states
The QM Born rule's literal probabilistic interpretation requires normalizable states for the inner product. In QFT:
- Asymptotic in/out states are momentum eigenstates — not normalizable. is a formal infinity (interpreted as proportional to the spatial volume ).
- Squaring the S-matrix element produces , also formally infinite (proportional to the spacetime volume ).
The infinities cancel in the ratio (rate per particle pair) only because the delta-function squaring and the state-norm conventions pick up the same factors of and . This is why physicists work with probability per unit time per unit volume rather than probability directly; box-normalization (or any equivalent finite-spacetime regularization) is the rigorous way to derive the cross-section formula. Conceptually it is still the Born rule; mechanically, you need the regulator to interpret it.
In algebraic QFT (see § Algebraic QFT (Haag–Kastler)) the issue is sharper: local algebras are Type III, with no minimal projectors, so the literal form of the Born rule does not even apply to local observables. One uses expectation values directly, and cross sections are reconstructed from these.
Partial Reductions
Several frameworks make parts of the measurement axioms less fundamental, without fully eliminating them:
- Decoherence. Tracing out the environment from a system + apparatus + environment composite produces an effectively diagonal density matrix in a "pointer basis" on a timescale . This explains why we never see macroscopic superpositions, but does not pick out a single outcome — that step still requires either the Born rule as an additional postulate, an interpretive move (Many-Worlds), or a dynamical-collapse model. QFT is the natural arena for decoherence, since the environment is typically a quantum field.
- Gleason's theorem (1957). Any probability measure on the projection lattice of a Hilbert space (dimension ) satisfying positivity, normalization, and additivity over orthogonal projectors must take the form — the Born rule. So the form of the Born rule is forced once one accepts that probabilities exist; only their existence remains a postulate.
- Many-Worlds derivations. Deutsch, Wallace, and Zurek (envariance) attempt to derive the Born rule within an Everettian framework with no collapse. Whether these arguments succeed is contested. They apply equally to QM and QFT.
- Dynamical collapse models (GRW, CSL). Modify unitary evolution by adding a stochastic term that effects spontaneous collapse, replacing the collapse postulate with a dynamical equation. Lorentz-covariant relativistic versions are an active research area, but these are alternatives to QFT, not consequences of it.
Why It's Harder, Not Easier, in QFT
QFT actually complicates the orthodox measurement story in several ways:
- Type-III local algebras. Local algebras in algebraic QFT (see above) are typically Type III in the von Neumann classification. Type-III algebras have no minimal projectors and admit no decomposition of the identity into orthogonal one-dimensional projectors. So the collapse postulate as stated in QM — "project onto the eigenspace of " — does not even quite apply to local observables in QFT.
- Reeh–Schlieder theorem. In a relativistic QFT, every state can be approximated arbitrarily well by acting on the vacuum with operators localized in any bounded spacetime region. This means there is no clean tensor-factorization of the Hilbert space into "system" and "environment" — yet such a factorization is implicit in every textbook discussion of measurement.
- Lorentz-covariant collapse is awkward. Naïve "instantaneous collapse on a spacelike hypersurface" picks a frame and breaks Lorentz invariance. Constructing a fully covariant collapse rule (or covariant CSL model) is a long-standing open problem.
Summary
The Born rule and collapse postulate sit in the same uncomfortable position in both QM and QFT: operationally indispensable, foundationally unexplained, and not derivable from the other postulates. QFT does not solve the measurement problem — it carries it forward, with extra technical complications from Type-III algebras and Lorentz covariance. Discussion of "measurement in QFT" is therefore really a discussion of how to apply QM-style measurement to QFT states, not a derivation of measurement from QFT first principles.
Specific Quantum Field Theories
Notes on concrete instances of quantum field theory. The general formalism (postulates, definitions, foundational issues) is collected in the parent QFT folder; this subfolder covers specific theories obtained by choosing field content, symmetries, and a renormalizable Lagrangian.
Contents
- Quantum Electrodynamics (QED) — the gauge theory of electrons, positrons, and photons (modern gauge-principle derivation).
- Historical (Dirac-Equation) Route — the same theory built up the textbook way: Dirac equation, minimal coupling, second quantization.
- Compton Scattering — the easiest real QED calculation: tree-level , closed-form Klein–Nishina cross section.
- The Hydrogen Atom in QED — worked example showing how Schrödinger, Dirac fine structure, the Lamb shift, and hyperfine splitting arise as successive approximations in and .
- Quantum Chromodynamics (QCD) — the gauge theory of quarks and gluons; structurally parallel to QED but qualitatively different in dynamics (asymptotic freedom, confinement, ghost-dependence, gluon self-coupling).
- Electroweak Theory — the gauge theory unifying QED with the weak interaction, spontaneously broken to by the Higgs mechanism. Introduces chiral fermions, massive gauge bosons (), the Higgs boson, and CKM mixing.
- The Standard Model — the union of QCD and electroweak, with the cross-sector content (anomaly cancellation, generations + GIM, two CP problems, accidental symmetries) that only makes sense once both are combined.
Quantum Electrodynamics
Quantum Electrodynamics (QED) is the relativistic quantum field theory of electrons, positrons, and photons. It is obtained by specializing the general postulates of Quantum Field Theory with three additional inputs: a choice of field content, a gauge symmetry, and the renormalizable Lagrangian uniquely determined by these together with Lorentz and discrete-symmetry invariance.
Three routes to QED. This document takes the gauge-theory viewpoint, in which local invariance is postulated and minimal coupling is derived. Two companion presentations cover the same theory from different starting points:
- Modern Foundations — Wigner–Weinberg derivation shows (§5.2) that gauge invariance is itself a theorem of Lorentz consistency for any massless spin-1 species. This document picks up where that one ends, taking the QFT postulates as given and adding the QED-specific ingredients. See §Equivalent framings in Step 2 for the relation.
- QED — Historical (Dirac-Equation) Route starts from Dirac's single-particle equation; minimal coupling is postulated there and gauge invariance is derived as a consequence.
Following the tag convention of foundations-modern.md, each step below is labelled as (Empirical input), (Postulate), (Theorem), or (Standard machinery) so the assumption budget on top of the QFT postulates is transparent.
Derivation of QED from QFT
QED inherits all postulates of QFT (relativistic state space, spectrum condition, unique Poincaré-invariant vacuum, fields as operator-valued distributions, microcausality, spin–statistics, cyclicity of the vacuum, dynamics from a local action, and asymptotic completeness). What follows is the chain of choices that turns the generic QFT framework into QED.
Step 1 — Specify the Field Content (Empirical input; specializes QFT Postulate 4)
QED postulates two fundamental fields:
- A Dirac spinor field , the electron/positron field, transforming in the representation of .
- A vector field , the photon field, transforming in the representation.
By the spin–statistics theorem (QFT Postulate 7), is quantized with anticommutators (fermion) and with commutators (boson).
Step 2 — Postulate a Local Gauge Symmetry (Postulate)
The genuinely new ingredient — generic QFTs have no such requirement — is the assumption of an internal local symmetry:
with an arbitrary real function. This single postulate has two immediate consequences:
- The photon must be massless: a term is not gauge-invariant.
- The way electrons couple to photons is uniquely fixed by the requirement that derivatives of appear only through the covariant derivative , which transforms as .
Equivalent framings. Postulating gauge invariance and deriving photon masslessness (this doc) is logically equivalent to postulating a massless spin-1 species and deriving gauge invariance — the route taken in foundations-modern.md §5.2. The two are the same theory viewed from opposite ends: foundations-modern starts from Wigner classification (massless spin-1 exists as an irrep) and shows gauge invariance is forced by Lorentz consistency of the non-tensorial polarization vectors; this document starts from gauge invariance and shows the photon must be massless. Either choice fixes the same . The gauge-symmetry-first framing is taken here because it generalizes more cleanly to the non-abelian case (see QCD).
Step 3 — Build the Lagrangian (Theorem; specializes QFT Postulate 9)
The most general Lagrangian density that is
- Lorentz-invariant,
- gauge-invariant under the above,
- -, -, and -invariant (separately),
- renormalizable (operators of mass dimension in spacetime dimensions),
- built only from , , , and their derivatives,
is uniquely
where
- is the covariant derivative,
- is the electromagnetic field strength,
- are the Dirac gamma matrices satisfying ,
- is the electric charge and the electron mass — the only two free parameters.
Renormalizability rules out higher-dimension operators such as (dim 6), (dim 5, the Pauli term), and -type interactions. The hypothetical -violating term is a total derivative in QED and so does not contribute to perturbation theory.
The classical equations of motion that follow from are the Dirac equation in an external field and Maxwell's equations with a Dirac source:
Step 4 — Quantize (Standard machinery)
The fields are promoted to operators following the standard QFT machinery (QFT Postulate 9). Two equivalent routes:
Canonical quantization. Compute conjugate momenta:
The vanishing of is a primary constraint, signalling gauge redundancy in . Resolving it requires a gauge-fixing prescription — Lorenz gauge (Gupta–Bleuler), Coulomb gauge , axial gauge, or (most systematically) BRST quantization. After gauge fixing, impose canonical (anti)commutation relations on the physical degrees of freedom.
Path-integral quantization. Define the generating functional
with a covariant gauge-fixing term, e.g.
and Faddeev–Popov ghosts . In QED the ghosts decouple (they only matter in non-abelian gauge theories), so they can be ignored for practical computations.
Step 5 — Renormalize (Standard machinery)
Naive perturbation theory in produces UV-divergent loop integrals. Renormalizability of guarantees that all divergences can be absorbed into a finite number of multiplicative redefinitions:
The Ward–Takahashi identities, which are exact consequences of gauge invariance, imply
Thus the renormalization of the electric charge is governed entirely by the photon self-energy — a non-trivial check that gauge invariance survives quantization.
Step 6 — Predictions via the S-Matrix (Standard machinery; specializes QFT Postulate 10)
Time-ordered correlators are computed perturbatively in using Feynman rules read off from :
| Object | Feynman rule (Feynman gauge ) |
|---|---|
| Electron propagator | |
| Photon propagator | |
| Vertex | |
| External electron / positron | spinors / |
| External photon | polarization vector |
The LSZ reduction formula extracts -matrix elements from amputated time-ordered correlators; cross sections follow from via standard kinematics (master formulas and conventions in QFT/cross-sections.md, with the parallel decay-rate treatment in decay-rates.md).
Summary: What QED Adds to QFT
| Ingredient | Source |
|---|---|
| Hilbert space, locality, Poincaré covariance, spectrum, vacuum, spin–statistics, ... | Generic QFT postulates |
| Field content: Dirac + vector | Choice (Step 1) |
| Local gauge invariance | New postulate (Step 2) |
| Lagrangian | Forced by gauge invariance + renormalizability + Lorentz/ (Step 3) |
| Quantization, gauge-fixing, renormalization | Standard QFT machinery (Steps 4–5) |
| Cross sections, decay rates, precision observables | LSZ + perturbation theory (Step 6) |
So the "derivation" of QED from QFT amounts to three choices: pick the matter and gauge fields, demand local , and require renormalizability. The dynamics — including the precise form of the electron–photon coupling and all of QED's celebrated predictions (, Lamb shift, Compton scattering, ...) — are then uniquely fixed.
Successes and Tested Predictions
QED is the most precisely tested theory in physics. A few highlights:
- Anomalous magnetic moment of the electron: agrees with experiment to better than one part in when combined with the most precise measurement of the fine-structure constant.
- Lamb shift in hydrogen: the splitting between and levels, predicted by QED loop corrections.
- Compton, Bhabha, and Møller scattering: tree-level cross sections and their loop corrections.
- Hyperfine structure of hydrogen and positronium.
Caveats and Open Issues
- Landau pole / triviality. The QED coupling grows with energy scale via the renormalization-group equation; extrapolated naively, diverges at an enormous (and unphysical) Landau pole. This suggests pure QED is only an effective theory, valid below some cutoff — and indeed it is embedded in the electroweak theory above .
- Haag's theorem prevents the interaction picture from existing rigorously; perturbative QED is best understood as a formal expansion justified by renormalization rather than as a strict consequence of the Wightman axioms.
- No rigorous construction in . As with all interacting QFTs in four spacetime dimensions, a fully rigorous mathematical construction of QED satisfying the Wightman/Haag–Kastler axioms is an open problem.
QED — Historical (Dirac-Equation) Route
This is the route by which Dirac, Heisenberg, Pauli, and Fermi originally arrived at QED, and the way most introductory texts (Bjorken–Drell, Sakurai's Advanced QM, Griffiths' particle book, the early chapters of Peskin–Schroeder) build it up. Compared with the modern gauge-theory derivation, it is conceptually simpler — it does not presuppose the gauge principle — but pedagogically backward: gauge invariance appears at the end as a consequence of minimal coupling, rather than at the beginning as a postulate.
To make the logical structure transparent, every step below is tagged as one of:
- (Postulate) — a primitive assumption of this formulation. These are the genuine physical inputs.
- (Definition) — a mathematical object or notation introduced for use later.
- (Derived) — a result that follows from the postulates plus standard mathematics or QM.
- (Heuristic) — a step originally presented as physically motivated guesswork; later cleaned up by deeper theory.
The historical route rests on three postulates:
| # | Postulate | Modern reinterpretation |
|---|---|---|
| H1 | Relativistic single-particle wave equation must be first-order in derivatives | A choice of representation; equivalent to picking the Dirac field |
| H2 | Minimal coupling | A theorem in the gauge-principle approach |
| H3 | Second quantization with anticommutators (after the wavefunction interpretation breaks down) | Forced by spin–statistics |
Everything else — including gauge invariance — is derived.
0. Preliminaries
The general mathematical machinery (Hilbert spaces, Lorentz/Poincaré groups, Lagrangian field theory, Fock space, mode expansions, regularization/renormalization) is collected in the QM and QFT preliminaries pages. This section covers only the prerequisite mathematics and classical physics specific to the wave-equation route to QED: gamma-matrix notation and the Lagrangian formulation of classical electromagnetism. Motivations, derived consequences, and the perturbative machinery used in §5 are deliberately not here — they appear next to the postulates or derivations that use them.
0.1 Gamma Matrices, Dirac Spinors, Notation
The Dirac gamma matrices () are matrices satisfying the Clifford algebra
This is the defining relation of the algebra . For the abstract definition (any vector space with a quadratic form), classification (Bott periodicity), Spin groups, and spinor-representation theory in arbitrary dimensions, see math/clifford-algebra.md. For this document we just need the four matrices and the bilinear-covariant notation below.
Common explicit representations:
- Dirac (standard) representation: , , with the Pauli matrices. Convenient for the non-relativistic limit.
- Weyl (chiral) representation: , . Convenient for massless / high-energy limits and for separating left- and right-handed components.
- Majorana representation: all purely imaginary; useful when discussing real (Majorana) fermion fields.
A Dirac spinor is a four-component object on which the act. Useful definitions:
- Dirac adjoint: . The bilinear is a Lorentz scalar; is a vector; has scalar + antisymmetric tensor parts; etc.
- Feynman slash: for any four-vector . Then .
- Chirality matrix: , which anticommutes with all and squares to . Projectors project onto left- and right-handed components.
0.2 Classical Electromagnetism in Lagrangian Form
Classical electromagnetism has the Lagrangian density
whose Euler–Lagrange equations are the inhomogeneous Maxwell equations . The four-potential is defined only up to a classical gauge transformation
with an arbitrary scalar; this leaves — and therefore the physical electric and magnetic fields — invariant. This redundancy is already present at the classical level, before any quantum mechanics.
The classical equation of motion for a point charge in an external follows from the Lagrangian , equivalent to the canonical momentum substitution
This is the classical input that Dirac borrowed when postulating minimal coupling (H2, §2.1 below).
1. The Dirac Equation
Scope of this route. Postulates H1+H2+H3 are sufficient to construct the QED Lagrangian, but they are not sufficient to derive the general Poincaré classification of particles. H1 only handles spin- — and only via the first-order-equation algebraic accident — so particles of other spins (scalar, vector, gravitino, graviton, ...) require separate analogous wave-equation postulates, case by case. The general classification, which says exactly which spins / helicities are possible and how they transform under , requires elevating the Poincaré group to a primitive symmetry acting on a Hilbert space and analyzing its unitary irreducible representations — the Wigner classification (1939). The historical route never performs representation theory of , so this classification is strictly more general than what H1+H2+H3 can reach. See foundations-modern.md §1.1 — One-particle states (Theorem — Wigner classification) for the modern derivation, and QFT/preliminaries.md § Wigner's Classification for the standalone reference.
1.1 Motivation: Klein–Gordon and the Negative-Probability Problem
The simplest relativistic wave equation is obtained by quantizing the relativistic dispersion via , :
It is Lorentz-invariant but second-order in time, which has two unwanted consequences for a single-particle wavefunction interpretation:
- The conserved current has that can be negative — incompatible with a probability density.
- It admits both positive- and negative-energy solutions with no obvious way to discard the negative-energy ones.
Both problems motivated Dirac to seek a first-order relativistic equation — the postulate H1 below.
(In modern QFT, KG is fine: it is the equation of motion of a free scalar field operator, and "negative probability" was a category error from insisting on a wavefunction interpretation.)
Other routes to Klein–Gordon. The canonical-substitution sketch above is the historical route ("Route A"). KG can equivalently be derived as the Euler–Lagrange equation of the free scalar Lagrangian (Lagrangian route, "Route B"), or as a theorem — the position-space form of the first-Poincaré-Casimir constraint on a spin-0 Wigner irrep (representation-theoretic route, "Route C"). See QFT/preliminaries.md § Worked example: free real scalar field and the Klein–Gordon equation for the three-route comparison. The first-order/probability-current concerns motivating H1 are specific to Route A; in the modern routes there is no such problem (KG is the EOM of a scalar field operator, never a single-particle wavefunction).
1.2 Demand a first-order relativistic wave equation (Postulate H1)
Seek a Lorentz-invariant wave equation for a single relativistic spin- particle that is first-order in , so as to admit a positive-definite probability current and avoid the negative-norm problems of §1.1.
This is a postulate in two senses: it picks first-order over second-order (a real physical assumption motivated by probability-current concerns), and it implicitly chooses the spinor representation of the Lorentz group over scalar or vector ones.
First quantization: the unstated state-space framework
"Single relativistic spin- particle" is not a neutral phrase — it commits the historical narrative to the framework of first quantization without ever naming it. For symmetry with §3.3 (where second quantization is performed explicitly), it is worth stating what's been silently assumed.
First quantization here means: take a relativistic single particle and put it on the Hilbert space
a tensor product of two factors:
- — the standard configuration-space Hilbert space of non-relativistic single-particle QM (square-integrable wavefunctions on space; see QM/preliminaries.md). The historical route carries this factor over by analogy when going relativistic — it is not itself derived from anything in this document.
- — a finite-dimensional internal space carrying the spin / Lorentz-representation degrees of freedom of . This is the new ingredient demanded by representing the Lorentz group on a single-particle wavefunction (a scalar wavefunction would take trivially).
The dimension of is not fixed by H1 alone. It will be determined in §1.3 by the algebraic consequences of the first-order ansatz: requiring the operator to square to the Klein–Gordon operator forces the Clifford algebra , whose smallest faithful representation is by matrices — so and
For the rest of §1.2 the precise dimension does not matter; only the product structure does.
The inner product is
A state is a single vector , equivalently a wavefunction . Time evolution is governed (in the Schrödinger picture) by the Dirac Hamiltonian
via , which is just the Dirac equation in disguise.
This is the framework that H1 (here) and H2 (§2.1) operate inside. Particle number is fixed at : there is no vacuum, no antiparticles in a clean way, and no operators that can change the particle count. Only at §3.3 (postulate H3) does this framework get replaced by Fock-space second quantization, in which becomes an operator on a multi-particle state space. The relationship and the comparison of pre- and post-quantization states is laid out in QFT/fock-space-inventory.md §0.
Terminology note. "First" and "second" quantization are unfortunate names — there is only one Hilbert-space construction (first quantization); the second step is really field quantization. The historical names have stuck.
Loose end: Lorentz covariance. The position-space wavefunction singles out a preferred spatial slice (every frame has its own ), and so this state space is not manifestly Lorentz-covariant. The genuinely covariant single-particle space is the momentum-space Wigner irrep, where Lorentz transformations act unitarily. The Fourier-transform relating the two carries a measure mismatch that makes the position-space "wavefunction" not quite a probability amplitude in the relativistic sense — connected to the Newton–Wigner localization issues. The historical route ignores all of this; the modern treatment in foundations-modern.md § 1.1.3 — The natural basis: momentum and spin, not position addresses it head-on.
1.3 Derivation of the Clifford algebra and gamma matrices (Derived)
Writing the ansatz and requiring that squaring the operator reproduce the Klein–Gordon equation forces
Pure algebra then shows the smallest faithful representation is by matrices, so has four complex components. None of this is an additional input — it all follows from H1. The explicit matrix representations and bilinear notation are collected in §0.1.
The Clifford algebra abstractly. The relation above is the defining identity of the Clifford algebra — the associative algebra generated by four vectors with . As a general construction (any vector space with a quadratic form), Clifford algebras subsume the complex numbers, quaternions, and exterior algebras as special cases; their classification (Bott periodicity), spinor representations, and connection to Spin groups (e.g. , the universal cover used in the modern route) are collected in math/clifford-algebra.md. The collapsible block below specializes to this algebra in .
Why exactly 4 × 4? (counting argument + small-case obstructions)
The claim "smallest faithful representation is " deserves an argument. It follows from (i) the dimension of the Clifford algebra and (ii) a check that the smaller cases concretely fail. The general -dimensional version of the result, and its connection to the Bott classification, is in math/clifford-algebra.md § 5.
Where the Clifford algebra comes from (recap). Squaring the Dirac operator:
Because partial derivatives commute, we may symmetrize:
For this to equal — so that the second-order equation is exactly — we need . So the 's cannot be numbers (numbers commute, but forces non-commutation); they must be matrices.
Counting argument: . The Clifford algebra generated by four anticommuting symbols subject only to has a canonical basis of antisymmetric products:
| Length | Basis elements | Count |
|---|---|---|
| 0 | ||
| 1 | ||
| 2 | ||
| 3 | ||
| 4 |
Total . A representation on embeds the algebra in , which has dimension . Faithfulness (injectivity) demands the image be 16-dimensional, hence
So are ruled out purely by counting. But it is illuminating to see why the small cases fail concretely.
: numbers. Numbers commute, so would force — not faithful.
: spatial gammas fit, has no room. A tempting attempt in : set for . Then
matching the spacelike Clifford relations. The obstruction is : it must satisfy and for all . But:
Lemma. The only complex matrix with for all is .
Proof. is spanned by . Write . Then forces and all , so .
There is no nontrivial in — the three spatial gammas already exhaust the "anticommuting room", and a fourth generator that anticommutes with all of them cannot exist. (This is the counting argument made concrete: , but , so 12 dimensions' worth of elements must collapse — and you can see which ones get squeezed out first.)
: ruled out by counting. , so no faithful rep can exist. There is also no natural "Pauli-like" structure in to attempt.
: works, and is unique. — exactly the Clifford-algebra dimension. The general classification gives
so a faithful 4-dimensional representation exists, is unique up to similarity transformations, and every element of the Clifford algebra is represented by a distinct matrix. The Dirac, Weyl, and Majorana representations in §0.1 are three concrete choices of similarity transform.
The 4 in is therefore not a representational accident; it is the unique minimum dimension at which the algebraic defining relation admits a non-collapsing realization in spacetime dimensions. For the general- pattern (Bott periodicity, when Majorana/Weyl/Majorana–Weyl spinors exist) see math/clifford-algebra.md § 5 and § 7.
1.4 The Dirac equation (Derived)
The result of §1.2 + §1.3 is the Dirac equation:
At this stage is treated as an ordinary relativistic wavefunction (a single-particle theory), not yet as a quantum field. Its plane-wave solutions (§1.5), conserved current , and non-relativistic limit (the Pauli equation, with , §2.4) are all derived consequences.
1.5 Plane-wave solutions and completeness relations (Derived)
Plugging the plane-wave ansatz into the free Dirac equation gives the algebraic equation , with two linearly independent positive-energy solutions for . Similarly, gives , with two negative-energy solutions that — after second quantization (§3.3) — describe antiparticles.
Standard normalization: , , and the spin sums
These appear throughout Feynman-diagram calculations (§5) and are the engine behind Casimir's trick (§5.4.2). The negative-energy spectrum signaled by the family is the technical input to the consistency problem of §3.1, which forces the move to second quantization.
2. Coupling to Electromagnetism
2.1 Minimal coupling (Postulate H2 — heuristic)
Borrow from classical electrodynamics (§0.2), where a charged particle in an external electromagnetic field obeys equations of motion with the canonical momentum replaced by the kinetic momentum:
Apply this minimal substitution to the Dirac equation:
This is the second postulate of the historical formulation. It is not derived — it is a physically motivated rule, justified after the fact by the consequences collected in §2.4: simplicity, the correct non-relativistic Pauli equation with , and the relativistic Lorentz force law.
In the modern gauge-principle derivation H2 becomes a theorem, not a postulate.
2.2 Definition of the covariant derivative (Definition)
Define . The minimally coupled Dirac equation is then
This is purely notation.
2.3 Gauge invariance (Derived)
Direct calculation shows that the simultaneous transformation
leaves the minimally coupled Dirac equation invariant. So gauge invariance is a derived consequence of postulate H2 — exactly the inverse of the modern viewpoint, in which gauge invariance is the input and minimal coupling is the output. (See QED.)
2.4 Post-hoc justifications: Pauli equation, , Lorentz force (Derived)
H2 is a heuristic at the level of postulation, but several derived consequences vindicate it after the fact. These are consequences of the minimally coupled Dirac equation (§2.1), not independent postulates.
Non-relativistic limit: the Pauli equation. Block-diagonalising the minimally coupled Dirac equation in the standard representation and dropping terms gives the Pauli equation
for the two-component non-relativistic spinor . The last term is a magnetic moment coupling with gyromagnetic ratio — the historically dramatic prediction that vindicated the Dirac equation, since experiment had observed for the electron whereas the orbital value would be .
Classical limit: the Lorentz force. In the WKB / classical limit one recovers the relativistic Lorentz force law .
Simplicity. Minimal substitution is the simplest Lorentz-invariant coupling of to at the operator level.
All three results were used historically as post hoc justification of H2.
3. From Wave Equation to Quantum Field
3.1 The negative-energy problem (Derived — and why a new postulate is needed)
The free Dirac equation admits both positive- and negative-energy plane-wave solutions:
This is a calculation, not a postulate — but it shows that a single-particle interpretation is impossible (the spectrum is unbounded below). A new ingredient must be added.
3.2 The Dirac sea (Heuristic — historical)
Dirac's original resolution: postulate that all negative-energy states are filled, and identify a "hole" in the sea with a positive-energy antiparticle of opposite charge. This predicted the positron, discovered shortly after by Anderson (1932).
This is conceptually awkward (it relies on an infinite filled vacuum) and was eventually superseded.
3.3 Second quantization (Postulate H3)
The Dirac sea is replaced by a sharper axiom on itself.
Postulate H3. Promote the Dirac wavefunction to an operator-valued field on a Fock space, satisfying the equal-time canonical anticommutation relations
with the Dirac-spinor indices.
Two things are postulated together here:
- The promotion of from c-number wavefunction to operator field.
- The choice of anticommutators (Fermi statistics) over commutators.
Both are forced upon you in the modern view by the spin–statistics theorem, but that theorem was proved later (Pauli 1940). Historically, anticommutators were postulated to ensure positive-definite energy and the Pauli exclusion principle.
Consequence: mode expansion. Solving the free Dirac equation as an operator equation, and using the plane-wave spinor basis from §1.5, fixes uniquely up to the operator-valued coefficients of each mode:
where annihilates an electron and creates a positron. Substituting this expansion into the postulated equal-time field anticommutators and using the spinor completeness relations of §1.5 reduces them to the ladder-operator anticommutators
So the ladder form is a consequence of H3 in the plane-wave basis, not an independent postulate. The two forms are equivalent (one is the spatial Fourier transform of the other).
Consequence: Fock space. Once is an operator with the ladder operators above, the natural state space they act on is no longer the single-particle of §1.2, but the Fock space
with the one-dimensional vacuum sector and the antisymmetrized -particle sector built from . The vacuum is defined by for all , and arbitrary multi-particle (and multi-antiparticle) states are generated by acting with creation operators on . This is the replacement of first quantization promised in §1.2.
Terminology warning. Before H3, the symbol denoted a single-particle wavefunction (a state). After H3 it denotes a field operator on Fock space. Same symbol, completely different mathematical object. From here on, "the state" of the system is a vector in Fock space (typically the vacuum or an asymptotic Fock state); is one of the operators acting on that space. See QFT/preliminaries.md § States vs. Fields for an extended discussion.
3.4 Antiparticles, vacuum, and positivity of energy (Derived)
Once H3 is in place, several historical puzzles resolve themselves automatically:
- The vacuum is annihilated by all and .
- Negative-energy modes have been re-interpreted as positive-energy antiparticle creation; the Dirac sea disappears.
- The Hamiltonian is positive after normal-ordering.
- Pauli exclusion follows from .
None of this is an additional input.
4. Quantization of the Electromagnetic Field
4.1 Choice of gauge (Definition / convention)
Pick a gauge — for the historical treatment, Coulomb gauge . This eliminates longitudinal and timelike components as non-dynamical (the latter by the Coulomb constraint) and leaves only the two transverse photon polarizations .
This is a convention, not a postulate — a different choice (Lorenz gauge with Gupta–Bleuler, or BRST) gives the same physics.
4.2 Mode expansion of (Postulate — same flavor as H3, applied to bosons)
Postulate the bosonic mode expansion
with bosonic commutators . Conceptually this is the same kind of postulate as H3 — promotion of a classical field to an operator with prescribed (anti)commutators — applied to a boson rather than a fermion.
The instantaneous Coulomb interaction is reintroduced explicitly to compensate for the elimination of the timelike photon mode.
5. Dynamics and Predictions
5.0 Perturbative machinery: S-matrix, Dyson series, Wick's theorem (Generic QFT setup)
Why we need it. Postulates H1–H3 give us field equations and a Hilbert space, but no recipe for what an experimentalist measures. Real experiments prepare asymptotic states — well-separated wavepackets long before a collision — and detect asymptotic states long after. The S-matrix is the operator whose matrix elements encode exactly these probabilities, packaging all interaction effects into a single object. Cross sections, decay rates, and (via poles) bound-state energies are all built from S-matrix elements. We collect the generic construction here so that §5.2 can apply it to QED without distraction.
Definition. Split the Hamiltonian as with free. In the interaction picture, operators evolve under while states evolve under . The S-matrix is the interaction-picture evolution operator from to :
Sketch of derivation (Dyson series). Solving the interaction-picture Schrödinger equation iteratively, and noting that the time-ordering operator accounts for the non-commutativity of at different times, gives
This is the Dyson series — the order-by-order expansion of in powers of the coupling.
How each term is evaluated (Wick's theorem). Each Dyson term is a vacuum/in/out matrix element of a time-ordered product of free fields. Wick's theorem rewrites such a product as a sum of normal-ordered products with all possible pairwise contractions, where each contraction is by definition a vacuum-expectation time-ordered product — a Feynman propagator:
Dyson series + Wick's theorem together turn perturbation theory into a sum of Feynman diagrams: one diagram per contraction pattern, with vertices coming from and lines from the propagators. The QED-specific application is in §5.2; the full mechanical derivation of the momentum-space rules is left to a standard QFT textbook (Peskin–Schroeder Ch. 4, Schwartz Ch. 7, Srednicki Ch. 9).
Transition operator (preview). is the primary operator — defined directly as the interaction-picture evolution above. Almost every later use, however, separates out the trivial no-scattering piece by writing
The transition operator (or -matrix) is defined from this way; it carries no information doesn't already carry. Its purpose is purely bookkeeping: for , the matrix element contains only genuine interaction effects, with the trivial removed. This becomes important in §5.3 when we extract the invariant amplitude by stripping the spacetime-translation from .
Notation collision. The symbol in this document plays two unrelated roles: the time-ordering operator appearing inside the Dyson formula above (an instruction to reorder operators by time), and the transition operator just defined (an actual operator on Fock space). Context disambiguates — time-ordering always sits in front of a product of operators, transition sits between bra and ket as .
Caveat (Haag's theorem). The interaction picture does not strictly exist in interacting QFT; the construction above is a formal expansion, justified after the fact by renormalization. See QFT/remarks.md.
5.1 Interaction Hamiltonian (Derived)
From the minimally coupled Lagrangian (the Lagrangian whose Euler–Lagrange equation is the minimally coupled Dirac equation of §2.1) the interaction Hamiltonian is read off directly:
No new input.
5.2 The QED S-matrix and Feynman rules (Derived)
Why we need it (here). §5.0 introduced the S-matrix in general; we now plug in the QED interaction Hamiltonian from §5.1 to get the explicit perturbative expansion for electron–photon processes. The end product is the QED Feynman rules — a graphical algorithm for computing any S-matrix element to any order in . All of this is derivation, not postulation.
The QED Dyson series. Substituting into the general Dyson series of §5.0:
Each factor in the integrand is a vertex contribution at one spacetime point.
Sketch of derivation of the Feynman rules. The chain from this expression to the momentum-space rules table consists of three mechanical steps (extending the generic §5.0 Wick step to QED-specific bookkeeping):
-
Apply Wick's theorem to each order- time-ordered product, producing a sum of normal-ordered terms with all pairwise contractions of , , and . Surviving terms (after taking the matrix element) are those where the uncontracted fields exactly match the in/out particle content.
-
Diagrammatic bookkeeping. Identify each algebraic piece with a graphical element:
- Each interaction factor at → one vertex with three lines (one electron in, one out, one photon).
- Each contraction → an internal electron line between and , propagator .
- Each contraction → an internal photon line, propagator .
- Uncontracted fields acting on or → external line factors (spinors for fermions; polarization vectors for photons).
-
Position → momentum space. Fourier-transform every propagator and external line. Spacetime integrals over each vertex become momentum-conserving delta functions . Integrating away the internal delta functions leaves one overall delta function enforcing total energy–momentum conservation; the residue is the momentum-space Feynman rules dictionary:
Diagrammatic element Algebraic factor Vertex Internal electron line, momentum Internal photon line, momentum (Feynman gauge) External electron in / out / External positron in / out / External photon in / out / Closed fermion loop factor of and a trace Symmetry factor for diagrams with internal symmetry
This is the same table referenced in QED Step 6 and applied in QED/compton.md to compute the Klein–Nishina cross section. The rules contain no new physics beyond — they are a reorganization of perturbation theory into a graphical algorithm.
5.3 The Invariant Amplitude (Definition)
Why we need it. Raw S-matrix elements between momentum eigenstates always carry the overall delta function from translation invariance, plus state-normalization factors . Squaring such an element naively produces the meaningless . To extract a finite, Lorentz-invariant probability density usable in a cross-section formula we strip the delta function out by definition and work with the residue. That residue is what the Feynman rules of §5.2 actually compute; giving it a name turns the rules into a self-contained calculational object.
Definition. Recall from §5.0 that the trivial no-scattering piece of can be peeled off via , where the transition operator encodes all genuine interaction effects. For an off-diagonal momentum-eigenstate matrix element of , factor the spacetime-translation delta function out:
The residue is the invariant amplitude (also called the matrix element, or just the M-matrix element). It is a Lorentz scalar (with possible spinor/polarization indices on the external states) and is finite at generic kinematics.
Notation note — vs. vs. . Three closely related symbols appear in the literature and in the rest of this doc:
Symbol Spins / polarizations Typical use fixed, specific channel definitions, polarized observables, probability of one transition same as above, labels suppressed shorthand when the channel is clear from context summed over final, averaged over initial unpolarized cross sections; what Casimir's trick (§5.4.2) computes So (notation only, same object), while is a different object — a sum over an ensemble of values. Below we keep the subscript when the channel-specific meaning matters and drop it when it does not.
Sketch of derivation from .
- Identity subtraction. Forward scattering trivially has ; only the interacting part produces transitions with — which is why was introduced above.
- Translation invariance ⇒ overall . The interaction Hamiltonian density is invariant under spacetime translations . Each Dyson term, between momentum eigenstates, can therefore be written as times a function of relative coordinates, where is the centre of the diagram. Integrating over produces the universal factor . Defining as the residue is the precise statement of step 3 in §5.2 ("after integrating away delta functions, read off the rules").
- Momentum-space Feynman rules compute directly. Each tabulated rule (vertex , internal propagator, external spinor/polarization) is the contribution to from the corresponding diagrammatic element. Summing over all topologically distinct diagrams at order in gives .
What you do with it. The whole point of is that the physically observable quantity is , not itself. The next subsection unpacks what that means and how it is computed.
5.4 The Squared Amplitude
Why it is the physical object. Quantum mechanics produces complex amplitudes; experiments measure probabilities. The Born rule (QM Postulate 3) supplies the bridge: . Applied to scattering with and the delta-function-stripping of §5.3, the relevant probability density is
a real, non-negative, Lorentz-invariant function of the external momenta and spins/polarizations. (The subtlety produced by squaring the raw S-matrix element is what makes the -as-residue definition useful: has no leftover delta-function squared.)
The full QED interaction probability density per pair, derived inside QFT/cross-sections.md §3, is
from which cross sections and decay rates are read off.
5.4.1 Spin and polarization sums:
Real experiments rarely measure individual spin/polarization channels. Two operations bring to a directly comparable form:
- Average over initial-state spins/polarizations the experimenter does not control — divide by the multiplicity (e.g. a factor for two spin- particles, per unpolarized photon).
- Sum over final-state spins/polarizations the detector does not resolve.
The overlined notation
is universal in cross-section formulas. Polarized observables (e.g. spin asymmetries) keep specific spin labels instead and use unaveraged.
5.4.2 Computational technology: Casimir's trick (trace technology)
For a generic QED amplitude built from spinor bilinears, the channel-specific amplitude has the schematic form
where is some product of -matrices and propagators (and here denote the specific spins in addition to the fixed external momenta). Taking the modulus squared and using the conjugation identity (with ):
Summing over the initial- and final-state spins and using the completeness relations
collapses the explicit -spinors into projectors, and the spinor matrix product becomes a trace over Dirac indices:
This is Casimir's trick (also called the Casimir trick / spin-sum-as-trace). The remaining computation is purely algebraic: apply the standard trace identities
traces of an odd number of 's vanish, etc., to reduce to a function of Lorentz-invariant dot products , , ... (i.e. Mandelstam variables ). Photon polarization sums proceed analogously: (in covariant gauges, with the Ward identity guaranteeing the longitudinal/timelike pieces drop out of physical amplitudes).
The Klein–Nishina calculation in QED/compton.md is a worked example of this entire pipeline.
5.4.3 What means physically
- It is a probability density per phase-space point. Multiplied by and integrated, it produces a probability rate (cross section or decay rate). The bare has dimensions for an -particle final state (Lorentz-invariant phase space carries the rest of the dimensions).
- It is Lorentz invariant. Both and the relativistic-normalization phase space transform as Lorentz scalars, so can be quoted in any frame and substituted unchanged.
- It encodes interference. When several diagrams contribute at the same order in , and . The cross terms are quantum-mechanical interference between Feynman diagrams — the same phenomenon that distinguishes Compton's - and -channel diagrams from an incoherent sum, or Bhabha's - and -channel diagrams.
- It must be gauge-invariant. Although individual diagrams in covariant gauges may depend on the gauge parameter, the sum contributing to at fixed external states does not. This is enforced by the Ward identity on amplitudes with an external photon of momentum .
- Crossing symmetry. The same analytic function describes processes related by moving particles between initial and final states (e.g. vs. Compton ); inherits this, with kinematic re-labeling.
- Optical theorem. (S-matrix unitarity , sandwiched between , after the split); see QFT/cross-sections.md § Optical Theorem for the derivation.
- Non-relativistic limit. In the kinematic regime where particles are slow and external lines reduce to non-relativistic wavefunctions, where is the matrix element of a non-relativistic scattering potential, and reduces to Fermi's Golden Rule . The QFT formula and Fermi's Golden Rule are the same Born-rule statement at different levels of relativistic completeness.
5.4.4 Worked-example pipeline
For any QED process, the calculation chain is:
- Draw all Feynman diagrams at the desired order in .
- Apply momentum-space Feynman rules (§5.2 table) to write as a sum of diagram contributions.
- Take , expanding interference terms.
- Sum/average over spins and polarizations using completeness relations → as a Dirac trace.
- Evaluate the trace with -matrix identities, expressing the result in Mandelstam variables.
- Plug into (or ) and integrate over the desired phase-space region.
Steps 1–2 are diagrammatic; steps 3–5 are algebra; step 6 is kinematic integration. The Klein–Nishina cross section (QED/compton.md) walks through all six steps explicitly.
5.5 Renormalization (Pragmatic procedure)
Loop integrals diverge; regularize and absorb divergences into multiplicative redefinitions of , , , . The Ward–Takahashi identities () follow from the gauge invariance derived in §2.3. Renormalization is not strictly a postulate — but it relies on the empirical fact that QED is renormalizable, which only later was proved (BPHZ).
6. Summary: Postulates vs. Derived Results
| Item | Status |
|---|---|
| First-order relativistic wave equation | Postulate H1 |
| Clifford algebra, 4-component spinors | Derived from H1 |
| Free Dirac equation | Derived from H1 |
| Conserved current, plane-wave solutions, non-rel. limit | Derived |
| Minimal coupling | Postulate H2 (heuristic) |
| Covariant derivative | Definition |
| Gauge invariance , | Derived (consequence of H2) |
| Negative-energy spectrum | Derived (and motivates need for new input) |
| as operator field with equal-time canonical anticommutators | Postulate H3 |
| Mode expansion of with ladder anticommutators ; Fock space | Derived from H3 + free Dirac equation |
| Antiparticles, positivity of energy, Pauli exclusion | Derived from H3 |
| Mode expansion of with bosonic commutators | Postulate (same flavor as H3) |
| Interaction Hamiltonian | Derived |
| S-matrix , Dyson series, Feynman rules | Derived (definition + interaction-picture algebra) |
| Invariant amplitude via | Definition (residue after stripping translation ) |
| via Casimir trace technology, Mandelstam variables | Derived (algebra + completeness relations) |
| Cross sections | Derived (modulo Born-rule postulate inherited from QM) |
7. Comparison with the Modern Gauge-Theory Route
| Aspect | Historical (this file) | Modern (QED) |
|---|---|---|
| Foundational postulates | H1 (first-order eq.) + H2 (minimal coupling) + H3 (anticommutator quantization) | gauge invariance + renormalizability + Lorentz/ |
| Role of gauge invariance | Derived consequence of H2 | Postulate |
| Role of minimal coupling | Postulate H2 | Theorem (forced by gauge invariance) |
| Role of anticommutators | Postulate H3 | Theorem (spin–statistics) |
| Photon mass | Implicit, justified after the fact | Forbidden by gauge invariance |
| Positron | Postulated via Dirac sea, then re-derived from H3 | Built into the Fock-space construction from the outset |
| Generalization to other gauge groups | Awkward — no obvious route to Yang–Mills | Direct — replace by to obtain QCD, electroweak, ... |
| Pedagogical accessibility | High — builds on familiar single-particle QM | Lower — requires accepting that local symmetry is the right organizing principle |
Both routes lead to the same Lagrangian
the same Feynman rules, and the same physical predictions. The choice between them is a matter of pedagogical preference and conceptual framing, not physics.
8. A Brief Word on the Wigner / Weinberg Route
A third, more foundational approach (Weinberg's QFT Vol. 1) starts from the Wigner classification of single-particle states and derives QED from the consistency requirements of a Lorentz-invariant, cluster-decomposable -matrix. In this view the polarization vector of a massless spin-1 particle does not transform as a true four-vector under Lorentz boosts; the residual transformation must be a symmetry of the interaction, which is precisely electromagnetic gauge invariance. From this perspective gauge invariance is neither a derived consequence nor a postulate, but a theorem about consistent interactions of massless spin-1 particles.
Compton Scattering — The Easiest Real QED Calculation
Compton scattering, , is the simplest physical QED process whose cross section can be computed end-to-end at tree level. It uses every piece of the QED machinery introduced in QED and QED/historical.md — Lagrangian, Feynman rules, LSZ, kinematics — but only at lowest order, with no loops, no renormalization, and no bound-state complications.
It is the right "first concrete application" to compare against the much more involved hydrogen calculation.
1. The Process
A photon of 4-momentum scatters off a free electron of 4-momentum :
Initial and final states are asymptotic Fock-space vectors (see fock-space-inventory.md §3):
Conservation of 4-momentum: .
2. Feynman Diagrams
At lowest order in (i.e. ), there are exactly two diagrams, related by exchange of the two photon vertices:
s-channel u-channel
γ(k) γ(k') γ(k') γ(k)
\ / \ /
\ / \ /
●═══════● ●═════════●
/ e(p+k) \ / e(p-k') \
/ \ / \
e(p) e(p') e(p) e(p')
- s-channel: electron absorbs , propagates with momentum , emits .
- u-channel: electron emits first, propagates with momentum , then absorbs .
(There is no t-channel diagram because there is no photon self-coupling in QED — the photon doesn't interact directly with another photon at tree level.)
3. The Amplitude
Reading the diagrams off the QED Feynman rules (QED Step 6):
Using and (since and ), this simplifies to
This is the complete tree-level amplitude. No regularization, no counterterms, no infinities — it's a finite algebraic expression, written down in two lines from the Lagrangian.
4. The Spin- and Polarization-Averaged Squared Amplitude
For an unpolarized cross section, average over initial electron spins / photon polarizations and sum over final ones:
Standard trace technology — using and (in Feynman gauge) — gives the classic result
The trace evaluation is mechanical; it takes maybe two pages of algebra in any QFT textbook (e.g. Peskin–Schroeder §5.5, Schwartz §13.4).
5. The Klein–Nishina Cross Section
In the lab frame (electron initially at rest), with incoming photon energy and outgoing photon energy at angle , momentum conservation gives the Compton wavelength shift:
Plugging into and folding in the standard phase-space factor (see QFT/cross-sections.md §2.5) yields the Klein–Nishina formula (1929):
This is the central result. It gives the scattering cross section as an explicit closed-form function of incoming photon energy and scattering angle.
Limits
-
Low-energy (Thomson) limit, . Then and the formula reduces to
the classical Thomson scattering cross section, with the classical electron radius. Total cross section .
-
High-energy limit, . The cross section drops as , a slow logarithmic falloff. Photons become more forward-peaked.
6. What Was — and Wasn't — Used
This is what makes Compton scattering the "easiest real QED application":
| Used | Not used |
|---|---|
| QED Lagrangian and Feynman rules | Loop diagrams |
| Asymptotic Fock states (, ) | Renormalization, counterterms |
| LSZ reduction (implicit in reading off ) | Bound-state machinery (Bethe–Salpeter, NRQED) |
| Standard kinematic phase space | Gauge-fixing beyond Feynman gauge (results are gauge-invariant) |
| Trace identities for | Path integrals |
Compare with QED/hydrogen.md, which needs all of the above plus much more.
7. Why It's a Cleaner First Example Than Hydrogen
| Feature | Compton scattering | Hydrogen atom |
|---|---|---|
| External states | Free particles (one-particle Fock states) | Bound state (non-perturbative two-body) |
| Diagrams | 2 tree-level | All loop orders + non-perturbative summation |
| Closed-form result? | Yes — Klein–Nishina | No (beyond Dirac–Coulomb) |
| Renormalization? | Not needed | Needed for Lamb shift |
| QCD input? | Not needed | Needed for proton structure |
| Where you find it in textbooks | Peskin–Schroeder Ch. 5; Schwartz Ch. 13 | Multiple chapters spread across textbook + monographs |
The scattering amplitude can be derived in a single sitting; the hydrogen spectrum requires the entire NRQED framework.
8. Historical and Experimental Status
- Klein and Nishina (1929) derived the formula immediately after Dirac's equation, before QED was fully formulated. Their derivation used the Dirac equation in an external EM field and a clever cancellation of negative-energy contributions; the modern QFT derivation (above) is conceptually cleaner but gives the same answer.
- Experimental tests: Compton's original 1923 X-ray scattering experiment confirmed the wavelength shift; Klein–Nishina's energy-dependent angular distribution has been verified for -rays from MeV to TeV energies (e.g. by the Compton Gamma Ray Observatory).
- Compton scattering is the dominant photon–matter interaction process in the energy range to , of central importance in medical imaging, -ray astronomy, and radiation shielding.
9. Take-Aways
- Compton scattering is the simplest real QED calculation: two diagrams, no loops, closed-form Klein–Nishina cross section.
- It exercises the full QED Feynman-rule machinery without any of the bound-state, loop, or renormalization complications of hydrogen.
- The Thomson and high-energy limits cleanly recover classical and ultra-relativistic regimes from the same formula.
- For pedagogical purposes, this is the right first place to land after learning the QED postulates; hydrogen is much later.
The Hydrogen Atom in QED
The hydrogen atom is the canonical worked example of QED, and the most precisely tested system in atomic physics. This page walks through how the QED framework of QED and QED/historical.md actually produces the hydrogen spectrum, layer by layer.
The thread is: QED gives the full description of hydrogen as a bound state in Fock space, and the familiar Schrödinger / Dirac / Lamb-shift treatments are successive approximations in .
1. Hydrogen as a Bound State in Fock Space
In QED, hydrogen is a bound state of the proton-electron-photon system — a non-perturbative state in (the proton enters either as a fixed external source or as an additional fermion species). Schematically,
where:
- creates an electron of momentum .
- creates a proton of momentum .
- is the two-body relativistic wavefunction, satisfying the Bethe–Salpeter equation (the relativistic generalization of the Schrödinger equation for bound states).
- The "(photon-dressed admixtures)" indicate that the true bound state has nonzero overlap with sectors containing extra photons, virtual pairs, and so on — these are the loop corrections.
This is the row labelled "Bound state" in fock-space-inventory.md §3, with (positron) replaced by (proton). Bound states are flagged there as non-perturbative because cannot be obtained by ordinary perturbation theory in — the Coulomb interaction must be summed to all orders to produce a bound state at all.
The full QED hydrogen state is therefore an object of considerable complexity. In practice one almost never works with directly; instead, one uses an expansion in two small parameters:
- — fine-structure constant.
- — electron-to-proton mass ratio.
Each level of the expansion adds physics. The next sections describe them in turn.
2. Hierarchy of Approximations
Level 0 — Schrödinger Hydrogen
Treat the proton as an infinitely heavy classical Coulomb source , and use the non-relativistic Schrödinger equation:
This gives the gross structure of hydrogen: the Bohr energies , the principal quantum number , the orbital quantum numbers , , and (added by hand) the spin quantum number . The eigenfunctions are the familiar hydrogen wavefunctions involving Laguerre polynomials and spherical harmonics.
Position-space wavefunctions (closed form). Separation in spherical coordinates gives
with the radial functions
where is the Bohr radius and are the associated Laguerre polynomials. The first few:
In QED language, this is the leading-order term in a non-relativistic, no-radiation, classical-source expansion — zero of the QED machinery (no Fock space, no virtual photons, no antiparticles, no loops) is actually used. The proton is classical; the electron is first-quantized.
Level 1 — Fine Structure
Replace the Schrödinger equation by the Dirac equation in an external Coulomb field:
This is exactly the minimally coupled Dirac equation of QED/historical.md §2.1, with as a fixed external classical field rather than a quantized one. Expanding the Dirac-Coulomb solutions in powers of around the Schrödinger result adds three corrections:
- Relativistic kinetic energy: , from .
- Spin-orbit coupling: , from the Dirac magnetic-moment interaction in the non-relativistic reduction. This couples orbital and spin angular momenta into the total .
- Darwin term: , a purely relativistic effect from the Zitterbewegung smearing (see fock-space-inventory.md §0.8.7) — the electron's effective position is averaged over a Compton wavelength.
The exact Dirac-Coulomb spectrum is
which expanded to leading correction in gives the standard fine-structure formula:
Note that fine-structure energies depend only on and the total angular momentum , not on . So states like and (different , same ) are degenerate at this level. This degeneracy is an "accidental" symmetry of the Coulomb potential — it doesn't survive once QED loop corrections (Level 2) are turned on.
Position-space wavefunctions (closed form — Darwin–Gordon, 1928). With orbital angular momentum no longer commuting with (spin–orbit mixes and ), the angular dependence is carried by the spinor spherical harmonics
and the exact 4-spinor wavefunction is
where for , , and the upper / lower 2-component blocks have opposite parity (so and differ by 1). The radial functions are
with parameters
and confluent hypergeometric functions in the variable . is a normalization constant. Explicit formulas are given in any relativistic-QM textbook (Bjorken–Drell, Greiner, Strange).
In the limit these reduce to the Schrödinger wavefunctions of Level 0 (with the lower component vanishing as ).
This is still first-quantized electron physics in a classical Coulomb field. The full QED machinery — quantized , Fock space, loops — has not been used.
Level 2 — Lamb Shift and True QED
Now promote to a quantized field. New effects appear at order :
-
Electron self-energy. The bound electron emits and reabsorbs virtual photons; the corresponding one-loop diagram shifts its energy. This is the dominant contribution to the Lamb shift. Conceptually: the electron is "dressed" by a cloud of virtual photons whose energy depends on the binding to the nucleus.
-
Vacuum polarization (Uehling potential). Virtual pairs in the photon propagator modify the Coulomb potential at short distances:
The correction is exponentially suppressed beyond the Compton wavelength , so it shifts only -states (which have non-zero density at the origin).
-
Anomalous magnetic moment. The electron has with — Schwinger's celebrated 1948 result. This modifies the spin-orbit and hyperfine couplings.
The combined effect lifts the Dirac-Coulomb degeneracy between and by the Lamb shift:
Measured by Willis Lamb in 1947 and explained by Bethe within weeks (with the first one-loop QED computation), this was the founding triumph of QED. It is the first level at which QED's full Fock-space machinery — quantized photons, virtual particle loops, renormalization — is genuinely required.
Position-space wavefunctions: no closed form. There is no closed-form expression for the bound-state wavefunction once one-loop QED corrections are included. In practice the Dirac–Coulomb wavefunctions of Level 1 are used as a basis, and the QED corrections are computed perturbatively as matrix elements:
where packages the self-energy, vacuum polarization, and anomalous-moment contributions as effective operators (see §6 on NRQED). The one analytic piece available in closed form is the Uehling potential itself (the integral above), but the eigenvalue problem for the Coulomb + Uehling potential is not analytically solvable.
Level 3 — Hyperfine Structure
Treat the proton as having spin and a magnetic moment
The interaction between proton spin and the electron's magnetic field at the proton splits states with different total angular momentum . For an -state (no orbital field), the dominant contribution comes from the contact (Fermi) interaction at the origin.
For the ground state, this gives the 21 cm line:
This is the spectral line astronomers use to map neutral hydrogen across the Milky Way and beyond — visible to radio telescopes.
Note: is anomalous (the proton is a composite QCD bound state; its -factor is not 2). The numerical value of the hyperfine splitting therefore requires QCD input, not just QED. The QED part is the form of the interaction; the coefficient requires nuclear physics.
Position-space wavefunctions. The angular structure is now the doubly-coupled recombined into total via Clebsch–Gordan. The spatial / radial dependence is unchanged from Level 1 (Dirac–Coulomb radial functions); the hyperfine splitting is a small energy correction whose proportionality to at the origin makes the radial wavefunction enter only as an evaluated number, not as a modified function.
Level 4 — Higher-Order Corrections
Modern hydrogen spectroscopy probes effects at order and beyond, requiring:
-
Two-loop and higher QED corrections to self-energy and vacuum polarization.
-
Nuclear recoil corrections. At leading order, replace everywhere by the reduced mass . Beyond that, relativistic recoil corrections of order enter.
-
Finite proton size. The proton has an RMS charge radius , modifying the Coulomb potential at very short distances:
This affects only -states and is the source of the proton radius puzzle (a long-standing discrepancy between hydrogen-spectroscopy and muonic-hydrogen / electron-scattering determinations of , partially resolved in recent years).
-
Weak-interaction parity violation. Tiny mixing of and states from -boson exchange — beyond QED proper, in the electroweak sector.
3. Numerical Hierarchy
For the – transition in hydrogen (), the relative size of each contribution is roughly:
| Effect | Order | Relative magnitude |
|---|---|---|
| Bohr (Schrödinger) | ||
| Fine structure (Dirac–Coulomb) | ||
| Lamb shift (one-loop QED) | ||
| Hyperfine | ||
| Two-loop QED, recoil, finite size, ... | , , ... | and smaller |
Modern spectroscopy (Garching, NIST, MPQ) measures the – transition to 15 significant figures. Agreement with QED holds at this level, currently limited by uncertainty in the proton charge radius rather than by QED itself.
4. Closed-Form Solutions: What's Available
Position-space wavefunctions of hydrogen exist in closed form only at the first two levels of the hierarchy. Beyond Dirac–Coulomb, the bound state can only be characterized via systematic expansions or numerical methods.
4.1 What Is Closed-Form
| Equation | Bound state in position space? |
|---|---|
| Schrödinger–Coulomb | Yes — Laguerre × (Level 0) |
| Klein–Gordon–Coulomb (spin-0) | Yes — hypergeometric × |
| Dirac–Coulomb (spin-, point classical proton) | Yes — Darwin–Gordon: confluent hypergeometric × (Level 1) |
The Dirac–Coulomb spectrum is exact to all orders in :
4.2 What Is Not Closed-Form
Every additional ingredient breaks closed-form solvability:
- Quantized photons (Lamb shift). Self-energy diagrams involve a logarithm — Bethe's logarithm — that depends on the entire bound-state spectrum. No closed form for the resulting energy shift.
- Quantum proton (recoil). Even the classical two-body relativistic problem has no separable closed form. Bethe–Salpeter restores Lorentz covariance at the cost of an integral equation in 4D.
- Vacuum polarization. The Uehling potential is closed-form (an integral representation), but the eigenvalue problem for Coulomb + Uehling potential is not.
- Asymptoticity of the QED series. At sufficiently high orders, the perturbative QED series for hydrogen is asymptotic, not convergent. An "all-orders sum" is not well-defined as a function of .
- Haag's theorem. The interacting QED bound state does not live in the same Hilbert space as the free Fock space (see QFT/remarks.md). There is no rigorous sense in which an "exact wavefunction" exists as a function of in any rigorously constructed Hilbert space.
4.3 What Is Used in Practice
For state-of-the-art atomic-physics calculations:
- Dirac–Coulomb wavefunctions are the workhorse. They serve as the basis on which QED corrections are computed perturbatively.
- NRQED (Caswell–Lepage 1986) is an effective field theory in which is the small parameter. The hydrogen Hamiltonian becomes a power series in with coefficients computed once and reused; effective-Hamiltonian eigenstates are exact eigenfunctions at given order in .
- Bethe–Salpeter integral equations for the two-body wavefunction are solved numerically.
- Dimensional regularization gives analytical expressions for individual loop contributions, often involving zeta values and polylogarithms.
In a useful but loose sense, the "exact QED wavefunction" used in practice is the Dirac–Coulomb spinor, with QED corrections folded in via perturbation theory and effective Hamiltonians. This is a productive fiction, not a true exact QED state.
4.4 The Bethe–Salpeter Equation Explicitly
Although it has no closed-form solutions, the Bethe–Salpeter (BS) equation is the formally correct relativistic two-body bound-state equation in QFT, and it is fully explicit. For a two-fermion bound state with total 4-momentum and quantum numbers , the Bethe–Salpeter amplitude is the matrix element
where is the electron field and is the second fermion's field (proton for hydrogen, positron for positronium). The homogeneous Bethe–Salpeter equation is
or, schematically, . Here:
- , are the full (renormalized) one-particle propagators of the two fermions.
- is the two-particle irreducible (2PI) kernel — the sum of all Feynman diagrams that cannot be cut into two pieces by removing one electron line and one line.
In momentum space, with relative momentum and total , this becomes the integral eigenvalue equation
with bound-state masses appearing as the discrete values of for which non-trivial solutions exist.
The kernel order by order. is a perturbative sum, not closed-form. The leading non-trivial contribution is the single-photon-exchange ("ladder") kernel:
Diagrammatically the resulting iteration (the ladder approximation) sums all "rung diagrams":
e ─→─┬─→─┬─→─┬─→─ ...
γ γ γ
p ─→─┴─→─┴─→─┴─→─ ...
Higher orders add crossed-ladder diagrams, vertex corrections, self-energy insertions (which dress the propagators via Schwinger–Dyson), and vacuum-polarization insertions on the exchanged photon. Each is a higher power of .
Practical difficulties. Several issues make the raw 4D BS equation unwieldy:
- It is genuinely 4-dimensional — the relative momentum has a relative-energy component as well as , and there is no obvious reduction to a 3D Schrödinger-style problem.
- It has spurious solutions corresponding to anomalous dependence on relative time, which must be projected out.
- The off-shell kernel is gauge-dependent; physical bound-state energies are gauge-invariant only after summing the kernel to all orders.
- Even the toy ladder approximation has no closed-form solutions for full QED. The Wick–Cutkosky model (scalar electrodynamics with massless exchange) is a rare solvable case, via Wick rotation to a 4D harmonic-oscillator-like equation.
3D reductions. Because the 4D BS equation is hard to use directly, practical calculations use 3D reductions:
- Salpeter equation — instantaneous approximation , integrating out .
- Brezin–Itzykson–Zinn-Justin reduction — a more covariant 3D reduction that retains more relativistic structure than Salpeter.
- NRQED — integrate out high-energy modes () once and for all, leaving an effective non-relativistic Hamiltonian with QED-computed matching coefficients. This is the modern workhorse.
Concrete example: positronium. The cleanest BS / NRQED application is positronium, where there is no QCD complication. Leading-order energies
(half the hydrogen Bohr energy because the reduced mass is ), and the leading hyperfine splitting between para- and ortho-positronium
are now known through , with computed and measured values agreeing to -digit precision.
In summary: the BS equation is essential conceptually — it defines what a relativistic bound state in QFT means — but is rarely used in its raw 4D form for production calculations. NRQED has replaced it as the practical tool.
4.5 Where the BS Equation Comes From
The BS equation is not postulated — it is a structural consequence of three ingredients that are already present in any QFT: the four-point Green's function, the 2PR/2PI organization of Feynman diagrams, and the spectral representation of correlation functions. Here is the standard derivation.
Step 1 — The four-point Green's function
For two fermion species , , define the four-point Green's function
This is the propagation amplitude for two fermions from to , summing all interactions to all orders. It is what perturbation theory directly computes (a sum of all Feynman diagrams with two incoming and two outgoing fermion lines).
Step 2 — Two-particle reducibility
Organize diagrams in by two-particle reducibility:
- A diagram is two-particle reducible (2PR) if it can be cut into two pieces by removing one line and one line.
- It is two-particle irreducible (2PI) otherwise.
Define the 2PI kernel as the sum of all 2PI diagrams. Combinatorially, every diagram in is either 2PI or decomposes uniquely as a chain
— i.e., is a sum of "chains" of 2PI kernels glued by full propagators. (This is just the definition of 2PI: cutting at any 2PR location separates the diagram into a smaller chain.)
Step 3 — Dyson's equation for
Translating the chain decomposition into an equation gives Dyson's equation for the four-point function:
where:
- is the disconnected propagator product (full one-particle propagators, no interaction between the two particles).
- The product is a 4-fold spacetime convolution.
This is exactly analogous to the Dyson equation for the one-particle propagator, but at the two-particle level. Iterating recovers the geometric series
which is the chain decomposition above.
Step 4 — Bound states as poles in
So far this is just a resummation; bound states have not yet appeared. They enter through the spectral content of .
The four-point function, viewed as a function of the total momentum (Fourier-conjugate to the center-of-mass position), has discrete poles at for each bound-state mass . Inserting a complete set of states between and ,
the residue at defines the Bethe–Salpeter amplitude
Step 5 — The BS equation as the residue equation
Plug the pole structure of back into the Dyson equation and match residues at :
- The left-hand side has a simple pole with residue .
- The first term on the right has no pole at (free propagators do not bind).
- The pole on the right comes entirely from the in , with residue .
Matching residues and dropping the common factor ,
This is the homogeneous Bethe–Salpeter equation stated in §4.4. Non-trivial solutions exist only for those values that are bound-state masses — the eigenvalue condition.
Status: postulated vs. derived
| Object / step | Status |
|---|---|
| Underlying QFT (Lagrangian, Hilbert space, fields, vacuum) | Postulated (the QFT axioms) |
| Four-point Green's function | Defined |
| 2PR / 2PI organization | Combinatorial fact about Feynman diagrams |
| Dyson equation | Derived from chain decomposition |
| Pole structure of at bound-state masses | Spectral assumption (true if a bound state exists) |
| BS equation | Derived by matching residues |
So the BS equation is not an independent postulate — it is a structural consequence of having any QFT with bound states.
Sanity check: non-relativistic limit
In the non-relativistic limit, the BS equation should reduce to the Schrödinger equation. Indeed:
- Take the kernel to be single-photon exchange (ladder).
- Take the instantaneous approximation ( depends only on , not ).
- Take both fermions to be non-relativistic (, lower spinor components small).
- Define (integrate out relative time).
The result is
the Schrödinger–Coulomb equation with reduced mass and the binding energy. So the BS equation contains the Schrödinger hydrogen equation (Level 0) as its leading non-relativistic instantaneous-ladder limit, and Dirac–Coulomb (Level 1) as the next-order correction.
Historical note
The BS equation was derived by Bethe and Salpeter in 1951, motivated by the need to do bound-state calculations in QED beyond the Dirac–Coulomb level. Gell-Mann and Low gave a more rigorous derivation a few months later, formalizing the residue argument used above. It was the first relativistic two-body wave equation in QFT, and remains the canonical definition of a relativistic bound state — even though, as discussed, it is rarely used directly in modern computations (NRQED having largely replaced it).
5. The Spectrum, Schematically
Stacking all the corrections on the manifold gives the famous picture:
n = 2
┌── 2P₃/₂ ──── ── ── ── ── ── ── ── F=2,1
│ ↑
│ fine structure (~α⁴)
│ ↓
Schrödinger ─────── ┤ 2S₁/₂ ──── ──── ── ── ── ── ── ── F=1,0
(Bohr energy) │ 2P₁/₂ } degenerate in Dirac F=1,0
│ ↑
│ Lamb shift (~α⁵)
│ ↓
└── 2P₁/₂ ──── ── ── ── ── ── ── ── F=1,0
splits 2S₁/₂ from 2P₁/₂
Bohr Dirac fine QED Lamb Hyperfine
~ α² ~ α⁴ ~ α⁵ ~ α⁴ m_e/m_p
The same hierarchy applies to every level , with the magnitude of each splitting falling off as a power of .
6. Where This Sits in the QED Framework
| QED ingredient | Role in hydrogen |
|---|---|
| Dirac field , QED Step 1 | The electron field, used to build the bound-state vector |
| Photon field , QED Step 1 | Mediates the Coulomb potential (tree level) and the Lamb shift (one-loop) |
| Local gauge invariance, QED Step 2 | Forces the electron–photon coupling to be exactly |
| Minimal coupling, QED/historical.md §2.1 | What you actually use to write down the Coulomb potential and the spin-orbit term |
| Fock space, fock-space-inventory.md §1 | Where lives |
| Bound state row, fock-space-inventory.md §3 | The structural template for |
| Renormalization, QED Step 5 | Turns the divergent self-energy diagram into the finite Lamb shift |
| LSZ / Feynman rules, QED Step 6 | Used for scattering off hydrogen, not for the bound state itself |
The bound state is a non-perturbative object; ordinary scattering perturbation theory in misses it entirely (a bound state appears as a pole in correlation functions, not as a finite-order Feynman diagram). The standard tools for it are NRQED (non-relativistic QED, an effective theory in which is small) and the Bethe–Salpeter equation, both of which sum the Coulomb interaction to all orders before perturbing in the residual radiative corrections.
7. Take-Aways
- Hydrogen is the cleanest, most precisely tested QED system — agreement at the part-in- level for the – transition.
- The familiar Schrödinger / Dirac / Lamb / hyperfine pictures are successive approximations in the small parameters and , all derivable from the QED postulates.
- Levels 0 and 1 use no Fock-space machinery at all (classical or external Coulomb field); Level 2 (Lamb shift) is the first point where quantized photons and loop corrections are essential.
- The bound state itself is a non-perturbative vector in Fock space; perturbative Feynman diagrams compute corrections on top of an already-bound state, not the bound state itself.
- Hyperfine structure already takes us outside pure QED: the proton -factor is set by QCD, and the proton charge radius enters as a hadronic input.
Quantum Chromodynamics
Quantum Chromodynamics (QCD) is the relativistic quantum field theory of quarks and gluons — the gauge theory of the strong interaction. Like QED, it is obtained by specializing the general postulates of Quantum Field Theory with three additional inputs: a choice of field content, a gauge symmetry, and the renormalizable Lagrangian uniquely determined by these together with Lorentz and discrete-symmetry invariance.
The single structural change relative to QED — replacing the abelian gauge group by the non-abelian — has profound consequences:
- Gluons self-interact (3- and 4-gluon vertices appear automatically from gauge invariance), unlike photons.
- The running coupling decreases at high energy: asymptotic freedom.
- At low energy the coupling becomes strong and quarks/gluons are not asymptotic states: color confinement.
- Faddeev–Popov ghosts do not decouple — they are required for unitarity in covariant gauges.
These three facts make QCD qualitatively different from QED at both the perturbative and non-perturbative level, despite the formal similarity of the Lagrangians.
What QCD Describes Physically: The Strong Force
QCD is the modern theory of the strong force (strong interaction), one of the four fundamental forces alongside electromagnetism, the weak interaction, and gravity. The strong force is responsible for two phenomena that look quite different at first glance:
- Fundamental level — binding quarks into hadrons. Quarks carry color charge (3 values: red, green, blue — labels, no relation to optics) and interact by exchanging gluons, the strong-force analogue of the photon. Unlike photons, gluons themselves carry color (8 of them, in the adjoint of ) and so interact with each other. The only physical states are color-singlet combinations: mesons (), baryons (), glueballs, etc. — isolated quarks and gluons have never been observed, a non-perturbative phenomenon called confinement (see §Confinement).
- Residual level — binding nucleons into nuclei. Protons and neutrons are themselves color-singlets, so the force holding them together inside a nucleus is a residual effect of QCD, analogous to van der Waals forces between neutral atoms. At the nucleon level it is well-described by the Yukawa exchange of mesons (Yukawa, 1935): primarily pions () and heavier mesons (). The resulting nucleon–nucleon potential is short-range (range ), attractive at moderate distances, repulsive at very short ones. This is the original "nuclear force" of pre-quark-era nuclear physics, now understood as an emergent low-energy consequence of the deeper QCD dynamics below.
Why "strong"?
The four fundamental couplings, at low energy:
| Force | Coupling | Approximate value |
|---|---|---|
| Strong | at , at | |
| Electromagnetic | ||
| Weak | ||
| Gravitational |
So at hadronic energies the strong force is genuinely the strongest — by an order of magnitude over electromagnetism, six orders over the weak interaction, and an astronomical 38 orders over gravity between elementary particles. Asymptotic freedom (Step 5) is what tames it at short distances: decreases with energy, so the force gets weaker at higher energies — the opposite of the everyday intuition that pulling things apart costs more energy.
Why most mass is QCD
One of QCD's most striking qualitative predictions, often missed in introductory treatments: only of the proton's mass comes from the Higgs mechanism. The up-quark and down-quark masses (Higgs-generated) total roughly — about of the proton mass . The remaining is dynamically generated by QCD: it is the energy stored in the gluon field and in quark kinetic motion confined inside the proton, related to the dimensional-transmutation scale (Step 5).
This is why almost all the mass of ordinary matter (atoms, you, this document) is QCD-binding energy, not Higgs-generated rest mass. The Higgs is essential for most particle masses (electron, muon, , individual quarks), but the bulk of baryonic matter mass is the strong force at work. Lattice QCD computations confirm this ab initio: the entire light-hadron spectrum (proton, neutron, pion, kaon, ...) emerges from a Lagrangian whose only inputs are and the small quark masses, with no further mass parameter.
Short historical lineage
The strong force was named long before QCD existed:
- Yukawa (1935) — postulated a massive scalar mediator of the nuclear force, predicting the pion (, discovered 1947).
- Particle zoo (1950s–60s) — proliferation of hadrons (mesons, baryons, resonances) at accelerators suggested they were composite.
- Quark model (Gell-Mann, Zweig, 1964) — hadrons organized into multiplets of an approximate flavor symmetry, postulated to arise from three quark "building blocks" .
- Color (Han–Nambu, Greenberg, 1965) — extra quantum number needed for the baryon to be antisymmetric under fermion exchange.
- Asymptotic freedom (Gross–Wilczek, Politzer, 1973) — gauge with quarks in the fundamental representation produces a negative -function. Settled QCD as the theory of the strong force, Nobel Prize 2004.
- Confirmation (1970s–present) — deep inelastic scattering at SLAC (partons, late 1960s); the 3-jet event at PETRA (gluons, 1979); precision QCD at LEP, Tevatron, and the LHC.
The "strong interaction" label predates the field-theoretic understanding by 40 years; the modern view is that the strong force is the gauge interaction of QCD, and everything else (nuclear binding, mesons, hadron spectroscopy) is an emergent consequence.
Companion presentation. This document picks up where Modern Foundations — Wigner–Weinberg derivation ends: the latter shows in §5.2 (and its Generalization callout) that any set of massless spin-1 particles forming a multiplet under an internal symmetry forces the Yang–Mills structure as a theorem of Lorentz consistency. Here we go the opposite direction — postulate as the gauge group and derive the resulting massless adjoint multiplet of gluons — and then proceed to the full Lagrangian, quantization, renormalization, and qualitative consequences (asymptotic freedom, confinement). See §Equivalent framings in Step 2 for the precise correspondence.
Following the tag convention of foundations-modern.md, each step below is labelled (Empirical input), (Postulate), (Theorem), or (Standard machinery) so the assumption budget on top of the QFT postulates is transparent.
Derivation of QCD from QFT
QCD inherits all postulates of QFT (see Postulates of QFT). The chain of choices that turns the generic QFT framework into QCD parallels the QED derivation.
Step 1 — Specify the Field Content (Empirical input; specializes QFT Postulate 4)
QCD postulates two fundamental field types:
- Quark fields — Dirac spinors carrying a color index (transforming in the fundamental representation of ) and a flavor index . Each quark flavor has its own mass .
- Gluon fields , — eight real vector fields, one for each generator of , transforming in the adjoint representation .
By the spin–statistics theorem (QFT Postulate 7), quarks are quantized with anticommutators and gluons with commutators.
Color vs. flavor. Color is the gauge symmetry — it is local and is the dynamical content of the theory. Flavor (which distinguishes up/down/strange/...) is a global approximate symmetry, broken explicitly by the different quark masses. The quark flavors are not related by any gauge transformation; they are different species, each with its own mass parameter.
Step 2 — Postulate a Local Gauge Symmetry (Postulate)
The genuinely new ingredient is local non-abelian gauge invariance:
where , the are the eight Hermitian generators of in the fundamental representation (often written with the Gell-Mann matrices), and is the matrix-valued gauge field. The generators satisfy the Lie algebra (see group-theory.md):
with totally antisymmetric structure constants .
The covariant derivative on a quark field is
and the non-abelian field strength is
The last term is the crucial difference from QED: is not gauge-invariant by itself — it transforms covariantly, — and the gauge-invariant kinetic term contains and pieces, giving gluon self-interactions.
Local has two immediate consequences:
- Gluons must be massless ( is not gauge-invariant).
- The way quarks couple to gluons is uniquely fixed by the covariant derivative; in addition, the gauge group itself dictates 3-gluon and 4-gluon vertices.
Equivalent framings. Postulating local gauge invariance and deriving an octet of massless gluons (this doc) is logically equivalent to postulating eight massless spin-1 species transforming in the adjoint of and deriving the Yang–Mills self-coupling structure from Lorentz consistency — the route sketched in the Generalization callout of foundations-modern.md §5.2. The two are the same theory viewed from opposite ends. As with QED, this document picks the gauge-symmetry-first framing because it generalizes uniformly to any compact Lie group (electroweak , GUTs, etc.) and exposes the non-abelian structure constants as the cause of gluon self-interaction rather than as an accident.
Step 3 — Build the Lagrangian (Theorem; specializes QFT Postulate 9)
The most general Lagrangian density that is
- Lorentz-invariant,
- gauge-invariant under local ,
- -, -, and -invariant (the -term below is a parity-odd exception, see caveats),
- renormalizable (operators of mass dimension ),
- built only from , , , and their derivatives,
is uniquely
with the covariant derivative and field strength given in Step 2. The free parameters are the strong coupling (equivalently ) and the quark masses .
Expanding in powers of exposes the new vertices absent in QED:
Renormalizability rules out higher-dimension operators such as (dim 6) and (chromomagnetic Pauli term, dim 5). The hypothetical -violating term
is allowed by all symmetries and renormalizability and — unlike the analogous term in QED — is not a total derivative in the non-abelian case (it contributes via instantons). The experimental smallness of the neutron electric dipole moment forces , the unexplained strong CP problem (see Caveats).
The classical equations of motion are the Dirac equation in a color background and the non-abelian Maxwell (Yang–Mills) equations with a color current:
where is the gauge-covariant divergence.
Step 4 — Quantize (Standard machinery)
The fields are promoted to operators following the standard QFT machinery, with one important wrinkle absent in QED.
Canonical quantization proceeds as for QED, but resolving the primary constraint in covariant gauges requires the Faddeev–Popov procedure. The most useful presentation is the path integral:
with covariant gauge-fixing and ghost terms
The Grassmann-valued scalar fields are the Faddeev–Popov ghosts. The non-abelian piece of the ghost-gluon coupling means ghosts run in loops — they do not decouple as they do in QED, and are required to cancel unphysical gluon polarizations and preserve unitarity. (Ghosts only appear inside loops; they are never external states.)
The full quantum theory has a residual BRST symmetry that replaces the broken classical gauge symmetry; this is what guarantees gauge-independence of physical -matrix elements after gauge-fixing.
Step 5 — Renormalize: Asymptotic Freedom (Standard machinery + key theorem)
QCD is renormalizable (proved by 't Hooft, 1971): all UV divergences can be absorbed into multiplicative redefinitions of the fields, masses, and the coupling .
The defining feature of QCD is the sign of the beta function. To one loop,
with for and the number of active quark flavors. For (in particular the physical ), , so and the coupling decreases as increases. Equivalently,
where is the dimensional-transmutation scale at which formally diverges. This is asymptotic freedom (Gross–Wilczek and Politzer, 1973), the historical justification for taking QCD seriously as the theory of the strong interaction:
- High energy / short distance (): is small, perturbation theory works, quarks behave as nearly free particles — the regime probed by deep inelastic scattering () and high-energy jets.
- Low energy / long distance (): is large, perturbation theory breaks down, and the spectrum is dominated by hadrons (mesons, baryons), not quarks and gluons. This is the regime of confinement and chiral symmetry breaking.
The crossover scale is generated dynamically; it is the only intrinsic mass scale of the massless-quark theory. The hadron mass spectrum (proton, pion, ...) is set by , with quark masses providing only quantitative corrections.
Step 6 — Predictions (Standard machinery; specializes QFT Postulate 10)
Time-ordered correlators are computed using Feynman rules read off from :
| Object | Feynman rule (Feynman gauge ) |
|---|---|
| Quark propagator (flavor , color ) | |
| Gluon propagator (color ) | |
| Ghost propagator (color ) | |
| Quark–gluon vertex | |
| 3-gluon vertex | |
| 4-gluon vertex | |
| Ghost–gluon vertex | (with outgoing ghost momentum) |
| External quark / antiquark | / with color index |
| External gluon | with color index |
LSZ reduction and the cross-section / decay-rate master formulas of cross-sections.md / decay-rates.md apply unchanged for partonic processes (where quarks and gluons can be treated as asymptotic states at high energy via the factorization theorems below). For low-energy observables, the partonic -matrix is not directly measurable — see Caveats.
Confinement and the Asymptotic-State Problem
QCD violates the version of QFT Postulate 10 that identifies asymptotic states with single-particle quark/gluon excitations: free quarks and gluons have never been observed. Instead the asymptotic Hilbert space is built from color-singlet hadrons — mesons , baryons , glueballs, etc.
Confinement is a non-perturbative phenomenon — invisible to any finite order in . The standard pieces of evidence and tools:
- Lattice QCD: Wilson's discretization of the Euclidean path integral on a spacetime lattice, evaluated by Monte Carlo. Confirms a linearly rising quark–antiquark potential at large (string tension ), reproduces hadron masses ab initio, and is currently the only first-principles method for QCD at low energy.
- 't Hooft large- limit: at with fixed, planar diagrams dominate; hadronic resonance physics organizes into a expansion that explains qualitative facts (OZI suppression, narrow widths, meson dominance).
- Wilson loops: (area law) signals confinement, vs. for an unconfined Coulomb phase.
- The Yang–Mills mass gap (existence of a non-zero lightest glueball mass in pure ) is supported numerically by lattice computations and is one of the Clay Millennium Prize Problems.
Chiral Symmetry and Its Spontaneous Breaking
In the limit , the QCD Lagrangian has a global chiral symmetry acting on left- and right-handed quark flavors independently. The QCD vacuum spontaneously breaks this to the diagonal via the chiral quark condensate
Goldstone's theorem then predicts 8 massless pseudoscalar bosons, identified with the pseudoscalar meson octet . Small explicit quark masses () lift these to small but non-zero pseudo-Goldstone masses — the pions are unusually light because chiral symmetry is only weakly broken in nature. The low-energy effective theory built around this picture is chiral perturbation theory (PT).
The piece of the classical chiral symmetry is broken by the chiral anomaly (not spontaneously): this resolves the historical puzzle and is intimately tied to instantons and the -term.
Factorization and Practical Calculations
High-energy hadron-collider observables (e.g. proton–proton scattering at the LHC) cannot be computed directly because the asymptotic states (protons) are non-perturbative bound states. The bridge is factorization theorems: schematically,
separating the calculation into:
- Parton distribution functions : probability of finding parton carrying momentum fraction inside a proton; non-perturbative, extracted from data.
- Partonic cross section : computed perturbatively in using the Feynman rules above and the standard cross-section machinery.
- DGLAP evolution: the -dependence of is governed by perturbative QCD via the Dokshitzer–Gribov–Lipatov–Altarelli–Parisi equations.
This is what makes QCD predictive at colliders despite confinement.
Summary: What QCD Adds to QFT and Differs from QED
| Aspect | QED | QCD |
|---|---|---|
| Gauge group | , abelian | , non-abelian |
| Matter fields | in | quarks in , flavors |
| Gauge bosons | 1 photon | 8 gluons |
| Gauge-boson self-interactions | none | 3- and 4-gluon vertices |
| Ghosts in covariant gauges | decouple | required (run in loops) |
| Beta function sign (1-loop) | (Landau pole at high ) | (asymptotic freedom) |
| Coupling at low energy | weak () | strong, perturbation theory fails |
| Asymptotic states | electrons, photons | hadrons (color singlets) |
| Spectrum determined by | parameters | + quark masses (mostly ) |
| Bound states | hydrogenic, weakly coupled | hadrons, intrinsically strong-coupling |
| Discrete symmetries | separately conserved | separately conserved (modulo ) |
| Rigorous construction in | open | open (Millennium Prize) |
So the "derivation" of QCD from QFT amounts to the same three choices as QED — field content, gauge group, renormalizability — with . The dynamics that follows is qualitatively different.
Successes and Tested Predictions
- Deep inelastic scattering (DIS) and Bjorken scaling: high-energy electron–proton scattering revealed point-like constituents (partons) inside the proton, and the logarithmic violations of exact Bjorken scaling match the DGLAP evolution predicted by perturbative QCD.
- Asymptotic freedom: the measured running of from GeV (where ) to (where ) agrees with QCD predictions across two decades in energy.
- Jet physics at colliders: 2-jet, 3-jet, and multi-jet rates at machines (LEP, SLC) and hadron colliders (Tevatron, LHC) confirm both the gluon's existence (3-jet events at PETRA, 1979) and the QCD prediction for jet cross sections.
- Lattice QCD ab initio computations of the hadron spectrum reproduce the proton, neutron, pion, kaon, ... masses to a few percent using only and the quark masses as inputs.
- Heavy-quark physics: the spectroscopy and decays of charmonium (), bottomonium (), and -mesons are computed with NRQCD and HQET, effective theories built from QCD by integrating out the heavy-quark scale.
Caveats and Open Issues
- No rigorous construction of Yang–Mills in with a mass gap — one of the Clay Millennium Prize Problems. Confinement is universally believed but has never been proved analytically.
- The strong CP problem. The -term is allowed by all symmetries and would generate a neutron EDM at the level ; the experimental bound forces . Why this parameter is so tiny is unexplained within QCD; the most popular resolution is the Peccei–Quinn mechanism, which predicts a new pseudoscalar particle, the axion.
- Sign problem in lattice QCD at finite density. Monte Carlo importance sampling fails when the Euclidean action becomes complex (finite chemical potential, real-time dynamics); this obstructs first-principles study of dense QCD (neutron-star interiors, the QCD phase diagram).
- Confinement mechanism. Several pictures (dual superconductor / monopole condensation, center-vortex condensation, Gribov–Zwanziger horizon) capture aspects of confinement, but no single mechanism is established as the answer.
- Quark masses are inputs. The 6 quark masses + the strong coupling are free parameters of QCD; their hierarchical pattern (over six orders of magnitude from MeV to GeV) is unexplained within QCD and must be supplied by physics beyond.
See Also
- QED — the abelian gauge theory, structurally identical at the Lagrangian level but qualitatively different in dynamics.
- Modern Foundations — the Wigner–Weinberg derivation framework into which QCD fits as a specialization (with the asymptotic-completeness caveat above).
- Postulates of QFT — the general framework QCD specializes.
- Group Theory — for , its Lie algebra , structure constants, and representation theory.
- Remarks and Open Issues — for the Yang–Mills existence problem and related foundational gaps.
Electroweak Theory
The electroweak theory is the relativistic quantum field theory of leptons, quarks, the photon, the and gauge bosons, and the Higgs boson. It unifies the electromagnetic interaction (QED) with the weak interaction in a single gauge theory based on the group
spontaneously broken by the vacuum expectation value of a Higgs scalar doublet down to the residual electromagnetic . The full unbroken theory has four massless gauge bosons; the broken theory has one massless gauge boson (photon) and three massive ones (), with masses generated by the Higgs mechanism.
It is the second pillar of the Standard Model (alongside QCD), the canonical example of:
- spontaneous symmetry breaking (SSB) in a gauge theory (the Higgs mechanism),
- chiral fermions (left-handed and right-handed components transform differently under the gauge group),
- explicit parity violation (, separately violated; violated through the CKM phase),
- flavor mixing (the CKM and PMNS matrices),
and historically the first instance of a unification of two of the four fundamental interactions.
What Electroweak Theory Describes Physically: Electromagnetism and the Weak Force
Electroweak theory subsumes two of the four fundamental forces of nature. The electromagnetic force is treated in detail in QED (the unbroken that remains after symmetry breaking — see §3.1). This section unpacks the second, less familiar half: the weak force.
What the weak force is
The weak force (weak interaction) is the fundamental force responsible for processes that change one type of fermion into another. Unlike electromagnetism (which preserves species and only shifts energies/momenta) or the strong force (which binds quarks of fixed flavor), the weak force can turn a down quark into an up quark, an electron into a neutrino, a muon into an electron, and so on.
Canonical processes:
- Nuclear -decay: , the historical entry point (radioactivity, discovered 1896; understood as a weak process in the 1930s). Microscopically, a down quark inside the neutron emits a virtual and turns into an up quark; the then decays into an electron and an electron antineutrino.
- Muon decay: — the cleanest weak process, used to define the Fermi constant .
- Pion decay , kaon decay, hyperon decay, and the entire menagerie of "slow" hadron decays whose lifetimes ( or longer) signal a weak-force origin.
- Hydrogen fusion in the Sun: . A proton must turn into a neutron for the deuteron to form — only the weak force allows that. The slowness of this process ( per proton) is why the Sun burns for billions of years rather than seconds.
- Neutrino interactions: neutrinos feel only the weak force (and gravity), which is why a typical solar neutrino traverses a light-year of lead before scattering.
Distinguishing features
| Property | Weak force | (For comparison) |
|---|---|---|
| Mediators | — massive vector bosons | photon: massless; gluons: massless |
| mass | photon: | |
| mass | (same) | |
| Range | EM: infinite; strong: | |
| Strength at low energy | — very weak | ; |
| Strength at | , comparable to EM | |
| Parity () | Maximally violated: only feels charged currents | conserved in QED/QCD |
| Charge conjugation () | maximally violated | conserved in QED/QCD |
| violated (CKM phase) | conserved in QED/QCD (modulo QCD ) | |
| Flavor-changing | The only force that changes quark/lepton flavor | EM/strong are flavor-diagonal |
| Acts on | All fermions (including neutrinos) | EM: charged; strong: colored |
The name "weak" is a low-energy accident: at energies the weak coupling is comparable to the electromagnetic one. The apparent weakness at everyday energies comes from the propagator suppression for — pulling a factor of into every weak amplitude. Strip away that propagator factor and the underlying coupling is stronger than the electromagnetic . This is exactly the hint that led to electroweak unification: the two forces look qualitatively different at low energy, but their underlying gauge couplings are of the same order.
Two types of weak interactions
Empirically the weak force splits into two distinct classes:
- Charged-current (CC) interactions, mediated by . They change electric charge by one unit and the flavor of one fermion. Vertex (in the doublet , similarly for ): All known -decays, hyperon decays, and most flavor physics happen via CC.
- Neutral-current (NC) interactions, mediated by . They preserve flavor and electric charge but provide a new weak force between like and unlike species, including neutrino–electron scattering. The couples to a mixture of vector and axial-vector currents and was predicted by electroweak unification before direct observation (Gargamelle, 1973 — the first hard confirmation of the GSW theory).
V−A structure and parity violation
The single most surprising experimental fact about the weak force is maximal parity violation: only left-handed fermions couple to the . A right-handed electron does not feel the charged-current weak force at all. This was discovered in the Wu experiment (1956–57): polarized nuclei were found to emit electrons preferentially in one direction relative to the nuclear spin, a result that cannot happen in a parity-conserving theory. Lee and Yang shared the Nobel Prize in 1957 for predicting it.
The compact statement is that the charged-current interaction has V−A (vector-minus-axial) structure: it couples through rather than the parity-symmetric of QED. This is encoded in the modern theory by putting in doublets and in singlets — a structural choice (Step 1 below), but the empirical input it formalizes is the V−A structure that Wu's experiment exposed.
The terminology mismatch: is there a "QFD"? By analogy with QED (the standalone gauge theory of EM, ) and QCD (the standalone gauge theory of the strong force, ), one might expect a standalone theory of the weak force — sometimes called QFD (Quantum Flavordynamics) in older literature. There is no such theory in nature. The weak interaction is inseparably unified with electromagnetism in electroweak theory: the are four mixtures (rotated by the Weinberg angle ) of the original gauge bosons, and they cannot be cleanly disentangled into a "weak-only" and "EM-only" sector at the Lagrangian level. The historical predecessor — Fermi's effective theory of -decay (1933) with a four-fermion contact interaction — was a standalone theory of the weak force, but it is non-renormalizable and breaks down at energies . EW theory replaces it with a renormalizable gauge theory in which Fermi's emerges as the low-energy limit, . So "QFD" never caught on, and the modern view is that the weak force is one half of electroweak.
Short historical lineage
- Becquerel 1896 — discovery of radioactivity; later identified as -decay = weak.
- Fermi 1933 — first quantitative theory: a four-fermion contact interaction , with . Predicts spectra correctly. Non-renormalizable.
- Lee–Yang 1956, Wu 1957 — parity violation. The weak interaction is the only one in nature that distinguishes left from right at the fundamental level.
- Feynman–Gell-Mann / Sudarshan–Marshak 1958 — V−A structure: the weak interaction couples only to left-handed currents.
- Glashow 1961, Weinberg 1967, Salam 1968 — electroweak unification: with the Higgs mechanism. Predicts , neutral currents, and the photon all from the same gauge structure. Nobel Prize 1979.
- 't Hooft–Veltman 1971–72 — proof that the spontaneously broken non-abelian gauge theory is renormalizable. Settles EW as a viable QFT. Nobel Prize 1999.
- Gargamelle 1973 — discovery of weak neutral currents (the first confirmation of as predicted by EW).
- UA1/UA2 1983 — direct discovery of and bosons at CERN. Nobel Prize 1984.
- LHC ATLAS/CMS 2012 — discovery of the Higgs boson at . Nobel Prize 2013.
The "weak interaction" label predates the field-theoretic understanding by 40+ years; the modern view is that the weak force and electromagnetism are two low-energy faces of a single gauge interaction, broken by the Higgs VEV to expose the difference between them.
Companion presentations.
- Modern Foundations — Wigner–Weinberg derivation shows in §5.2 that Lorentz consistency of any set of massless spin-1 species forces a Yang–Mills structure (the "Generalization" callout). It also flags (in the "What this story does not derive" callout) that massive gauge bosons are not derivable from M1+M2+M3 alone — their longitudinal polarizations break perturbative unitarity unless the gauge symmetry is spontaneously broken by a scalar VEV. This document is where that gap is filled: the Higgs mechanism is the empirical input that makes massive consistent with Lorentz + unitarity.
- QED and QCD are the two simpler gauge theories built on the same template (gauge symmetry → covariant derivative → renormalizable Lagrangian). Electroweak adds chirality, symmetry breaking, and mixing. See §Comparison with QED and QCD for a side-by-side.
- Standard Model combines this document with QCD into the full theory, with the additional cross-sector ingredients (anomaly cancellation, three generations, Yukawa hierarchy, CP problems).
Following the tag convention of foundations-modern.md, each step below is labelled (Empirical input), (Postulate), (Theorem), or (Standard machinery) so the assumption budget on top of the QFT postulates is transparent.
Derivation of Electroweak Theory from QFT
Electroweak inherits all postulates of QFT (see Postulates of QFT). The chain of choices parallels the QED / QCD derivations but with three new structural ingredients: chirality in the field content (Step 1), spontaneous symmetry breaking in the Lagrangian (Step 3), and mass mixing at the quantization stage (Step 4).
Step 1 — Specify the Field Content (Empirical input; specializes QFT Postulate 4)
The electroweak field content has four irreducible pieces, each playing a distinct structural role.
Gauge fields. Four real vector fields, one for each generator of :
- , — three real vector fields in the adjoint of .
- — one real vector field, neutral under , carrying hypercharge via .
At the level of the unbroken Lagrangian all four are massless. The physical emerge after symmetry breaking (Step 3) as linear combinations.
Matter fields — chiral fermions. The genuinely novel feature relative to QED/QCD is that left-handed and right-handed components transform differently: they belong to inequivalent representations of . Writing and :
| Field | Multiplet under | ||
|---|---|---|---|
| Left-handed lepton doublet | (doublet) | ||
| Right-handed electron | (singlet) | ||
| Left-handed quark doublet | |||
| Right-handed up-quark | |||
| Right-handed down-quark |
(Right-handed neutrinos are not present in the original SM but are added in extensions to give neutrinos Dirac masses; see Caveats.) The full theory has three generations of this pattern (electron / muon / tau, with up/down, charm/strange, top/bottom); a single generation is shown above for clarity. Hypercharge assignments are not free: they are fixed by anomaly cancellation (see standard-model/from-postulates.md).
Why chirality? The empirical input is the V−A (vector-minus-axial) structure of charged-current weak interactions, first systematized by Feynman–Gell-Mann (1958) and Sudarshan–Marshak (1958) and traced to parity violation in -decay (Wu, 1957). Putting and in inequivalent gauge multiplets is the modern way of encoding this: a Dirac mass term mixes the two chiralities and is therefore forbidden by gauge invariance until the Higgs gives the symmetry-breaking VEV that supplies the missing quantum numbers.
Scalar field — the Higgs. A complex scalar doublet under with hypercharge :
This is the single field whose presence is novel relative to QED/QCD. The four real degrees of freedom in become, after symmetry breaking, three longitudinal modes of (eaten by the gauge bosons, Goldstone-style) and one physical scalar — the Higgs boson , mass (discovered at the LHC, 2012).
By the spin–statistics theorem (QFT Postulate 7), all fermions are quantized with anticommutators, all bosons with commutators.
Step 2 — Postulate Local Gauge Symmetry (Postulate)
The next ingredient is local non-abelian × abelian gauge invariance. With the generators (half the Pauli matrices) and the hypercharge generator:
- A field in a representation of with hypercharge transforms as
- The gauge fields transform inhomogeneously:
The covariant derivative acting on is
with two independent gauge couplings: for , for . The non-abelian field strengths are
As in QCD, contains 3- and 4-gauge-boson self-couplings; is abelian like the photon and has no self-couplings of its own.
Local has the following immediate consequences:
- All four gauge bosons (, ) must be massless at the Lagrangian level.
- A Dirac fermion mass term is forbidden because and live in different representations.
- The fermion–gauge-boson couplings are uniquely fixed by the covariant derivative; the gauge boson self-couplings are uniquely fixed by the structure constants.
Equivalent framings. As in QED and QCD, postulating local gauge invariance and deriving the masslessness and self-couplings of the gauge bosons is equivalent to postulating four massless spin-1 species (with the right Lorentz quantum numbers and an internal multiplet structure) and deriving the Yang–Mills + abelian gauge structure via foundations-modern.md §5.2. The gauge-first framing is taken here, as in QCD, because it makes the symmetry-breaking pattern (Step 3) and the residual unbroken transparent.
Step 3 — Build the Lagrangian: Higgs Mechanism (Theorem + Postulate; specializes QFT Postulate 9)
The most general Lagrangian density that is
- Lorentz-invariant,
- gauge-invariant under local ,
- -invariant (with , separately and all violated, as we will see),
- renormalizable (operators of mass dimension ),
built from the fields of Step 1, splits naturally into four pieces:
with:
where the sum runs over all chiral multiplets (per generation), each with its own covariant derivative determined by its representation and hypercharge.
with the charge-conjugate doublet (needed to give up-type quarks mass while preserving hypercharge), generic complex Yukawa matrices in generation space, and h.c. denoting Hermitian conjugate.
The Higgs mechanism is what makes this Lagrangian describe massive gauge bosons and massive fermions consistently with gauge invariance.
3.1 Spontaneous symmetry breaking (Theorem)
When in the Higgs potential, the minimum is not at but on a sphere with
the electroweak scale, fixed empirically by measured and masses. Picking the gauge
three of the four generators of are broken by this VEV; one combination remains unbroken:
This is the electric charge generator. The unbroken subgroup is — exactly the gauge group of QED. So electroweak theory is the unique extension of QED that admits a renormalizable scalar potential breaking it down to .
3.2 Gauge-boson masses (Theorem)
Expanding and reading off the quadratic terms in produces gauge-boson mass terms. Diagonalizing them defines the physical gauge bosons:
where the weak mixing angle (also called the Weinberg angle) is fixed by
So the bosons get mass from alone; the is a -mixed combination; and the photon is the unbroken combination of and , remaining massless to enforce on the physical states. The electric charge is
connecting the electroweak couplings to the QED coupling .
The three "eaten" Goldstone bosons — the , and broken- generators acting on — become the longitudinal polarizations of and , restoring the count of degrees of freedom: has 4 real components, of which 3 are eaten and 1 (the physical Higgs ) remains.
3.3 Higgs boson mass (Empirical input)
The fluctuation around the VEV, (in unitary gauge), is a physical real scalar field with mass
Empirically (LHC, 2012), which fixes for the measured . The Higgs self-couplings (cubic and quartic ) are then predictions, currently being measured at the LHC.
3.4 Fermion masses and CKM mixing (Theorem)
The Yukawa terms become fermion mass matrices after EW symmetry breaking. For example,
a general complex mass matrix . Singular-value decomposition produces three positive eigenvalues — the physical electron, muon, and tau masses — and rotates the flavor fields into mass eigenstates.
In the quark sector, doing this independently for up-type and down-type yields the Cabibbo–Kobayashi–Maskawa (CKM) matrix
a unitary matrix that cannot be removed by field redefinitions and appears explicitly in the charged-current interactions . The CKM matrix has three real angles and one physical CP-violating phase — the sole source of CP violation in the Standard Model (modulo the QCD -angle).
For leptons, an analogous Pontecorvo–Maki–Nakagawa–Sakata (PMNS) matrix governs neutrino oscillations, but it requires neutrino masses (and hence either right-handed neutrinos or a Majorana mass term) — see Caveats.
Where is the postulate vs. theorem boundary? Steps 1 and 2 (field content + gauge group) are empirical / postulate; the existence of a Higgs mechanism solving the mass problem is a theorem (it is the unique renormalizable way to give masses to the gauge bosons of a spontaneously broken gauge theory while preserving unitarity). What is postulated in §3 is (a) the specific scalar representation (an doublet with — the minimal choice consistent with / mass relations) and (b) the Yukawa couplings (whose values are empirical and unexplained). The hierarchical pattern of fermion masses ( down to ) is one of the major open puzzles — see Caveats.
Step 4 — Quantize (Standard machinery)
The fields are promoted to operators following the standard QFT path-integral machinery. Two wrinkles relative to QED:
- Non-abelian gauge fixing requires Faddeev–Popov ghosts in covariant gauges, as in QCD.
- gauges for spontaneously broken theories ('t Hooft, 1971) mix the gauge fields with the would-be Goldstone bosons in a way that produces a manifestly renormalizable Lagrangian. The would-be Goldstones acquire -dependent masses and ghosts appear for both .
In unitary gauge (), the would-be Goldstones decouple and the propagators of become
manifestly massive. In Feynman–'t Hooft gauge () the propagators are simpler:
at the cost of carrying around explicit Goldstone bosons and ghost fields.
The renormalizability of the spontaneously broken non-abelian gauge theory was proved by 't Hooft and Veltman (1971–72), Nobel Prize 1999. The proof requires the BRST formalism and the unitarity of the -matrix on the physical Hilbert space (Goldstones + ghosts cancel against unphysical gauge-boson polarizations).
Step 5 — Renormalize (Standard machinery)
The electroweak theory is renormalizable in any gauge. Its -functions for the two gauge couplings are, at one loop:
with the number of generations and the number of Higgs doublets. For the SM (, ), (asymptotic freedom for , like QCD) and (a Landau pole at very high energies for , like QED — though far above the Planck scale and so phenomenologically irrelevant).
The renormalized parameters are usually traded for measured observables: the most common modern scheme is
with the Fermi constant defined from muon decay and the running fine-structure constant at the -pole.
Step 6 — Predictions (Standard machinery; specializes QFT Postulate 10)
The Feynman rules follow from in any gauge. The new ingredients relative to QED/QCD:
| Object | Comment |
|---|---|
| propagator | massive vector, as above; mixes with would-be Goldstone in |
| propagator | same |
| Higgs propagator | |
| Charged-current vertex | |
| Neutral-current vertex | with -dependent |
| Higgs–fermion vertex | (coupling proportional to fermion mass — the SM "smoking gun") |
| Triple-gauge | from |
| Quartic gauge | from |
| Higgs self-couplings | from |
Cross sections and decay rates follow from the standard machinery of cross-sections.md and decay-rates.md.
Comparison with QED and QCD
| Aspect | QED | QCD | Electroweak (this doc) |
|---|---|---|---|
| Gauge group | , abelian | , non-abelian | , broken to |
| Matter | Dirac (vector-like) | Dirac (vector-like in color) | Chiral: in different reps |
| Gauge bosons | 1 massless photon | 8 massless gluons | 4 bosons: (massive), (massless) |
| Gauge-boson self-interaction | none | 3g, 4g vertices | , , etc. |
| Mass generation | direct allowed | direct allowed | forbidden by gauge invariance; Yukawa + Higgs VEV required |
| Higgs sector | none | none | one complex doublet , GeV |
| Parity, | conserved | conserved | and violated maximally; violated by CKM phase |
| Asymptotic states | electrons, photons | hadrons (P10b modified) | leptons, hadrons, (massive — finite-lifetime resonances), |
| Beta function (gauge) | (Landau pole) | (asymptotic freedom) | (); () |
| Free parameters | , , | , plus 9 fermion masses + 4 CKM parameters per generation (PMNS adds more for ) |
So electroweak adds qualitatively new content to the QED/QCD template:
- Chiral fermions — the first place left and right components live in inequivalent gauge representations.
- Spontaneous symmetry breaking — the empirical input that foundations-modern §5.2 flagged as not derivable from M1+M2+M3.
- Mass mixing (CKM, PMNS) — the source of all flavor-physics phenomenology.
- Discrete-symmetry violation — all violated.
Successes and Tested Predictions
Full inventory of electroweak observables — what is measured, where, and how it overlaps with QED/QCD/SM — lives in observables/electroweak.md. This section lists historical and structural highlights only.
- Predicted the and bosons with masses GeV, GeV before they were directly discovered (UA1/UA2 at CERN, 1983; Nobel Prize 1984).
- Predicted neutral-current weak interactions (Gargamelle, CERN, 1973) before any direct detection of weak neutral currents.
- Predicted the Higgs boson as a necessary consequence of the renormalizable Higgs mechanism (Higgs / Englert / Brout, 1964); discovered at the LHC ATLAS/CMS in 2012 at , Nobel Prize 2013.
- Precision tests at LEP, SLC, and the Tevatron: -pole observables (line-shape, asymmetries, partial widths) measured to per-mille accuracy and consistent with electroweak global fits, severely constraining new-physics scenarios.
- CKM unitarity tests ( etc.) all consistent to .
- Anomalous magnetic moment of the muon receives an electroweak contribution from loops, computed to high order and required for agreement with experiment.
- The CP-violating phase in the CKM matrix correctly accounts for the observed CP violation in - and -meson systems.
Caveats and Open Issues
- No rigorous mathematical construction in . As with QED and QCD, the renormalizable perturbative theory is well-defined formally but lacks a rigorous Wightman / Haag–Kastler construction.
- The hierarchy / naturalness problem. The Higgs mass receives quadratically divergent radiative corrections, . Why remains so small compared to any plausible UV cutoff (Planck scale, GUT scale) is unexplained. The standard proposals (supersymmetry, composite Higgs, large extra dimensions, anthropic / multiverse) are all so far disfavored or unconfirmed by LHC data.
- Neutrino masses. The SM as stated above has massless neutrinos — but solar / atmospheric / reactor oscillation experiments (Super-K, SNO, KamLAND, T2K, NOvA, ...) show neutrinos have small but non-zero masses (). Generating them requires either adding right-handed neutrinos (Dirac masses, with anomalously small Yukawas) or a dim-5 Weinberg operator (Majorana masses, signalling new physics at scale ). The SM as renormalizable theory does not incorporate either; this is the cleanest evidence for physics beyond the SM at any scale.
- The flavor / Yukawa hierarchy. The 13 fermion masses + 4 CKM parameters span six orders of magnitude with no SM explanation. Proposals include Froggatt–Nielsen mechanisms, family symmetries, and extra dimensions — none confirmed.
- The strong CP problem in QCD (mentioned in QCD/from-postulates.md) is not solved by electroweak unification — it remains an outstanding puzzle.
- Anomaly cancellation appears miraculous within EW alone. The hypercharge assignments are tuned so that , , and mixed gravitational-gauge anomalies all cancel per generation. This is a strong hint that quarks and leptons should be unified in larger representations — the motivation for grand unified theories. The full anomaly bookkeeping is collected in the Standard Model doc.
See Also
- QED — the abelian that survives after electroweak symmetry breaking.
- QCD — the non-abelian gauge theory of the strong interaction, the third pillar of the SM.
- Standard Model — combines EW with QCD into the full theory.
- Modern Foundations — for the derivation of the QFT framework EW specializes, including the §5.2 Generalization callout pointing here for the Higgs mechanism.
- Group Theory — for , , Lie algebras, and spontaneous symmetry breaking.
The Standard Model
The Standard Model (SM) of particle physics is the relativistic quantum field theory that combines the three gauge theories of the previous documents — QCD, and electroweak theory (the latter unifying QED with the weak interaction) — into a single gauge theory. It is the most precisely tested theory of nature ever constructed, accounting for every laboratory-scale particle-physics observation made to date.
The SM is the union of QCD and EW plus a small number of genuinely new ingredients that only make sense across the two sectors:
- Three generations of fermions with identical gauge quantum numbers.
- Anomaly cancellation across quark and lepton sectors — the central reason hypercharge assignments take the specific values they do.
- Asymmetric coupling of the same Higgs to all charged fermions via the Yukawa sector, with masses spanning six orders of magnitude.
- Two distinct CP problems (CKM phase vs. QCD -angle) that the unification does not relate.
This document collects the cross-sector content. For the individual gauge factors, see the parent docs.
Companion presentations.
- Modern Foundations — derives the QFT framework all SM constituents specialize.
- QED, QCD, Electroweak — the three constituent gauge theories.
Following the tag convention of foundations-modern.md, each step below is labelled (Empirical input), (Postulate), (Theorem), or (Standard machinery).
Derivation of the SM from QFT
The SM inherits all QFT postulates and is obtained by combining the field content and gauge symmetries of QCD and EW. The "new" ingredients are constraints that link the two sectors.
Step 1 — Specify the Field Content (Empirical input; specializes QFT Postulate 4)
The SM postulates a single Higgs doublet and three generations of chiral fermions, each generation a copy of the EW pattern with QCD color attached. Writing representations:
| Field (one generation) | Multiplet | Generations |
|---|---|---|
| Left-handed quark doublet | ||
| Right-handed up-quark | ||
| Right-handed down-quark | ||
| Left-handed lepton doublet | ||
| Right-handed charged lepton | ||
| Higgs doublet | one | |
| Gluons , | one set | |
| EW gauge bosons , | one set | |
| gauge boson | one set |
(Right-handed neutrinos are not part of the original SM; they are added in extensions to give neutrinos Dirac masses — see Caveats.)
The three-generation pattern is an empirical input. The SM neither predicts the number of generations nor explains why it is exactly three; the constraint from the -boson invisible width at LEP rules out a fourth light generation, but says nothing about why there should be three rather than one.
Step 2 — Gauge Symmetry (Postulate)
The SM gauge group is
with three independent gauge couplings . The covariant derivative on any field in representation is
After electroweak symmetry breaking the residual gauge group is .
Step 3 — The SM Lagrangian (Theorem; specializes QFT Postulate 9)
The most general renormalizable, -gauge-invariant, Lorentz-invariant, -invariant Lagrangian built from the Step 1 fields is
where
- ,
- , summed over all chiral multiplets and all three generations,
- with the EW potential,
- is the generation-mixing Yukawa structure from EW Step 3.4,
- (the QCD -term — see QCD Step 3).
A direct counterpart exists but is a total derivative for an abelian theory in 4D and so does not contribute perturbatively. The non-abelian term would be physical, but the vacuum angle is unobservable because of the chiral nature of the SM fermions (it can be rotated away by a baryon-number redefinition — see Reviews).
The SM has 19 free parameters that must be measured (assuming massless neutrinos): 3 gauge couplings, 2 Higgs parameters , 9 fermion masses, 4 CKM parameters (3 angles + 1 phase), and the QCD -angle. With non-zero neutrino masses one adds at least 7 more (3 masses + 4 PMNS parameters, with possible extra Majorana phases).
Step 4 — Quantize (Standard machinery)
The quantization machinery is the union of QCD's and EW's: path integral with Faddeev–Popov ghosts for both and , gauges for the spontaneously broken EW sector. Renormalizability of the combined theory was established by the same 't Hooft–Veltman analysis that handled EW alone.
Step 5 — Renormalize (Standard machinery)
All three gauge couplings run; their measured low-energy values and the SM -functions extrapolate to nearly meet near but not exactly — the almost-unification is one of the main historical motivations for grand unified theories and for supersymmetry (which makes them meet much more precisely).
Step 6 — Predictions (Standard machinery; specializes QFT Postulate 10)
The Feynman rules are the union of QED/QCD/EW; all cross-sector predictions (e.g. flavor-changing neutral currents constrained by the GIM mechanism, electroweak corrections to QCD observables and vice versa) follow from the standard machinery.
Cross-Sector Content (What Combining QCD + EW Buys Us)
The substantive content of the SM as a separate document, beyond its constituents, is the following.
A. Anomaly cancellation (Theorem)
Gauge invariance must survive quantization. A chiral gauge theory generically suffers triangle anomalies: fermion-loop diagrams with three gauge currents on the legs produce , violating the gauge symmetry at the quantum level. For the SM to be consistent the anomaly contribution must cancel summed over all chiral fermions in the theory.
The relevant anomalies are:
| Anomaly | Contribution from one generation | Comment |
|---|---|---|
| 0 | Vector-like in color (left and right quarks both in ) | |
| Cancels | ||
| 0 | reps are self-conjugate | |
| Quark contribution exactly cancels lepton contribution | ||
| Cancels (long computation; depends on factor of 3 from color) | ||
| Mixed gauge-gravitational anomaly |
The factor of 3 in from color combined with the matching hypercharges is what makes quarks and leptons "fit together". This is the strongest hint inside the SM that quarks and leptons are not independent — they belong in larger multiplets of some unified gauge group (the original motivation for and GUTs, where one generation fits into a of or a single of ).
Why this matters. Anomaly cancellation is what forces the hypercharge assignments listed in EW Step 1 — they are not free phenomenological inputs. The deeper question — why exactly this charge assignment, and the apparent quark-lepton complementarity — is the central pre-existing-mystery the SM hands forward to BSM physics.
B. Generations and GIM (Theorem)
With three generations and a single Higgs doublet, the Yukawa matrices are complex. After diagonalization (EW Step 3.4) the charged-current weak interactions carry the CKM matrix; the neutral-current weak interactions and the Higgs couplings, in contrast, remain diagonal in flavor at tree level. This is the GIM mechanism (Glashow–Iliopoulos–Maiani, 1970), which is why flavor-changing neutral currents (FCNCs) — e.g. — are dramatically suppressed in nature.
The GIM mechanism required the existence of the charm quark to cancel the FCNC contribution from up; this was a successful pre-discovery prediction (charm found at SLAC and Brookhaven, 1974). The same logic applied to the down sector + measured CP violation in decays led Kobayashi and Maskawa (1973) to predict a third generation; bottom (1977) and top (1995) quarks confirmed it.
C. Two CP problems, one solved, one not (Postulate)
The SM has two independent sources of violation:
| Source | Status |
|---|---|
| CKM phase | , large. Accounts for the observed violation in and mesons. |
| QCD -angle | from neutron EDM bound. Unexplained smallness — the strong CP problem (see QCD Caveats). |
The two are independent — the CKM phase does not "solve" the strong CP problem, and they cannot be rotated into each other without violating gauge invariance. This is one of the cleanest indications that the SM is incomplete.
D. Accidental symmetries (Theorem)
Renormalizable -invariance happens to automatically preserve four global symmetries:
- — baryon number (each quark , leptons ).
- , , — three separate lepton-flavor numbers.
These are accidental: they were not postulated; they emerge as consequences of -invariance + renormalizability + the absence of . They are broken by:
- Non-perturbative electroweak instantons ('t Hooft, 1976): is anomalously broken; is exactly conserved. Practically irrelevant at low energy (rate ), but important at very high temperatures and central to baryogenesis scenarios.
- Neutrino masses (Majorana): break explicitly; lepton flavor mixed via PMNS.
- Beyond-SM physics (proton decay): predicted by GUTs at rate , not observed.
So conservation of baryon number is not a SM postulate — it is a derived approximate symmetry. Lepton-flavor universality (electrons / muons / taus interact with the same gauge couplings) is a postulate, embedded in the choice of identical gauge multiplets across generations.
Comparison: SM as a Whole vs. its Constituents
| Aspect | Combined SM (this doc) | Sum of QED + QCD + EW alone |
|---|---|---|
| Gauge group | Same | |
| Free parameters (massless ) | 19 | Same |
| Hypercharge assignments | Fixed by anomaly cancellation | Would appear free |
| FCNCs | Suppressed by GIM | Would be unconstrained |
| 3rd generation | Required for violation by KM | Two would suffice for masses |
| conservation | Accidental at tree level; non-perturbatively broken | Would need to be postulated |
| CP-violation sources | CKM phase ✓; strong ✗ (unexplained) | Same; not unified |
So the cross-sector content the SM doc captures, beyond its parts, is exactly the four items in §Cross-Sector Content above: anomaly cancellation, GIM/generations, the two CP problems, and accidental symmetries.
Successes and Tested Predictions
Full inventory of SM cross-sector observables — CKM unitarity triangle, global EW fit, GIM mechanism, lepton-universality tests, cosmological constraints — lives in observables/standard-model.md. Sector-specific observables live in observables/electroweak.md, with QED/QCD per-theory inventories noted as a future gap in observables/README.md. This section lists historical and structural highlights only.
The SM is the most precisely tested theory in physics. Highlights:
- Anomalous magnetic moment of the electron — agrees with theory to better than (when combined with the most precise measurement). Includes electroweak loop corrections.
- -pole observables at LEP/SLC — line-shape, partial widths, forward-backward asymmetries all match SM predictions at the per-mille level.
- CKM-matrix unitarity tests — to ; CP violation in -mesons consistent with a single KM phase.
- Higgs discovery at (LHC, 2012); production and decay rates match SM predictions in every measured channel to .
- Asymptotic freedom of QCD confirmed across measurements spanning .
- Lattice QCD computations of the light hadron spectrum agree with experiment at the few-percent level.
- Neutral-current discovery (Gargamelle, 1973) — predicted by EW before observation.
- Numerous discovered particles in advance of measurement: , the gluon (from 3-jet events at PETRA).
Caveats and Open Issues
The SM, despite its empirical success, leaves known gaps:
- Gravity. The SM contains no graviton and no quantization of general relativity. The fundamental obstruction is non-renormalizability of perturbatively quantized Einstein gravity in . The SM is an effective theory below the Planck scale .
- Neutrino masses. Already covered in EW Caveats. The cleanest evidence for BSM physics at any scale.
- Dark matter. Astronomical evidence (rotation curves, cluster dynamics, CMB anisotropies, large-scale structure) requires of cosmic energy density to be in a non-luminous, non-baryonic, cold component. No SM particle fits; candidates (WIMPs, axions, sterile neutrinos, primordial black holes) are all BSM.
- Dark energy. of cosmic energy density behaves like a cosmological constant . The SM vacuum energy is many orders of magnitude larger than the observed (the cosmological constant problem), one of the largest fine-tuning puzzles in physics.
- Baryogenesis. The universe is matter-dominated; the SM has the three Sakharov conditions ( violation, and violation, out-of-equilibrium dynamics) only in principle — the CKM CP violation is too small and the SM electroweak phase transition is too smooth to generate the observed baryon asymmetry. BSM CP-violation sources are required.
- Hierarchy problem. Why ? See EW Caveats.
- Strong CP problem. Why ? See QCD Caveats. The most popular resolution (Peccei–Quinn axion) is BSM.
- Flavor puzzle. Why three generations? Why the hierarchical Yukawa pattern? Why CKM mixing angles small but PMNS mixing angles large?
- No rigorous construction in . As with QED/QCD/EW individually, a Wightman / Haag–Kastler construction of the SM is an open problem.
See Also
- QED, QCD, Electroweak — the three gauge-theory pillars combined here.
- Modern Foundations — the derivation of the QFT framework all SM constituents specialize.
- Group Theory — for and the broader Lie-group machinery.
- (Forthcoming) GUTs, supersymmetric extensions, and effective-theory framings (SMEFT, HEFT).