The Geometry of General Relativity: Part II — The Metric

Introduction

Part 1 developed the metric-free story: manifolds, tensors, differential forms, the exterior derivative, and the generalized Stokes theorem. Everything there required only a smooth structure and an orientation. A metric was conspicuously (and deliberately) absent. Working without one makes its role visible rather than assumed; the reader arrives here having felt the absence of distances, angles, and a canonical way to compare vectors.

Part 2 adds exactly one object: a metric \(g_{\mu\nu}\). From this single addition the geometry of the manifold follows without further arbitrary choices: the metric uniquely determines a connection (forced by metric compatibility and torsion-freeness), which determines covariant derivatives, parallel transport, geodesics, and curvature. A Lorentzian metric gives spacetime its causal structure. Part 3 will add matter and make the metric itself dynamical.

The Metric

Definition

In Part 1, a manifold came equipped with only a smooth structure: you could do calculus on it, but you could not measure distances or angles, and there was no way to compare the lengths of tangent vectors. The metric supplies this structure and is therefore the centrally important ingredient required to build physics models in curved space.

A metric on a manifold \(M\) is a smooth, symmetric, nondegenerate \((0,2)\) tensor field \(g_{\mu\nu}\). At each point \(p \in M\) it is a symmetric bilinear form on \(T_pM\): a map \(g_p: T_pM \times T_pM \to \mathbb{R}\) that is linear in each argument and satisfies \(g_p(v,w) = g_p(w,v)\). Nondegeneracy means that if \(g_p(v,w) = 0\) for all \(w \in T_pM\), then \(v = 0\). This is the condition that the matrix of components \(g_{\mu\nu}\) is invertible.

In coordinates the metric is specified by its components \(g_{\mu\nu} = g(\partial_\mu, \partial_\nu)\), a symmetric matrix of smooth functions. We define the following (abuses of) notation for the metric:

\[ g = g_{\mu\nu}\,dx^\mu \otimes dx^\nu =: g_{\mu\nu}\,dx^\mu\,dx^\nu =: ds^2, \]

where the so-called invariant line element \(ds^2\) notation hand-wavily alludes to "squared infinitesimal length." This is just notation: \(s\) is not a function on the manifold and \(d\) here is not the exterior derivative of Part 1. It is also conventional in physics to omit the tensor product symbol as shown above and we will do this when the context is clear. Note also that the metric determines a canonical volume form \(\sqrt{|g|}\,dx^1\wedge\cdots\wedge dx^n\) on any oriented manifold, where \(|g| = |\det(g_{\mu\nu})|\).

Signature and Causality

A symmetric bilinear form can be diagonalized by a linear change of basis, producing a standard form with \(+1\)s and \(-1\)s on the diagonal. The number of positive and negative entries is a property of the form itself, not the choice of basis: this is the signature. A Riemannian metric has all positive entries, written \((+,\ldots,+)\). A Lorentzian metric has exactly one negative entry, written \((-,+,+,+)\) in four dimensions. A Riemannian manifold is a pair \((M, g)\) where \(M\) is a smooth manifold and \(g\) is a Riemannian metric on it; a Lorentzian manifold is the same with a Lorentzian \(g\). The metric is not intrinsic to \(M\): the same smooth manifold can carry many different metrics, and specifying one is an additional choice. The term pseudo-Riemannian covers any metric with indefinite signature and will be used when statements apply to both cases.

General relativity uses a Lorentzian metric because special relativity already tells us spacetime has this structure. The invariant interval of special relativity,

\[ ds^2 = -dt^2 + dx^2 + dy^2 + dz^2, \]

is precisely the Minkowski metric \(\eta_{\mu\nu} = \mathrm{diag}(-1,+1,+1,+1)\). At each point of a Lorentzian manifold, tangent vectors split into three classes according to the sign of \(g(v,v) = g_{\mu \nu}v^\mu v^\nu \). Vectors with \(g(v,v) < 0\) are timelike and represent directions in which massive particles can travel; vectors with \(g(v,v) = 0\) are null and represent directions along which light travels; vectors with \(g(v,v) > 0\) are spacelike. The null vectors at \(p\) form the light cone, which separates the region \(p\) can causally influence from the region it cannot. The metric (specifically the relative minus sign) thus gives the manifold its causal structure.

Examples

Metric	Line element	Description
Euclidean \(\mathbb{R}^3\)	\(dx^2 + dy^2 + dz^2\)	Flat 3D space; \(g_{\mu\nu} = \delta_{\mu\nu}\).
Minkowski \(\mathbb{R}^{3,1}\)	\(-dt^2 + dx^2 + dy^2 + dz^2\)	Flat spacetime of special relativity; \(g_{\mu\nu} = \eta_{\mu\nu}\). The local approximation to any Lorentzian manifold at small scales.
Round \(S^2\)	\(d\theta^2 + \sin^2\theta\,d\varphi^2\)	The unit 2-sphere in spherical coordinates. Note \(g_{\varphi\varphi} = \sin^2\theta\) vanishes at the poles: a coordinate singularity, not a geometric one.

Raising and Lowering Indices

In Part 1, tangent vectors and covectors lived in separate worlds. A vector \(V^\mu \in T_pM\) and a covector \(\omega_\mu \in T_p^*M\) are dual to each other (one acts on the other to return a number), but there was no canonical way to convert between them. Any such conversion would require choosing a basis, making it coordinate-dependent. The metric provides this identification intrinsically.

Since \(g_{\mu\nu}\) is nondegenerate, it has an inverse \(g^{\mu\nu}\), defined by \( g^{\mu\rho}\,g_{\rho\nu} = \delta^\mu_{\ \nu} \). Given a tangent vector \(V^\mu\), the metric produces a covector \(V_\mu = g_{\mu\nu}V^\nu\). Given a covector \(\omega_\mu\), the inverse metric produces a tangent vector \(\omega^\mu = g^{\mu\nu}\omega_\nu\). These operations are called lowering and raising indices ("index gymnastics") for obvious reasons. The operations are inverse to each other and extend to arbitrary tensors: one factor of \(g_{\mu\nu}\) per upper index lowered, one factor of \(g^{\mu\nu}\) per lower index raised.

The geometric meaning is this: if \(V^\mu\) is a tangent vector, then \(V_\mu\) is not a different object but the same vector re-expressed as a linear functional on other vectors. Specifically, \(V_\mu W^\mu = g_{\mu\nu}V^\nu W^\mu = g(V, W)\): acting \(V_\mu\) on any vector \(W\) returns the inner product of \(V\) with \(W\). The covector \(V_\mu\) is the operation "take the inner product with \(V\)." Raising and lowering is therefore the metric's way of saying that vectors and covectors, while technically distinct, are two representations of the same underlying geometric object.

Why the Metric Is the Right Object

It is worth reflecting on why the metric, among all \((0,2)\) tensor fields, occupies such a special role. A generic \((0,2)\) tensor is just a bilinear map: it eats two tangent vectors and returns a number. The metric does this too, but its symmetry and nondegeneracy give it three properties that no other structure provides without additional arbitrary choices:

1. Nondegeneracy and symmetry \(\Rightarrow\) \(g\) is a genuine inner product at each \(T_pM\), supplying the manifold with notions of length and angle (hence the name "metric"). The length of a tangent vector \(v\) is \(\sqrt{|g(v,v)|}\), and the angle between \(v\) and another vector \(w\) at the same point is given by \(\cos\theta = g(v,w)\,/\,(|v|\,|w|)\). A generic \((0,2)\) tensor cannot do this because it may be degenerate or asymmetric.

2. Nondegeneracy gives a canonical isomorphism between \(T_pM\) and \(T_p^*M\), as discussed above ("raising and lowering"). Without a metric, no such identification exists.

3. Nondegeneracy and symmetry uniquely determine a connection out of infinitely many possibilities on \(M\). Since the connection, as we will see, determines parallel transport, geodesics, and curvature, the metric is not just one tensor among many but the root of the entire geometric edifice of GR.

The Covariant Derivative and the Levi-Civita Connection

The Problem

Physical laws in curved spacetime must be coordinate-independent: a law that holds in one coordinate system must hold in all of them, or it is not a law but a coordinate artifact. This forces us to write physics as tensor equations, which in turn requires a way to differentiate tensor fields and get tensors back. The exterior derivative of Part 1 gave a coordinate-independent derivative for differential forms, but the naive generalization to arbitrary tensor fields fails. We need something more general. Indeed, even taking partial derivatives of a simple vector field, \(\partial_\mu V^\nu\), does not produce a tensor. Under a coordinate change \(x \to x'\):

\partial'_\mu V'^\nu = \left(\frac{\partial x^\sigma}{\partial x'^\mu}\partial_\sigma\right)\!\left(\frac{\partial x'^\nu}{\partial x^\rho}V^\rho\right) = \frac{\partial x^\sigma}{\partial x'^\mu}\frac{\partial x'^\nu}{\partial x^\rho}\,\partial_\sigma V^\rho + \frac{\partial x^\sigma}{\partial x'^\mu}\frac{\partial^2 x'^\nu}{\partial x^\sigma \partial x^\rho}\,V^\rho.

The first term is correct tensor behavior; the second is the obstruction. It vanishes only for linear coordinate changes, which is why partial derivatives work in flat space with Cartesian coordinates. On a curved manifold with general coordinates it does not vanish, and \(\partial_\mu V^\nu\) is coordinate-dependent.

The Covariant Derivative

We define a covariant derivative \(\nabla\) to be an operator mapping \((p,q)\) tensor fields to \((p,q+1)\) tensor fields satisfying:

Linearity for tensors \(S,T\) and constants \(\alpha,\beta\): \(\nabla(\alpha S + \beta T) = \alpha \nabla S + \beta \nabla T\)
Leibniz rule: \(\nabla(S \otimes T) = (\nabla S)\otimes T + S \otimes (\nabla T)\)
Reduces to exterior derivative for any scalar field \(f\): \(\nabla f = df\)
Commutes with contraction: \(\nabla(\text{trace}\, T) = \text{trace}(\nabla T)\)

These axioms should seem plausible for differential operators, but they do not uniquely determine \(\nabla\). Specifying a covariant derivative is additional structure on the manifold; many valid choices exist. We seek to specify a particular covariant derivative that resolves the problem identified in the previous subsection.

Since \(\nabla\) is linear and satisfies Leibniz, its action on any tensor field is completely determined by its action on basis vector fields \(\partial_\nu\). The result \(\nabla(\partial_\nu)\) is a \((1,1)\) tensor, and we define the connection coefficients (or Christoffel symbols) \(\Gamma^\rho_{\mu\nu}\) by its components in the \(\mu\)-direction:

[\nabla(\partial_\nu)]_\mu =: \Gamma^\rho_{\mu\nu}\,\partial_\rho.

We henceforth adopt the notation \(\nabla_\mu(\cdot) \equiv [\nabla(\cdot)]_\mu\). This is an unfortunate abuse of notation since it incorrectly seems to suggest that \(\nabla_\mu\) is a covector. It is calculationally useful, though, and therefore ubiquitous, so we are stuck with it. The previous expression then reads:

\[ \nabla_\mu \partial_\nu =: \Gamma^\rho_{\mu\nu}\,\partial_\rho. \]

The Covariant Derivative is Coordinate-Free

To see how the covariant derivative resolves the problem identified above, we derive its component expression for a vector field and verify tensorial transformation. Write \(V = V^\nu \partial_\nu\) and apply \(\nabla\) via Leibniz:

\begin{aligned} (\nabla V)_\mu &= [\nabla(V^\nu \partial_\nu)]_\mu \\ &= (\nabla V^\nu)_\mu\,\partial_\nu + V^\nu\,[\nabla(\partial_\nu)]_\mu \\ &= (\partial_\mu V^\nu)\,\partial_\nu + V^\nu\,\Gamma^\rho_{\mu\nu}\,\partial_\rho \\ &= (\partial_\mu V^\nu)\,\partial_\nu + V^\rho\,\Gamma^\nu_{\mu\rho}\,\partial_\nu \\ &= \left(\partial_\mu V^\nu + \Gamma^\nu_{\mu\rho}\,V^\rho\right)\partial_\nu, \end{aligned}

where we've used the fact that \( (\nabla V^\nu )_\mu = \partial_\mu V^\nu \) for the scalar coefficients \( V^\nu \), and we also relabeled the dummy index \(\rho \leftrightarrow \nu\) in the penultimate expression. Reading off the \(\nu\)-component and further desecrating notation (in an unfortunately standard way) so that \( (\nabla V)_\mu^\nu \equiv \nabla_\mu V^\nu \), we obtain the following classic coordinate-space expression for the covariant derivative of a vector field:

\[ \nabla_\mu V^\nu = \partial_\mu V^\nu + \Gamma^\nu_{\mu\rho}\,V^\rho. \]

For this to be a genuine \((1,1)\) tensor it must satisfy:

\begin{aligned}\nabla'_\mu V'^\nu &= \partial'_\mu V'^\nu + \Gamma'^\nu_{\mu\rho}V'^\rho \\&= \frac{\partial x^\sigma}{\partial x'^\mu}\frac{\partial x'^\nu}{\partial x^\lambda}\,\nabla_\sigma V^\lambda.\end{aligned}

Using the transformation of \(\partial'_\mu V'^\nu\) derived above, one can verify that this is satisfied if and only if \(\Gamma\) transforms as:

\Gamma'^\nu_{\mu\rho} = \frac{\partial x^\sigma}{\partial x'^\mu}\frac{\partial x'^\nu}{\partial x^\lambda}\frac{\partial x^\tau}{\partial x'^\rho}\,\Gamma^\lambda_{\sigma\tau} - \frac{\partial x^\sigma}{\partial x'^\mu}\frac{\partial x^\tau}{\partial x'^\rho}\frac{\partial^2 x'^\nu}{\partial x^\sigma\partial x^\tau}.

The second term is inhomogeneous: a pure second-derivative piece with no \(\Gamma\) in it. It cancels precisely the obstruction that appeared in \(\partial'_\mu V'^\nu\). The non-tensorial behavior of \(\Gamma\) is not a defect; it is a requirement: \(\Gamma\) must transform this way to kill the obstruction.

The action of \(\nabla\) extends to arbitrary tensors by the Leibniz rule: each upper index contributes a \(+\Gamma\) term, each lower index a \(-\Gamma\) term. Two common cases: for a covector and a \((2,0)\) tensor,

\begin{aligned} \nabla_\rho\,\omega_\mu &= \partial_\rho\,\omega_\mu - \Gamma^\lambda_{\rho\mu}\,\omega_\lambda, \\ \nabla_\rho\,T^{\mu\nu} &= \partial_\rho\,T^{\mu\nu} + \Gamma^\mu_{\rho\lambda}\,T^{\lambda\nu} + \Gamma^\nu_{\rho\lambda}\,T^{\mu\lambda}, \end{aligned}

where the signs are fixed by the Leibniz rule applied to contractions with other tensors.

Interpreting the Connection Coefficients

The name connection points at something specific. Tangent spaces at different points \(T_pM\) and \(T_qM\) are separate vector spaces with no built-in relationship; there is no canonical sense in which a vector at \(p\) "points the same way" as one at \(q\). The connection is the extra structure that provides this: it connects the tangent spaces, supplying a notion of what it means for a vector to vary (or remain constant) as you move. The \(\Gamma^\rho_{\mu\nu}\) are its coordinate expression, encoding the rate at which the tangent spaces twist relative to each other. This is what makes parallel transport possible — the subject of the next section.

To see this concretely, notice that the two terms in the component formula \( \nabla_\mu V^\rho = \partial_\mu V^\rho + \Gamma^\rho_{\mu\nu}V^\nu \) have different origins: \(\partial_\mu V^\rho\) is the change in the components; \(\Gamma^\rho_{\mu\nu}V^\nu\) corrects for the basis vectors themselves changing. A plain \(\partial_\mu\) sees only the first, which is why it fails as a tensor derivative.

Recall that the expression underlying the component formula is our definition \( \nabla_\mu \partial_\nu = \Gamma^\rho_{\mu\nu}\,\partial_\rho \). This says that as you move in the \(\partial_\mu\) direction, the change in \(\partial_\nu\) is itself a vector with \(\rho\)-th component \(\Gamma^\rho_{\mu\nu}\). In flat space with Cartesian coordinates the basis vectors are constant everywhere, so \(\Gamma = 0\). In polar coordinates on flat space they tilt as you move (\(\partial_r\) and \(\partial_\theta\) rotate) so that \(\Gamma \neq 0\) without any curvature. Nonzero Christoffel symbols can therefore reflect a curved coordinate system rather than a curved space; curvature lives in the derivatives of \(\Gamma\), which is the content of the Riemann tensor section.

The Levi-Civita Connection

On a bare manifold there are infinitely many valid choices of \(\Gamma^\rho_{\mu\nu}\). The metric singles out a unique one by imposing two conditions:

Metric compatibility: \(\nabla_\rho\,g_{\mu\nu} = 0\). The metric is covariantly constant; equivalently, raising and lowering indices commutes with \(\nabla\).
Torsion-free: \(\Gamma^\rho_{\mu\nu} = \Gamma^\rho_{\nu\mu}\). The lower two indices are symmetric.

Together these uniquely determine \(\Gamma\). Expanding metric compatibility using the covariant derivative formula for a \((0,2)\) tensor:

\nabla_\rho\,g_{\mu\nu} = \partial_\rho g_{\mu\nu} - \Gamma^\lambda_{\rho\mu}\,g_{\lambda\nu} - \Gamma^\lambda_{\rho\nu}\,g_{\mu\lambda} = 0.

Write this out for three cyclic permutations of \((\rho,\mu,\nu)\):

\begin{aligned} \partial_\mu g_{\nu\rho} &= \Gamma^\lambda_{\mu\nu}\,g_{\lambda\rho} + \Gamma^\lambda_{\mu\rho}\,g_{\nu\lambda}, \\ \partial_\nu g_{\rho\mu} &= \Gamma^\lambda_{\nu\rho}\,g_{\lambda\mu} + \Gamma^\lambda_{\nu\mu}\,g_{\rho\lambda}, \\ \partial_\rho g_{\mu\nu} &= \Gamma^\lambda_{\rho\mu}\,g_{\lambda\nu} + \Gamma^\lambda_{\rho\nu}\,g_{\mu\lambda}. \end{aligned}

Add the first two and subtract the third. Applying torsion-freeness \(\Gamma^\lambda_{\alpha\beta} = \Gamma^\lambda_{\beta\alpha}\) and symmetry of \(g\), four of the six \(\Gamma\) terms cancel in pairs; only \(2\Gamma^\lambda_{\mu\nu}g_{\lambda\rho}\) survives:

\partial_\mu g_{\nu\rho} + \partial_\nu g_{\rho\mu} - \partial_\rho g_{\mu\nu} = 2\,\Gamma^\lambda_{\mu\nu}\,g_{\lambda\rho}.

Contracting both sides with \(g^{\rho\sigma}\) and using \(g_{\lambda\rho}\,g^{\rho\sigma} = \delta^\sigma_\lambda\) gives the famous Levi-Civita connection, which will be our default connection for the remainder of this text:

\[ \Gamma^\rho_{\mu\nu} = \frac{1}{2}\,g^{\rho\sigma}\!\left(\partial_\mu g_{\nu\sigma} + \partial_\nu g_{\mu\sigma} - \partial_\sigma g_{\mu\nu}\right). \]

Parallel Transport

The connection defines what it means for a vector to remain constant as you move along a curve \(\gamma: \mathbb{R} \to M\). A vector field along a curve is a map \(V\) assigning to each \(\lambda\in\mathbb{R}\) a tangent vector \(V(\lambda) \in T_{\gamma(\lambda)}M\). Note this is defined only along \(\gamma\), not on a neighborhood of it. Such a field is parallel-transported along \(\gamma\) if its covariant derivative in the direction of \(\gamma'(\lambda)\) vanishes:

\[ \frac{DV^\mu}{d\lambda} := \frac{dx^\nu}{d\lambda}\,\nabla_\nu V^\mu = \frac{dx^\nu}{d\lambda}\!\left(\partial_\nu V^\mu + \Gamma^\mu_{\nu\rho}\,V^\rho\right) = 0. \]

The notation \(D/d\lambda\) is the covariant derivative along the curve: the component of \(\nabla\) in the direction of \(\gamma'(\lambda)\). In flat space with Cartesian coordinates \(\Gamma = 0\) and this reduces to ordinary constancy of components. In curved space or curved coordinates, the \(\Gamma\) terms compensate for the tilting basis, keeping the vector geometrically constant even as its components change.

Parallel transport is path-dependent: moving a vector from \(p\) to \(q\) along different curves generally produces different results. This path-dependence is the geometric signature of curvature, made precise by the Riemann tensor in a later section.

Geodesics

A geodesic is a curve that parallel-transports its own tangent vector. This is the natural generalization of a straight line to curved space. Setting \(V^\mu = dx^\mu/d\lambda\) in the parallel transport equation:

\begin{aligned} 0 &= \frac{dx^\nu}{d\lambda}\,\partial_\nu\!\left(\frac{dx^\mu}{d\lambda}\right) + \Gamma^\mu_{\nu\rho}\frac{dx^\nu}{d\lambda}\frac{dx^\rho}{d\lambda} \\ &= \frac{d^2x^\mu}{d\lambda^2} + \Gamma^\mu_{\nu\rho}\frac{dx^\nu}{d\lambda}\frac{dx^\rho}{d\lambda}, \end{aligned}

where the last step uses \((dx^\nu/d\lambda)\,\partial_\nu = d/d\lambda \) along the curve. This second-order ODE is the geodesic equation, whose solutions are geodesics:

\[ \frac{d^2x^\mu}{d\lambda^2} + \Gamma^\mu_{\rho\sigma}\frac{dx^\rho}{d\lambda}\frac{dx^\sigma}{d\lambda} = 0. \]

Writing \(u^\mu = dx^\mu/d\lambda\) for the tangent vector, the quantity \(g_{\mu\nu}u^\mu u^\nu\) is constant along any geodesic. This can be seen by noticing that, since \(D/d\lambda\) agrees with \(d/d\lambda\) on scalars, the Leibniz rule gives:

\frac{d}{d\lambda}(g_{\mu\nu}u^\mu u^\nu) = \frac{D g_{\mu\nu}}{d\lambda}\,u^\mu u^\nu + g_{\mu\nu}\frac{Du^\mu}{d\lambda}\,u^\nu + g_{\mu\nu}\,u^\mu\frac{Du^\nu}{d\lambda} = 0,

where the first term vanishes by metric compatibility and each of the last two vanish by the parallel transport condition applied to the tangent vector \(u^\mu\).

The sign of \(g_{\mu\nu}u^\mu u^\nu\), the causal character of the tangent vector already established in § The Metric, is therefore conserved along any geodesic. This connects the geometry to the physics directly: the equivalence principle states that freely falling massive particles follow timelike geodesics and light follows null geodesics. Gravity is not a force; it is the curvature of spacetime, and free particles simply follow its geodesics.

Parallel Transport on the Sphere

The conservation argument above is not specific to tangent vectors. For any two parallel-transported vectors \(V\) and \(W\) along the same curve, the Leibniz rule and metric compatibility give

\frac{d}{d\lambda}g(V,W) = \frac{Dg_{\mu\nu}}{d\lambda}\,V^\mu W^\nu + g_{\mu\nu}\frac{DV^\mu}{d\lambda}\,W^\nu + g_{\mu\nu}\,V^\mu\frac{DW^\nu}{d\lambda} = 0.

For example, if \(\gamma\) is a geodesic with tangent \(u\), then \(u\) is itself parallel-transported by definition, so \(g(V,u)\) is constant for any parallel-transported \(V\): the angle between \(V\) and the geodesic's tangent is preserved throughout the arc.

Parallel transport around a spherical triangle: V at A points south, follows the triangle A to B to N to A, and returns pointing east as V prime

The meridians and equator of a round sphere are geodesics. Take the North Pole \(N\) and two equatorial points \(A,B\) \(90°\) apart (see figure), and parallel-transport a vector \(V\), initially pointing south at \(A\), around the triangle they form. First leg \(A\rightarrow B\): \(V\) starts perpendicular to the arc and this is preserved throughout; \(V\) arrives at \(B\) still pointing south. Second leg \(B\rightarrow N\): \(V\) is perpendicular to the arc's tangent. Third leg \(N\rightarrow A\): \(V\) is perpendicular to the arc and arrives back at \(A\) pointing due-East, which we label as \(V'\) in the figure. We see that the parallel transport loop has rotated the vector by 90°, in contrast to flat space where the same closed path returns any vector unchanged.

The Riemann Tensor

The closing observation of § Parallel Transport was that path-dependence is the geometric signature of curvature. Here we make it precise. Take a vector \(V\) at a point \(p\) and parallel-transport it around a closed loop, returning to \(p\). In flat space, you recover \(V\) exactly. In curved space, you recover a rotated vector. The Riemann tensor makes this precise.

The Definition

Partial derivatives commute. Covariant derivatives need not; whether they do depends on the geometry. To understand this more deeply, we must compute the commutator \( [\nabla_\mu,\nabla_\nu] \equiv \nabla_\mu\nabla_\nu - \nabla_\nu\nabla_\mu\). Applying this operator to a vector \(V\) gives:

\begin{aligned} \left[\nabla_\mu,\nabla_\nu\right]\,V^\rho &= (\nabla_\mu\nabla_\nu - \nabla_\nu\nabla_\mu)V^\rho = \nabla_\mu(\nabla_\nu V^\rho) - (\mu\leftrightarrow\nu)\\ &= \nabla_\mu(\partial_\nu V^\rho + \Gamma^\rho_{\nu\sigma}V^\sigma) - (\mu\leftrightarrow\nu)\\ &= \partial_\mu(\partial_\nu V^\rho + \Gamma^\rho_{\nu\sigma}V^\sigma) + \Gamma^\rho_{\mu\lambda}(\partial_\nu V^\lambda + \Gamma^\lambda_{\nu\sigma}V^\sigma) - \Gamma^\lambda_{\mu\nu}(\partial_\lambda V^\rho + \Gamma^\rho_{\lambda\sigma}V^\sigma) - (\mu\leftrightarrow\nu) \\ &= \partial_\mu\partial_\nu V^\rho + (\partial_\mu\Gamma^\rho_{\nu\sigma})V^\sigma + \Gamma^\rho_{\nu\sigma}\partial_\mu V^\sigma + \Gamma^\rho_{\mu\lambda}\partial_\nu V^\lambda + \Gamma^\rho_{\mu\lambda}\Gamma^\lambda_{\nu\sigma}V^\sigma - \Gamma^\lambda_{\mu\nu}\partial_\lambda V^\rho - \Gamma^\lambda_{\mu\nu}\Gamma^\rho_{\lambda\sigma}V^\sigma - (\mu\leftrightarrow\nu), \end{aligned}

where we used the definition of the covariant derivative for (1,0) and (1,1) tensors in the second and third lines, respectively. Performing the antisymmetrization, three classes of terms cancel: \(\partial_\mu\partial_\nu V^\rho - \partial_\nu\partial_\mu V^\rho = 0\) by symmetry of partial derivatives; the first-derivative-of-\(V\) terms cancel pairwise after relabeling dummy indices; and the \((\Gamma^\lambda_{\mu\nu} - \Gamma^\lambda_{\nu\mu})\) terms vanish by torsion-freeness. We gather the remaining terms so that

[\nabla_\mu,\nabla_\nu]\,V^\rho =: R^\rho{}_{\sigma\mu\nu}\,V^\sigma,

where we have defined the Riemann tensor:

\[ R^\rho{}_{\sigma\mu\nu} := \partial_\mu\Gamma^\rho_{\nu\sigma} - \partial_\nu\Gamma^\rho_{\mu\sigma} + \Gamma^\rho_{\mu\lambda}\Gamma^\lambda_{\nu\sigma} - \Gamma^\rho_{\nu\lambda}\Gamma^\lambda_{\mu\sigma}. \]

This object is, perhaps surprisingly given its origins as a commutator of differential operators, algebraic in \(V\) with no derivatives. Covariant derivatives generally fail to commute, and their failure (the Riemann tensor) acts directly on \(V\) as a \((1,3)\) tensor field. The commutator \([\nabla_\mu,\nabla_\nu]\) is the difference between transporting in the \(\mu\)-then-\(\nu\) order versus \(\nu\)-then-\(\mu\): the infinitesimal version of going around a loop, and the Riemann tensor is what that loop does to a vector, which is an intuitive way to quantify curvature. Note that \(R^\rho{}_{\sigma\mu\nu}\) is built from Christoffel symbols: two derivative-of-\(\Gamma\) terms and two \(\Gamma^2\) terms. We therefore see that the connection coefficients and their derivatives and self-interactions encode curvature.

For a general (non-Levi-Civita) connection, note that the \((\Gamma^\lambda_{\mu\nu} - \Gamma^\lambda_{\nu\mu})\) terms do not vanish; they combine into \(-T^\lambda{}_{\mu\nu}\nabla_\lambda V^\rho\), where \(T^\lambda{}_{\mu\nu} = \Gamma^\lambda_{\mu\nu} - \Gamma^\lambda_{\nu\mu}\) is the torsion tensor. That term involves a derivative of \(V\), making the commutator a differential operator rather than a tensor acting pointwise on \(V\). Torsion-freeness is what makes the commutator algebraic in \(V\) and hence a genuine tensor.

The Bianchi Identity

At any point \(p\) there is a coordinate system, called Riemann normal coordinates, in which \(\Gamma^\rho_{\mu\nu}(p) = 0\). To construct these coordinates, begin by fixing a basis \(\{e_\mu\}\) for \(T_pM\). Every vector \(v = v^\mu e_\mu \in T_pM\), by standard existence and uniqueness of ODE solutions applied to the geodesic equation, determines a unique geodesic \(\gamma_v\) with \(\gamma_v(0) = p\) and \(\dot\gamma_v(0) = v\). Define the coordinates of a nearby point \(q\) via \(\gamma_v(1) = q\). The smooth structure of \(M\) guarantees that we can always find a \(q\) "close enough" to \(p\) that the same coordinate chart is valid for both.

In these coordinates (by construction), \(\gamma_v\) has coordinate representation \(x^\mu(\lambda) = \lambda v^\mu\). That is, the point \(\gamma_v(\lambda)\) is reached from \(p\) by traveling along a linear ray from the origin within the coordinate chart. Substituting into the geodesic equation,

\frac{d^2x^\mu}{d\lambda^2} + \Gamma^\mu_{\nu\rho}(x(\lambda))\,\frac{dx^\nu}{d\lambda}\frac{dx^\rho}{d\lambda} = \Gamma^\mu_{\nu\rho}(\lambda v)\,v^\nu v^\rho = 0.

Setting \(\lambda = 0\): \(\Gamma^\mu_{\nu\rho}(p)\,v^\nu v^\rho = 0\) for every \(v \in T_pM\). Replace \(v\) by \(v + w\) and subtract the pure-\(v\) and pure-\(w\) cases; what remains is \(\Gamma^\mu_{\nu\rho}(p)(v^\nu w^\rho + w^\nu v^\rho) = 0\) for all \(v, w\). Since \(\Gamma^\mu_{\nu\rho} = \Gamma^\mu_{\rho\nu}\) by torsion-freeness, this is \(2\Gamma^\mu_{\nu\rho}(p)\,v^\nu w^\rho = 0\) for all \(v, w\), hence \(\Gamma(p) = 0\), as we claimed. At \(p\), the Riemann tensor then reduces to \(R^\rho{}_{\sigma\mu\nu} = \partial_\mu\Gamma^\rho_{\nu\sigma} - \partial_\nu\Gamma^\rho_{\mu\sigma}\), and we also see \(\nabla = \partial\) at \(p\). This construction is a coordinate trick, not a geometric fact: \(\Gamma\) is not a tensor, so its vanishing at \(p\) is always achievable by a suitable chart and says nothing about the curvature.

These coordinates are the tool for proving both Bianchi identities. We work at an arbitrary \(p\) in normal coordinates, derive each identity by pure partial-derivative algebra, and then promote the result to all coordinates everywhere by tensoriality. For the first, antisymmetrize \(R^\rho{}_{\sigma\mu\nu}\) over its three lower indices and regroup by which \(\Gamma\) is being differentiated:

\begin{aligned} R^\rho{}_{[\sigma\mu\nu]} &:= \tfrac{1}{6}\bigl(R^\rho{}_{\sigma\mu\nu} + R^\rho{}_{\mu\nu\sigma} + R^\rho{}_{\nu\sigma\mu} - R^\rho{}_{\mu\sigma\nu} - R^\rho{}_{\nu\mu\sigma} - R^\rho{}_{\sigma\nu\mu}\bigr) \\ &= \tfrac{1}{3}\bigl(R^\rho{}_{\sigma\mu\nu} + R^\rho{}_{\mu\nu\sigma} + R^\rho{}_{\nu\sigma\mu}\bigr) \\ &= \tfrac{1}{3}\bigl[\partial_\mu(\Gamma^\rho_{\nu\sigma} - \Gamma^\rho_{\sigma\nu}) + \partial_\nu(\Gamma^\rho_{\sigma\mu} - \Gamma^\rho_{\mu\sigma}) + \partial_\sigma(\Gamma^\rho_{\mu\nu} - \Gamma^\rho_{\nu\mu})\bigr]\big|_p = 0, \end{aligned}

where the first line defines the antisymmetrization, the second uses antisymmetry of \(R\) in its last pair (easy to see from the definition, especially in normal coordinates) to collapse the six terms to the cyclic sum, and each bracket on the third line vanishes by torsion-freeness. Generalizing from \(p\) to everywhere by tensoriality then gives the algebraic Bianchi identity:

\[ R^\rho{}_{[\sigma\mu\nu]} = 0. \]

Now antisymmetrize \(\nabla_\lambda R^\rho{}_{\sigma\mu\nu}\) over \(\{\lambda,\mu,\nu\}\) and regroup by which \(\Gamma\) is being doubly differentiated:

[\nabla_\lambda R^\rho{}_{\sigma\mu\nu} + \nabla_\mu R^\rho{}_{\sigma\nu\lambda} + \nabla_\nu R^\rho{}_{\sigma\lambda\mu}]\big|_p = [\partial_\lambda,\partial_\mu]\Gamma^\rho_{\nu\sigma} + [\partial_\mu,\partial_\nu]\Gamma^\rho_{\lambda\sigma} + [\partial_\nu,\partial_\lambda]\Gamma^\rho_{\mu\sigma} = 0,

with each commutator vanishing due to equality of mixed partials. The second Bianchi identity is then:

\[ \nabla_\lambda R^\rho{}_{\sigma\mu\nu} + \nabla_\mu R^\rho{}_{\sigma\nu\lambda} + \nabla_\nu R^\rho{}_{\sigma\lambda\mu} = 0. \]

Symmetries of the Riemann Tensor

Normal coordinates give a clean proof of the algebraic symmetries of \(R_{\rho\sigma\mu\nu} = g_{\rho\lambda}R^\lambda{}_{\sigma\mu\nu}\). Since \(\nabla g = 0\), the covariant derivative component formula says \(\partial_\mu g_{\rho\sigma} = \Gamma^\lambda_{\mu\rho}g_{\lambda\sigma} + \Gamma^\lambda_{\mu\sigma}g_{\rho\lambda}\) so that \(\partial_\mu g_{\rho\sigma}\big|_p = 0\) when \(\Gamma(p) = 0\). Lowering the first index of \(R^\lambda{}_{\sigma\mu\nu}\big|_p = \partial_\mu\Gamma^\lambda_{\nu\sigma} - \partial_\nu\Gamma^\lambda_{\mu\sigma}\) and expanding the Levi-Civita connection coefficients:

\begin{aligned} R_{\rho\sigma\mu\nu}\big|_p &= \partial_\mu\Gamma_{\rho\nu\sigma} - \partial_\nu\Gamma_{\rho\mu\sigma} \\ &= \tfrac{1}{2}\bigl(\partial_\mu\partial_\nu g_{\rho\sigma} + \partial_\mu\partial_\sigma g_{\rho\nu} - \partial_\mu\partial_\rho g_{\nu\sigma}\bigr) - \tfrac{1}{2}\bigl(\partial_\nu\partial_\mu g_{\rho\sigma} + \partial_\nu\partial_\sigma g_{\rho\mu} - \partial_\nu\partial_\rho g_{\mu\sigma}\bigr) \\ &= \tfrac{1}{2}\bigl(\partial_\sigma\partial_\mu g_{\rho\nu} - \partial_\rho\partial_\mu g_{\sigma\nu} - \partial_\sigma\partial_\nu g_{\rho\mu} + \partial_\rho\partial_\nu g_{\sigma\mu}\bigr), \end{aligned}

where the \(\partial_\mu\partial_\nu g_{\rho\sigma}\) second derivative terms cancel by symmetry of mixed partials. Two symmetries are immediate, both of which hold everywhere by tensoriality. Swapping \(\rho \leftrightarrow \sigma\) negates the expression (by \(g_{\rho\sigma} = g_{\sigma\rho}\)), giving first-pair antisymmetry:

\[ R_{\rho\sigma\mu\nu} = -R_{\sigma\rho\mu\nu}. \]

Swapping both pairs simultaneously, \((\rho\sigma) \leftrightarrow (\mu\nu)\), leaves it unchanged, giving pair symmetry:

\[ R_{\rho\sigma\mu\nu} = R_{\mu\nu\rho\sigma}. \]

Contractions and the Einstein Tensor

Several contractions of the Riemann tensor will prove essential in what follows.

The Ricci tensor is the trace on the first and third indices:

R_{\mu\nu} = R^\rho{}_{\mu\rho\nu}.

It is symmetric and has 10 independent components. The Ricci scalar is its metric trace: \(R = g^{\mu\nu}R_{\mu\nu}\). These combine into the symmetric 10 component Einstein tensor:

\[ G_{\mu\nu} = R_{\mu\nu} - \tfrac{1}{2}g_{\mu\nu}R. \]

The algebraic Bianchi identity is the cyclic sum \(R^\rho{}_{\sigma\mu\nu} + R^\rho{}_{\mu\nu\sigma} + R^\rho{}_{\nu\sigma\mu} = 0\). Isolating the first term and contracting with \(g^{\sigma\nu}\) gives:

g^{\sigma\nu}R^\rho{}_{\sigma\mu\nu} = -g^{\sigma\nu}R^\rho{}_{\mu\nu\sigma} - g^{\sigma\nu}R^\rho{}_{\nu\sigma\mu}.

The RHS first term vanishes since \(g^{\sigma\nu}\) is symmetric in \(\sigma\nu\) while \(R^\rho{}_{\mu\nu\sigma}\) is antisymmetric in \(\nu\sigma\). For the second, lower \(\rho\), apply antisymmetry in the first pair \((R_{abcd} = -R_{bacd})\), then pair symmetry \((R_{abcd} = R_{cdab})\):

\begin{aligned} g^{\sigma\nu}R^\rho{}_{\nu\sigma\mu} &= g^{\rho\tau}g^{\sigma\nu}R_{\tau\nu\sigma\mu} \\ &= -g^{\rho\tau}g^{\sigma\nu}R_{\nu\tau\sigma\mu} \\ &= -g^{\rho\tau}g^{\sigma\nu}R_{\sigma\mu\nu\tau} \\ &= -g^{\rho\tau}R^\nu{}_{\mu\nu\tau} = -g^{\rho\tau}R_{\mu\tau} = -R^\rho{}_\mu. \end{aligned}

Therefore \(g^{\sigma\nu}R^\rho{}_{\sigma\mu\nu} = R^\rho{}_\mu\). Contracting \(\rho\) with \(\lambda\) in the second Bianchi identity gives \(\nabla_\rho R^\rho{}_{\sigma\mu\nu} - \nabla_\mu R_{\sigma\nu} + \nabla_\nu R_{\sigma\mu}=0\). Contracting this result with \(g^{\sigma\nu}\) and inserting \(g^{\sigma\nu}R^\rho{}_{\sigma\mu\nu} = R^\rho{}_\mu\) then yields:

\begin{aligned} 0 &=g^{\sigma\nu}\bigl(\nabla_\rho R^\rho{}_{\sigma\mu\nu} - \nabla_\mu R_{\sigma\nu} + \nabla_\nu R_{\sigma\mu}\bigr) \\ &= \nabla^\sigma R_{\sigma\mu} - \nabla_\mu R + \nabla^\sigma R_{\sigma\mu} \\ &= 2\nabla^\sigma R_{\sigma\mu} - \nabla_\mu R. \end{aligned}

Relabeling indices gives \(2\nabla^\mu R_{\mu\nu} = \nabla_\nu R\). Then, since \(\nabla^\mu(g_{\mu\nu}R) = \nabla_\nu R\), this is \(\nabla^\mu(R_{\mu\nu} - \tfrac{1}{2}g_{\mu\nu}R) = 0\):

\[ \nabla^\mu G_{\mu\nu} = 0. \]

This statement about the Einstein tensor is so far purely geometric: no physics assumed. Its significance becomes clear in the Einstein field equations.

Outlook: Part 3

Part 2 began with a metric and derived from it a unique connection, covariant derivatives, parallel transport, geodesics, and the Riemann tensor. The Einstein tensor \(G_{\mu\nu}\), built from contractions of the Riemann tensor, carries the curvature information in exactly the form needed to state a field equation, and the Bianchi identity establishes \(\nabla^\mu G_{\mu\nu} = 0\) as a purely geometric fact. The geometry is ready.

What sources it, and what it predicts, is the subject of Part III.