The chain rule is a fundamental result in differential calculus describing how to differentiate a composite expression. When one function is applied to the output of another, the derivative of the combined map can be obtained by combining the derivatives of the pieces. This idea makes it possible to compute rates of change for complicated expressions built from simpler building blocks.

Basic statement

Suppose F(x) = f(g(x)), where g is differentiable at x and f is differentiable at the point g(x). Then the derivative F'(x) exists and equals the product of the derivative of f evaluated at g(x) and the derivative of g at x. In symbolic form:

  • F(x) = f(g(x))
  • F'(x) = f'(g(x)) \u00b7 g'(x)
An intuitive mnemonic using Leibniz notation is dy/dx = (dy/du)\u00b7(du/dx), which highlights how the inner change scales into the outer one. Derivative values must exist at the appropriate points for the rule to apply.

Sketch of the justification

A short proof uses the definition of derivative as a limit. One writes the difference quotient for f(g(x)) and inserts and subtracts f(g(x + h)) in a way that factors the small change into a product: change in outer function per change in inner function, times change in inner function per change in x. Alternatively, when f and g are differentiable, their linear approximations combine by composition, yielding the stated product. This reasoning is why the chain rule is sometimes seen as expressing how linear approximations compose.

Multivariable version and matrices

The chain rule extends to functions between Euclidean spaces. If G maps R^n to R^m and F maps R^m to R^p, both differentiable, then the derivative of the composition F\u25e6G at a point is the matrix product of their Jacobians: D(F\u25e6G)(x) = D F(G(x)) \u00b7 D G(x). In words, the linear map approximating the composition is the composition of the linear approximations. This matrix form is central in differential geometry, optimization, and machine learning where layered transformations are common.

Uses, examples and remarks

The chain rule is used constantly: differentiating sin(x^2), e^{3x+1}, or nested radical expressions; performing implicit differentiation; deriving formulas in physics for related rates; and implementing backpropagation in neural networks. Simple examples:

  • If h(x)=\sin(x^2), h'(x)=\cos(x^2)\u00b7 2x.
  • If y = \exp(3x+1), y' = \exp(3x+1)\u00b7 3.
It also generalizes to higher derivatives; the combinatorial complexity of repeated composition is captured by Fa\u00e0 di Bruno's formula. Care is required: the inner function must map into the domain where the outer derivative is taken.

Historically, ideas behind the chain rule appeared in the 17th century in the work of early developers of calculus, and its clear algebraic form is tightly connected to Leibniz's notation for differentials. For further reading on foundational topics and examples see articles on derivative, function, and practical guides to differentiation in calculus texts and online resources such as composite function introductions.