# Chain rule

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

Main Article
Discussion
Related Articles  [?]
Bibliography  [?]
Citable Version  [?]

This editable Main Article is under development and subject to a disclaimer.

In calculus, the chain rule describes the derivative of a "function of a function": the composition of two function, where the output z is a given function of an intermediate variable y which is in turn a given function of the input variable x.

Suppose that y is given as a function ${\displaystyle \,y=g(x)}$ and that z is given as a function ${\displaystyle \,z=f(y)}$. The rate at which z varies in terms of y is given by the derivative ${\displaystyle \,f'(y)}$, and the rate at which y varies in terms of x is given by the derivative ${\displaystyle \,g'(x)}$. So the rate at which z varies in terms of x is the product ${\displaystyle \,f'(y)\cdot g'(x)}$, and substituting ${\displaystyle \,y=g(x)}$ we have the chain rule

${\displaystyle (f\circ g)'=(f'\circ g)\cdot g'.\,}$

In order to convert this to the traditional (Leibniz) notation, we notice

${\displaystyle z(y(x))\quad \Longleftrightarrow \quad z\circ y(x)}$

and

${\displaystyle (z\circ y)'=(z'\circ y)\cdot y'\quad \Longleftrightarrow \quad {\frac {\mathrm {d} z(y(x))}{\mathrm {d} x}}={\frac {\mathrm {d} z(y)}{\mathrm {d} y}}\,{\frac {\mathrm {d} y(x)}{\mathrm {d} x}}.\,}$.

In mnemonic form the latter expression is

${\displaystyle {\frac {\mathrm {d} z}{\mathrm {d} x}}={\frac {\mathrm {d} z}{\mathrm {d} y}}\,{\frac {\mathrm {d} y}{\mathrm {d} x}},\,}$

which is easy to remember, because it as if dy in the numerator and the denominator of the right hand side cancels.

## Multivariable calculus

The extension of the chain rule to multivariable functions may be achieved by considering the derivative as a linear approximation to a differentiable function.

Now let ${\displaystyle F:\mathbf {R} ^{n}\rightarrow \mathbf {R} ^{m}}$ and ${\displaystyle G:\mathbf {R} ^{m}\rightarrow \mathbf {R} ^{p}}$ be functions with F having derivative ${\displaystyle \mathrm {D} F}$ at ${\displaystyle a\in \mathbf {R} ^{n}}$ and G having derivative ${\displaystyle \mathrm {D} G}$ at ${\displaystyle F(a)\in \mathbf {R} ^{m}}$. Thus ${\displaystyle \mathrm {D} F}$ is a linear map from ${\displaystyle \mathbf {R} ^{n}\rightarrow \mathbf {R} ^{m}}$ and ${\displaystyle \mathrm {D} G}$ is a linear map from ${\displaystyle \mathbf {R} ^{m}\rightarrow \mathbf {R} ^{p}}$. Then ${\displaystyle F\circ G}$ is differentiable at ${\displaystyle a\in \mathbf {R} ^{n}}$ with derivative

${\displaystyle \mathrm {D} (F\circ G)=\mathrm {D} F\circ \mathrm {D} G.\,}$