The Chain Rule
The Chain Rule.
The theorems in Section \secref{1.7} were concerned with finding the derivatives of functions that were constructed from other functions using the algebraic operations of addition, multiplication by a constant, multiplication, and division. In this section we shall derive a similar formula, called the Chain Rule, for the derivative of the composition [math]f(g)[/math] of a differentiable function [math]g[/math] with a differentiable function [math]f[/math]. Before giving the theorem, we remark that an alternative way of writing the definition of the derivative of a function [math]f[/math] is
The substitution [math]x = a + t[/math] will transform into the expression that we have heretofore used for the derivative. An equation equivalent to is
We next define a function [math]r[/math] (dependent on both [math]f[/math] and [math]a[/math]) by
Note that the two functions [math]f[/math] and [math]r[/math] have the same domain. Furthermore, as a result of, we have
i.e., the function [math]r[/math] is continuous at [math]a[/math]. From the definition of [math]r[/math], we obtain the equation
which is true for every [math]x[/math] in the domain of [math]f[/math]. We now prove:
If [math]f[/math] and [math]g[/math] are differentiable functions, then so is the composite function [math]f(g)[/math]. Moreover, [math][f(g)]' = f'(g)g'[/math].
Let [math]a[/math] be a number in the domain of [math]g[/math] such that [math]g(a)[/math] is in the domain of [math]f[/math]. By definition
Example
\label{exam 1.8.1} If [math]F(x) = (x^2 + 2)^3[/math], compute [math]F'(x)[/math]. One way to do this problem is to expand [math](x^2 + 2)^3[/math] and use the differentiation formulas developed in Section \secref{1.7}.
Another method uses the Chain Rule. Let [math]g[/math] and [math]f[/math] be the functions defined, respectively, by [math]g(x) = x^2 + 2[/math] and [math]f(y) = y^3[/math]. Then
and, according to the Chain Rule,
Since [math]g'(x) = 2x[/math] and [math]f'(y) = 3y^2[/math], we get [math]f'(g(x)) = 3(x^2 + 2)^2[/math] and
which agrees with the alternative solution above.
Example
\label{exam 1.8.2} Find the derivative of the function [math](3x^7 + 2x)^{128}[/math]. In principle, we could expand by the binomial theorem, but with the Chain Rule at our disposal that would be absurd. Let [math]g(x) = 3x^7 + 2x[/math] and [math]f(y) = y^{128}[/math]. Then [math]g'(x) = 21x^6 + 2[/math] and [math]f'(y) = 128y^{127}[/math]. Setting [math]y = 3x^7 + 2x[/math], we get
The above two examples are instances of the following
corollary of the Chain Rule:
If [math]f[/math] is a differentiable function, then
To prove it, let [math]F(y) = y^n[/math]. Then [math]F(f) = f^n[/math], and we know that [math]F'(y) = ny^{n-1}[/math]. Consequently, [math](f^n)' = [F(f)]' = F'(f) f' = nf^{n-1}f'[/math]. A significant generalization of this result is
If [math]f[/math] is a positive differentiable function and [math]r[/math] is any rational number, then [math](f^r)' = rf^{r-1}f'[/math]. The requirement that [math]f[/math] is positive assures that [math]f^r[/math] is defined. A nonpositive number cannot be raised to an arbitrary rational power. However, as we shall show later (see, the requirement that [math]r[/math] be a rational number is unnecessary. Theorem is actually true for any real number [math]r[/math].
Let [math]r = \frac{m}{n}[/math], where [math]m[/math] and [math]n[/math] are integers, and set [math]h = f^r = f^{m/n}[/math]. Then [math]h^n = (f^{m/n})^n = f^m[/math], which implies that [math](h^n)' = (f^m)'[/math]. Using the above formula for the derivative of an integral power of a function, we get
This completes the proof---almost.
Note that we have in the argument tacitly assumed that [math]h[/math],
the function whose derivative we are seeking, is differentiable.
Is it?
If it is, how do we know it?
The answer to the first question is yes,
but the answer to the second is not so easy.
The problem can be reduced to a simpler one:
If [math]n[/math] is a positive integer
and [math]g[/math] is the function defined by
[math]g(x) = x^{1/n''[/math], for [math]x \gt 0[/math],
then [math]g[/math] is differentiable.}
If we know this fact,
we are out of the difficulty
because the Chain Rule tells us that
the composition of two differentiable functions is differentiable.
Hence [math]g(f)[/math] is differentiable, and [math]g(f) = f^{1/n}[/math].
From this it follows that [math](f^{1/n})^m[/math] is differentiable,
and [math](f^{1/n})^m = f^{m/n}[/math].
(When we express [math]r[/math] as a ratio [math]\frac{m}{n}[/math],
we can certainly take [math]n[/math] to be positive.)
A proof that [math]x^{1/n}[/math] is differentiable, if [math]x \gt 0[/math],
is most easily given as an application of the
Inverse Function Theorem, \ref{chp 5}.
However, the intuitive reason is simple:
If [math]y = x^{1/n}[/math] and [math]x \gt 0[/math],
then [math]y^n = x[/math],
and by interchanging [math]x[/math] and y we obtain the equation [math]x^n = y[/math].
The latter equation defines a smooth curve
whose slope at every point is given by the derivative
[math]\frac{dy}{dx} = nx^{n-1}[/math].
Interchanging [math]x[/math] and [math]y[/math] amounts geometrically to
a reflection about the line [math]y = x[/math].
We conclude that the original curve [math]y = x^{1/n}, x \gt 0[/math],
has the same intrinsic shape and smoothness
as that defined by [math]y = x^n, y \gt 0[/math].
It therefore must have a tangent line at every point,
which means that [math]x^{1/n}[/math] is differentiable.
Example
\label{exam 1.8.3} If [math]y = x^{1/n}[/math], then
Example
\label{exam 1.8.4} Find the derivative of the function [math]F(x) = (3x^2 + 5x + 1)^{5/3}[/math]. If we let [math]f(x) = 3x^2 + 5x + 1[/math], then Theorem (8.2) implies that
With the [math]\frac{d}{dx}[/math] notation for the derivative,
the Chain Rule can be written in a form that is impossible to forget.
Let [math]f[/math] and [math]g[/math] be two differentiable functions.
The formation of the composite function [math]f(g)[/math]
is suggested by writing [math]u = g(x)[/math] and [math]y = f(u)[/math].
Thus [math]x[/math] is transformed by [math]g[/math] into [math]u[/math],
and the resulting [math]u[/math] is then transformed by
[math]f[/math] into [math]y = f(u) = f(g(x))[/math].
We have
By the Chain Rule, [math][f(g(x))]' = f'(g(x))g'(x) = f'(u)g'(x)[/math], and so
The idea that one can simply cancel out [math]du[/math] in is very appealing and accounts for the popularity of the notation. It is important to realize that the cancellation is valid because the Chain Rule is true, and not vice versa. Thus far, [math]du[/math] is simply a part of the notation for the derivative and means nothing by itself. Note also that is incomplete in the sense that it does not say explicitly at what points to evaluate the derivatives. We can add this information by writing
Example
\label{exam 1.8.5} If [math]w = z^2 + 2z + 3[/math] and [math]z = \frac{1}{x}[/math], find [math]\frac{dw}{dx}(2)[/math]. By the Chain Rule,
When [math]x = 2[/math], we have [math]z = \frac {1}{2}[/math]. Hence
Example
\label{exam 1.8.6} Two functions, which we shall define in Chapter \ref{chp 11}, are the hyperbolic sine and the hyperbolic cosine, denoted by [math]\sinh x[/math] and [math]\cosh x[/math] respectively. These functions are differentiable and have the interesting property that
Furthermore, [math]\sinh (0) = 0[/math] and [math]\cosh (0) = 1[/math]. Compute the derivatives at [math]x= 0[/math] of (a) [math](\cosh x)^2[/math], (b) the composite function [math]\sinh (\sinh x)[/math]. By, we obtain for (a)
and so
Part (b) requires the full force of the Chain Rule: Setting [math]u = \sinh x[/math], we obtain
or
Hence
\end{exercise}
General references
Doyle, Peter G. (2008). "Crowell and Slesnick's Calculus with Analytic Geometry" (PDF). Retrieved Oct 29, 2024.