Linear Differential Operators

This section is divided into three parts. In the first, we shall systematically develop and extend the differential operators [math]D^2 + aD + b[/math] which were introduced in Section 1. In the second part we shall use these operators to obtain directly the general solutions of certain linear differential equations with constant coefficients. Finally, we shall show how these methods can be used to solve any linear differential equation with constant coefficients (whether homogeneous or not) provided we extend our range of functions to include those whose values may be complex numbers. By a linear operator we shall mean any function [math]L[/math] whose domain and range are sets of numerical-valued functions and which satisfies the equations

[[math]] \begin{equation} L(y_1 + y_2) = L(y_1) + L(y_2), \label{eq11.3.1} \end{equation} [[/math]]

[[math]] \begin{equation} L(ky) = kL(y), \label{eq11.3.2} \end{equation} [[/math]]

for every real number [math]k[/math] and every [math]y, y_1[/math], and [math]y_2[/math] in the domain of [math]L[/math]. [The function [math]L(y)[/math] is frequently written simply [math]L_y[/math].] An important example is the function [math]D[/math], which, to every differentiable function [math]y[/math], assigns its derivative [math]Dy = \frac{dy}{dx}[/math]. Another example is the operation of multiplication by a real number. That is, for any real number [math]a[/math], the function [math]L[/math] defined by

[[math]] Ly= ay [[/math]]

obviously satisfies (1) and (2) and hence is a linear operator. If [math]L_1[/math] and [math]L_2[/math] are linear operators, then their sum is the function [math]L_1 + L_2[/math] defined by

[[math]] \begin{equation} (L_1 + L_2)y = L_1y + L_2y, \label{eq11.3.3} \end{equation} [[/math]]

for every [math]y[/math] which is in the domains of both [math]L_1[/math] and [math]L_2[/math]. lt is easy to show that

Theorem

If [math]L_1[/math] and [math]L_2[/math] nre linear operato's, then the sum [math]L_1 + L_2[/math] is also a linear operator.

Show Proof

We shall show that [math]L_1 + L_2[/math], satisfies equation (1) by using successively the definition of [math]L_1 + L_2[/math], the linearity of [math]L_1[/math] and [math]L_2[/math], separately, the commutative law of addition for functions, and finally the definition again. Thus

[[math]] \begin{eqnarray*} (L_1 + L_2)(y_1 + y_2) &=& L_1(y_1 + y_2) + L_2(y_1 + y_2)\\ &=& L_1y_1 + L_1y_2 + L_2y_1 + L_2y_2 \\ &=& (L_1y_1 + L_2y_1) + (L_1y_2, + L_2y_2)\\ &=& (L_1 + L_2)y_1 + (L_1 + L_2)y_2. \end{eqnarray*} [[/math]]

The proof that [math]L_1 + L_2[/math] satisfies (2) is similar:

[[math]] \begin{eqnarray*} (L_1 + L_2)(ky) &=& L_1(ky) + L_2(ky) \\ &=& kL_1y + kL_2y \\ &=& k(L_1y + L_2y)\\ &=& k(L_1 + L_2)y , \end{eqnarray*} [[/math]]

and this completes the proof.

■

If [math]L_1[/math] and [math]L_2[/math] are linear operators, then the composition of [math]L_2[/math], followed by [math]L_1[/math] is the function denoted by [math]L_1L_2[/math] and defined by

[[math]] \begin{equation} (L_1L_2)y = L_1(L_2y), \label{eq11.3.4} \end{equation} [[/math]]

for every [math]y[/math] for which the right side is defined. The proof of the following proposition is entirely analogous to that of (3.1) and is left to the reader as an exercise.

Theorem

If [math]L_1[/math] and [math]L_2[/math] are linear operators, then the composition [math]L_1L_2[/math] is also a /inear operator.

The composition [math]L_1L_2[/math] is also called the product of [math]L_1[/math] and [math]L_2[/math]. There is no reason to suppose from the definition that the commutative law of multiplication holds, and, for linear operators in general, [math]L_1L_2 \neq L_2L_1[/math]. However, the distributive laws hold:

Theorem

[[math]] \left \{ \begin{array}{l} (L_1(L_2 + L_3) = L_1L_2 + L_1L_3,\\ (L_1 + L_2)L_3 = L_1L_3 + L_2L_3. \end{array} \right . [[/math]]

Show Proof

The first of these is proved as follows:

[[math]] \begin{eqnarray*} (L_1(L_1 + L_3))y &=& L_1((L_2 + L_3)y) \\ &=& L_1(L_2y + L_3y) \\ &=& L_1(L_2y) + L_1(L_3y)\\ &=& (L_1L_2)y + (L_1L_3)y \\ &=& (L_1L_2 + L_1L_3)y. \end{eqnarray*} [[/math]]

The proof of the second is similar and is left as an exercise.

■

An important example of the product of linear operators is the composition of a linear operator [math]L[/math] followed by the operation of multiplication by a real number [math]a[/math]. This product, denoted [math]aL[/math], assigns to every [math]y[/math] in the domain of [math]L[/math] the value [math](aL)y[/math] which is equal to the product of a with the function [math]Ly[/math]. That is,

Theorem

[[math]](aL)y = a(Ly).[[/math]]

The composition in the other order is the product [math]La[/math]. Here we have [math](La)y = L(ay)[/math], and the latter quantity, by the linearity of [math]L[/math] is equal to [math]a(Ly)[/math]. Combining this with (3.4), we obtain the equation [math](La)y = (aL)y[/math]. Thus the operators [math]La[/math] and [math]aL[/math] are equal, and we have proved the following special case of the commutative law:

Theorem

[[math]]aL = L a.[[/math]]

Another example of the product, already encountered, is the operator [math]D0[/math], which is the composition [math]D^2 = D D[/math] of [math]D[/math] with itself. More generally, for every integer [math]n \gt 1[/math], we define the operator [math]D^n[/math] inductively by

[[math]] D^n = DD^{n-1}. [[/math]]

The domain of [math]D^n[/math] is the set of all e-times differentiable functions, and, for each such function [math]y[/math], we have

[[math]] D^ny = \frac{d^ny}{dx^n} . [[/math]]

By repeated applications of (3.1) and (3.2), we may conclude that any function formed in a finite number of steps by taking sums and products of linear operators is itself a linear operator. As an example, consider a polynomial [math]p(t)[/math] of degree [math]n[/math]; i.e.,

[[math]] p(t) = a_n t^n + a_{n-1} t^{n-1} + \cdots + a_1t + a_0, [[/math]]

where [math]a_0, . . ., a_n[/math] are real numbers and an [math]a_n \neq 0[/math]. Then the function

[[math]] p(D) = a_nD^n + a_{n-1} D^{n-1} + \cdots + a_1D + a_0 [[/math]]

is a linear operator. To every [math]n[/math]-times differentiable function [math]y[/math], it assigns as value the function

[[math]] \begin{eqnarray*} p(D)y &=& a_nD^n y + a_{n-1}D^{n-1}y + \cdots + a_1Dy + a_0y \\ &=& a_n \frac{d^ny}{dx^n} + a_{n-1}\frac{d^{n-1}y}{dx^{n-1}} + \cdots + a_1 \frac{dy}{dx} + a_0y . \end{eqnarray*} [[/math]]

We call [math]p(D)[/math] a linear differential operator of order $n$. It is the natural generalization of the differential operators of order 2, of the form [math]D^2 + aD + b[/math], which were discussed in Section 1. [Linear differential operators of types more general than [math]p(D)[/math] certainly exist; e.g., see Problem 9. They are of importance in more advanced treatments of differential equations, but we shall not study them here.] The polynomial differential operators [math]p(D)[/math] can be added and multiplied just like ordinary polynomials. In particular, the following theorem follows from the distributive laws (3.3) and the commutative law (3.5):

Theorem

If [math]p(t)[/math] and [math]q(t)[/math] are polynomials and if [math]p(t)q(t) = r(t)[/math], then

[[math]] p(D)q(D)= r(D) . [[/math]]

As an illustration, observe how (3.3) and (3.5) are used to prove the special case of this theorem in which [math]p(t) = at + b[/math] and [math]q(t) = ct + d[/math]. First of all, we have

[[math]] \begin{eqnarray*} r(t) = p(t)q(t) &=& (at + b)(ct + d) \\ &=& act^2 + bct + adt + bd. \end{eqnarray*} [[/math]]

Then

[[math]] \begin{eqnarray*} p(D)q(D) &=& (aD + b)(cD + d) \\ &=& (aD + b)cD + (aD + b)d \\ &=& aDcD + bcD + aDd + bd \\ &=& acD^2 + bcD + adD + bd \\ &=& r(D) . \end{eqnarray*} [[/math]]

The proof is the same in principle for arbitrary polynomials [math]p(t)[/math] and [math]q(t)[/math]. It is a corollary of (3.6) that polynomial differential operators satisfy the commutative law of multiplication. Thus

Theorem

[[math]]p(D)q(D) = q(D)p(D).[[/math]]

For, since [math]p(t)q(t) = q(t)p(t) = r(t)[/math], both sides of (3.7) are equal to [math]r(D)[/math].

We begin the second part of the section by considering the differential equation

[[math]] \frac{d^2y}{dx^2} - 2 \frac{dy}{dx} - 3y = e^{-x}, [[/math]]

which, with the notation of differential operators, can be written

[[math]] \begin{equation} (D^2 - 2D - 3)y= e^{-x}. \label{eq11.3.5} \end{equation} [[/math]]

We have thus far defined the characteristic equation only for homogeneous, second-order, linear differential equations with constant coefficients. The generalization to nonhomogeneous and higher-order equations is: For any polynomial [math]p(t)[/math] and function [math]F(x)[/math], the characteristic equation of the differential equation

[[math]] p(D)y = F(x) [[/math]]

is the equation [math]p(t) = 0[/math], and the polynomial [math]p(t)[/math] is its characteristic polynomial. Returning to (5), we see that the characteristic polynomial, which is [math]t^2 - 2t - 3[/math], factors into the product [math](t - 3)(t + 1)[/math]. It follows from (3.6) that [math]D^2 - 2D - 3 = (D - 3)(D + 1)[/math], and (5) can therefore be written

[[math]] (D - 3)(D + 1)y = e^{-x}. [[/math]]

Let us define the function [math]u[/math] by setting [math](D + 1)y = u[/math]. Then (5) becomes equivalent to the pair of first-order linear equations

[[math]] \left \{ \begin{array}{l} (D - 3)u = e^{-x} , \mbox{ (6)} \\ (D+ 1)y = u. \mbox{ (7)} \end{array} \right . [[/math]]

To solve (6), we use the technique developed in Section 2. For this equation, [math]P(x) = -3[/math] and [math]Q(x) = e^{-x}[/math]. Hence an integrating factor is [math]e^{\int P(x)dx} = e^{-3x} [/math], and therefore

[[math]] \frac{d}{dx} (e^{-3x}u)= e ^{-3x} e^{-x} = e ^{-4x}. [[/math]]

Integrating, we obtain

[[math]] e^{-3x} u = \int e^{-4x} dx + c_1 = - \frac{1}{4}e^{-4x} + c_1, [[/math]]

whence

[[math]] u = e^{3x}(-\frac{1}{4}e^{-4x} + c_1) = -\frac{1}{4}e^{-x} + c_1e^{3x}. [[/math]]

We now substitute this value for [math]u[/math] in equation (7) to obtain the first-order linear equation

[[math]] (D + 1)y = -\frac{1}{4}e^{-x} + c_1e^{3x} . [[/math]]

Here, [math]P(x) = 1[/math] and the integrating factor is [math]e^{x}[/math]. Accordingly, we have

[[math]] \begin{eqnarray*} \frac{d}{dx} (e^{x} y) &=& e^{x} ( -\frac{1}{4}e^{-x} + c_1e^{3x}) \\ &=& -\frac{1}{4} + c_1e^{4x}. \end{eqnarray*} [[/math]]

integration yields

[[math]] e^x y = - \frac{1}{4} x + \frac{c_1}{4} e^{4x} + c_2. [[/math]]

Replacing [math]\frac{c_1}{4}[/math] by [math]c_1[/math], and multiplying both sides by [math]e^{-x}[/math], we get finally

[[math]] y = \frac{1}{4}xe^{-x} + c_1e^{3x} + c_2e^{-x}. [[/math]]

This, where [math]c_1[/math] and [math]c_2[/math] are arbitrary real constants, is the general solution to the differential equation

[[math]] \frac{d^2y}{dx^2} - 2 \frac{dy}{dx} - 3y = e^{-x}. [[/math]]

This example illustrates the fact that we can in principle solve any secondorder, linear differential equation with constant coefficients provided the characteristic polynomial is the product of linear factors. Thus, if we are given

[[math]] (D^2 + aD + b)y = F(x), [[/math]]

and if [math]t^2+ at + b = (t - r_1)(t - r_2)[/math], then the differential equation can be written

[[math]] (D - r_1)(D - r_2)y = F(x). [[/math]]

If [math]u[/math] is defined by setting [math](D - r_2)y= u[/math], then the original second-order equation is equivalent to the two first-order linear differential equations

[[math]] \left \{ \begin{array}{l} (D - r_1)u = F(x), \\ (D - r_2)y= u, \end{array} \right . [[/math]]

and these can be solved successively to find first [math]u[/math] and then [math]y[/math]. The same technique can be applied to higher-order equations. Consider an arbitrary polynomial

[[math]] p(t) = t^n + a_{n-1} t^{n-1} + \cdots + a_1t + a_0, [[/math]]

where [math]n \gt 1[/math] and [math]a_0, . . ., a_{n-1}[/math], are real constants. In addition, we assume that [math]p(t)[/math] is the product of linear factors; i.e.,

[[math]] p(t)= (t - r_1)(t - r_2) \cdots (t - r_n). [[/math]]

Let F(x) be given and consider the differential equation \setcounter{equation}{7}

[[math]] \begin{equation} p(D)y = F(x), \label{eq11.3.8} \end{equation} [[/math]]

which is the same as

[[math]] \frac{d^ny}{dx^n} + a_{n-1} \frac{d^{n-1}y}{dx^{n-1}} + \cdots + a_1 \frac{dy}{dx} + a_0y = F(x) . [[/math]]

Since the factorization of [math]p(t)[/math] is assumed, the differential equation can also be written

[[math]] (D - r_1)(D - r_2) \cdots (D - r_n)y = F(x). [[/math]]

The functions [math]u_1, . . ., u_{n-1}[/math] are defined by

[[math]] \begin{eqnarray*} u_1 &=& (D - r_2) \cdots (D - r_n)y, \\ u_2 &=& (D - r_3) \cdots (D - r_n)y, \\ \vdots & & \\ u_{n-1} &=& (D - r_n)y . \end{eqnarray*} [[/math]]

Then (8) is equivalent to the following set of first-order linear differential equations

[[math]] \left \{ \begin{array}{l} (D - r_1)u_1 = F(x),\\ (D - r_2)u_2 = u_1, \\ \vdots \\ (D - r_n)y = u_{n-1}, \\ \end{array} \right . [[/math]]

which can be solved successively to finally obtain [math]y[/math]. \medskip ln Section 4 of Chapter 7 use was made of the fact that any polynomial with real coefficients and degree at least 1 can be written as the product of linear and irreducible quadratic factors (see page 386). Suppose [math]ct^2 + dt + e[/math] is irreducible. This is equivalent to the assertion that the discriminant [math]d^2 - 4ce[/math] is negative. According to the quadratic formula, the two roots of the equation [math]ct^2 + dt + e = 0[/math] are equal to [math]r_1 = \alpha + i\beta[/math] and [math]r_2 = \alpha - i\beta[/math], where [math]\alpha = - \frac{d}{2c}[/math] and [math]\beta = \frac{\sqrt{4ae - d^2}}{2c}[/math]. By multiplying and substituting these values, one can then easily verify the equation

[[math]] c(t - r_1)(t - r_2) = ct^2 + dt + e. [[/math]]

Thus any irreducible quadratic polynomial with real coefficients is the product of two linear factors with complex coefficients. It follows that, for any polynomial

[[math]] p(t) = a_nt^n + a_{n-1}t^{n-1} + \cdots + a_1t + a_0, [[/math]]

with real coefficients [math]a_i[/math], [math]n \geq 1[/math] and [math]a_n \neq 0[/math], we have

[[math]] p(t) = a_n(t - r_1)(t - r_2) \cdots (t - r_n), [[/math]]

where roots which are complex occur in conjugate pairs. It is this fact which introduces the third part of this section. lt is very natural to ask the following: If the class of possible solutions is enlarged to include complex-valued functions of a real variable, can we proceed to solve linear differential equations with constant coefficients just as before, but with the added knowledge that now the characteristic polynomial can always be factored into linear factors? The answer is yes! To justify this answer, we must of course know the definition of the derivative. Let $f$ be a function whose domain [math]Q[/math] is a subset of the real numbers and whose range is a subset of the complex numbers. Then two real-valued functions [math]f_1[/math] and [math]f_2[/math] with domain [math]Q[/math] are defined by

[[math]] \begin{eqnarray*} f_1(x) &=& \mbox{real part of}\; f(x), \\ f_2(x) &=& \mbox{imaginary part of}\; f(x). \end{eqnarray*} [[/math]]

That is, we have [math]f(x) = f_1(x) + if_2(x)[/math], for every [math]x[/math] in [math]Q[/math]. The derivative [math]f'[/math] is defined simply by the equation

[[math]] f'(x) = f'_1(x) + if'_2(x), [[/math]]

for every [math]x[/math] for which both [math]f'_1(x)[/math] and [math]f'_2(x)[/math] exist. Alternatively, if we write [math]y = f(x), u = f_1(x)[/math], and [math]v = f_2(x)[/math], then [math]y = u + iv[/math], and we also use the notations

[[math]] \begin{eqnarray*} f'(x) &=& \frac{dy}{dx} = \frac{du}{dx} + i \frac{dv}{dx} \\ &=& Dy = Du + i Dv. \end{eqnarray*} [[/math]]

Logically, we must now go back and check that all the formal rules for differentiation and finding antiderivatives are still true for complex-valued functions, and the same applies to several theorems (see, for example, Problems 10 and 11). Much of this work is purely routine, and, to avoid an interruption of our study of differential equations, we shall omit it. It now follows, by factoring the operator [math]p(D)[/math] into linear factors, that any linear differential equation

[[math]] p(D)y= F(x) [[/math]]

with constant coefficients can be solved. That is, it can first be replaced by an equivalent set of first-order linear differential equations. For each of these an explicit integrating factor [math]e^{\int P(x)dx}[/math] exists, and by solving them successively, we can eventually obtain the general solution [math]y[/math].

Example Solve the differential equation [math](D^2 + 1)y = 2x[/math]. Since [math]t^2 + 1 = (t + i)(t - i)[/math], we have

[[math]] (D + i)(D - i)y = 2x. [[/math]]

Let [math](D - i)y = u[/math], and consider the first-order equation

[[math]] (D + i)u = 2x. [[/math]]

Since [math]P(x) = i[/math], an integrating factor is [math]e^{ix}[/math], and we obtain

[[math]] \frac{d}{dx} (e^{ix} u) = e^{ix} 2x , [[/math]]

from which it follows by integrating that

[[math]] e^{ix} u = 2 \int x e^{ix} dx + c_1. [[/math]]

By integration by parts it can be verified that

[[math]] \begin{equation} \int xe^{ax} dx = \frac{xe^{ax}}{a} - \frac{e^{ax}}{a^2} . \label{eq11.3.9} \end{equation} [[/math]]

In this case, [math]a = i[/math] and we know that [math]\frac{1}{i} = -i[/math] and that [math]i^2 = - 1[/math]. Hence

[[math]] e^{ix}u = - 2ixe^{ix} + 2e^{ix} + c_1, [[/math]]

and so

[[math]] u = - 2ix + 2 + c_1e^{-ix} . [[/math]]

It therefore remains to solve the differential equation

[[math]] (D - i)y = - 2ix + 2 + c_1e^{-ix}. [[/math]]

This time, an integrating factor is [math]e^{-ix}[/math]. Hence

[[math]] \frac{d}{dx} (e^{-ix} y) = - 2ixe^{-ix} + 2e^{-ix} + c_1 e^{-2ix}. [[/math]]

Integration [with a second application of (9)] yields

[[math]] e^{-ix}y = 2xe^{-ix} - \frac{c_1}{2i}e^{-i2x} + c_2. [[/math]]

Replacing the constant [math]-\frac{c_1}{2i}[/math] by simply [math]c_1[/math], and multiplying both sides by [math]e^{ix}[/math], we obtain

[[math]] y = 2x + c_1 e^{-ix} + c_2 e^{ix} . [[/math]]

If the function [math]y[/math] is real-valued, then it is easy to prove that [math]c_1[/math] and [math]c_2[/math] are complex conjugates [see (4.3), page 644]. In this case [math]c_1e^{-ix} + c_2e^{ix}[/math] may be replaced by [math]c_1 \cos x + c_2 \sin x[/math], where now the constants [math]c_1[/math] and [math]c_2[/math] denote arbitrary real numbers. We conclude that

[[math]] y = 2x + c_1 \cos x + c_2 \sin x [[/math]]

is the general real-valued solution to the original differential equation

[[math]] \frac{d^2y}{dx^2} + y = 2x. [[/math]]

The computations in this section were long and involved. The important fact we have shown is that the equations can be solved by an iteration of routine steps. As a practical matter, however, it is clear that some general computationally simple techniques are badly needed. These will be developed in the next two sections by breaking the problem into a homogeneous part and a nonhomogeneous part and attacking each one separately.

General references

Doyle, Peter G. (2008). "Crowell and Slesnick's Calculus with Analytic Geometry" (PDF). Retrieved Oct 29, 2024.