- Introduction

We first met the mathematical operation of differentiation in post 17.4. But more information on this technique has appeared in many other posts. The purpose of this post is to bring all this information together in one place and to add some more to provide a single post on differentiation.

In post 17.4, we saw that if we could describe the position of an object by a mathematical expression in which time, *t*, was the only variable, then differentiation by *t*, gives the velocity of the object. Differentiating a second time gives its acceleration. In this example we say that the position of the object is a function of time only. Newton developed differentiation by time because he was interested in the motion of objects. His contemporary, Leibnitz, independently developed the concept of differentiation for any variable.

The first twelve topics appear in most elementary calculus textbooks, except for the appearance of the hyperbolic functions – sinh, cosh and tanh. So, you may wish to stop reading this post after section 12 and to ignore all mention of the hyperbolic functions. Also, the proof given in appendix 1 doesn’t appear in books; I developed it to avoid introducing the binomial theorem. You may want to read post 18.2, on powers of numbers, and remember, for example, that, by definition, cos^{2}*x* = (cos*x*)^{2}.

2. Functions

If the value of *f* can be calculated from the value of zero or more constants and a single variable, *x*, we say that *f* is a *function *of *x* only. Some examples of functions of *x* only are listed below.*f*_{1} = 3 + 2*x**f*_{2} = 5*x*^{3} + 7 – 2*x*^{2}*f*_{3} = cos*x* + 3*x*.

In the final example, cos*x* is the cosine of the angle *x*.

When *x* = 0, the value of *f*_{1} is 3 + (2 × 0) = 3 + 0 = 3.

We can write this as *f*_{1}(0) = 3.

Using these ideas, we can write *f*_{2}(1) = 10 and *f*_{1}(3) = 9. We can also write our examples of functions as *f*_{1}(*x*), *f*_{2}(*x*) and *f*_{3}(*x*), to show that these functions are functions of *x* only. But we need to be careful. If we write the expression*f*_{4}(*x*) = *x*(1 + *e ^{x}*)

the brackets on the left-hand side don’t mean the same thing as the brackets on the right-hand side. On the left they tell us that

*f*

_{4}depends on

*x*only: on the right they tell us that 1 +

*e*is multiplied by

^{x}*x*.

3. Graphs of functions

We can calculate the value of one of our functions for many different values of *x* and then plot a graph of that function against *x*, as shown in the two examples below.

The graph of *f*_{1} appears to be a straight line. In section 4, we see that any function of *x* only that has the form*f*(*x*) = *a* + *bx*

where *a* and *b* are constants, is a straight line. In the pictures above, *f*_{1} and *f*_{2} have been plotted in the *y*-axis direction. So we can say that*y* = *a* + *bx*

is the equation of a straight line.

In post 21.3 we saw that, for a circle of radius *a*,*x*^{2} + *y*^{2} = *a*^{2}.

so the equation of a circle is*y* =(*x*^{2} – *a*^{2})^{1/2}.

In post 22.6, we used the equation*y* = *ax*^{2}

to define a parabola.

In post 22.8, we saw that the equation of a catenary is*y* = *a*cosh(*x*/*a*)

where cosh is a hyperbolic function.

So, we can see that a function of *x* can be represented on a graph and so defines a given type of curve. When we write *y* = *f*(*x*), the form of *f*(*x*) defines a curve and we say that the equation is the *equation of a curve*.

In polar coordinates, we define two-dimensional shapes by a radius, *r*, that is a function of an angle *θ*, as in posts 21.3 and 21.5.

4. Slope of a straight line

Let’s think about the equation *y* = *a* + *bx *(1).

If we increase the value of *x* by an increment Δ*x*, then *y* will increase by some amount Δ*y*. Then equation 1 becomes (*y* + Δ*y*) = *a* +*b*(*x* + Δ*x*) (2)

Subtracting equation 1 from equation 2 gives

Δ*y *=* a *+Δ*x * or Δ*y*/Δ*x* = *a*. (3)

The picture below shows two points, P and Q on a graph of equation 1. P has Cartesian coordinates (*x*, *y*): Q has Cartesian coordinates (*x* + Δ*x*, *y* + Δ*y*).

You can see that Δ*y*/Δ*x* is the average slope of PQ. Since Δ*y*/Δ*x* is a constant, whose value is *a*, the slope is a constant and is the shortest distance between P and Q because any curve would take a longer route, as exemplified by the red and green curves in the picture.

So, equation 1 is the equation of a straight line whose slope is *a*.

5. Division of zero by zero

By definition, dividing any number by itself gives the answer 1. For example, 5/5 = 1.

So you might expect that 0/0 = 1. To check whether 5/5 = 1, we can multiply both sides of this equation by 5 to give (5/5) × 5 = 1 × 5 or 5 = 5. This result is true, so we can be confident that 5/5 = 1. Let’s do the same thing with 0/0 = 1. Multiplying both sides by zero gives 0 = 0 which is true. But if we write 0/0 = *n*, where *n* is any number, we still get 0 = 0. So 0/0 can have any value.

Now let’s think about the value of the function (sin*x*)/*x*, where sin*x* is the sine of the angle *x*. When *x* = 0,. sin(*x*) = 0 (post 16.50). So (sin0/0) can have any value. But what happens when we make *x* very small and then keep making it smaller? The results are shown below (with a precision of six significant figures).

We see that, as *x* approaches zero, the value of our function approaches 1. We say that the *limiting value* of (sin*x*)/*x*, as *x* tends to zero, is 1. We sometimes write this statement as

You might like to think about Xeno’s paradox (post 16.6) at this point.

6. Slope of a curve

The picture above shows the graph of a function that does not represent a straight line. The average slope of the blue curve, between the points P and Q is Δ*y*/Δ*x*. But the slope is not the same at any point between P and Q, so this result is not very useful. If we make Δ*x* very small, then Δ*y* will be very small and the segment of the line is almost a straight line. In the picture, we can see that the slope of the line at R is approximately δ*y*/δ*x*. I am using the symbol δ, instead of Δ, to show that the increment δ*x* is very small. The smaller the value of δ*x*, the better the approximation.

In the limit δ*x* → 0, this approximation becomes exact. We then write δ*y*/δ*x* as d*y*/d*x*. The process of calculating d*y*/d*x* is called *differentiation* and d*y*/d*x* is called the *derivative* of *y*. When we calculate d*y*/d*x* we say that we are *differentiating* *y*.

I have used Leibnitz’s nomenclature for representing the limiting value of δ*y*/δ*x*. This is the most commonly used. But sometimes you will see Newton’s nomenclature. He would have written the limiting value as *y*’. This could be confusing because it doesn’t explicitly state that the variable is *x*. But this didn’t matter to Newton because, in all his calculations, the variable was time.

It might be helpful to define the derivative of *f*, a function of *x*, by the equation below.

This equation defines differentiation more concisely than the explanation given above but is identical to it.

7. Derivatives of some simple functions

8. Some useful theorems

8.1 *Sum of two function*

If *u* and *v* are functions of *x* only and*f* = *u* + *v*

then

d*f*/d*x* = d*u*/d*x* + d*v*/d*x*.

This result is proved in appendix 2.

8.2 *Chain rule*

If *u* is a function of *x*, the *chain rule* states that

d*f*/d*x* = (d*f*/d*u*)(d*u*/d*x*).

This result is justified in post 17.13.

Here is an example of how we can use the chain rule. Suppose*f* = sin*x*^{2} = sin*u* when *u* = *x*^{2}.

Then d*f*/d*u* = cos*u* and d*u*/d*x* =2*x*.

Substituting these results into the equation that states the chain rule gives

d*f*/d*x* = (cos*u*)(2*x*) = 2*x*cos*x*^{2}.

8.3 *Product rule*

If *u* and *v* are functions of *x*, the *product rule* states that

d(*uv*)/d*x* = *u*(d*v*/d*x*) + *v*(d*u*/d*x*).

The product rule is proved in appendix 3. How do we use it? Suppose that*f*= *x*sin*x* = *uv* where *u* = *x* and *v* = sin*x*.

Then d*u*/d*x* = 1 and d*v*/d*x* = cos*x*.

Substituting these results into the equation that defines the product rule gives

d(*uv*)/d*x* = *x*cos*x* +sin*x*.

8.4 *Quotient rule*

Most textbooks on calculus give the impression that we need to know the *quotient rule*. This isn’t true but I’ll mention it anyway.

If *u* and *v* are functions of *x*, the *quotient rule* states that

d(*u*/*v*)/d*x* = (1/*v*^{2}){*v*(d*u*/d*x*) – *u*(d*v*/d*x*)}.

We don’t need this rule, it is difficult to remember and tedious to keep looking it up. If you want to prove it, define *f* = *uv*^{-1} and *w* = *v*^{-1}, then apply the product rule to differentiate.

Books may tell you that you need the quotient rule to differentiate functions like*f* = (sin*x*)/*x* = *uv* where *u* = sin*x* and *v* = *x*^{-1}.

Then d*u*/d*x* = cos*x* and d*v*/d*x* = –*x*^{-2}.

We can then substitute these results into the product rule to give

d*f*/d*x* = (sin*x*)(-*x*^{-2}) + (*x*^{-1})(cos*x*) = (1/*x*^{2})(*x*cos*x* – sin*x*).

We didn’t need to use the quotient rule.

9. Derivatives of some more functions

In the table below I have made the functions of section 7 a bit more complicated. In this table *n* and *m* are constants. You can differentiate even more complicated functions using these results and the theorems in sections 8.1, 8.2 and 8.3.

10. Integration

As explained in post 17.19, integration reverses the process of differentiation, as shown in the diagram below.

We can write that

δ*f* = (δ*f*/δ*x*).δ*x*.

So, an approximation to recovering f from (δ*f*/δ*x*) is given by adding together all the terms like (δ*f*/δ*x*).δ*x*. We write this as

This approximation becomes exact in the limit δ*x* → 0 and we write the result as

We say that *f* is the *integral* of *f’*. For example, since d*x ^{n}*/d

*x*=

*nx*

^{n}^{-1}+

*C*where

*C*is any constant, we can write that

So when we perform the operation of integration a constant appears whose value is unknown. This constant, *C*, is called a *constant of integration*. When we use integration to solve physical problems, we can use *boundary conditions* to evaluate *C*, as explained in post 18.15.

So far, we have considered only *indefinite integrals* (see post 17.19). A *definite integral* is evaluated in a range of *x* values, for example *a* ≤ *x* ≤ *b*. Then there is no constant of integration and we can write

as explained in post 17.19.

The concept of a *line integral* is explained in post 17.36.

Examples of the application of integration are given in posts 17.19, 17.23, 17.27 and 17.36.

11. Repeated differentiation

Let’s suppose that *f* = *x*^{4}.

Then we can write *f’* = d*f*/d*x* =4*x*^{3}.

Then we define d^{2}*f*/d*x* = d*f’*/d*x* = 12*x*^{2} = *f’’*

and d^{3}*f*/d*x*^{3} = d*f’’*/d*x* = 24*x*.

Finally, d^{4}*f*/d*x*^{4} = 24 and d^{5}*f*/d*x*^{5} = 0.

12. Some applications of differentiation

Differentiation can be used to find the maximum and minimum values of a function, as described in appendix 2 of post 20.37.

The picture above shows that a *maximum*, *minimum* and *saddle point* (point of inflexion) are defined by d*f*/d*x* = 0. The three types of points are distinguished by the value of d^{2}*f*/d*x*^{2} which is negative for a maximum, positive for a minimum and zero for a saddle point.

Taylor’s theorem allows us to use the derivatives of a function to express that function as an infinite series, as described in appendix 1 of post 20.3. According to this theorem*f*(*x*) = *f*(0) + *xf*’ + (*x*^{2}/2!)*f*’’ + (*x*^{3}/3!)*f*’’’ + … + (*x ^{n}*/

*n*!)

*f*+ …

^{n}where

*n*! =

*n*× (

*n*– 1) × (

*n*– 2) × (

*n*– 3) × … × 3 × 2 × 1, as described in post 18.15. For example

4! = 4 × 3 × 2 × 1 = 24.

Why is Taylor’s theorem useful? Because, for example, it allows us to derive series that represent functions like cosine (see appendix 1 of post 18.6 to see why this is useful). Also, expanding a function as a series enables us to calculate its value, for a given value of *x*, in the same way as we calculated the values of π (post 17.11) and *e* (post 18.15).

13. Partial differentiation

Suppose *f* is a function of more than one variable. For example, *f*(*x*, *z*) represents a function of *x* and *z*. An example of such a function is*f* = *x*^{2} + *z*^{2}.

We can then differentiate *f* with respect to *x *only if we assume that *y* is constant. To show that we are making this assumption we write the derivative as

∂*f*/∂*x* = 2*x* + *z*^{2}.

If we want it to be clear that the variable being made constant is *z*, we can write this as

(∂*f*/∂*x*)_{z} = 2*x* + *z*^{2}.

Similarly

∂*f*/∂*z* = *x*^{2} + 2*z*.

We can differentiate the derivatives above a second time, to give

∂(∂*f*/∂*x*)/∂x = ∂^{2}*f*/∂*x*^{2} = 2 + *z*^{2}

and

∂(∂*f*/∂*z*)/∂z = ∂^{2}*f*/∂*z*^{2} = *x*^{2} + 2.

We can also differentiation a second time with respect to a different variable. For example

∂(∂*f*/∂*x*)/∂z = ∂^{2}*f*/∂*z.*∂*x* = 2*x* + 2*z*.

Similarly

∂(∂*f*/∂*z*)/∂x = ∂^{2}*f*/∂*x.*∂*z* = 2*x* + 2*z*.

Notice that the results of differentiating twice with respect to the different variables is independent of the order of differentiation. This is a general result.

Further information on partial differentiation is given in post 19.11.

14. The operator del

In post 20.34 we defined the operator del (also called nabla), in an orthogonal Cartesian coordinate system, by

where ** i**,

**and**

*j***are unit vectors defining the directions of the axes of the coordinate system.**

*k***∇**can operate on a scalar or a vector either by forming a dot product or a cross product, as explained in post 20.34. It appears in the Navier-Stokes equation that describes fluid flow, as described in post 20.36. In post 20.37, we saw that in polar coordinates it is given, in two dimensions, by

We also saw, in post 20.34, that the scalar operator ∇^{2} is given by

and that in polar coordinates this becomes, in two dimensions

(see post 20.37). The operator ∇^{2} appears in the wave equation (post 19.12), the diffusion equation (post 19.15) and in Schrödinger’s equation (post 19.27) which is the equation of a particle wave (see post 19.25).

15. Differential equations

Differential equations are equations that contain one or more derivatives, as described in post. An ordinary differential equation (ODE) is a differential equation that contains no partial derivatives. Examples of differential equations are the equation of a simple harmonic oscillator

d^{2}*x*/d*t*^{2} =-*ω*^{2}*x*

(see post 18.11) and the equation describing exponential growth

d*n*/d*t* = *kn*

(see post 18.15).

A partial differential equation (PDE) contains at least one partial derivation. Examples of PDEs are the wave equation (post 19.12), the diffusion equation (post 19.15) and in Schrödinger’s equation (post 19.27).

Appendix 1

The purpose of this appendix is to show that d(*x ^{n}*)/d

*x*=

*nx*

^{n}^{-1}.

Since *x*^{0} = 1 (post 18.2) is constant it does not change when *x* changes so

d(*x*^{0})/d*x* = 0. (1)

If *f* = *x*^{1} = *x* then *f* + δ*f* = *x* + δ*x*. Subtracting the first equation from the second gives

δ*f* = x so that δ*f*/δ*x* = 1.

In the limit δ*x* → 0 this result does not change, so that

d(*x*^{1})/d*x* = 1 = *x*^{0} = 1*x*^{0}. (2)

If *f* = *x*^{2} then*f* + δ*f* = (*x* + δ*x*)^{2} = *x*^{2} + (δ*x*)^{2} +2*x*δ*x*.

The final step is explained in appendix 4 of post 17.4. Subtracting the first equation from the second gives

δ*f* = (δ*x*)^{2} +2*x*δ*x* so that δ*f*/δ*x* = δ*x* +2*x.*

In the limit δ*x* → 0 this result becomes

d(*x*^{2})/d*x* = 2*x* = 2*x*^{1}. (3)

If *f* = *x*^{3} then*f* + δ*f* = (*x* + δ*x*)^{3} = (*x* + δ*x*)(*x* + δ*x*)^{2} = (*x* + δ*x*){*x*^{2} + (δ*x*)^{2} +2*x*δ*x*}.

We can write the final result as*f* + δ*f* = *x*{*x*^{2} + (δ*x*)^{2} +2*x*δ*x*} + (δ*x*){*x*^{2} + (δ*x*)^{2} +2*x*δ*x*} = *x*^{3} + 3*x*^{2}(δ*x*) + 3*x*(δ*x*)^{2} +(δ*x*)^{3}.

Subtracting *f* from the left-hand side and *x*^{3} from the right-hand side gives

δ*f* = 3*x*^{2}(δ*x*) + 3*x*(δ*x*)^{2} +(δ*x*)^{3}

so that δ*f*/δ*x* = 3*x*^{2} + 3*x*δ*x* +(δ*x*)^{2}.

In the limit δ*x* → 0 this result becomes

d(*x*^{3})/d*x* = 3*x*^{2}. (4)

We could go on to show that

d(*x*^{4})/d*x* = 4*x*^{3}

d(*x*^{5})/d*x* = 5*x*^{4}

and so on. If you want to try, it may be helpful to know that

(*a* + *b*)^{4} = *a*^{4} + 4*a*^{3}*b* + 6*a*^{2}*b*^{2} +4*ab*^{3} + *b*^{4}

and

(*a* + *b*)^{5} = *a*^{5} + 5*a*^{4}*b* + 10*a*^{3}*b*^{2} + 10*a*^{2}*b*^{3} + 5*ab*^{4} + *b*^{5}.

We can already see a pattern by comparing equations 1, 2 and 3. It seems that

d(*x ^{n}*)/d

*x*=

*nx*

^{n}^{-1}. (5)

The proof that follows is complicated and you may wish to trust that equation 5 is true.

If equation 4 is true then

d(*x ^{n }*

^{+ 1})/d

*x*= (

*n*+ 1)

*x*. (6)

^{n}Let’s assume that equation 5 is true and see if equation 6 follows from this assumption. If it does, we can add 1 to *n* when *n* = 1, 2, 3… and so on until we arrive at a general expression for any value of *n*. Then equation 4 must be true.

If *n* = *k*, we assume that

d(*x ^{k}*)/d

*x*=

*kx*

^{k}^{-1}.

If equation 4 is true then

d(

*x*

^{k }^{+ 1})/d

*x*= (

*k*+ 1)

*x*.

^{k}Note that

*x*

^{k }^{+ 1}=

*x*(

*x*)

^{k}and then differentiate this result using the product rule, the theorem stated in section 8.2 and proved in appendix 3 (below). Let

*u*=

*x*and

*v*=

*x*. Then

^{k}d

*u*/d

*x*= 1 and d

*v*/d

*x*=

*kx*

^{k}^{-1}.

The result for d

*v*/d

*x*rests on the assumption that equation 5 is true. The product rule states that

d(

*uv*)/d

*x*=

*u*(d

*v*/d

*x*) +

*v*(d

*u*/d

*x*).

Substituting the results we have just obtained into this equation

d(

*x*

^{k }^{+ 1})/d

*x*=

*x*(

*kx*

^{k}^{-1}) +

*x*=

^{k}*kx*+

^{k}*x*= (

^{k}*k*+ 1)

*x*.

^{k}We have shown, if equation 5 is true, then equation 6 must also be true. Therefore, we believe that equation 5 is true, as explained in the paragraph before last.

Usually we prove theorems in logical steps, starting from an axiom; this is called *proof by deduction*. But here we’ve used a different approach called *proof by induction*. In this method we prove that a statement is true when a natural number, *n* = 0, 1, 2, 3… has a low value. The *inductive step* is to show that if this statement is true when *n* = *k*, where *k* is a larger natural number, then it is true for *n* = *k* + 1. These steps show that the statement is true for any natural number.

A proof by deduction that d(*x ^{n}*)/d

*x*=

*nx*

^{n}^{-1}is given at

https://socratic.org/questions/differentiate-y-x-n-using-first-principle.

To understand this proof, you need to understand the binomial theorem

https://en.wikipedia.org/wiki/Binomial_theorem.

What happens when *n* is a fraction? Suppose *n* = 1/*p*, where *p* is a positive integer. Then we assume that

d(*x ^{1/p}*)/d

*x*= (1/

*p*)

*x*

^{(1/p)}^{-1}.

If this is true then

d(

*x*

^{(1/p) }^{+ 1})/d

*x*= (1 +1/

*p*)

*x*.

^{1/p}Note that

*x*

^{(1/p) }^{+ 1}=

*x.x*

^{1/p}and continue as we did when

*n*=

*k*, to show that our assumption is true.

What happens when *n* is negative? Suppose *n* = – *p*. Then we assume that

d(*x ^{-p}*)/d

*x*= (

*-p*)

*x*

^{-p – }^{1}.

If equation 4 is true then

d(

*x*)/d

^{1 – p}*x*= (1 – p)

*x*.

^{-p}Note that

*x*

^{1 – p}=

*x*.

*x*

^{-p}and continue as we did when

*n*=

*k*, to show that our assumption is true.

Appendix 2

The purpose of this appendix is to show that d(tan*x*)/d*x* = 1/cos^{2}*x * and d(tanh*x*)/d*x* = 1/cosh^{2}*x*.

In post 16.50, we defined

tan*x* = (sin*x*)/(cos*x*) = (sin*x*)(cos*x*)^{-1} = *uv*

where *u* = sin*x* and *v* = (cos*x*)^{-1},

so that d*u*/d*x* = cosx and d*v*/d*x* = (sin*x*)/(cos*x*)^{2}.

To obtain the final result, we define *v* = (cos*x*)^{-1} = *w*^{-1}.

We then use the chain rule (section 8.1) to write

d*v*/d*x* = (d*v*/d*w*)(d*w*/d*x*) = (-1/*w*^{2})(-sin*x*) = (sin*x*)/(cos*x*)^{2}.

Putting these results into the product rule (section 8.2 with proof in appendix 3) gives

d(tan*x*)/d*x* = (sin*x*).(sin*x*)/(cos*x*)^{2} + (cos*x*)^{-1}.(cos*x*) = (1/cos^{2}*x*)(sin^{2}x + cos^{2}x) = 1/cos^{2}*x.*The final step is true because sin

^{2}x + cos

^{2}x = 1 (post 16.50).

You will usually see this result written as sec^{2}*x*. The trigonometric *secant* is defined by sec*x* = 1/cos*x*. Similarly, *cosecant* is defined by cosec*x* = 1/sin*x*. I never use them because I can’t remember which is which. I don’t believe anyone really needs them.

Appendix 3

The purpose of this appendix is to prove the theorem of section 8.1. If *f* = *u* + *v*,

where *f*, *u* and *v* are functions of *x*,

then *f* + δ*f* = *u*(*x* + δ*x*) + *v*(*x* + δ*x*).

Subtracting the first equation from the second gives

δ*f* = *u*(*x* + δ*x*) + *v*(*x* + δ*x*) – *u* – *v*.

Dividing by δ*x* gives

Notice that, in the limit δ*x* → 0 the right hand side of this equation becomes d*u*/d*x* + d*v*/d*x* and the left hand side is d*f*/d*x* (see section 6), which proves the theorem.

Appendix 4

The purpose of this appendix is to prove the theorem of section 8.3.

If *f* = *uv*,

where *f*, *u* and *v* are functions of *x*,

then *f* + δ*f* = (*u* + δ*u*)(*v* + δ*v*) = *uv* + *u*δ*v* + *v*δ*u* + δ*u*.δ*v*.

Subtracting the first equation from the second gives

δ*f* = *u*δ*v* + *v*δ*u* + δ*u*.δ*v*.

Dividing by δ*x* gives

δ*f*/δ*x* = *u*(δ*v*/δ*x*) + *v*(δ*u*/δ*x*) + (δ*u*.δ*v*)/δ*x*.

In the limit δ*x* → 0 this becomes

d*f*/d*x* = *u*(d*v*/d*x*) + *v*(d*u*/d*x*)

since δ*u*.δ*v* represents two infinitesimally small numbers multiplied by each other.