Lecture 16: Section 14.6 Directional Derivatives and the Gradient

Welcome back! So far, we've explored partial derivatives, $f_x$ and $f_y$, which tell us the rate of change of a function in the specific directions of the coordinate axes. But what if we're standing on a mountain and want to know the slope in the exact direction we're facing, say, northeast? Today, we develop the tools to answer that question. We will generalize the derivative to find the rate of change in any direction.

Topic 1: The Directional Derivative 🧭

Analogy: The Hiker on the Mountain

Imagine you are a hiker standing on a mountainside. The surface of the mountain is the graph of the function $z=f(x,y)$.

Your location on the map is the point $(x,y)$.
Your actual position on the mountainside is the point $(x,y,f(x,y))$.
The direction vector $\mathbf{u}$ is the direction you point your compass on the map (e.g., due north, southwest, etc.).
The directional derivative, $D_{\mathbf{u}}f(x,y)$, is the steepness of the path at your current location if you take a step in that compass direction. A positive value means you are heading uphill, negative means downhill, and zero means you are on a level path.

In the graph below, move the sliders for $a$ and $b$ to change the direction vector $\mathbf{u}=\langle a,b \rangle$ and see how the steepness of the tangent line on the surface changes.

Why a Unit Vector?

The directional derivative measures the rate of change with respect to distance. To make this a standardized measure, we must use a unit vector $\mathbf{u}$. A unit vector has a magnitude of 1, so it represents a pure direction. If we used a vector of length 2, our answer would be twice as large, representing the change over two units of distance, not the instantaneous rate of change "per unit distance" at the point.

Conceptually, the directional derivative is defined by a limit that mirrors the definition from Calculus I. It measures the change in $f$ as we move an infinitesimally small distance $h$ in the direction of a unit vector $\mathbf{u} = \langle a, b \rangle$.

$$D_{\mathbf{u}}f(x,y) = \lim_{h\to 0} \frac{f(x+ha, y+hb) - f(x,y)}{h}$$

While the limit definition is the formal foundation, we can derive a simpler computational formula. To do this, we define a function $g(h) = f(x+ha, y+hb)$. The directional derivative is then simply $g'(0)$. By applying the multivariable chain rule from our last lecture to $g(h)$, where $x(h)=x+ha$ and $y(h)=y+hb$, we get $g'(h) = f_x \cdot \frac{dx}{dh} + f_y \cdot \frac{dy}{dh} = f_x \cdot a + f_y \cdot b$. This leads directly to our computational formula.

Formula for the Directional Derivative

If $f$ is a differentiable function of $x$ and $y$, then the directional derivative of $f$ in the direction of the unit vector $\mathbf{u} = \langle a, b \rangle$ is:

$$D_{\mathbf{u}}f(x,y) = f_x(x,y)a + f_y(x,y)b$$

Example 1: Calculating a Directional Derivative

Find the directional derivative of $f(x,y) = x^2 y^3 - 4y$ at the point $(2, -1)$ in the direction of the vector $\mathbf{v} = \langle 2, 5 \rangle$.

Solution:

Step 1: Normalize the direction vector. The vector $\mathbf{v}$ is not a unit vector. We find its magnitude: $|\mathbf{v}| = \sqrt{2^2 + 5^2} = \sqrt{4+25} = \sqrt{29}$.

The unit vector $\mathbf{u}$ is $\mathbf{u} = \frac{\mathbf{v}}{|\mathbf{v}|} = \left\langle \frac{2}{\sqrt{29}}, \frac{5}{\sqrt{29}} \right\rangle$. So, $a = \frac{2}{\sqrt{29}}$ and $b = \frac{5}{\sqrt{29}}$.

Step 2: Find the partial derivatives.

$f_x(x,y) = 2xy^3$

$f_y(x,y) = 3x^2y^2 - 4$

Step 3: Evaluate the partial derivatives at the point $(2, -1)$.

$f_x(2,-1) = 2(2)(-1)^3 = -4$

$f_y(2,-1) = 3(2)^2(-1)^2 - 4 = 3(4)(1) - 4 = 12 - 4 = 8$

Step 4: Use the formula.

$$D_{\mathbf{u}}f(2,-1) = f_x(2,-1)a + f_y(2,-1)b = (-4)\left(\frac{2}{\sqrt{29}}\right) + (8)\left(\frac{5}{\sqrt{29}}\right)$$

$$= \frac{-8}{\sqrt{29}} + \frac{40}{\sqrt{29}} = \frac{32}{\sqrt{29}}$$

Check Your Understanding #1

Find the directional derivative of $g(x,y) = xe^y$ at the point $(2,0)$ in the direction of $\mathbf{v} = \langle 3, -4 \rangle$.

Topic 2: The Gradient Vector (∇f)

The formula for the directional derivative, $f_x a + f_y b$, looks like a dot product. This is no coincidence! We can "package" the partial derivatives of a function into a special new vector called the gradient.

The gradient of a function $f$, denoted $\nabla f$ (pronounced "del f"), is a vector field that contains all the first-order partial derivative information of $f$.

The Gradient Vector

For $f(x,y)$, the gradient is a two-dimensional vector: $\nabla f(x,y) = \langle f_x(x,y), f_y(x,y) \rangle$.

For $f(x,y,z)$, the gradient is: $\nabla f(x,y,z) = \langle f_x(x,y,z), f_y(x,y,z), f_z(x,y,z) \rangle$.

Directional Derivative using the Gradient

The directional derivative can now be written concisely as the dot product of the gradient and the unit direction vector:

$$D_{\mathbf{u}}f = \nabla f \cdot \mathbf{u}$$

This dot product has a beautiful geometric interpretation. Recall from our study of vectors that the scalar projection of a vector $\mathbf{a}$ onto a vector $\mathbf{b}$ is given by $\frac{\mathbf{a} \cdot \mathbf{b}}{|\mathbf{b}|}$. This number tells us the signed magnitude of the "shadow" that $\mathbf{a}$ casts on $\mathbf{b}$.

In our case, the directional derivative is precisely the scalar projection of the gradient vector $\nabla f$ onto the direction vector $\mathbf{u}$. Since $\mathbf{u}$ is a unit vector, its magnitude $|\mathbf{u}|$ is 1. The formula thus simplifies beautifully:

Scalar Projection of $\nabla f$ onto $\mathbf{u} = \frac{\nabla f \cdot \mathbf{u}}{|\mathbf{u}|} = \frac{\nabla f \cdot \mathbf{u}}{1} = \nabla f \cdot \mathbf{u}$.

This scalar (a number) represents the "steepness" of the mountain heading in the direction $\mathbf{u}$, or more formally, the instantaneous rate of change of the function in the direction of $\mathbf{u}$.

A Step-by-Step Guide to Finding the Directional Derivative

Find the Gradient: Calculate the partial derivatives and assemble the gradient vector, $\nabla f = \langle f_x, f_y \rangle$.
Find the Unit Vector: Identify the direction vector $\mathbf{v}$. If it's not a unit vector, normalize it: $\mathbf{u} = \frac{\mathbf{v}}{|\mathbf{v}|}$.
Evaluate the Gradient: Plug the coordinates of the given point $P$ into your gradient vector to get a vector of constants, $\nabla f(P)$.
Calculate the Dot Product: Compute the directional derivative by taking the dot product of the evaluated gradient and the unit vector: $D_{\mathbf{u}}f(P) = \nabla f(P) \cdot \mathbf{u}$.

Example 2: Calculating the Gradient

Find the gradient of $f(x,y,z) = x\sin(yz)$ at the point $(1, 3, 0)$.

Solution:

First, we find the partial derivatives:

$f_x = \sin(yz)$

$f_y = xz\cos(yz)$

$f_z = xy\cos(yz)$

Now, we evaluate each partial derivative at the point $(1,3,0)$:

$f_x(1,3,0) = \sin(3 \cdot 0) = \sin(0) = 0$

$f_y(1,3,0) = (1)(0)\cos(0) = 0$

$f_z(1,3,0) = (1)(3)\cos(0) = 3$

The gradient vector at this point is $\nabla f(1,3,0) = \langle 0, 0, 3 \rangle$.

Check Your Understanding #2

Find the gradient of $g(x,y,z) = z^2 e^{xy}$ at the point $(0,1,2)$.

Topic 3: The Geometric Significance of the Gradient ⛰️

The gradient is far more than a notational shortcut. Using the alternate formula for the dot product, $D_{\mathbf{u}}f = |\nabla f| |\mathbf{u}| \cos\theta$, and knowing $|\mathbf{u}|=1$, we get:

$$D_{\mathbf{u}}f = |\nabla f| \cos\theta$$

where $\theta$ is the angle between the gradient vector $\nabla f$ and the direction vector $\mathbf{u}$. This simple equation reveals three crucial properties:

The directional derivative is maximized when $\cos\theta=1$ (so $\theta=0$). This means the direction of steepest ascent is the direction of the gradient vector, $\nabla f$. The maximum rate of change is $|\nabla f|$.
The directional derivative is minimized (most negative) when $\cos\theta=-1$ (so $\theta=\pi$). The direction of steepest descent is $-\nabla f$.
The directional derivative is zero when $\cos\theta=0$ (so $\theta=\pi/2$). This happens when the direction $\mathbf{u}$ is orthogonal (perpendicular) to the gradient.

That last point is huge: if you move in a direction perpendicular to the gradient, the function's value does not change. This means the gradient vector at a point is always normal (perpendicular) to the level curve (or surface) passing through that point.

It is important to clarify what "moving in the direction of the gradient" means. For a function $z=f(x,y)$, the gradient $\nabla f$ is a 2D vector that lies in the xy-plane (like a compass direction on a map). "Going in the direction of the gradient" means walking on the 3D surface (the mountain) in such a way that your shadow on the xy-plane (the map) moves in the direction of $\nabla f$.

In the graph below, unclick the first equation for the surface to see the relationship in the xy-plane. The gradient (red) points in the direction of steepest ascent, and it is orthogonal to the tangent vector (black) of the level curve (blue).

Connection to Machine Learning: Gradient Descent

The concept of "steepest descent" is the foundation of many modern optimization algorithms. In machine learning, a "cost function" measures how inaccurate a model's predictions are. This function is a high-dimensional surface, and the goal is to find its lowest point. The gradient descent algorithm does this by starting at a random point on the surface and repeatedly taking a small step in the direction of the negative gradient, $-\nabla f$. By always moving in the direction of steepest descent, it systematically "walks down the hill" to find a minimum of the cost function, thereby improving the model.

Here is a visualization of the gradient descent algorithm we just discussed, which repeatedly takes a step in the direction of the negative gradient to descend to a minimum.

Example 3: Finding the Direction of Steepest Ascent

Let $f(x,y) = x^2 + 4y^2$. Find the direction of maximum increase and the value of this maximum rate of change at the point $P(2, -1)$.

Solution:

The direction of maximum increase is simply the direction of the gradient vector. First, we compute the gradient:

$\nabla f = \langle f_x, f_y \rangle = \langle 2x, 8y \rangle$.

Next, evaluate the gradient at the point $P(2, -1)$:

$\nabla f(2, -1) = \langle 2(2), 8(-1) \rangle = \langle 4, -8 \rangle$.

The direction of maximum increase is $\langle 4, -8 \rangle$.

The maximum rate of change is the magnitude of this gradient vector:

$|\nabla f(2,-1)| = \sqrt{4^2 + (-8)^2} = \sqrt{16 + 64} = \sqrt{80} = 4\sqrt{5}$.

Check Your Understanding #3

For the function $f(x,y) = \sin(xy)$, find the direction of steepest descent at the point $(\pi, 1/2)$.

Topic 4: Tangent Planes to Level Surfaces 🏛️

Recall that for a function $F(x,y,z)$, a level surface is the set of all points $(x,y,z)$ where the function has a constant value, i.e., $F(x,y,z) = k$. For example, the level surfaces of $F=x^2+y^2+z^2$ are concentric spheres.

Just as the 2D gradient is normal to level curves, the 3D gradient $\nabla F$ at a point $P(x_0,y_0,z_0)$ is normal (perpendicular) to the level surface that passes through $P$. This gives us an easy way to find the tangent plane to a surface.

Equation of the Tangent Plane and Normal Line

The equation of the tangent plane to the level surface $F(x,y,z)=k$ at the point $P(x_0,y_0,z_0)$ is:

$$F_x(P)(x-x_0) + F_y(P)(y-y_0) + F_z(P)(z-z_0) = 0$$

The normal vector to this plane is the gradient $\nabla F(P)$. The normal line is the line that passes through $P$ in the direction of this normal vector.

Example 4: Finding a Tangent Plane

Find the equation of the tangent plane to the ellipsoid $x^2 + 4y^2 + z^2 = 18$ at the point $(1, 2, 1)$.

Solution:

The ellipsoid is a level surface of the function $F(x,y,z) = x^2 + 4y^2 + z^2$ for the constant $k=18$. The normal vector to the tangent plane is the gradient of $F$.

$\nabla F = \langle 2x, 8y, 2z \rangle$.

Evaluate the gradient at the point $(1,2,1)$ to get the specific normal vector:

$\nabla F(1,2,1) = \langle 2(1), 8(2), 2(1) \rangle = \langle 2, 16, 2 \rangle$.

Using the point $(x_0,y_0,z_0)=(1,2,1)$ and the normal vector $\langle 2, 16, 2 \rangle$, the equation of the plane is:

$2(x-1) + 16(y-2) + 2(z-1) = 0$

$2x - 2 + 16y - 32 + 2z - 2 = 0$

$2x + 16y + 2z = 36$, or more simply, $x + 8y + z = 18$.

Check Your Understanding #4

Find the equation of the tangent plane to the paraboloid $z = x^2 + y^2$ at the point $(1, 2, 5)$.

Topic 5: Summary of the Gradient Vector

The gradient is one of the most important concepts in multivariable calculus. This table summarizes its key properties and formulas.

Property/Concept	Formula / Description
Definition	$\nabla f = \langle f_x, f_y, f_z \rangle$
Directional Derivative	$D_{\mathbf{u}}f = \nabla f \cdot \mathbf{u}$
Direction of Max Increase	The direction of the gradient vector, $\nabla f$.
Max Rate of Increase	The magnitude of the gradient vector, $\|\nabla f\|$.
Direction of Max Decrease	The direction opposite the gradient vector, $-\nabla f$.
Orthogonality	$\nabla f$ is perpendicular to the level curves/surfaces of $f$.
Tangent Plane Normal	$\nabla F$ is the normal vector to the level surface $F(x,y,z)=k$.

Lecture Conclusion & Practice Problems

Today we moved beyond the limitations of partial derivatives. The directional derivative lets us find the rate of change in any direction, and the gradient vector is the key to calculating it. More importantly, the gradient itself tells us the direction of steepest ascent and is always normal to level curves, making it a powerful tool for understanding the geometry of multivariable functions.

Final Practice #1

The temperature at a point $(x,y)$ on a metal plate is given by $T(x,y) = 400 e^{-(x^2+y)/2}$. An ant at $(1,1)$ wants to walk in the direction in which it will cool off the fastest. In what direction should it walk?

Final Practice #2

Find the equations of the tangent plane and the normal line to the hyperboloid $x^2 + y^2 - z^2 = 1$ at the point $P(1, 1, 1)$.