Chapter 14: Partial Derivatives

14.1 Functions of Several Variables

First several pages of this chapter are just boring explanations of two-variable functions...

Level Curves

The level curves of a function $f$ of two variables are the curves with equations $f(x,y)=k$ , where $k$ is a constant (in the range of $f$ ).

14.2 Limits and Continuity

Limits of Multivariate Functions

For a two-variable function,

\lim_{ (x,y) \to (a,b) } f(x,y)=L

The definition extends similarly to higher dimensions.

Formally, ( $\epsilon-\delta$ ),

Limit Definition

Let $f$ be a function of two variables whose domain $D$ includes points arbitrarily close to $(a,b)$ . The limit of $f(x,y)$ as $(x,y)$ approaches $(a,b)$ is $L$ if for every number $\epsilon>0$ there is a corresponding number $\delta>0$ such that if $(x,y)\in D$ and $0<\sqrt{ (x-a)^{2}+(y-b)^{2} }<\delta$ then $\lvert f(x,y)-L \rvert<\epsilon$ .

In other words, the distance between $f(x,y)$ and $L$ can be made infinitesimal by making the distance from $(x,y)$ to $(a,b)$ infinitesimal.

There is one complication with limits for multivariate functions that does not appear in univariate functions, though. Namely, $(a,b)$ can be approached by $(x,y)$ in any direction, infinitely many times. Thus, if there exist more than one direction that $(a,b)$ is approached from and that causes $f(x,y)$ to approach more than one distinct value, the limit of $f(x,y)$ at $(a,b)$ does not exist. Formally,

Limit Existence

If $f(x,y)\to L_{1}$ as $(x,y)\to(a,b)$ along a path $C_{1}$ and $f(x,y)\to L_{2}$ as $(x,y)\to(a,b)$ along a path $C_{2}$ , where $L_{1}\neq L_{2}$ , then $\lim_{ (x,y) \to (a,b) }f(x,y)$ does not exist.

Continuity

A two-variable function $f$ is continuous at $(a,b)$ if

\lim_{ (x,y) \to (a,b) } f(x,y)=f(a,b)

Also, if $f$ is continuous at every point $(a,b)$ in its domain $D$ , then $f$ is continuous on $D$ .

Using the properties of limits, it is important to note the following property of continuous functions:

A function

h

produced by

f * g

, where

*

is one of

+,-,\times,\div

, is continuous if and only if

f,g

are continuous.

As a result of this fact, all multivariate polynomials, as well as rational functions, are continuous.

Continuity for Proving Limits

If a function is continuous at $(a,b)$ , then the limit exists at $(a,b)$ . Then, for a function that is a composition of continuous functions, the limit exists at $(a,b)$ . This is often helpful for proving that a limit exists at $(a,b)$ for a function! (So you don't have to use an $\epsilon-\delta$ proof).

14.3 Partial Derivatives

Partial Derivatives

A partial derivative of a multivariate function $f$ is essentially just a univariate derivative that considers all other variables constant. Consider a two-variable function $f(x,y)$ . Then, the partial derivative with respect to $x$ at $(a,b)$ is denoted $f_{x}(a,b)$ and

f_{x}(a,b)=g'(a)

Where $g(x)=f(x,b)$ .

Similarly, the partial derivative with respect to $y$ is

f_{y}(a,b)=h'(b)

Where $h(y)=f(a,y)$ .

We can also write this in limit definition form.

\begin{align*} f_{x}(a,b) &=\lim_{ h \to 0 } \frac{f(a+h,b)-f(a,b)}{h} \\ f_{y}(a,b) &=\lim_{ h \to 0 } \frac{f(a,b+h)-f(a,b)}{h} \end{align*}

So, TL;DR, when finding the partial derivative of a multivariate function $f$ with respect to some variable, e.g. $x$ , consider all other variables constant when deriving.

With regards to notation, the partial derivative of a multivariate function $f$ with respect to $x$ is

\frac{ \partial f }{ \partial x }

Higher-Order Derivatives

The following are all second partial derivatives of $f$ :

\begin{align*} f_{xx}&=\frac{ \partial^{2}f }{ \partial x^{2} } = \frac{ \partial }{ \partial x } \left( \frac{ \partial f }{ \partial x } \right) \\ f_{yx} &= \frac{ \partial^{2}f }{ \partial y \cdot \partial x } = \frac{ \partial }{ \partial y } \left( \frac{ \partial f }{ \partial x } \right) \\ f_{xy} &=\frac{ \partial^{2}f }{ \partial x \cdot \partial y } = \frac{ \partial }{ \partial x } \left( \frac{ \partial f }{ \partial y } \right) \\ f_{yy} &= \frac{ \partial^{2} f }{ \partial y^{2} } = \frac{ \partial }{ \partial y } \left( \frac{ \partial f }{ \partial y } \right) \end{align*}

Third, fourth, etc. partial derivatives of $f$ are defined similarly.

Most functions we will see will actually have $f_{yx}=f_{xy}$ . The following theorem specifies when this occurs:

Clairaut's Theorem

Suppose $f$ is defined on a disk $D$ that contains the point $(a,b)$ . If the functions $f_{xy}$ and $f_{yx}$ are both continuous on $D$ , then $f_{xy}(a,b)=f_{yz}(a,b)$ .

14.4 Tangent Planes and Linear Approximations

Tangent Planes

Suppose a surface $S$ has equation $z=f(x,y)$ , where $f$ has continuous first partial derivatives, and let $P(x_{0},y_{0},z_{0})$ be a point on $S$ . Let $C_{1},C_{2}$ be the curves that result from intersecting the surface $S$ with the planes $y=y_{0}$ and $x=x_{0}$ . Note that $P$ lies on both $C_{1}$ and $C_{2}$ . Le $T_{1}$ and $T_{2}$ be the tangent lines to curves $C_{1}$ and $C_{2}$ at $P$ . Then the tangent plane to the surface $S$ at the point $P$ is the plane that contains both tangent lines $T_{1}$ and $T_{2}$ .

Note that, for any other curve $C$ that lies on the surface $S$ , it passes through $P\iff$ its tangent line at $P$ lies in the tangent plane. In other words, the tangent plane at $P$ consists of all possible tangent lines at $P$ , i.e. the tangent plane at $P$ best approximates the surface $S$ near $P$ . This will be covered in more detail in 14.6.

Tangent Plane Derivation

A plane passing through $P(x_{0},y_{0},z_{0})$ has the form

A(x-x_{0})+B(y-y_{0})+C(z-z_{0})=0

Let $a=-\frac{A}{C}$ and $b=-\frac{B}{C}$ . Then,

z-z_{0}=a(x-x_{0})+b(y-y_{0})

Consider the plane $y=y_{0}$ , which intersects with $S$ to form $C_{1}$ . Substituting into this equation, we get

z-z_{0}=a(x-x_{0})

Remember that $T_{1}$ is the tangent line to the curve $C_{1}$ , and lies in the tangent plane. Therefore, this above equation represents $T_{1}$ . We know that the slope of tangent $T_{1}$ can be calculated as $f_{x}(x_{0},y_{0})=\frac{ \partial f }{ \partial x }$ . Thus, $a=f_{x}(x_{0},y_{0})$ . A similar argument follows for the plane $x=x_{0}$ . Hence:

Tangent Plane Equation

Suppose $f$ has continuous partial derivatives. The equation of the tangent plane to the surface $z=f(x,y)$ at $P(x_{0},y_{0},z_{0})$ is

z=z_{0}+f_{x}(x_{0},y_{0})(x-x_{0})+f_{y}(x_{0},y_{0})(y-y_{0})

Linear Approximations

Note that the tangent plane equation essentially represents an approximation of the function $f$ at a point $P$ that is a linear function. This function $L$ is called the linearization of $f$ at $P$ , and the approximation $f\approx L$ is the linear approximation/tangent plane approximation of $f$ at $P$ .

Differentiability

Formally,

Differentiability

If $z=f(x,y)$ , then $f$ is differentiable at $(a,b)$ if $\Delta z$ can be expressed in the form

\Delta z=f_{x}(a,b)\Delta x+f_{y}(a,b)\Delta y+\epsilon_{1}\Delta x+\epsilon_{2}\Delta y

where $\epsilon_{1},\epsilon_{2}\to 0$ as $(\Delta x,\Delta y)\to(0,0)$

In words,

Differentiability

If the partial derivatives $f_{x}$ and $f_{y}$ exist near $(a,b)$ and are continuous at $(a,b)$ , then $f$ is differentiable at $(a,b)$ .

We may also write the following

Differentiability (Limit Definition)

$f$ is differentiable at $(a,b)$ if

\lim_{ (x,y) \to (a,b) } \frac{f(x,y)-h(x,y)}{\lvert (x,y)-(a,b) \rvert }=0

where $h(x,y)=f(a,b)+f_{x}(a,b)(x-a)+f_{y}(a,b)(y-b)$ , i.e. is the linear approximation of $f$ .

Differentials

For a univariate function $y=f(x)$ , we can write

f'(x)=\frac{\textrm{d} y }{\textrm{d} x } \implies \mathrm{d}y=f'(x)\mathrm{d}x

For a two-variable function $z=f(x,y)$ , we can instead write

\mathrm{d}z=f_{x}(x,y)\mathrm{d}x+f_{y}(x,y)\mathrm{d}y=\frac{ \partial z }{ \partial x } \mathrm{d}x + \frac{ \partial z }{ \partial y } \mathrm{d}y

For a general multivariate function $f(x_{1},x_{2},\dots,x_{n})$ ,

\mathrm{d}f=\sum_{i=1}^{n} \frac{ \partial f }{ \partial x_{i} } \mathrm{d}x_{i}

14.5 Chain Rule

Chain Rule

Recall the chain rule applied to univariate functions $y=f(x)$ and $x=g(t)$ :

\frac{\mathrm{d} y }{\mathrm{d} t } =\frac{\mathrm{d} y }{\mathrm{d} x } \cdot \frac{\mathrm{d} x }{\mathrm{d} t }

For multivariate functions, there actually exist several versions of the chain rule. First, we consider the case where $x$ and $y$ are univariate functions.

The Chain Rule (1)

Suppose that $z=f(x,y)$ is a differentiable function of $x$ and $y$ , where $x=g(t)$ and $y=h(t)$ are both differentiable functions in terms of $t$ . Then $z$ is a differentiable function of $t$ and

\frac{\mathrm{d} z }{\mathrm{d} t } = \frac{ \partial f }{ \partial x } \frac{\mathrm{d} x }{\mathrm{d} t } +\frac{ \partial f }{ \partial y } \frac{\mathrm{d} y }{\mathrm{d} t }

Now, we consider the case where $x$ and $y$ are multivariate functions.

The Chain Rule (2)

Suppose that $z=f(x,y)$ is a differentiable function of $x$ and $y$ , where $x=g(s,t)$ and $y=h(s,t)$ are differentiable functions of $s$ and $t$ . Then

\frac{ \partial z }{ \partial s } = \frac{ \partial z }{ \partial x } \frac{ \partial x }{ \partial s } + \frac{ \partial z }{ \partial y } \frac{ \partial y }{ \partial s } \qquad \qquad \frac{ \partial z }{ \partial t } = \frac{ \partial z }{ \partial x } \frac{ \partial x }{ \partial t } + \frac{ \partial z }{ \partial y } \frac{ \partial y }{ \partial t }

In other words, just apply the case 1 chain rule separately to $s$ and $t$ .

The Chain Rule (General)

Suppose $u$ is a differentiable function of the $n$ variables $x_{1},x_{2},\dots,x_{n}$ and each $x_{i}$ is a differentiable function of the $m$ variables $t_{1},t_{2},\dots,t_{m}$ . Then $u$ is af function of $t_{1},t_{2},\dots,t_{m}$ and

\frac{ \partial u }{ \partial t_{i} } = \sum_{j=1}^{n} \frac{ \partial u }{ \partial x_{j} } \frac{ \partial x_{j} }{ \partial t_{i} }

Implicit Differentiation

Suppose that an equation of the form $F(x,y)=0$ defines $y$ implicitly as a differentiable function of $x$ , i.e. $y=f(x)$ where $F(x,f(x))=0,\forall x \in \mathrm{Domain}(f)$ . If $F$ is differentiable, then we can apply Chain Rule to differentiate $F(x,y)=0$ with respect to $x$ :

\begin{align*} \frac{ \partial F }{ \partial x } \cancelto{ 1 }{ \frac{\mathrm{d} x }{\mathrm{d} x } } +\frac{ \partial F }{ \partial y } \frac{\mathrm{d} y }{\mathrm{d} x } &= 0 \\ \frac{ \partial F }{ \partial x } +\frac{ \partial F }{ \partial y } \frac{\mathrm{d} y }{\mathrm{d} x } &= 0 \\ \frac{ \partial F }{ \partial y } \frac{\mathrm{d} y }{\mathrm{d} x } &= -\frac{ \partial F }{ \partial x } \\ \frac{\mathrm{d} y }{\mathrm{d} x } &= \boxed{ -\frac{\frac{ \partial F }{ \partial x }}{\frac{ \partial F }{ \partial y }} } \\ \end{align*}

Now suppose that $z$ is given implicitly as a function $z=f(x,y)$ by an equation of the form $F(x,y,z)=0$ , i.e. $F(x,y,f(x,y))=0,\forall(x,y)\in \mathrm{Domain}(f)$ . If $F$ and $f$ are differentiable, then we can apply Chain Rule to differentiate the equation $F(x,y,z)=0$ with respect to $x$ :

\begin{align*} \frac{ \partial F }{ \partial x } \cancelto{ 1 }{ \frac{ \partial x }{ \partial x } }+\frac{ \partial F }{ \partial y } \cancelto{ 0 }{ \frac{ \partial y }{ \partial x } } +\frac{ \partial F }{ \partial z } \frac{ \partial z }{ \partial x } &= 0 \\ \frac{ \partial F }{ \partial z } \frac{ \partial z }{ \partial x } &= -\frac{ \partial F }{ \partial x } \\ \frac{ \partial z }{ \partial x } &= \boxed{ -\frac{\frac{ \partial F }{ \partial x } }{\frac{ \partial F }{ \partial z } } } \end{align*}

Similarly, by differentiating with respect to $y$ , we derive

\frac{ \partial z }{ \partial y } =\boxed{ -\frac{\frac{ \partial F }{ \partial y } }{\frac{ \partial F }{ \partial z } } }

As a sidenote, the textbook mentions that the Implicit Function Theorem as stipulating conditions under which these are valid. These are not included in the notes because they seemed to be rather irrelevant to the necessary knowledge for Math 53.

14.6 Directional Derivatives and the Gradient Vector

Directional Derivative

The following diagram may be useful to reference as you read this section:

Consider $z=f(x,y)$ . We aim to find the rate of change of $z$ at $(x_{0},y_{0})$ in the direction of the arbitrary unit vector $\mathbf{u}=\langle a,b \rangle$ .

Consider the surface described by $z$ , and let $z_{0}=f(x_{0},y_{0})$ . Then $P(x_{0},y_{0},z_{0})$ lies on $S$ . The vertical plane that passes through $P$ in the direction of $\mathbf{u}$ intersects $S$ in a curve $C$ . The slope of the tangent line $T$ to $C$ at the point $P$ is the rate of change of $z$ in the direction of $\mathbf{u}$ .

Let $Q(x,y,z)$ be another point on $C$ . Let $P',Q'$ be the projections of $P,Q$ onto the $xy$ -plane. Then, the vector $\overrightarrow{P'Q'} \parallel \mathbf{u}$ , and thus

\overrightarrow{P'Q'}=h\mathbf{u}=\langle ha,hb \rangle

for some scalar $h$ . In other words, $x-x_{0}=ha$ and $y-y_{0}=hb$ , i.e. $x=x_{0}+ha$ and $y=y_{0}+hb$ . Therefore,

\frac{\Delta z}{h}=\frac{z-z_{0}}{h}=\frac{f(x_{0}+ha,y_{0}+hb)-f(x_{0},y_{0})}{h}

Taking the limit as $h\to 0$ , we get

Directional Derivative (Limit Definition)

The directional derivative of $f$ at $(x_{0},y_{0})$ in the direction of the unit vector $\mathbf{u}=\langle a,b \rangle$ is

D_{\mathbf{u}}f(x_{0},y_{0})=\lim_{ h \to 0 } \frac{f(x_{0}+ha,y_{0}+hb)-f(x_{0},y_{0})}{h}

(if the limit exists)

We can rewrite this in a more useful form:

Directional Derivative (Derivative Definition)

If $f$ is a differentiable function of $x,y$ , then $f$ has a directional derivative in the direction of any unit vector $\mathbf{u}=\langle a.b \rangle$ and

D_{\mathbf{u}}f(x,y)=f_{x}(x,y)a+f_{y}(x,y)b

The above follows directly from Chain Rule, and its proof is left as an exercise to the reader. (Hint: consider deriving the function $g(h)=f(x_{0}+ha,y_{0}+hb)$ ).

Also, if $\mathbf{u}$ make an angle $\theta$ with the positive $x$ -axis, we can write $\mathbf{u}=\langle \cos\theta,\sin\theta \rangle$ . Then, the formula becomes

D_{u}f(x,y)=f_{x}(x,y)\cos\theta+f_{y}(x,y)\sin\theta

Gradient Vector

Note that the directional derivative of $f(x,y)$ can actually be written as a dot product:

\begin{align*} D_{u}f(x,y) &= f_{x}(x,y)a + f_{y}(x,y)b \\ &= \langle f_{x}(x,y),f_{y}(x,y) \rangle \cdot \langle a,b \rangle \\ &= \langle f_{x}(x,y),f_{y}(x,y) \rangle \cdot \mathbf{u} \end{align*}

The first vector in the dot product appears frequently in many contexts, and so is specially denoted as follows:

Gradient

If $f$ is a function of two variables $x,y$ , then the gradient of $f$ is the vector function $\nabla f$ defined by

\nabla f(x,y)=\langle f_{x}(x,y),f_{y}(x,y) \rangle =\frac{ \partial f }{ \partial x } \mathbf{i}+\frac{ \partial f }{ \partial y } \mathbf{j}

We may then rewrite the directional derivative equation again.

Directional Derivative (Gradient Definition)

D_{u}f(x,y)=\nabla f(x,y)\cdot \mathbf{u}

In other words, the directional derivative in the direction of $\mathbf{u}$ is the scalar projection of the gradient vector onto $\mathbf{u}$ . (Recall $\mathrm{comp}_{a}b=\frac{a\cdot b}{\lvert a \rvert}$ , and that $\mathbf{u}$ is a unit vector).

Maximizing the Directional Derivative

We aim to find the maximal directional derivative of $f$ at a given point, i.e. the direction in which $f$ changes the fastest at a given point and the corresponding magnitude of rate of change.

Note that

\begin{align*} D_{u}f &= \nabla f\cdot \mathbf{u} \\ &= \lvert \nabla f \rvert \lvert u \rvert \cos\theta \\ &= \lvert \nabla f \rvert \cos\theta \end{align*}

where $\theta$ denotes the angle between $\nabla f$ and $\mathbf{u}$ . The maximum value of $\cos\theta$ is $1$ , and occurs when $\theta=0$ . Therefore, the maximum value of $D_{u}f$ is $\lvert \nabla f \rvert$ , and it occurs when $\theta=0\implies$ $\mathbf{u} \parallel \nabla f$ , i.e. the two vectors have the same direction. Thus,

Maximal Directional Derivative

The maximum value of the directional derivative $D_{u}f$ is $\lvert \nabla f \rvert$ and occurs when $\mathbf{u} \parallel \nabla f$ .

Tangent Planes to a Level Surface

Suppose $S$ is a surface with equation $F(x,y,z)=k$ , i.e. it is a level surface of a three-variable function $F$ . Let $P(x_{0},y_{0},z_{0})$ lie on $S$ and let $C$ be an arbitrary curve that lies on $S$ and passes through $P$ . Let $C=\mathbf{r}(t)=\langle x(t),y(t),z(t) \rangle$ , and let $t_{0}\in \mathbb{R}$ such that $P=\mathbf{r}(t_{0})$ . Consider that

\begin{align*} F(x(t),y(t),z(t)) &= k \\ \frac{ \partial }{ \partial t } F(x(t),y(t),z(t)) &= \frac{ \partial }{ \partial t } k \\ \frac{ \partial F }{ \partial x } \frac{\mathrm{d} x }{\mathrm{d} t } +\frac{ \partial F }{ \partial y } \frac{\mathrm{d} y }{\mathrm{d} t } + \frac{ \partial F }{ \partial z } \frac{\mathrm{d} z }{\mathrm{d} t } &= 0 \\ \langle F_{x},F_{y},F_{z} \rangle \cdot \langle x'(t),y'(t),z'(t) \rangle &= 0 \\ \nabla F\cdot \mathbf{r}'(t) &= 0 \end{align*}

Now, substituting $t=t_{0}$ ,

\nabla F(x_{0},y_{0},z_{0})\cdot \mathbf{r'}(t_{0})=0

In other words,

theorem

The gradient vector at $P$ of $F(x,y,z)$ is perpendicular to the tangent vector $\mathbf{r'}(t_{0})$ to any curve $C$ on $S$ that passes through $P$ .

If $\nabla F(x_{0},y_{0},z_{0})\neq \mathbf{0}$ , we can then define the tangent plane to the level surface at $P(x_{0},y_{0},z_{0})$ as the plane that passes through $P$ and has normal vector $\nabla F(x_{0},y_{0},z_{0})$ . Thus,

Tangent Plane to a Level Surface

F_{x}(x_{0},y_{0},z_{0})(x-x_{0})+F_{y}(x_{0},y_{0},z_{0})(y-y_{0})+F_{z}(x_{0},y_{0},z_{0})(z-z_{0})=0

We can also define a line called the normal line to $S$ at $P$ such that it passes through $P$ and is perpendicular to the tangent plane. The direction of this normal line is clearly the same as the gradient vector, and as such is defined as

Normal Line

\frac{x-x_{0}}{F_{x}(x_{0},y_{0},z_{0})}=\frac{y-y_{0}}{F_{y}(x_{0},y_{0},z_{0})}=\frac{z-z_{0}}{F_{z}(x_{0},y_{0},z_{0})}

14.7 Maximum and Minimum Values

A two-variable function $f$ has a local maximum at $(a,b)$ if $f(x,y)\leq f(a,b)$ when $(x,y)$ is near $(a,b)$ . A local minimum is defined analogously.

If $f(x,y)\leq f(a,b)$ , $\forall(x,y)\in D$ , the domain of $f(x,y)$ ,then $f(a,b)$ is an absolute maximum. An absolute minimum is defined analogously.

There exist the following tests for finding the local extrema of $f(x,y)$ , which are analogous to their univariate counterparts.

First Derivative Test

If $f$ has a local extrema at $(a,b)$ and the first order partial derivatives exist there, then $f_{x}(a,b)=f_{y}(a,b)=0$ . This is considered a critical point.

Second Derivative Test

Suppose the second partial derivatives of $f$ are continuous on a disk with center $(a,b)$ , and $(a,b)$ is a critical point of $f$ . Let

D(a,b)=f_{xx}(a,b)f_{yy}(a,b)-[f_{xy}(a,b)]^{2}=\begin{vmatrix*} f_{xx} & f_{xy} \\ f_{yx} & f_{yy} \end{vmatrix*}

Then,
(1) $D>0$ and $f_{xx}(a,b)>0$ $\implies$ $f(a,b)$ is a local minimum.
(2) $D>0$ and $f_{xx}(a,b)<0$ $\implies$ $f(a,b)$ is a local maximum.
(3) $D<0$ $\implies$ $f(a,b)$ is not a local extrema. Instead, $(a,b)$ is a saddle point of $f$ .
(4) $D=0$ $\implies$ no information about $f(a,b)$ .

Meanwhile, we also consider the absolute extrema of $f(x,y)$ over some closed interval. For $\mathbb{R}^{2}$ , a bounded set is one that is contained in some disk. Meanwhile, a closed set is one that additionally contains all its boundary points. With these definitions, we then define the Extreme Value Theorem for two-variable functions.

Extreme Value Theorem

If $f$ is continuous on a closed, bounded set $D$ in $\mathbb{R}^{2}$ , then $f$ attains an absolute maximum value $f(x_{1},y_{1})$ and an absolute minimum value $f(x_{2},y_{2})$ at some points $(x_{1},y_{1})$ and $(x_{2},y_{2})$ in $D$ .

To find the absolute extrema, we perform the following process.

Finding Absolute Extrema

Find the values of $f$ at the critical points in $D$ .
Find the extreme values of $f$ on the boundary of $D$ .

Notice the similarity to the univariate analogue! However, one major difference is that, when determining the extreme values on the boundary of $D$ , it is not possible to calculate the values of every point on the boundary of $D$ (since there are infinitely many!). Instead, it suffices to set constraints on $x,y$ such that $(x,y)$ is on the boundary, and determine the extreme values of $f(x,y)$ over these constraints, typically through taking the derivatives of the consequent univariate functions. It may be necessary to split the boundary up into multiple different constraints. See the textbook for some helpful examples!

14.8 Lagrange Multipliers

One Constraint

Lagrange's method is a way of maximizing/minimizing a general function $f(x,y,z)$ when a constraint of the form $g(x,y,z)=k$ is considered.

Consider the function $f(x,y,z)$ subject to the constraint $g(x,y,z)=k$ . In other words, $(x,y,z)$ is restricted to lie on the level surface $S$ with equation $g(x,y,z)=k$ .

Suppose $f$ has an extreme value at a point $P(x_{0},y_{0},z_{0})$ on the surface $S$ and let $C$ be a curve with vector equation $\mathbf{r}(t)=\langle x(t),y(t),z(t) \rangle$ that lies on $S$ and passes through $P$ . If $t_{0}$ is the parameter value corresponding to the point $P$ , then $\mathbf{r}(t_{0})=\langle x_{0},y_{0},z_{0} \rangle$ . The composite function $h(t)=f(x(t),y(t),z(t))$ represents the values that $f$ takes on the curve $C$ . Since $f$ has an extreme value at $(x_{0},y_{0},z_{0})$ , which corresponds to $t_{0}$ for $\mathbf{r}(t)$ , $h$ has an extreme value at $t_{0}$ , i.e. $h'(t_{0})=0$ . Since $f$ is differentiable, we can apply Chain Rule as follows:

\begin{align*} 0 &= h'(t_{0}) \\ &= f_{x}(x_{0},y_{0},z_{0})x'(t_{0})+f_{y}(x_{0},y_{0},z_{0})y'(t_{0})+f_{z}(x_{0},y_{0},z_{0})z'(t_{0}) \\ &= \nabla f(x_{0},y_{0},z_{0})\cdot \mathbf{r}'(t_{0}) \end{align*}

In other words, the gradient vector $\nabla f(x_{0},y_{0},z_{0})$ is orthogonal to the tangent vector $\mathbf{r}'(t_{0})$ to every such curve $C$ . From [[#Tangent Planes to a Level Surface|section 14.6]], we know that the gradient vector of $g$ , $\nabla g(x_{0},y_{0},z_{0})$ , is also orthogonal to $\mathbf{r}'(t_{0})$ for every curve $C$ , since $g(x,y,z)=k$ describes the level surface $S$ . In other words, $\nabla f(x_{0},y_{0},z_{0})\parallel \nabla g(x_{0},y_{0},z_{0})$ . Further, if $\nabla g(x_{0},y_{0},z_{0}) \neq 0$ , then $\exists \lambda$ such that

\nabla f(x_{0},y_{0},z_{0})=\lambda \; \nabla g(x_{0},y_{0},z_{0})

$\lambda$ is called the Lagrange multiplier.

Method of Lagrange Multipliers

To find the extreme values of $f(x,y,z)$ subject to the constraint $g(x,y,z)=k$ (assuming these extreme values exist and $\nabla g \neq 0$ on the surface $g(x,y,z)=k$ ).

Find all values of $(x,y,z)$ and $\lambda$ such that $\nabla f(x,y,z)=\lambda \; \nabla g(x,y,z)$ and $g(x,y,z)=k$ .
Evaluate $f$ at all points $(x,y,z)$ from step 1. The largest of these values is the maximum value of $f$ , and analogously for the minimum value of $f$ .

For a 3-variable function, we can write the vector equation $\nabla f=\lambda \; \nabla g$ in terms of its components to transform the equation in step 1 into

f_{x}=\lambda g_{x} \qquad f_{y}=\lambda g_{y} \qquad f_{z}=\lambda g_{z} \qquad g(x,y,z)=k

This is a system of four unknowns and four equations, so it is solvable!

Two Constraints

We can also find the extreme values of $f(x,y,z)$ subject to two constraints $g(x,y,z)=k$ and $h(x,y,z)=c$ . Then, we can write

\nabla f(x_{0},y_{0},z_{0})=\lambda\;\nabla g(x_{0},y_{0},z_{0})+\mu\;\nabla h(x_{0},y_{0},z_{0})