First several pages of this chapter are just boring explanations of two-variable functions...
Level Curves
The level curves of a function f of two variables are the curves with equations f(x,y)=k, where k is a constant (in the range of f).
14.2 Limits and Continuity
Limits of Multivariate Functions
For a two-variable function,
(x,y)→(a,b)limf(x,y)=L
The definition extends similarly to higher dimensions.
Formally, (ϵ−δ),
Limit Definition
Let f be a function of two variables whose domain D includes points arbitrarily close to (a,b). The limit of f(x,y) as (x,y) approaches (a,b) is L if for every number ϵ>0 there is a corresponding number δ>0 such that if (x,y)∈D and 0<(x−a)2+(y−b)2<δ then ∣f(x,y)−L∣<ϵ.
In other words, the distance between f(x,y) and L can be made infinitesimal by making the distance from (x,y) to (a,b) infinitesimal.
There is one complication with limits for multivariate functions that does not appear in univariate functions, though. Namely, (a,b) can be approached by (x,y) in any direction, infinitely many times. Thus, if there exist more than one direction that (a,b) is approached from and that causes f(x,y) to approach more than one distinct value, the limit of f(x,y) at (a,b)does not exist. Formally,
Limit Existence
If f(x,y)→L1 as (x,y)→(a,b) along a path C1 and f(x,y)→L2 as (x,y)→(a,b) along a path C2, where L1=L2, then lim(x,y)→(a,b)f(x,y) does not exist.
Continuity
A two-variable function f is continuous at (a,b) if
(x,y)→(a,b)limf(x,y)=f(a,b)
Also, if f is continuous at every point (a,b) in its domain D, then f is continuous on D.
Using the properties of limits, it is important to note the following property of continuous functions:
A function h produced by f∗g, where ∗ is one of +,−,×,÷, is continuous if and only if f,g are continuous.
As a result of this fact, all multivariate polynomials, as well as rational functions, are continuous.
Continuity for Proving Limits
If a function is continuous at (a,b), then the limit exists at (a,b). Then, for a function that is a composition of continuous functions, the limit exists at (a,b). This is often helpful for proving that a limit exists at (a,b) for a function! (So you don't have to use an ϵ−δ proof).
14.3 Partial Derivatives
Partial Derivatives
A partial derivative of a multivariate function f is essentially just a univariate derivative that considers all other variables constant. Consider a two-variable function f(x,y). Then, the partial derivative with respect to x at (a,b) is denoted fx(a,b) and
fx(a,b)=g′(a)
Where g(x)=f(x,b).
Similarly, the partial derivative with respect to y is
So, TL;DR, when finding the partial derivative of a multivariate function f with respect to some variable, e.g. x, consider all other variables constant when deriving.
With regards to notation, the partial derivative of a multivariate function f with respect to x is
∂x∂f
Higher-Order Derivatives
The following are all second partial derivatives of f:
Third, fourth, etc. partial derivatives of f are defined similarly.
Most functions we will see will actually have fyx=fxy. The following theorem specifies when this occurs:
Clairaut's Theorem
Suppose f is defined on a disk D that contains the point (a,b). If the functions fxy and fyx are both continuous on D, then fxy(a,b)=fyz(a,b).
14.4 Tangent Planes and Linear Approximations
Tangent Planes
Suppose a surface S has equation z=f(x,y), where f has continuous first partial derivatives, and let P(x0,y0,z0) be a point on S. Let C1,C2 be the curves that result from intersecting the surface S with the planes y=y0 and x=x0. Note that P lies on both C1 and C2. Le T1 and T2 be the tangent lines to curves C1 and C2 at P. Then the tangent plane to the surface S at the point P is the plane that contains both tangent lines T1 and T2.
Note that, for any other curve C that lies on the surface S, it passes through P⟺ its tangent line at P lies in the tangent plane. In other words, the tangent plane at P consists of all possible tangent lines at P, i.e. the tangent plane at P best approximates the surface S near P. This will be covered in more detail in 14.6.
Tangent Plane Derivation
A plane passing through P(x0,y0,z0) has the form
A(x−x0)+B(y−y0)+C(z−z0)=0
Let a=−CA and b=−CB. Then,
z−z0=a(x−x0)+b(y−y0)
Consider the plane y=y0, which intersects with S to form C1. Substituting into this equation, we get
z−z0=a(x−x0)
Remember that T1 is the tangent line to the curve C1, and lies in the tangent plane. Therefore, this above equation represents T1. We know that the slope of tangent T1 can be calculated as fx(x0,y0)=∂x∂f. Thus, a=fx(x0,y0). A similar argument follows for the plane x=x0. Hence:
Tangent Plane Equation
Suppose f has continuous partial derivatives. The equation of the tangent plane to the surface z=f(x,y) at P(x0,y0,z0) is
z=z0+fx(x0,y0)(x−x0)+fy(x0,y0)(y−y0)
Linear Approximations
Note that the tangent plane equation essentially represents an approximation of the function f at a point P that is a linear function. This function L is called the linearization of f at P, and the approximation f≈L is the linear approximation/tangent plane approximation of f at P.
Differentiability
Formally,
Differentiability
If z=f(x,y), then f is differentiable at (a,b) if Δz can be expressed in the form
Δz=fx(a,b)Δx+fy(a,b)Δy+ϵ1Δx+ϵ2Δy
where ϵ1,ϵ2→0 as (Δx,Δy)→(0,0)
In words,
Differentiability
If the partial derivatives fx and fy exist near (a,b) and are continuous at (a,b), then f is differentiable at (a,b).
We may also write the following
Differentiability (Limit Definition)
f is differentiable at (a,b) if
(x,y)→(a,b)lim∣(x,y)−(a,b)∣f(x,y)−h(x,y)=0
where h(x,y)=f(a,b)+fx(a,b)(x−a)+fy(a,b)(y−b), i.e. is the linear approximation of f.
Differentials
For a univariate function y=f(x), we can write
f′(x)=dxdy⟹dy=f′(x)dx
For a two-variable function z=f(x,y), we can instead write
dz=fx(x,y)dx+fy(x,y)dy=∂x∂zdx+∂y∂zdy
For a general multivariate function f(x1,x2,…,xn),
df=i=1∑n∂xi∂fdxi
14.5 Chain Rule
Chain Rule
Recall the chain rule applied to univariate functions y=f(x) and x=g(t):
dtdy=dxdy⋅dtdx
For multivariate functions, there actually exist several versions of the chain rule. First, we consider the case where x and y are univariate functions.
The Chain Rule (1)
Suppose that z=f(x,y) is a differentiable function of x and y, where x=g(t) and y=h(t) are both differentiable functions in terms of t. Then z is a differentiable function of t and
dtdz=∂x∂fdtdx+∂y∂fdtdy
Now, we consider the case where x and y are multivariate functions.
The Chain Rule (2)
Suppose that z=f(x,y) is a differentiable function of x and y, where x=g(s,t) and y=h(s,t) are differentiable functions of s and t. Then
In other words, just apply the case 1 chain rule separately to s and t.
The Chain Rule (General)
Suppose u is a differentiable function of the n variables x1,x2,…,xn and each xi is a differentiable function of the m variables t1,t2,…,tm. Then u is af function of t1,t2,…,tm and
∂ti∂u=j=1∑n∂xj∂u∂ti∂xj
Implicit Differentiation
Suppose that an equation of the form F(x,y)=0 defines y implicitly as a differentiable function of x, i.e. y=f(x) where F(x,f(x))=0,∀x∈Domain(f). If F is differentiable, then we can apply Chain Rule to differentiate F(x,y)=0 with respect to x:
Now suppose that z is given implicitly as a function z=f(x,y) by an equation of the form F(x,y,z)=0, i.e. F(x,y,f(x,y))=0,∀(x,y)∈Domain(f). If F and f are differentiable, then we can apply Chain Rule to differentiate the equation F(x,y,z)=0 with respect to x:
Similarly, by differentiating with respect to y, we derive
∂y∂z=−∂z∂F∂y∂F
As a sidenote, the textbook mentions that the Implicit Function Theorem as stipulating conditions under which these are valid. These are not included in the notes because they seemed to be rather irrelevant to the necessary knowledge for Math 53.
14.6 Directional Derivatives and the Gradient Vector
Directional Derivative
The following diagram may be useful to reference as you read this section:
Consider z=f(x,y). We aim to find the rate of change of z at (x0,y0) in the direction of the arbitrary unit vector u=⟨a,b⟩.
Consider the surface described by z, and let z0=f(x0,y0). Then P(x0,y0,z0) lies on S. The vertical plane that passes through P in the direction of u intersects S in a curve C. The slope of the tangent line T to C at the point P is the rate of change of z in the direction of u.
Let Q(x,y,z) be another point on C. Let P′,Q′ be the projections of P,Q onto the xy-plane. Then, the vector P′Q′∥u, and thus
P′Q′=hu=⟨ha,hb⟩
for some scalar h. In other words, x−x0=ha and y−y0=hb, i.e. x=x0+ha and y=y0+hb. Therefore,
hΔz=hz−z0=hf(x0+ha,y0+hb)−f(x0,y0)
Taking the limit as h→0, we get
Directional Derivative (Limit Definition)
The directional derivative of f at (x0,y0) in the direction of the unit vector u=⟨a,b⟩ is
If f is a differentiable function of x,y, then f has a directional derivative in the direction of any unit vector u=⟨a.b⟩ and
Duf(x,y)=fx(x,y)a+fy(x,y)b
The above follows directly from Chain Rule, and its proof is left as an exercise to the reader. (Hint: consider deriving the function g(h)=f(x0+ha,y0+hb)).
Also, if u make an angle θ with the positive x-axis, we can write u=⟨cosθ,sinθ⟩. Then, the formula becomes
Duf(x,y)=fx(x,y)cosθ+fy(x,y)sinθ
Gradient Vector
Note that the directional derivative of f(x,y) can actually be written as a dot product:
The first vector in the dot product appears frequently in many contexts, and so is specially denoted as follows:
Gradient
If f is a function of two variables x,y, then the gradient of f is the vector function ∇f defined by
∇f(x,y)=⟨fx(x,y),fy(x,y)⟩=∂x∂fi+∂y∂fj
We may then rewrite the directional derivative equation again.
Directional Derivative (Gradient Definition)
Duf(x,y)=∇f(x,y)⋅u
In other words, the directional derivative in the direction of u is the scalar projection of the gradient vector onto u. (Recall compab=∣a∣a⋅b, and that u is a unit vector).
Maximizing the Directional Derivative
We aim to find the maximal directional derivative of f at a given point, i.e. the direction in which f changes the fastest at a given point and the corresponding magnitude of rate of change.
Note that
Duf=∇f⋅u=∣∇f∣∣u∣cosθ=∣∇f∣cosθ
where θ denotes the angle between ∇f and u. The maximum value of cosθ is 1, and occurs when θ=0. Therefore, the maximum value of Duf is ∣∇f∣, and it occurs when θ=0⟹u∥∇f, i.e. the two vectors have the same direction. Thus,
Maximal Directional Derivative
The maximum value of the directional derivative Duf is ∣∇f∣ and occurs when u∥∇f.
Tangent Planes to a Level Surface
Suppose S is a surface with equation F(x,y,z)=k, i.e. it is a level surface of a three-variable function F. Let P(x0,y0,z0) lie on S and let C be an arbitrary curve that lies on S and passes through P. Let C=r(t)=⟨x(t),y(t),z(t)⟩, and let t0∈R such that P=r(t0). Consider that
The gradient vector at P of F(x,y,z) is perpendicular to the tangent vector r′(t0) to any curve C on S that passes through P.
If ∇F(x0,y0,z0)=0, we can then define the tangent plane to the level surface at P(x0,y0,z0) as the plane that passes through P and has normal vector ∇F(x0,y0,z0). Thus,
We can also define a line called the normal line to S at P such that it passes through P and is perpendicular to the tangent plane. The direction of this normal line is clearly the same as the gradient vector, and as such is defined as
Then,
(1) D>0 and fxx(a,b)>0⟹f(a,b) is a local minimum.
(2) D>0 and fxx(a,b)<0⟹f(a,b) is a local maximum.
(3) D<0⟹f(a,b) is not a local extrema. Instead, (a,b) is a saddle point of f.
(4) D=0⟹ no information about f(a,b).
Meanwhile, we also consider the absolute extrema of f(x,y) over some closed interval. For R2, a bounded set is one that is contained in some disk. Meanwhile, a closed set is one that additionally contains all its boundary points. With these definitions, we then define the Extreme Value Theorem for two-variable functions.
Extreme Value Theorem
If f is continuous on a closed, bounded set D in R2, then f attains an absolute maximum value f(x1,y1) and an absolute minimum value f(x2,y2) at some points (x1,y1) and (x2,y2) in D.
To find the absolute extrema, we perform the following process.
Finding Absolute Extrema
Find the values of f at the critical points in D.
Find the extreme values of f on the boundary of D.
Notice the similarity to the univariate analogue! However, one major difference is that, when determining the extreme values on the boundary of D, it is not possible to calculate the values of every point on the boundary of D (since there are infinitely many!). Instead, it suffices to set constraints on x,y such that (x,y) is on the boundary, and determine the extreme values of f(x,y) over these constraints, typically through taking the derivatives of the consequent univariate functions. It may be necessary to split the boundary up into multiple different constraints. See the textbook for some helpful examples!
14.8 Lagrange Multipliers
One Constraint
Lagrange's method is a way of maximizing/minimizing a general function f(x,y,z) when a constraint of the form g(x,y,z)=k is considered.
Consider the function f(x,y,z) subject to the constraint g(x,y,z)=k. In other words, (x,y,z) is restricted to lie on the level surface S with equation g(x,y,z)=k.
Suppose f has an extreme value at a point P(x0,y0,z0) on the surface S and let C be a curve with vector equation r(t)=⟨x(t),y(t),z(t)⟩ that lies on S and passes through P. If t0 is the parameter value corresponding to the point P, then r(t0)=⟨x0,y0,z0⟩. The composite function h(t)=f(x(t),y(t),z(t)) represents the values that f takes on the curve C. Since f has an extreme value at (x0,y0,z0), which corresponds to t0 for r(t), h has an extreme value at t0, i.e. h′(t0)=0. Since f is differentiable, we can apply Chain Rule as follows:
In other words, the gradient vector ∇f(x0,y0,z0) is orthogonal to the tangent vector r′(t0) to every such curve C. From [[#Tangent Planes to a Level Surface|section 14.6]], we know that the gradient vector of g, ∇g(x0,y0,z0), is also orthogonal to r′(t0) for every curve C, since g(x,y,z)=k describes the level surface S. In other words, ∇f(x0,y0,z0)∥∇g(x0,y0,z0). Further, if ∇g(x0,y0,z0)=0, then ∃λ such that
∇f(x0,y0,z0)=λ∇g(x0,y0,z0)
λ is called the Lagrange multiplier.
Method of Lagrange Multipliers
To find the extreme values of f(x,y,z) subject to the constraint g(x,y,z)=k (assuming these extreme values exist and ∇g=0 on the surface g(x,y,z)=k).
Find all values of (x,y,z) and λ such that ∇f(x,y,z)=λ∇g(x,y,z) and g(x,y,z)=k.
Evaluate f at all points (x,y,z) from step 1. The largest of these values is the maximum value of f, and analogously for the minimum value of f.
For a 3-variable function, we can write the vector equation ∇f=λ∇g in terms of its components to transform the equation in step 1 into
fx=λgxfy=λgyfz=λgzg(x,y,z)=k
This is a system of four unknowns and four equations, so it is solvable!
Two Constraints
We can also find the extreme values of f(x,y,z) subject to two constraints g(x,y,z)=k and h(x,y,z)=c. Then, we can write