flowchart LR
A("2x = λ 2x <br> 4y = λ 2y <br> x² + y² = 1")
A --> B("2x (1-λ) = 0 <br> 2y (2-λ) = 0 <br> x² + y² = 1")
B -->|"x=0"| C("2y (2-λ) = 0 <br> y² = 1")
B -->|"λ=1"| D("2y = 0 <br> x² + y² = 1")
C -->|"y=0"| E("0 = 1"):::contradiction
C -->|"λ=2"| F("y² = 1")
F -->|"y=±1"| G("(x,y,λ) = (0,1,2), (0,-1,2)"):::solution
D -->|"y=0"| H("x² = 1")
H -->|"x=±1"| I("(x,y,λ) = (1,0,1), (-1,0,1)"):::solution
19 Lagrange multipliers I
Introduction
The second derivative test allows us to identify local maxima and minima of a function \(f(x,y)\) (this is often called optimization). The main goal of today is to solve a different, but related problem:
We want to optimize a function of two variables, \(f(x,y)\), (or maybe even a function of more variables, like \(f(x,y,z)\) or \(f(x_1,\dots,x_n)\)) but we are subject to a constraint \(g(x,y)=c\). We call \(f\) the objective function and \(g\) the constraint function.
In other words, we want to figure out how to make \(f(x,y)\) as large as possible (or as small as possible), but we are only allowed to plug in values where \(g(x,y)=c\).
We want to find the maximum and minimum values of \(f(x,y)\) subject to the constraint \(g(x,y) = c\):
Show[
ContourPlot[
Sin[Pi x] Cos[Pi y],
{x, -2, 2}, {y, -2, 2},
ColorFunction -> "Pastel"
],
ContourPlot[
4 x^2 + 4 y^2 - 4 x == 3,
{x, -2, 2}, {y, -2, 2},
ContourStyle -> Black
]
]
We want to understand the extrema of the function whose contours are plotted above, but only along the curve \(g(x,y) = 3\) (in black). Switching to the \(z = f(x,y)\) surface plot perspective, we have:
ContourPlot3D[
{
z == Sin[Pi x] Cos[Pi y],
4 x^2 + 4 y^2 - 4 x == 3
},
{x, -2, 2}, {y, -2, 2}, {z, -2, 2},
ContourStyle -> {
Opacity[0.6],
Opacity[0]
},
BoundaryStyle -> {
{1, 2} -> Thick,
2 -> None
},
Mesh -> None
]
Using these plots, how can we visually identify the desired extrema?
Main result
The system of \(f(x,y)\) constrained to \(g(x,y) = c\) will have an extreme value at \((x_0,y_0)\) if:
- \(f(x,y)\) has an extreme value in its own right, or
- The level curves of \(f\) and \(g\) are tangent at \((x_0,y_0)\).
Remember that level curves of a function are orthogonal to gradient vectors! In particular, the above conditions mean that
- \(\nabla f(x_0,y_0) = 0\), or
- \(\nabla f(x_0,y_0)\) and \(\nabla g(x_0,y_0)\) are parallel.
We can wrap these into a single equality (the Lagrange equations):
\[\nabla f = \lambda \nabla g \text{ for some } \lambda \in \mathbb{R}.\]
Values \((x_0,y_0,\lambda_0)\) satisfying this equation are called stationary points of the Lagrangian.
The stationary points of the Lagrangian are: \((-1, 0, 1), (1, 0, 1), (0,-1, 2), (0, 1, 2)\). The first two give minima of the system (\(f=1\)) and the last two give maxima (\(f=2\)). One way we can approach the algebra is:
Notice how we rely on these forks in logic to pin down solutions, and some of them lead to contradictions!
The stationary points of the Lagrangian are: \((-1,-1, 0), (1, 1, 0), (-1, 1, 4), (1, -1, 4)\). The first two give minima of the system (\(f=-2\)) and the last two give maxima (\(f=6\)).
In general, a system like this can be quite difficult to solve; the algebra happens to work out nicely in this case. A good trick is to multiply the two equations from \(\nabla f = \lambda \nabla g\) by \(x\) and \(y\), respectively, so we have two terms equal to \(2 \lambda x y\) which we can equate:
\[4x^3y - 4y^2 = \lambda 2xy = 4xy^4 - 4x^2 \stackrel{\text{rearrange}}{\implies} x^3 y - x y^4 - y^2 + x^2 = 0 \stackrel{\text{factor}}{\implies} (x-y)(x+y)(xy+1) = 0. \]
Solving the system looks like this:
flowchart LR
A("4x³-4y=λ2x <br> 4y³-4x=λ2y <br> x²+y²=2")
A --> B("4x³-4y-2λx=0 <br> (x-y)(x+y)(xy+1)=0 <br> x²+y²=2")
B -->|"y=x"| C("2x(2x²-2-λ)=0 <br> 2x²=2")
B -->|"y=-x"| D("2x(2x²+2-λ)=0 <br> 2x²=2")
B -->|"y=-1/x"| E("2(2x⁴-2-λx²)/x=0 <br> 2x²=2")
C -->|"x²=1"| F("(x,y,λ) = (-1,-1,0), (1,1,0)"):::solution
D -->|"x²=1"| G("(x,y,λ) = (-1,1,4), (1,-1,4)"):::solution
E -->|"λ=0"| F
What is the significance of the stationary points where \(\lambda = 0\)?