19  Lagrange multipliers I

Introduction

The second derivative test allows us to identify local maxima and minima of a function \(f(x,y)\) (this is often called optimization). The main goal of today is to solve a different, but related problem:

ImportantGoal

We want to optimize a function of two variables, \(f(x,y)\), (or maybe even a function of more variables, like \(f(x,y,z)\) or \(f(x_1,\dots,x_n)\)) but we are subject to a constraint \(g(x,y)=c\). We call \(f\) the objective function and \(g\) the constraint function.

In other words, we want to figure out how to make \(f(x,y)\) as large as possible (or as small as possible), but we are only allowed to plug in values where \(g(x,y)=c\).

NoteExample: extrema of \(f(x,y) = \sin(\pi x) \cos(\pi y)\), subject to \(g(x,y) = 4x^2 + 4y^2 - 4x\) and \(c = 3\).

We want to find the maximum and minimum values of \(f(x,y)\) subject to the constraint \(g(x,y) = c\):

Show[
  ContourPlot[
    Sin[Pi x] Cos[Pi y],
    {x, -2, 2}, {y, -2, 2},
    ColorFunction -> "Pastel"
  ],
  ContourPlot[
    4 x^2 + 4 y^2 - 4 x == 3,
    {x, -2, 2}, {y, -2, 2},
    ContourStyle -> Black
  ]
]

We want to understand the extrema of the function whose contours are plotted above, but only along the curve \(g(x,y) = 3\) (in black). Switching to the \(z = f(x,y)\) surface plot perspective, we have:

ContourPlot3D[
  {
    z == Sin[Pi x] Cos[Pi y],
    4 x^2 + 4 y^2 - 4 x == 3
  },
  {x, -2, 2}, {y, -2, 2}, {z, -2, 2},
  ContourStyle -> {
    Opacity[0.6],
    Opacity[0]
  },
  BoundaryStyle -> {
    {1, 2} -> Thick,
    2 -> None
  },
  Mesh -> None
]

Using these plots, how can we visually identify the desired extrema?

Main result

ImportantIdea of Lagrange Multipliers

The system of \(f(x,y)\) constrained to \(g(x,y) = c\) will have an extreme value at \((x_0,y_0)\) if:

  • \(f(x,y)\) has an extreme value in its own right, or
  • The level curves of \(f\) and \(g\) are tangent at \((x_0,y_0)\).

Remember that level curves of a function are orthogonal to gradient vectors! In particular, the above conditions mean that

  • \(\nabla f(x_0,y_0) = 0\), or
  • \(\nabla f(x_0,y_0)\) and \(\nabla g(x_0,y_0)\) are parallel.

We can wrap these into a single equality (the Lagrange equations):

\[\nabla f = \lambda \nabla g \text{ for some } \lambda \in \mathbb{R}.\]

Values \((x_0,y_0,\lambda_0)\) satisfying this equation are called stationary points of the Lagrangian.

NoteExample: extrema of \(f(x,y) = x^2 + 2 y^2\), subject to \(g(x,y) = x^2 + y^2\) and \(c = 1\).

The stationary points of the Lagrangian are: \((-1, 0, 1), (1, 0, 1), (0,-1, 2), (0, 1, 2)\). The first two give minima of the system (\(f=1\)) and the last two give maxima (\(f=2\)). One way we can approach the algebra is:

flowchart LR
  A("2x = λ 2x <br> 4y = λ 2y <br> x² + y² = 1")
  A --> B("2x (1-λ) = 0 <br> 2y (2-λ) = 0 <br> x² + y² = 1")
  B -->|"x=0"| C("2y (2-λ) = 0 <br> y² = 1")
  B -->|"λ=1"| D("2y = 0 <br> x² + y² = 1")
  C -->|"y=0"| E("0 = 1"):::contradiction
  C -->|"λ=2"| F("y² = 1")
  F -->|"y=±1"| G("(x,y,λ) = (0,1,2), (0,-1,2)"):::solution
  D -->|"y=0"| H("x² = 1")
  H -->|"x=±1"| I("(x,y,λ) = (1,0,1), (-1,0,1)"):::solution

Notice how we rely on these forks in logic to pin down solutions, and some of them lead to contradictions!

NoteExample: extrema of \(f(x,y) = x^4 + y^4 - 4 x y\), subject to \(g(x,y) = x^2 + y^2\) and \(c = 2\).

The stationary points of the Lagrangian are: \((-1,-1, 0), (1, 1, 0), (-1, 1, 4), (1, -1, 4)\). The first two give minima of the system (\(f=-2\)) and the last two give maxima (\(f=6\)).

In general, a system like this can be quite difficult to solve; the algebra happens to work out nicely in this case. A good trick is to multiply the two equations from \(\nabla f = \lambda \nabla g\) by \(x\) and \(y\), respectively, so we have two terms equal to \(2 \lambda x y\) which we can equate:

\[4x^3y - 4y^2 = \lambda 2xy = 4xy^4 - 4x^2 \stackrel{\text{rearrange}}{\implies} x^3 y - x y^4 - y^2 + x^2 = 0 \stackrel{\text{factor}}{\implies} (x-y)(x+y)(xy+1) = 0. \]

Solving the system looks like this:

flowchart LR
  A("4x³-4y=λ2x <br> 4y³-4x=λ2y <br> x²+y²=2")
  A --> B("4x³-4y-2λx=0 <br> (x-y)(x+y)(xy+1)=0 <br> x²+y²=2")
  B -->|"y=x"| C("2x(2x²-2-λ)=0 <br> 2x²=2")
  B -->|"y=-x"| D("2x(2x²+2-λ)=0 <br> 2x²=2")
  B -->|"y=-1/x"| E("2(2x⁴-2-λx²)/x=0 <br> 2x²=2")
  C -->|"x²=1"| F("(x,y,λ) = (-1,-1,0), (1,1,0)"):::solution
  D -->|"x²=1"| G("(x,y,λ) = (-1,1,4), (1,-1,4)"):::solution
  E -->|"λ=0"| F

What is the significance of the stationary points where \(\lambda = 0\)?

Homework

Stewart, James, Daniel K. Clegg, and Saleem Watson. 2020. Calculus: Early Transcendentals. 9th ed. Cengage Learning.