# Linear Least Squares with Equality constraints, the Lagrangian method.

c-plus-plus eigen least-squares numerical-analysis optimization

This post is an annotated version a Linear Least Squares with Equality constraints (LSE for short) solution from Matrix Computations, Gene H. Golub and Charles F. Van Loan. 4th edition, 2013 ISBN 9781421407944.

I wasn’t able to find this solution on the web, so I decided to write a post where I’ll provide some extra detail and implementation snippets for the Golub and Van Loan solution in section 6.2.4, p. 316, “LSE solution using the augmented system,” a Lagrange multiplier method. (Note, that text provides other ways to solve this problem, but I am familiar with the method of Lagrange multipliers, so that’s where I started.)

## Problem definition

First, the LSE problem is a constrained least squares problem in the following form:

$\min_{Bx = d} \| Ax - b \|_2$

where

• $$A \in \mathbb{R}^{m_1 \times n_1}$$, $$m_1 \geq n_1$$,
• $$B \in \mathbb{R}^{m_2 \times n_1}$$, $$m_2 < n_1$$,
• $$b \in \mathbb{R}^{m_1}$$,
• $$d \in \mathbb{R}^{m_2}$$.

The 6.2.4 solution to this problem in Golub and Van Loan uses Lagrange multipliers. Form the Lagrangian function,

$f(x, \lambda) = \frac{1}{2} \| Ax - b \|^2_2 + \lambda^T(d - Bx) \label{l-function}$

Note: from the method of Lagrange multipliers, the function to minimize is $$\| Ax - b \|_2$$ and the constraint function $$g(x) = Bx - d$$ or equivalently, $$g(x) = d - Bx$$. $$g(x) = 0$$ in the traditional setup using Lagrange multipliers.

Expanding Eq. \eqref{l-function}, we have

$f(x, \lambda) = \frac{1}{2} x^{T} A^TAx - x^TA^Tb + b^Tb - \lambda^TBx + \lambda^Td \label{l-function-expanded}$

The next step is to differentiate $$f(x, \lambda)$$ with respect to $$x$$ and set it to zero:

$f(x, \lambda) = A^TAx - A^Tb - B^T\lambda = 0 \label{dl-function}$

In the method of Lagrange multipliers, the constraint in Eq. \eqref{dl-function} is also a constraint with the $$g(x) = 0$$ constraint for a extremizer (minimizer or maximizer). Our constraints are:

• $A^TAx - A^Tb - B^T\lambda = 0 \label{list-constraint0}$
• $Bx = d \label{list-constraint1}$

The Golub and Van Loan approach goes from there, to stating that we can combine these two constraints with the equation $$r=b-Ax$$ to get a linear system,

$\begin{bmatrix} 0 & A^T & B^T\\ A & I & 0\\ B & 0 & 0 \end{bmatrix} \begin{bmatrix} x\\ r\\ \lambda \end{bmatrix} = \begin{bmatrix} 0\\ b\\ d \end{bmatrix} \label{linear-sys1}$

An important note: Eq. \eqref{linear-sys1} is nonsingular if both $$A$$ and $$B$$ have full rank.

## Annotation of constraint step to the linear system of equations step

If you’re happy with that jump from Eq.s \eqref{list-constraint0}-\eqref{list-constraint1} to Eq. \eqref{linear-sys1}, no need to read any further. It took me a minute, so constraints are:

$A^TAx - A^Tb - B^T\lambda = 0 \label{con1}$ $Bx = d \label{con2}$ $r = b - Ax \label{con3}$

First, multiply the first constraint by -1 on both sides:

$-A^TAx + A^Tb + B^T\lambda = 0 \label{con1-neg}$

Eq. \eqref{con1-neg} can be simplified using Eq. \eqref{con3}:

\begin{align} A^T(b - Ax) + B^T\lambda = 0 \\ A^Tr + B^T\lambda = 0 \label{line1} \end{align}

The second line of the linear system in Eq \eqref{linear-sys1} comes from the $$r = b - Ax$$, which can be rearranged to $$Ax + Ir = b$$ (where $$I$$ is an identity matrix of $$m_1 \times m_1$$).

The third line of Eq. \ref{linear-sys1} is the $$Bx = d$$ constraint.

## Implementation details

First, I’ll mark up the linear system of equations with the dimensions on the top and side:

$\begin{array}{c c c c c} & \begin{array}{c c c c} n & m_1 & m_2 \\ \end{array} \\ \begin{array}{c} n \\ m_1\\ m_2 \end{array} & \left[ \begin{array}{c c c} 0 & A^T & B^T \\ A & I & 0 \\ B & 0 & 0 \end{array} \right] & \left[ \begin{array}{c} x \\ r\\ \lambda \end{array} \right] & = & \left[ \begin{array}{c} 0\\ b\\ d \end{array} \right] \end{array}$

Given this, the following C++ snippet gives an outline of the code to create the constraint matrix and solve (using Eigen, and Eigen’s dynamic submatrix access, described here).


MatrixXd A(m1, n);
MatrixXd b(m1, 1);

MatrixXd B(m2, n);
MatrixXd d(m2, 1);

A.setZero(); B.setZero();

// fill in A, b, B, d values from data
...
...

// construct the constraint matrix -- LHS is 'left-hand side'.
MatrixXd LHS(n + m1 + m2, n + m1 + m2);
LHS.setZero();

// RHS is 'right-hand side'
MatrixXd RHS(n + m1 + m2, 1); RHS.setZero();

LHS.block(0, n, n, m1) = A.transpose();

LHS.block(0, n + m1, n, m2) = B.transpose();

LHS.block(n, 0, m1, n) = A;

LHS.block(n, n, m1, m1) = MatrixXd::Identity(m1, m1);

LHS.block(n + m1, 0, m2, n) = B;

RHS.block(n, 0, m1, 1) = b;

RHS.block(n + m1, 0, m2, 1) = d;

JacobiSVD<MatrixXd> svd(LHS, ComputeThinU | ComputeThinV);

MatrixXd resultVector = svd.solve(RHS);