Linear Least Squares with Equality constraints, the Lagrangian method.

27 Feb 2022 c-plus-plus eigen least-squares numerical-analysis optimization

This post is an annotated version a Linear Least Squares with Equality constraints (LSE for short) solution from Matrix Computations, Gene H. Golub and Charles F. Van Loan. 4th edition, 2013 ISBN 9781421407944.

I wasn’t able to find this solution on the web, so I decided to write a post where I’ll provide some extra detail and implementation snippets for the Golub and Van Loan solution in section 6.2.4, p. 316, “LSE solution using the augmented system,” a Lagrange multiplier method. (Note, that text provides other ways to solve this problem, but I am familiar with the method of Lagrange multipliers, so that’s where I started.)

Problem definition

First, the LSE problem is a constrained least squares problem in the following form:

\[\min_{Bx = d} \| Ax - b \|_2\]

where

\(A \in \mathbb{R}^{m_1 \times n_1}\), \(m_1 \geq n_1\),
\(B \in \mathbb{R}^{m_2 \times n_1}\), \(m_2 < n_1\),
\(b \in \mathbb{R}^{m_1}\),
\(d \in \mathbb{R}^{m_2}\).

The 6.2.4 solution to this problem in Golub and Van Loan uses Lagrange multipliers. Form the Lagrangian function,

\[f(x, \lambda) = \frac{1}{2} \| Ax - b \|^2_2 + \lambda^T(d - Bx) \label{l-function}\]

Note: from the method of Lagrange multipliers, the function to minimize is \(\| Ax - b \|_2\) and the constraint function \(g(x) = Bx - d\) or equivalently, \(g(x) = d - Bx\). \(g(x) = 0\) in the traditional setup using Lagrange multipliers.

Expanding Eq. \eqref{l-function}, we have

\[f(x, \lambda) = \frac{1}{2} x^{T} A^TAx - x^TA^Tb + b^Tb - \lambda^TBx + \lambda^Td \label{l-function-expanded}\]

The next step is to differentiate \(f(x, \lambda)\) with respect to \(x\) and set it to zero:

\[f(x, \lambda) = A^TAx - A^Tb - B^T\lambda = 0 \label{dl-function}\]

In the method of Lagrange multipliers, the constraint in Eq. \eqref{dl-function} is also a constraint with the \(g(x) = 0\) constraint for a extremizer (minimizer or maximizer). Our constraints are:

\[A^TAx - A^Tb - B^T\lambda = 0 \label{list-constraint0}\]
\[Bx = d \label{list-constraint1}\]

The Golub and Van Loan approach goes from there, to stating that we can combine these two constraints with the equation \(r=b-Ax\) to get a linear system,

\[\begin{bmatrix} 0 & A^T & B^T\\ A & I & 0\\ B & 0 & 0 \end{bmatrix} \begin{bmatrix} x\\ r\\ \lambda \end{bmatrix} = \begin{bmatrix} 0\\ b\\ d \end{bmatrix} \label{linear-sys1}\]

An important note: Eq. \eqref{linear-sys1} is nonsingular if both \(A\) and \(B\) have full rank.

Annotation of constraint step to the linear system of equations step

If you’re happy with that jump from Eq.s \eqref{list-constraint0}-\eqref{list-constraint1} to Eq. \eqref{linear-sys1}, no need to read any further. It took me a minute, so constraints are:

\[A^TAx - A^Tb - B^T\lambda = 0 \label{con1}\] \[Bx = d \label{con2}\] \[r = b - Ax \label{con3}\]

First, multiply the first constraint by -1 on both sides:

\[-A^TAx + A^Tb + B^T\lambda = 0 \label{con1-neg}\]

Eq. \eqref{con1-neg} can be simplified using Eq. \eqref{con3}:

\[\begin{align} A^T(b - Ax) + B^T\lambda = 0 \\ A^Tr + B^T\lambda = 0 \label{line1} \end{align}\]

The second line of the linear system in Eq \eqref{linear-sys1} comes from the \(r = b - Ax\), which can be rearranged to \(Ax + Ir = b\) (where \(I\) is an identity matrix of \(m_1 \times m_1\)).

The third line of Eq. \ref{linear-sys1} is the \(Bx = d\) constraint.

Implementation details

First, I’ll mark up the linear system of equations with the dimensions on the top and side:

\[\begin{array}{c c c c c} & \begin{array}{c c c c} n & m_1 & m_2 \\ \end{array} \\ \begin{array}{c} n \\ m_1\\ m_2 \end{array} & \left[ \begin{array}{c c c} 0 & A^T & B^T \\ A & I & 0 \\ B & 0 & 0 \end{array} \right] & \left[ \begin{array}{c} x \\ r\\ \lambda \end{array} \right] & = & \left[ \begin{array}{c} 0\\ b\\ d \end{array} \right] \end{array}\]

Given this, the following C++ snippet gives an outline of the code to create the constraint matrix and solve (using Eigen, and Eigen’s dynamic submatrix access, described here).

   MatrixXd A(m1, n);
   MatrixXd b(m1, 1);

   MatrixXd B(m2, n);
   MatrixXd d(m2, 1);

   A.setZero(); B.setZero();
  
   // fill in A, b, B, d values from data
   ...
   ...

   // construct the constraint matrix -- LHS is 'left-hand side'.
   MatrixXd LHS(n + m1 + m2, n + m1 + m2); 
   LHS.setZero();
  
   // RHS is 'right-hand side'
   MatrixXd RHS(n + m1 + m2, 1); RHS.setZero();
  
   LHS.block(0, n, n, m1) = A.transpose();
        
   LHS.block(0, n + m1, n, m2) = B.transpose();

   LHS.block(n, 0, m1, n) = A;

   LHS.block(n, n, m1, m1) = MatrixXd::Identity(m1, m1);

   LHS.block(n + m1, 0, m2, n) = B;

   RHS.block(n, 0, m1, 1) = b;

   RHS.block(n + m1, 0, m2, 1) = d;

   JacobiSVD<MatrixXd> svd(LHS, ComputeThinU | ComputeThinV);

   MatrixXd resultVector = svd.solve(RHS);

Share or discuss.

Share on Twitter. Share on Facebook. Share on LinkedIn.

Amy Tabb

Linear Least Squares with Equality constraints, the Lagrangian method.

Problem definition

Annotation of constraint step to the linear system of equations step

Implementation details

More reading.

Share or discuss.

Amy Tabb

Linear Least Squares with Equality constraints, the Lagrangian method.

Problem definition

Annotation of constraint step to the linear system of equations step

Implementation details

More reading.

Share or discuss.

Related Posts

(Camera) Calibration Wizard; today's reading. 19 Oct 2024

Code optimization in Python; Today's reading. 31 Aug 2023

My favorite print debugging method in C++. 03 Jul 2023

Why won't Ceres run in parallel or with multithreading? 24 Jan 2023

Eigen: what's going on during a matrix multiplication? 21 Jan 2023