# N-view triangulation: nonlinear least squares reprojection error (4).

multiple-view-geometry triangulation

We’re going to jump now from the previous two posts on the DLT methods (DLT 1, DLT 2,3 ) to minimizing nonlinear least squares reprojection error in this post. This type of error is what is commonly used in Structure from Motion (SfM) pipelines. The cost function is

$\min_{\mathbf{X}} \sum_i^n d(\mathbf{x}_i, \mathbf{P}_i\mathbf{X})$

where

• $$\mathbf{P}_i$$: $$3 \times 4$$ camera matrix from the $$i$$th camera.
• $$\mathbf{X}$$ (world point): column vector, size $$4$$.
• $$\mathbf{x}_i$$ (image point): column vector, size $$3$$, from the $$i$$th camera.

as in previous posts, and

• $$d(\cdot, \cdot)$$ is a distance function.

Now, if you read the DLT posts where the formulation was $$\mathbf{x} = \mathbf{PX}$$, you might be thinking, “just use $$\|\mathbf{x} -\mathbf{PX}\|^2$$ as an appropriate distance function, what’s going on here”?

And you would be right, if we assumed that there was no radial distortion and the coordinates were not homogenous (requiring a divide). In the DLT methods, we sort of hand-waved our way around radial distortion or ignored it (I ignore it, in my cases). With the method described in this post, we’ll take into account the camera model, including radial distortion. That’s why this method uses nonlinear least squares to find an approximate solution. It also means we’ ll need an initial solution for $$\mathbf{X}$$ to start the minimization, and we can use a DLT method to compute the initial solution $$\mathbf{X}_0$$. So, another term,

Let

• $$v_{i, distortion}$$: a vector representing the distortion parameters for the $$i$$th camera. I’m using the distortion model of radial and tangential factors used in OpenCV, of 8 elements, represented as $$v_{i, distortion}=(k_1, k_2, p_1, p_2, k_3, k_4, k_5, k_6)$$. Note that one does not have to use all of these parameters – more on that later.

## Radial distortion, a quick explanation

Let

• $$\begin{bmatrix} \mathbf{R} & \rvert & \mathbf{t} \end{bmatrix}$$ be a extrinsic parameter matrix,
• $$\mathbf{K}$$ be a intrinsic camera parameter matrix,
• $$\mathbf{x}_{ud}$$ be undistorted image coordinates in 3D camera coordinates,
• $$\mathbf{x}_d$$ be distorted image coordinates in 3D camera coordinates,
• $$\mathbf{x}$$ be the undistorted point in pixel coordinates (in other words the goal!).

Then

$\mathbf{x}_{ud} = \begin{bmatrix} \mathbf{R} & \rvert & \mathbf{t} \end{bmatrix}\mathbf{X}$

Apply the radial distortion to $$\mathbf{x}_{ud}$$ to get $$\mathbf{x}_d$$. This is where the nonlinear part starts:

$r^2 = \mathbf{x}_{u, ud}^2 + \mathbf{x}_{v, ud}^2$

(here, $$\mathbf{x}_{u, ud}$$ is the first element of $$\mathbf{x}_{ud}$$, $$\mathbf{x}_{v, ud}$$ is the second.)

$\mathbf{x}_{u, d} = \mathbf{x}_{u, ud}\frac{1 + k_1 r^2 + k_2r^4 + k_3r^6}{1 + k_4 r^2 + k_5r^4 + k_6r^6} + 2p_1\mathbf{x}_{u, ud}\mathbf{x}_{v, ud} + p_2(r^2 + 2\mathbf{x}_{u, ud}^2)$ $\mathbf{x}_{v, d} = \mathbf{x}_{v, ud}\frac{1 + k_1 r^2 + k_2r^4 + k_3r^6}{1 + k_4 r^2 + k_5r^4 + k_6r^6} + 2p_2\mathbf{x}_{u, ud}\mathbf{x}_{v, ud} + p_1(r^2 + 2\mathbf{x}_{v, ud}^2)$

Mercy sakes, and then after all of that, convert these 3D camera coordinates $$\mathbf{x}_{d}$$ to image pixel coordinates $$\mathbf{x}$$:

$\mathbf{x} = \mathbf{K}\mathbf{x_d}$

Ok, resources for radial distortion are MVG2004 p. 189, OpenCV tutorial, and Ceres’ bundle adjustment example. The model listed and linked here is not the only radial distortion model, and while this paper, BabelCalib does not have as its focus exploring all models, you can read about them in the paper.

Now, in practice the use of eight distortion parameters can be problematic; my experiences are that using 5, 4, or 2 parameters produce more stable results. Modern OpenCV even supports 14 parameters now! You can set any of the parameters to zero, see the documentation from OpenCV’s calibrateCamera() function.

distCoeffs	Input/output vector of distortion coefficients (k1,k2,p1,p2[,k3[,k4,k5,k6[,s1,s2,s3,s4[,τx,τy]]]]) of 4, 5, 8, 12 or 14 elements.


Note that when you set the extra parameters to zero, they disappear quite nicely from the formulation. For example, suppose only $$k_1, k_2$$ are non-zero:

$\mathbf{x}_{u, d} = \mathbf{x}_{u, ud}\frac{1 + k_1 r^2 + k_2r^4}{1}$ $\mathbf{x}_{v, d} = \mathbf{x}_{v, ud}\frac{1 + k_1 r^2 + k_2r^4}{1}$

## Solving the nonlinear minimization problem

To solve $$\min_{\mathbf{X}} \sum_i^n d(\mathbf{x}_i, \mathbf{P}_i\mathbf{X})$$, you’ll need to use a solver, and I use the Ceres Solver for their implementation of the Levenberg-Marquardt method for nonlinear least squares. The great thing about Ceres – no derivatives. You can write in your function, and the solver will generate analytical Jacobians. I have worked with other solvers where I needed to provide functions for the Jacobian, and whew, that was something.

Compute an initial solution for world point $$\mathbf{X}$$ using another method, likely a DLT method, and plug it into your solver.

Here’s an example from some of my code. T(1) converts the literal 1 into Ceres’ data type.


// this is a function I wrote -- the extrinsic parameter matrix is
// multiplied by world X,
// and result saved in xp (read xp as x prime)
MatrixMultiply(Rt, X, xp, 3, 4, 4, 1);

//normalize
xp = xp/xp;
xp = xp/xp;

// compute r-squared for radial distortion
r_sqr= xp*xp + xp*xp;

// this is the numerator & denominator for the fraction
T numer = (T(1) + k1*r_sqr + k2*r_sqr*r_sqr + k3*r_sqr*r_sqr*r_sqr);
T denom = (T(1) + k4*r_sqr + k5*r_sqr*r_sqr + k6*r_sqr*r_sqr*r_sqr);

// converting undistorted 3D camera points to distorted 3d camera points
xpp = xp*numer/denom + T(2)*p1*xp*xp + p2*(r_sqr + T(2)*xp*xp);
xpp = xp*numer/denom + T(2)*p2*xp*xp + p1*(r_sqr + T(2)*xp*xp);

// multiplying distorted 3d camera points by non-zero entries
//in camera calibration matrix
T predicted_x = (xpp*K + K);
T predicted_y = (xpp*K + K);


[OpenCV Tutorial] OpenCV Camera Calibration Tutorial, 4.5.3. link.

[OpenCV CalibrateCamera] OpenCV calibrateCamera() documentation. link.