Subsections


Local Registration Algorithms

In the literature, there are two ways for estimating local registration: the Forward and the Inverse approaches (9). The former evaluates directly the warp which aligns the texture image $ \mathcal{I}_0$ with the warped image $ \mathcal{I}_W$. The latter computes the warp which aligns the warped with the texture image and then inverts the warp. They are both compatible with approximations of the cost function such as Gauss-Newton, ESM, learning-based, etc We describe in details the Inverse Gauss-Newton and the Forward Learning-based local registration steps.


Local Registration with Gauss-Newton

Combining an Inverse local registration with a Gauss-Newton approximation of the cost function is efficient since this combination makes invariant the approximated Hessian matrix used in the normal equations to be solved at each iteration. We cast this approach in the Feature-Driven framework, making it possible to extend Inverse Compositional registration to the TPS and the FFD warps.

In the Inverse Compositional framework, local registration is achieved by minimizing the local discrepancy error:

$\displaystyle \mathcal{C}_l(\tilde{\mathbf{u}}) = \sum_{\mathbf{q} \in \mathfra...
...thcal{W}(\mathbf{q} ; \tilde{\mathbf{u}})) - \mathcal{I}_W(\mathbf{q}) \Vert^2.$ (A.16)

Using Gauss-Newton as local registration engine, the gradient vector is the product of the texture image gradient vector and of the constant Jacobian matrix  $ \mathsf{K}$ of the warp: $ \mathbf{g} = \nabla\mathcal{I}_0^\mathsf{T}\mathsf{K}$. Matrix  $ \mathsf{K}$ is given in section A.5. The Jacobian matrix of this least squares cost is thus constant. The Hessian matrix  $ \mathsf{H}$ and its inverse are computed off-line. However, the driving features  $ \tilde{\mathbf{u}}$ are located on the reference image  $ \mathcal{I}_0$. They must be located on the warped image  $ \mathcal{I}_W$ for being used in the update. We use our warp reversion process for finding the driving features  $ \mathbf{u}$ on the warped image i.e. , $ \mathbf{u}$ such that  $ \mathcal{W}(\tilde{\mathbf{u}};\mathbf{u}) = \mathbf{u}_0$. An overview of Feature-Driven Inverse Gauss-Newton registration is shown in table A.1.


Table A.1: Overview of our Feature-Driven Inverse Compositional Gauss-Newton registration.
\begin{table}
\begin{center}
\fbox{\begin{minipage}{0.95\columnwidth}
\begin{...
...}\:; \mathbf{u})$
\end{itemize}
\end{minipage}}
\end{center}
\end{table}



Learning-Based Local Registration

Learning-based methods model the relationship between the local increment  $ \mathbold{\delta}$ and the intensity discrepancy  $ \mathbf{d}$ with an interaction function $ f$:

$\displaystyle \mathbold{\delta}= f(\mathbf{d}).$ (A.17)

The interaction function is often approximated using a linear model, i.e. $ f(\mathbf{d}) = \mathsf{F} \mathbf{d}$ where  $ \mathsf{F}$ is called the interaction matrix. This relationship is valid locally around the texture image parameters  $ \mathbf{u}_0$. Compositional algorithms are thus required, as in (104) for homographic warps. The Feature-Driven framework naturally extends this approach to non-groupwise warps. However in (49) the assumption is made that the domain where the linear relationship is valid covers the whole set of registrations. They thus apply their interaction function around the current parameters, avoiding the warping and the composition steps. This does not appear to be a valid choice in practice.

The interaction function is learned from artificially perturbed texture images  $ \mathcal{A}_j$. They are obtained through random perturbations of the reference parameter  $ \mathbf{u}_0$. In the literature, linear and non linear interaction functions are used. They are learned with different regression algorithms such as Least Squares (LS) (104,49), Support Vector Machines (SVM) or Relevance Vector Machines (RVM) (1). Details are given below for a linear interaction function, i.e. an interaction matrix, learned through Least Squares regression. Table A.2 summarizes the steps of learning-based local registration.


Table A.2: Overview of our Learning-based registration.
\begin{table}
\begin{center}
\fbox{\begin{minipage}{0.95\columnwidth}
\begin{...
...}\:; \mathbf{u})$
\end{itemize}
\end{minipage}}
\end{center}
\end{table}


Generating training data with a Feature-Driven Warp.

The driving features in the texture image are disturbed from their rest position  $ \mathbf{u}_0$ with randomly chosen directions  $ \mathbold{\theta}_j$ and magnitudes  $ \mathbf{r}_j$:

$\displaystyle \mathbf{u}_j = \mathbf{u}_0 + \mathbold{\delta}_j
 \qquad \textrm...
...bold{\theta}_j)  \mathbf{r}_j \odot \sin(\mathbold{\theta}_j)
 \end{pmatrix},$ (A.18)

where  $ \cos(\mathbold{\theta}_j)$ and  $ \sin(\mathbold{\theta}_j)$ are meant to be applied to all the elements of  $ \mathbold{\theta}_j$ and $ \odot$ denotes the element-wise product. The magnitude is clamped between a lower and an upper bound, determining the area of validity of the interaction matrix to be learned. For a Feature-Driven warp, fixing this magnitude is straightforward since the driving features are expressed in pixels. It can be much more complex when the parameters are difficult to interpret such as the usual coefficients of the TPS and the FFD warps. There are two ways to synthesize images:
    $\displaystyle \mathcal{A}_j(\mathbf{q}) \leftarrow \mathcal{I}_0(\mathcal{W}(\mathbf{q} ; \mathbf{u}_j)^\diamond)$ (A.19)
    or  
    $\displaystyle \mathcal{A}_j(\mathbf{q}) \leftarrow \mathcal{I}_0(\arg \min_{\ma...
...}} \left\Vert \mathcal{W}(\mathbf{q} ; \mathbf{u}_j) - \mathbf{q} \right\Vert).$ (A.20)

The former requires warp inversion whereas the latter requires a cost optimization, per-pixel. In our experiments, we use equation (A.19). Our Feature-Driven warp reversion process is thus used to warp the texture image. Training data generation with a Feature-Driven warp is illustrated in figure A.7.

Figure A.7: Generating training data with a Feature-Driven warp.
Image exemple_Apprentissage

Learning.

The residual vector is computed for the pixels of interest in  $ \mathfrak{R}$:

$\displaystyle \mathbf{d}_j = {\boldsymbol{\xi}}_\mathfrak{R} \left( \mathcal{I}_0 - \mathcal{A}_j \right).$ (A.21)

The training data are gathered in matrices $ \mathsf{D} = \left( \mathbold{\delta}_1 \; \ldots \; \mathbold{\delta}_m \right) \in \mathbb{R}^{\vert \mathfrak{R} \vert \times m}$ and $ \mathsf{L} = \left( \mathbf{d}_1 \; \ldots \; \mathbf{d}_m \right) \in \mathbb{R}^{\vert \mathfrak{R} \vert \times m}$. The interaction matrix  $ \mathsf{F} \in \mathbb{R}^{\vert \mathfrak{R} \vert \times \vert \mathfrak{R} \vert}$ is computed by minimizing a Linear Least Squares error in the image space, expressed in pixel value unit, giving:

$\displaystyle \mathsf{F} = \left(\mathsf{L} \mathsf{D}^\mathsf{T}(\mathsf{D} \mathsf{D}^\mathsf{T})^{-1}\right)^\dagger .$ (A.22)

This is one of the two possibilities for learning the interaction matrix. The other possibility is dual. It minimizes an error in the parameter space, i.e. expressed in pixels. The two approaches have been experimentally compared. Learning the interaction matrix in the image space give the best results. Thereafter, we use this option.

A piecewise linear interaction function.

Experiments show that a linear approximation of the relationship between the local increment  $ \mathbold{\delta}$ and the intensity discrepancy  $ \mathbf{d}$, though computationally efficient, does not always give satisfying results. The drawback is that if the interaction matrix covers a large domain of deformation magnitudes, the registration accuracy is spoiled. On the other hand, if the matrix is learned for small deformations only, the convergence basin is dramatically reduced. Using a nonlinear interaction function learned through RVM or SVM partially solves this issue. We use a simple piecewise linear relationship as interaction function. It means that we learn not only one but a series $ \mathsf{F}_1, \ldots, \mathsf{F}_{\kappa}$ of interaction matrices, each of them covering a different range of displacement magnitudes. The interaction function is thus of the form  $ f(\mathbf{d}) = \sum_{i=1}^\kappa a_i \mathsf{F}_i$. More details are given in appendix A.7.


Contributions to Parametric Image Registration and 3D Surface Reconstruction (Ph.D. dissertation, November 2010) - Florent Brunet
Webpage generated on July 2011
PDF version (11 Mo)