Multivariate Optimization

If the independent parameters of the spatial transformation model did not interact with one another, multivariate optimization could be conducted as a set of simultaneous univariate optimizations. Unfortunately, the parameters of a rigid-body model become extensively intertwined even in the course of converting rotations and translations into the coefficients of the linear equations that perform rigid-body transformations (see the chapter "Spatial Transformation Models''). Fortunately, there is a multidimensional version of the univariate procedure described above that takes these interactions between parameters into account. Analogous to the univariate case, the cost function is modeled as a parabolic surface centered on the desired minimum (Fig. 3). Given a vector b of first partial derivatives of the cost function with respect to each of the spatial transformation parameters and a Hessian matrix A consisting of the second partial derivatives with respect to each pair of parameters, the vector x, which im^rac

im^rac

FIGURE 3 Multivariate minimization methods model the cost function as a parabolic surface. The right top panel shows the modeled cost function on the vertical axis and two parameters on the horizontal axes. The objective is to identify the parameter values that correspond to the lowest point on the parabolic surface. As illustrated on the left in the top panel, the second partial derivatives of the cost function indicate that the two parameters are interwined. As a result, many univariate minimizations of these two parameters would be required to reach the minimum. Full Newton-type minimization procedures effectively define a new pair of hybrid parameters that are linear combinations of the original parameters, but have the special property that their second derivatives are diagonalized. The bottom panel shows the geometric consequences of this diagonalization. Univariate minimizations can be applied to both of the hybrid parameters to find the location of the minimum in a single step. The optimum pair of hybrid parameters can then be projected back onto the original pair of parameters to compute the values required for the next iteration. See also Plate 76.

represents the adjustment of each spatial transformation parameter needed to reach the minimum of this parabolic surface, can be computed by solving the matrix equation

Note that this multivariate equation is analogous to the univariate result. This equivalence can be emphasized by noting that the Hessian matrix A is symmetric by definition and consequently that all ofits eigenvalues and eigenvectors are real. If V is a matrix with columns that correspond to all of the unit-length eigenvectors of A, and D is a matrix with the corresponding eigenvalues along the diagonal and zeros elsewhere, the matrix to be solved can be rewritten as

From a geometric standpoint, the vector b of first derivatives is rotated by the orthonormal matrix V' into a space where each newly defined first derivative is divided by the corresponding eigenvalue derived from the Hessian matrix of second derivatives. The resulting vector is then rotated back into the original space where it defines the modification of the original parameters that is required. In essence, the eigenvectors of A define a set of hybrid parameters that are uncorrelated with one another. Each of these hybrid parameters effectively undergoes univariate normalization. These newly defined hybrid spatial transformation parameters lie along the principal axes of the multidimensional parabolic surface defined by A and b (see Fig. 3). Fortunately, it is not necessary to actually compute eigenvectors and eigenvalues to determine the required translation. This multivariate formulation has been implemented for the ratio image uniformity, least squares, and scaled least squares cost functions [21] and generally performs well. One difficulty that can arise is that one or more of the eigenvalues of the Hessian matrix A can turn out to be negative (indicating proximity of a maximum instead of the desired minimum) or zero (indicating a saddle point). In this case, the equation A*x =—b does not have a real solution and the Hessian matrix is said to be non-positive definite. One strategy for dealing with this problem is discussed in the next section.

0 0

Post a comment