david,phild
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%
% softposit demonstration code
%
% date: january 2001 --> april 2003
%
% authors:
% daniel dementhon
% center for automation research
% university of maryland
% college park, md 20742-3275
%
% philip david
% army research laboratory university of maryland
% attn: amsrl-ci-cb dept. of computer science
% adelphi, md 20783 college park, md 20742
%
% copyright 2003, university of maryland, army research lab, daniel dementhon and philip david
%
% this program is available under the terms of the gnu general public license
% as published by the free software foundation.
% see the gnu general public license for more details: http://www.gnu.org/copyleft/gpl.html
%
% you are free to use this program for non-commercial purpose.
% if you plan to use this code in commercial applications,
% you need additional licensing:
% please contact daniel@cfar.umd.edu or phild@arl.army.mil
%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%
% softposit compute pose and correspondences using the softposit algorithm.
%
% [rot, trans, assignmat, projworldpts, foundpose, stats] = ...
% softposit(imagepts, imageadj, worldpts, worldadj, beta0, noisestd, ...
% initrot, inittrans, focallength, displevel, kickout, center)
%
%
% given a set of 3d world points and a set of 2d image points, determine the
% rotation and translation (the pose) of the world points that best aligns
% the projected world points with the image points. correspondences between
% the world and image points are not known. the set of image points may
% include spurious and missing data (due to clutter and occlusion).
%
% the registration is computed using the softposit algorithm, which is
% an iterative algorithm combining the posit and softassign algorithms.
% for a technical description of the algorithm, see [d. dementhon and
% p. david, "softposit: an algorithm for registration of 3d models to noisy
% perspective images combining softassign and posit," univ. of maryland
% center for automation research technical report car-tr-970, may 2001].
%
% inputs:
% imagepts is an m x 2 matrix of the x and y coordinates of a set of
% m (m >= 3) image points.
% imageadj is an m x m adjacency matrix for the set of image points.
% this is used for the purpose of drawing pictures, not for computing
% the pose of the world points. if imageadj(i,j) is nonzero, then image
% point imagepts(i) will be connected via a line segment to image point
% imagepts(j) in any images displayed by this code. if you don't want
% to see any connecting lines, then set imageadj = zeros(m,m).
% worldpts is an n x 3 matrix of the x, y, and z coordinates of a set of
% n (n >= 4) world points.
% worldadj is an n x n adjacency matrix for the set of world points.
% this is used for the purpose of drawing pictures, not for computing the
% pose of the world points. if worldadj(i,j) is nonzero, then the image of
% world point worldpts(i) will be connected via a line segment to the image
% of world point worldpts(j) in any images displayed by this code. if you
% don't want to see any connecting lines, then set worldadj = zeros(n,n).
% beta0 defines the initial fuzziness of the correspondence matrix.
% if we know that our initlal pose is close to the actual pose, then
% beta0 should be close to 0.1 so that the correspondence matrix is not
% completely fuzzy; otherwise, if the pose is completely unknown, beta0
% should be close to 0.0001 so that all possibilities of correspondence
% are possible in the beginnning.
% noisestd is the standard deviation of the normally distributed noise in
% the x and y coordinates of image points.
% initrot is a 3 x 3 matrix giving the initial guess for the rotation of
% the world points. the transformation from world coordinates (xw) to camera
% coordinates (xc) is assumed to be xc = r*xw + t where r is a rotation matrix
% and t is a translation vector.
% inittrans is a 3-vector giving the initial guess for the translation of
% the world points.
% focallength is the focal length of the camera.
% displevel is a value in {0,1,2,3,4,5,6} that determines what amount of
% information this routine produces as it runs. the following table describes
% what output is produced for various values of displevel.
% -----------------------------------------------------------------
% function display level
% 0 1 2 3 4 5 6
% -----------------------------------------------------------------
% run silently (display nothing). x x x
% show short text messages at all iterations. x x x x
% show long text messages at all iterations. x x
% show images at all iterations. x x x
% wait for cr press after each iteration. x
% -----------------------------------------------------------------
% kickout contains information needed to perform an early termination
% test. that is, the routine uses the information in kickout to
% determine if finding a correct pose is so unlikely that it should
% just terminate so that it can be restarted from different initial
% conditions. kickout is a structure with two fields: rthldfbeta and
% nummatchable. kickout.nummatchable is an estimate of the number of
% world points that can be matched to image points. kickout.rthldfbeta
% is an array of length maxiterations such that, if on iteration k,
% nmp(k)/kickout.nummatchable is less than kickout.rthldfbeta(k), then the
% iteration terminates and returns with foundpose == 0. nmp(k) is the number
% of world points matched to image points on iteration k of softposit.
% if you don't want to use this kickout test, then set kickout.rthldfbeta
% = zeros(1,maxiterations). maxiterations is the maximum number of
% iterations that softposit() will perform, and can be calculated from
% beta0, betaupdate, and betafinal.
% center is a 2-vector giving the x and y coordinate in the image
% of the optical axis. this argument is optional. if given, then all
% image points in imagepts are normalized with respect to center. if not
% given, then this routine assumes that the image points have already been
% normalizied with respect to this center point.
%
% outputs:
% rot returns the 3 x 3 rotation matrix of the object. the computed pose
% of the object is defined by rot and trans.
% trans returns the 3-vector translation of the object.
% assignmat returns the assignments of image points to world points.
% assignmat is an (m+1) x (n+1) matrix such that image point j corresponds
% to world point k if assignmat(j,k) is maximal in row j and column k.
% m and n are as defined above in the description of the inputs.
% projworldpts returns an n x 2 matrix which gives the image (x and y
% coordinates) of all world points for the pose defined by rot and trans.
% if this pose is correct, then most of the rows of projworldpts should
% be close to some row of imagepts.
% foundpose returns nonzero if the softposit iteration converged
% to a pose; otherwise, foundpose returns zero. note that the fact that
% softposit converged to some pose does not mean that this pose is correct.
% the calling routine should check the assignments returned in assignmat
% to decide if this pose provides for a sufficient number of matches.
% stats returns an numiters x 5 matrix containing various information
% about the state of softposit on each iteration. numiters is the number
% of iterations for which softposit ran. see the code for the exact values
% returned in this matrix. this information is probably only useful for
% learning the thresholds to use in the early termination test.
%
% example usage:
% this is an example of softposit registering an image of a cube (in
% which one cube vertex is not visible) to a 3d model of the cube.
% in this example, the internal paramter betaupdate was set to 1.05.
% the pose of the cube that was used to generate the image points is:
% rotation = [ -0.0567 -0.9692 0.2397;
% -0.7991 0.1879 0.5710;
% -0.5985 -0.1592 -0.7852];
% translation = [0.5; -0.1; 7];
% the rotation matrix computed by softposit (see below) does not match the
% above rotation matrix because, for a cube, any of a number of different
% rotations register the image to the model equally well.
% inputs:
% imagepts = [ 172.3829 -15.4229;
% 174.9147 -183.8248;
% -28.3942 -147.8052;
% 243.2142 105.4463;
% 252.6934 -72.3310;
% 25.7430 -28.9218;
% 35.9377 149.1948];
% imageadj = [ 1 1 0 1 0 0 0;
% 1 1 1 0 1 0 0;
% 0 1 1 0 0 1 0;
% 1 0 0 1 1 0 1;
% 0 1 0 1 1 1 0;
% 0 0 1 0 1 1 1;
% 0 0 0 1 0 1 1];
% worldpts = [ -0.5000 -0.5000 -0.5000;
% 0.5000 -0.5000 -0.5000;
% 0.5000 0.5000 -0.5000;
% -0.5000 0.5000 -0.5000;
% -0.5000 -0.5000 0.5000;
% 0.5000 -0.5000 0.5000;
% 0.5000 0.5000 0.5000;
% -0.5000 0.5000 0.5000];
% worldadj = [ 1 1 0 1 1 0 0 0;
% 1 1 1 0 0 1 0 0;
% 0 1 1 1 0 0 1 0;
% 1 0 1 1 0 0 0 1;
% 1 0 0 0 1 1 0 1;
% 0 1 0 0 1 1 1 0;
% 0 0 1 0 0 1 1 1;
% 0 0 0 1 1 0 1 1];
% beta0 = 2.0e-04
% noisestd = 0
% initrot = [ 0.9149 0.1910 -0.3558;
% -0.2254 0.9726 -0.0577;
% 0.3350 0.1330 0.9328];
% inittrans = [0; 0; 50];
% focallength = 1500;
% displevel = 5;
% kickout.nummatchable = 7;
% kickout.rthldfbeta = zeros(1,200);
%
% execution:
% softposit runs for 42 iterations. the message displayed on the
% final iteration is the following:
% betacount = 42, beta = 0.0015523, delta = 0.86674,
% poseconverged = 1, nummatchpts = 7, sumnonslack = 5.0256
% on the final iteration, the projected world points exactly overlay
% the image points.
%
% outputs:
% rot = [ 0.9692 0.0567 -0.2395;
% -0.1879 0.7990 -0.5712;
% 0.1589 0.5987 0.7851];
% trans = [ 0.5002; -0.1001; 7.0011];
% assignmat = [ 0 0 0 0 0 0 0.72 0 0.28;
% 0 0 0 0 0 0.72 0 0 0.28;
% 0 0 0 0 0.72 0 0 0 0.28;
% 0 0 0.72 0 0 0 0 0 0.28;
% 0 0.72 0 0 0 0 0 0 0.28;
% 0.72 0 0 0 0 0 0 0 0.28;
% 0 0 0 0.72 0 0 0 0 0.28;
% 0.28 0.28 0.28 0.28 0.28 0.28 0.28 1.0 0.11];
% projworldpts = [ 25.7389 -28.9020;
% 252.6681 -72.2992;
% 243.1986 105.4171;
% 35.9445 149.1457;
% -28.3467 -147.8163;
% 174.9461 -183.8300;
% 172.4196 -15.4739;
% -14.9407 21.2222];
% foundpose = 1;
% stats = [0.00020 53.3 0 0.24 0.00108;
% 0.00021 55.1 0 0.30 0.00112;
% 0.00022 55.3 0 0.36 0.00105;
% 0.00023 55.1 0.12 0.44 0.00106;
% 0.00024 51.2 0.37 0.50 0.00128;
% ...];
%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
function [rot, trans, assignmat, projworldpts, foundpose, stats] = ...
softposit(imagepts, imageadj, worldpts, worldadj, beta0, noisestd, ...
initrot, inittrans, focallength, displevel, kickout, center)
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% check inputs.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
msg = nargchk(11, 12, nargin);
if (~isempty(msg))
error(strcat('error: ', msg));
end
clear msg;
if nargin == 11 % image coordinates are already centered, center is not given
center = [0, 0];
end
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% initialize.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
stats = [];
% alpha should be set to the maximum allowed squared distance between an
% image point and the matching projected model point. this should be a
% function of the noise level in the image. with normally distributed
% x and y noise of mean zero and standard deviation noisestd, the squared
% distance between a true 2d point and the measured 2d point has a
% chi-squared distribution with 2 degrees of freedom. thus, to ensure
% with probability 0.99 that a measured point is allowed to match to a
% true point, we should take alpha = 9.21*noisestd^2. see hartley &
% zisserman, multiple view geometry, p. 103 & p. 550. a value of 1 is
% added to this to account for possible roundoff errors.
alpha = 9.21*noisestd^2 + 1;
maxdelta = sqrt(alpha)/2; % max allowed error per world point.
betafinal = 0.5; % terminate iteration when beta == betafinal.
betaupdate = 1.05; % update rate on beta.
epsilon0 = 0.01; % used to initialize assignement matrix.
maxcount = 2;
minbetacount = 20;
nbimagepts = size(imagepts, 1); % number of image points (m below).
nbworldpts = size(worldpts, 1); % number of world points (n below).
minnbpts = min([nbimagepts;nbworldpts]);
maxnbpts = max([nbimagepts;nbworldpts]);
scale = 1/(maxnbpts + 1);
% convert to normalized image coordinates. with normalized coordinates, (0,0)
% is the point where the optic axis penetrates the image, and the focal length
% is 1.
centeredimage(:,1) = (imagepts(:,1) - center(1))/focallength;
centeredimage(:,2) = (imagepts(:,2) - center(2))/focallength;
imageones = ones(nbimagepts, 1); % column m-vector.
% the homogeneous world points -- append a 1 to each 3-vector. an nx4 matrix.
worldones = ones(nbworldpts, 1); % column n-vector.
homogeneousworldpts = [worldpts, worldones];
% initial rotation and translation as passed into this function.
rot = initrot;
trans = inittrans;
if ismember(displevel,[5,6])
disp(' '); disp('initial transformation:'); rot, trans
end
% initialize the depths of all world points based on initial pose.
wk = homogeneousworldpts * [rot(3,:)/trans(3), 1]';
% draw a picture of the model on the original image plane.
projworldpts = proj3dto2d(worldpts, rot, trans, focallength, 1, center);
if ismember(displevel,[4,5,6])
% plotimages(imagepts, imageadj, projworldpts, worldadj);
plot2dpts(imagepts, 'r.-', 'fignum', 1, 'adjmat', imageadj, ...
'name', 'original image', 'axis', 300);
plot2dpts(projworldpts, 'b.-', 'overlay', 'adjmat', worldadj, ...
'name', 'projected model', 'axis', 'freeze');
end
% first two rows of the camera matrices (for both perspective and sop). note:
% the scale factor is s = f/tz = 1/tz since f = 1. these are column 4-vectors.
r1t = [rot(1,:)/trans(3), trans(1)/trans(3)]';
r2t = [rot(2,:)/trans(3), trans(2)/trans(3)]';
betacount = 0;
poseconverged = 0;
assignconverged = 0;
foundpose = 0;
beta = beta0;
% the assignment matrix includes a slack row and slack column.
assignmat = ones(nbimagepts+1,nbworldpts+1) + epsilon0;
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% deterministic annealing loop.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
while beta < betafinal & ~assignconverged
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% given the current pose and w[i], compute the distance matrix, d[j,k].
% d[j,k] is the squared distance between the j-th corrected sop image
% point and the sop projection of the k-th scene point.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% sop (weak perspective projection) of world points.
projectedu = homogeneousworldpts * r1t; % column n-vector.
projectedv = homogeneousworldpts * r2t; % column n-vector.
% mxn matrices with identical rows equal to the x (replicatedprojectedu)
% and y (replicatedprojectedv) sop projections of the world points. the
% number of rows m is equal to the number of image points.
replicatedprojectedu = imageones * projectedu';
replicatedprojectedv = imageones * projectedv';
% for all possible pairings of image to world points, map the
% perspective image point into the corrected sop image point using
% the world point's current estimate of w[i]. here, j is the index of
% an image point and k is the index of a world point. wkxj is an
% mxn matrix where the j-th,k-th entry is w[k]*x[j]/f. wkyj is an
% mxn matrix where the j-th,k-th entry is w[k]*y[j]/f.
wkxj = centeredimage(:,1) * wk';
wkyj = centeredimage(:,2) * wk';
% distmat[j,k] = d[j,k]^2. the focallength^2 term here maps the distances
% back to the original (unnormalized) image, so distances are in terms
% of original pixels.
distmat = focallength*focallength*((replicatedprojectedu - wkxj).^2 + ...
(replicatedprojectedv - wkyj).^2);
if ismember(displevel,[5,6])
disp(' '); disp('distance matrix ='); disp(distmat);
end
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% use the softassign algorithm to compute the best assignment of image to
% world points based on the computed distances. the use of alpha
% determines when to favor assignments instead of use of slack.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
assignmat(1:nbimagepts, 1:nbworldpts) = scale*exp(-beta*(distmat - alpha));
assignmat(1:nbimagepts+1,nbworldpts+1) = scale;
assignmat(nbimagepts+1,1:nbworldpts+1) = scale;
% assignmat = sinkhornslack(assignmat); % don't normalize slack row and col.
assignmat = sinkhornimp(assignmat); % my "improved" sinkhorn.
if ismember(displevel,[5,6])
disp(' '); disp('processed assignment matrix ='); disp(assignmat);
end
% about how many matching model points do we currently have?
nummatchpts = nummatches(assignmat);
sumnonslack = sum(sum(assignmat(1:nbimagepts,1:nbworldpts)));
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% use posit to calculate the pose that minimizes the objective function
% (equation (10), the weighted sum of the differences between corrected
% image points and sop's of the corresponding world points), given the
% current assignment. the pose parameters and the w[i] are iteratively
% updated until some convergence criterion is satisfied.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% compute sum{k=1,n}(m'[k]s[k]s[k]^t) -- this is used in equations (11) and
% (12) in the paper, and below in the posit loop, to compute an optimal
% pose. in the paper, l == sumskskt.
% first, compute the sum of the elements w/in each column,
% ignoring the slack row and column. this is a row n-vector.
summedbycolassign = sum(assignmat(1:nbimagepts, 1:nbworldpts), 1);
% now, compute the 4x4 matrix l.
sumskskt = zeros(4, 4);
for k = 1:nbworldpts
sumskskt = sumskskt + summedbycolassign(k) * ...
homogeneousworldpts(k,:)' * homogeneousworldpts(k,:);
end
if cond(sumskskt) > 1e10
disp('sumskskt is ill-conditioned, termininating search.');
return;
end
objectmat = inv(sumskskt); % inv(l), a 4x4 matrix.
poseconverged = 0; % initialize for posit loop.
count = 0;
% save the previouse pose vectors for convergence checks.
r1tprev = r1t;
r2tprev = r2t;
% posit loop. we put a cap on number of steps here, so at first it does
% not converge. we currently do just one iteration of posit. when the
% annealing temperature is slow enough, one iteration of posit is
% sufficient since the assigments and pose will converge simultaneously.
% while ~poseconverged & count < maxcount
% compute the term in the equation for the optimal pose vectors (m and
% n in equations (11) and (12) in the paper) that depends on the current
% assigment and the w[i]. these are sum{all j,k}(m[j,k]w[k]x[j]s[k]) and
% sum{all j,k}(m[j,k]w[k]y[j]s[k]). here, (w[k]x[j],w[k]y[j]) is our
% best estimate of the sop of the k-th scene point whose homogeneous
% perspective image coordinate is (x[j],y[j]).
weightedui = zeros(4,1); % column vector
weightedvi = zeros(4,1); % column vector
for j = 1:nbimagepts
for k = 1:nbworldpts
weightedui = weightedui + assignmat(j,k) * wk(k) * ...
centeredimage(j,1) * homogeneousworldpts(k,:)';
weightedvi = weightedvi + assignmat(j,k) * wk(k) * ...
centeredimage(j,2) * homogeneousworldpts(k,:)';
end % for k
end % for j
% compute the pose vectors. m = s(r1,tx) and n = s(r2,ty) where the
% scale factor is s = f/tz, and f = 1. these are column 4-vectors.
r1t= objectmat * weightedui; % m
r2t = objectmat * weightedvi; % n
% compute the rotation matrix and translation vector corresponding to
% the computed pose vectors.
if 1 % chang & tsai calculation of r and t.
[u, s, v] = svd([r1t(1:3)'; r2t(1:3)']');
a = u * [1 0; 0 1; 0 0] * v';
r1 = a(:,1);
r2 = a(:,2);
r3 = cross(r1,r2);
tz = 2 / (s(1,1) + s(2,2));
tx = r1t(4) * tz;
ty = r2t(4) * tz;
r3t= [r3; tz];
else
% standard calculation of r and t. the rotation matrix may not be
% orthonormal. the object must distort when the rotation matrix
% is not orthonormal.
r1tsquared = r1t(1)*r1t(1) + r1t(2)*r1t(2)+ r1t(3)*r1t(3);
r2tsquared = r2t(1)*r2t(1) + r2t(2)*r2t(2)+ r2t(3)*r2t(3);
tz = sqrt(2/(r1tsquared+r2tsquared)); % chang & tsai's suggestion.
r1n = r1t*tz; % column 4-vectors: (r1,tx).
r2n = r2t*tz; % (r2,ty).
r1 = r1n(1:3); % three rows of the rotation matrix.
r2 = r2n(1:3);
r3 = cross(r1,r2);
r3t= [r3; tz]; % column 4-vector: (r3,tz).
tx = r1n(4);
ty = r2n(4);
end
% make the pose vectors consistent with the new rotation matrix.
% these are column 4-vectors.
r1t = [r1; tx]/tz;
r2t = [r2; ty]/tz;
% compute updated estimates for the w[i] (the third homogeneous
% coordinate of the perspective image points). the improved w[i] are
% used above to map (correct) the perspective image coordinates into
% the corresponding scaled orthographic projection image coordinates.
wk = homogeneousworldpts * r3t /tz;
% determine if the pose as computed by posit has converged.
% the pose is considered to have converged when the sum of the
% weighted distances between the corrected sop image points and
% the sop's of the world points is less than some threshold.
% this distance is in terms of pixels in the original image
% coordinate frame.
delta = sqrt(sum(sum(assignmat(1:nbimagepts,1:nbworldpts) ...
.* distmat))/nbworldpts);
poseconverged = delta < maxdelta;
if ismember(displevel,[3,4,5,6])
% display some information about the state of the iteration.
disp(['betacount = ' num2str(betacount) ...
', beta = ' num2str(beta) ...
', delta = ' num2str(delta) ...
', poseconverged = ' num2str(poseconverged) ...
', nummatchpts = ' num2str(nummatchpts) ...
', sumnonslack = ' num2str(sumnonslack) ]);
end
if ismember(displevel,[4,5,6])
pause;
end
% save some information for use by the calling routine.
stats(betacount+1,1) = beta;
stats(betacount+1,2) = delta;
stats(betacount+1,3) = nummatchpts/nbworldpts;
stats(betacount+1,4) = sumnonslack/nbworldpts;
stats(betacount+1,5) = sum([(r1t-r1tprev)' (r2t-r2tprev)'].^2);
count = count + 1; % number of posit iterations.
% end % of posit loop
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% updates for deterministic annealing and checks for convergence.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% update the "annealing temperature" and determine if the assignments
% have converged.
beta = betaupdate * beta;
betacount = betacount + 1; % number of deterministic annealing iterations.
assignconverged = poseconverged & betacount > minbetacount;
% form the best estimates for the translation vector and rotation matrix.
trans = [tx; ty; tz];
rot = [r1'; r2'; r3'];
if ismember(displevel,[5,6])
disp(' '); disp('current transformation:'); rot, trans
end
% has the pose converged?
foundpose = (delta < maxdelta & betacount > minbetacount);
% project the model onto the original (unnormalized) image plane.
projworldpts = proj3dto2d(worldpts, rot, trans, focallength, 1, center);
if ismember(displevel,[4,5,6])
% plotimages(imagepts, imageadj, projworldpts, worldadj);
plot2dpts(imagepts, 'r.-', 'fignum', 1, 'adjmat', imageadj, ...
'name', 'original image');
plot2dpts(projworldpts, 'b.-', 'overlay', 'adjmat', worldadj, ...
'name', 'projected model');
end
% perform the "early restart" test.
r = nummatchpts/kickout.nummatchable;
if r < kickout.rthldfbeta(betacount)
if ismember(displevel,[3,4,5,6])
disp(['early restart: r = ' num2str(r) ' < r*(' ...
num2str(betacount) ') = ' ...
num2str(kickout.rthldfbeta(betacount))]);
end
return;
end
end % of deterministic annealing loop
if poseconverged
% draw a picture showing the initial and final pose of the model.
initprojworldpts = proj3dto2d(worldpts, initrot, inittrans, ...
focallength, 1, center);
finalprojworldpts = proj3dto2d(worldpts,rot,trans,focallength,1,center);
plot2dpts(initprojworldpts, 'g.-', 'fignum', 5, 'adjmat', worldadj, ...
'axis', 'auto');
plot2dpts(finalprojworldpts, 'b.-', 'overlay', 'adjmat', worldadj);
plot2dpts(imagepts, 'r.', 'overlay', 'adjmat', imageadj);
axis(axis+[-50 50 -50 50]); % expand the axis a little.
set(gca,'xtick',[])
set(gca,'ytick',[])
print('-depsc','-f5',['poses' num2str(fix(rand*1000)) '.ps']);
disp('a picture of the initial and final poses has been saved.');
end
return
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%
% proj3dto2d project a set of 3d points onto a 2d image.
% pts2d = proj3dto2d(pts3d, rot, trans, flength, objdim, [center])
% returns a nx2 (or 2xn, see below) matrix of the projections of the
% 3d object points pts3d onto the image plane at focal length flength.
% the 3d object points are mapped into the camera coordinate system
% according to the transformation "xcamera = rot*xobject + trans".
% objdim gives the dimension (1 or 2) in pts3d along which object
% points are indexed. thus, the image matrix pts2d will be nx2
% if objdim == 1 and 2xn if objdim == 2. the optional argument
% center is a 2-vector which is principle point of the image plane;
% if not specified, the center is assumed to be (0,0).
%
% author:
% philip david
% army research laboratory university of maryland
% attn: amsrl-ci-cb dept. of computer science
% adelphi, md 20783 college park, md 20742
%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
function pts2d = proj3dto2d(pts3d, rot, trans, flength, objdim, varargin)
% check the inputs.
if objdim ~= 1 & objdim ~= 2
error(['objdim must be 1 or 2. you gave ' num2str(objdim)]);
elseif size(pts3d,bitcmp(objdim,2)) ~= 3
error(['pts3d does not have size 3 along dimension ' ...
num2str(bitcmp(objdim,2))]);
end
% get the principle point of the image.
if length(varargin) >= 1
center = varargin{1};
if size(center,2) ~= 1
center = center'; % make sure center is a column vector.
end
else
center = [0;0];
end
% make sure the 3d point matrix is 3xn.
if objdim == 1
pts3d = pts3d';
end
numpts = size(pts3d,2); % number of 3d points.
% map the world points into camera points.
campts = rot*pts3d + trans(:,ones(1,numpts)); % 3d camera points.
campts(3,find(campts(3,:) < 1e-20)) = 1e-20; % don't divide by zero!
% project the camera points into the image.
pts2d = flength * campts(1:2,:) ./ campts([3;3],:); % 2d image points.
pts2d = pts2d + center(:,ones(1,numpts)); % shift principle point.
% put the 2d point matrix into the same form as the 3d point matrix.
if objdim == 1
pts2d = pts2d';
end
return
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%
% disp2dpts plot a set of 2d points in a figure.
%
% plot2dpts(pntpos, linespec, [optional_args]) plots the 2d points
% in pntpos using the line specification linespec. pntpos is an nx2
% or a 2xn matrix of the x and y coordinates of the points to plot.
% (if n == 2, then each row of pntpos is assumed to be a different
% 2d point.) linespec is a character string of the type passed to
% plot() that specifies the line color, marker type, and line type.
% most optional arguments are remembered from one call of this routine
% to the next (and are associated with particular figure numbers), and
% therefore don't need to be repeated from one call to the next.
%
% various optional arguments may also appear in any order:
% 'title', title -- string giving title of the figure.
% 'name', name -- string giving name of the current data set.
% 'adjmat', adjmat -- nxn adjacency matrix for the data.
% 'fignum', fignum -- figure number. fignum may be a single integer
% or may be [fn, nr, nc, pos]. in the later case,
% fn is the figure number, the figure has
% nrxnc axes, and pos is the position for
% the current plot.
% 'new' -- start a new figure?
% 'overlay' -- overlay current points ontop of previous points?
% 'linewidth', lwidth -- line width.
% 'labelpts' -- label the points?
% 'axis', axis -- axis may be a single number, the vector
% [xmin xmax ymin ymax], the string 'auto',
% or the string 'freeze'. for the case
% of a single number, each axis displays
% the range -axis to +axis. for the
% case of a vector, the axis are set to
% [xmin xmax ymin ymax]. when axis is
% 'auto', the axis are automatically set.
% when axis is 'freeze', the axis are frozen
% after plotting the current data.
%
% author:
% philip david
% army research laboratory university of maryland
% attn: amsrl-ci-cb dept. of computer science
% adelphi, md 20783 college park, md 20742
%
% history:
% 14 july 2001 - p. david - created.
%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
function plot2dpts(pntpos, linespec, varargin)
persistent firstcall;
persistent fignum;
persistent figpos;
persistent numpntsets;
persistent pntnames;
persistent numlabeledsets;
persistent figtitle;
persistent fignrows;
persistent figncols;
persistent linewidth;
persistent axismode; % 0 = no change, 1 = auto, 2 = manual
persistent axisxmin;
persistent axisxmax;
persistent axisymin;
persistent axisymax;
persistent showlegend;
% initialize the default parameters.
if isempty(firstcall)
firstcall = 0;
fignum = 1;
figpos = 1;
for k = 1:100
pntnames{k} = [];
for j = 1:20
numlabeledsets{k,j} = 0;
numpntsets{k,j} = 0;
showlegend{k,j} = 0;
end
figtitle{k} = [];
fignrows{k} = 1;
figncols{k} = 1;
linewidth{k} = 1;
axismode{k} = 1;
axisxmin{k} = -500;
axisxmax{k} = 500;
axisymin{k} = -500;
axisymax{k} = 500;
end
end
% make sure the data matrix pntpos is an nx2 matrix.
[nrows, ncols] = size(pntpos);
if nrows == 0 & ncols == 0
% we allow an empty set of points.
elseif nrows ~= 2 & ncols ~= 2
error('pntpos must be nx2 or 2xn');
elseif nrows == 2 & ncols == 2
disp('plot2dpts: 2x2 point matrix, assuming each row is a 2d point.');
pause;
elseif nrows == 2
% make the 2d point matrix is nx2;
pntpos = pntpos';
tmp = ncols;
ncols = nrows;
nrows = ncols;
end
% set default arguments.
curname = ''; % data set is not named.
pntadj = eye(nrows); % no point is connected to any other.
newfig = 0; % use previous figure number.
overlayfig = 0; % erase previous plot in this figure.
labelpts = 0; % do not label the current set of points.
% get optional arguments.
argnum = 1;
while argnum 0
% if there was a previous plot for this axis, then overlay the
% current plot on top of it.
numpntsets{fignum,figpos} = numpntsets{fignum,figpos} + 1;
hold on;
else
% draw the first plot for this axis.
pntnames{fignum,figpos} = [];
numpntsets{fignum,figpos} = 1;
numlabeledsets{fignum,figpos} = 0;
showlegend{fignum,figpos} = 0;
hold off;
end
subplot(fignrows{fignum}, figncols{fignum}, figpos);
set(fignum,'doublebuffer','on');
% plot the data if there is any data to plot.
if nrows > 0, gplot(pntadj, pntpos, linespec); end
% set the range of data to be displayed.
axis equal;
if axismode{fignum} == 1
axis auto;
elseif axismode{fignum} == 2
axis([axisxmin{fignum} axisxmax{fignum} axisymin{fignum} axisymax{fignum}]);
end
% update the legend.
% in the past, putting a legend on the image causes some of the
% markers to disappear! see below for the makeshift legend.
pntnames{fignum,figpos,numpntsets{fignum,figpos}} = curname;
if ~isempty(curname) | showlegend{fignum,figpos}
showlegend{fignum,figpos} = 1;
legend(pntnames{fignum,figpos,1:numpntsets{fignum,figpos}},0);
end
% tag the current data according to it's fignum, figpos, and pntsetnum
h = findobj('type','line','color',linespec(1),'tag','');
set(h,'tag',mat2str([fignum figpos numpntsets{fignum,figpos}]));
% set the size of all markers and lines that are the current color.
markersize = 8*ceil(linewidth{fignum});
set(h, 'markersize', markersize);
set(h, 'linewidth', ceil(linewidth{fignum}));
% get the width and height of the plot.
axisrange = axis; % [xmin xmax ymin ymax]
width = axisrange(2) - axisrange(1);
height = axisrange(4) - axisrange(3);
set(gca,'units','points');
axsize = get(gca,'position'); % [left bottom width height] in points.
markratio = markersize./axsize(3:4); % marker-size-to-axis-size ratio.
% label the points.
if labelpts
switch mod(numlabeledsets{fignum,figpos},4)
case 0
% halign = 'left'; valign = 'top';
% dx = 1; dy = -1;
halign = 'left'; valign = 'middle';
dx = 1; dy = 0;
case 1
% halign = 'right'; valign = 'bottom';
% dx = -1; dy = 1;
halign = 'right'; valign = 'middle';
dx = -1; dy = 0;
case 2
% halign = 'right'; valign = 'top';
% dx = -1; dy = -1;
halign = 'center'; valign = 'top';
dx = 0; dy = -1;
case 3
% halign = 'left'; valign = 'bottom';
% dx = 1; dy = -1;
halign = 'center'; valign = 'bottom';
dx = 0; dy = 1;
end
dx = 0.5*dx*markratio(1)*width;
dy = 0; % 0.4*dy*markratio(2)*height;
for k = 1:nrows
h = text(pntpos(k,1)+dx, pntpos(k,2)+dy, int2str(k), ...
'verticalalignment', valign, 'horizontalalignment', halign, ...
'fontsize', 10, 'color', linespec(1));
end
numlabeledsets{fignum,figpos} = numlabeledsets{fignum,figpos} + 1;
end
% create a makeshift legend until the regular legend works.
% if ~isempty(curname)
% text(axisrange(1)+0.05*width, ...
% axisrange(4)-(numpntsets{fignum,figpos})*0.05*height, ...
% curname, 'fontsize', 10, 'color', linespec(1));
% end
% display a figure title.
if ~isempty(figtitle{fignum})
title(figtitle{fignum});
end
drawnow;
if axismode{fignum} == 0
% freeze the axis at their current settings.
axisxmin{fignum} = axisrange(1);
axisxmax{fignum} = axisrange(2);
axisymin{fignum} = axisrange(3);
axisymax{fignum} = axisrange(4);
axismode{fignum} = 2;
end
return;
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%
% normalize across rows and columns to find an assignment matrix (a
% doubly stochastic matrix).
%
% this version treats the slack row and column differently from other rows
% and columns: the slack values are not normalized with respect to other
% slack values, only with respect to the nonslack values. this may work
% better than the original sinkhorn algorithm which treats all rows and
% columns identically. this is true primarily when there needs to be more
% than one assignment to a slack row or column. i.e., when there are two
% or more missing image points or model points.
%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
function normalizedmat = sinkhornslack(m)
imaxitersinkhorn=60; % in pami paper
fepsilon2 = 0.001; % used in termination of sinkhorn loop.
inumsinkiter = 0;
[nbrows, nbcols] = size(m);
fmdiffsum = fepsilon2 + 1; % set "difference" from last m matrix above
% the loop termination threshold.
while(abs(fmdiffsum) > fepsilon2 & inumsinkiter < imaxitersinkhorn)
mprev = m; % save m from previous iteration to test for loop termination
% col normalization (except outlier row - do not normalize col slacks
% against each other)
mcolsums = sum(m, 1); % row vector.
mcolsums(nbcols) = 1; % don't normalize slack col terms against each other.
mcolsumsrep = ones(nbrows,1) * mcolsums ;
m = m ./ mcolsumsrep;
% row normalization (except outlier row - do not normalize col slacks
% against each other)
mrowsums = sum(m, 2); % column vector.
mrowsums(nbrows) = 1; % don't normalize slack row terms against each other.
mrowsumsrep = mrowsums * ones(1, nbcols);
m = m ./ mrowsumsrep;
inumsinkiter=inumsinkiter+1;
fmdiffsum=sum(abs(m(:)-mprev(:)));
end
normalizedmat = m;
return
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%
% normalizedmat = sinkhornimp(m)
%
% apply an improved sinkhorn algorithm to map matrix m to a doubly
% stochastic matrix.
%
% the sinkhorn algorithm modified for slack rows and columns treats the
% slack row and column differently from other rows and columns: the slack
% values are not normalized with respect to other slack values, only with
% respect to the nonslack values. this may work better than the original
% sinkhorn algorithm which treats all rows and columns identically.
% this is true primarily when there needs to be more than one assignment
% to a slack row or column. i.e., when there are two or more missing
% image points or model points.
%
% a problem with this modified sinkhorn algorithm is the following.
% suppose all rows except the slack row are normalized. it is possible that
% a nonslack value which was previously maximum in its row and column to now
% have a value that is less than the slack value for that column. (this
% nonslack value will still be greater than the slack value for that
% row.) the same sort of thing can happen when columns are normalized.
% intuitivitly, this seems like a bad thing: nonslack values that start
% off as maximum in their row and column should remain maximum in their
% row and column throughout this iteration. the current algorithm
% attempts to prevent this from happening as follows. after performing
% row normalizations, the values in the slack row are set so that their
% ratio to the nonslack value in that column which was previously maximum
% is the same as this ratio was prior to row normalization. a similar
% thing is done after column normalizations.
%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
function normalizedmat = sinkhornimp(m)
imaxitersinkhorn=60; % in pami paper
fepsilon2 = 0.001; % used in termination of sinkhorn loop.
inumsinkiter = 0;
[nbrows, nbcols] = size(m);
fmdiffsum = fepsilon2 + 1; % set "difference" from last m matrix above
% the loop termination threshold
% get the positions and ratios to slack of the nonslack elements that
% are maximal in their row and column.
[posmax, ratios] = maxposratio(m);
while(abs(fmdiffsum) > fepsilon2 & inumsinkiter < imaxitersinkhorn)
mprev = m; % save m from previous iteration to test for loop termination
% col normalization (except outlier row - do not normalize col slacks
% against each other)
mcolsums = sum(m, 1); % row vector.
mcolsums(nbcols) = 1; % don't normalize slack col terms against each other.
mcolsumsrep = ones(nbrows,1) * mcolsums ;
m = m ./ mcolsumsrep;
% fix values in the slack column.
for i = 1:size(posmax,1)
m(posmax(i,1),nbcols) = ratios(i,1)*m(posmax(i,1),posmax(i,2));
end
% row normalization (except outlier row - do not normalize col slacks
% against each other)
mrowsums = sum(m, 2); % column vector.
mrowsums(nbrows) = 1; % don't normalize slack row terms against each other.
mrowsumsrep = mrowsums * ones(1, nbcols);
m = m ./ mrowsumsrep;
% fix values in the slack row.
for i = 1:size(posmax,1)
m(nbrows,posmax(i,2)) = ratios(i,2)*m(posmax(i,1),posmax(i,2));
end
inumsinkiter=inumsinkiter+1;
fmdiffsum=sum(abs(m(:)-mprev(:)));
end
normalizedmat = m;
return;
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%
% maxposratio get the positions and ratios of the maximum nonslack values.
% [pos, ratios] = maxposratio(assignmat)
% assignmat is an image to model assignment matrix. the last
% row and column of assignmat are slack, and are used to indicate
% no match for that row or column. pos returns an mx2 matrix of the
% positions in assignmat that are maximal within the respective
% row and column. m is the number of elements in assignmat that are
% maximum in a row and column, and may equal zero. the kth row
% of pos gives [rowpos colpos], the row and column position of the
% kth maximal element. ratios returns an mx2 matrix of the ratios
% of these maximal values to the slack values in those rows and
% columns. the kth row of ratios is [rratio, cratio] where rratio
% (cratio) is the respective row (column) slack value for the kth
% maximal element divided by the value of the kth maximal element.
%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
function [pos, ratios] = maxposratio(assignmat)
pos = [];
ratios = [];
nrows = size(assignmat,1);
ncols = size(assignmat,2);
nimgpnts = nrows - 1;
nmodpnts = ncols - 1;
% iterate over all columns of assignmat.
for k = 1 : nmodpnts
[vmax imax] = max(assignmat(:,k)); % max value in column k.
if imax == nrows
continue; % slack value is maximum in this column.
end;
% check if the max value in the column is maximum within its row.
if all(vmax > assignmat(imax,[1:k-1,k+1:ncols]))
pos = [pos; [imax, k]]; % this value is maximal in its row & column.
% compute the ratios to row and column slack values.
rr = assignmat(imax,ncols)/assignmat(imax,k);
cr = assignmat(nrows,k)/assignmat(imax,k);
ratios = [ratios; [rr cr]];
end
end
return;
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%
% nummatches how many image points have found a model point match?
% num = nummatches(assignmat)
% assignmat is an image to model assignment matrix. the last
% row and column of assignmat are slack, and are used to indicate
% no match for that row or column. image point number i matches
% model point number m if assignmat(i,m) is strictly greater
% than all other entries in row i and column m of assignmat.
% num returns the number of matching points.
%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
function num = nummatches(assignmat)
num = 0;
nrows = size(assignmat,1);
ncols = size(assignmat,2);
nimgpnts = nrows - 1;
nmodpnts = ncols - 1;
for k = 1 : nmodpnts
[vmax imax] = max(assignmat(:,k)); % max value in column k.
if imax == nrows
continue; % slack value is maximum in this column.
end;
if all(vmax > assignmat(imax,[1:k-1,k+1:ncols]))
num = num + 1; % this value is maximal in its row & column.
end
end
return;
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
david,phild Précédent 217 Précédent 216 Précédent 215 Précédent 214 Précédent 213 Précédent 212 Précédent 211 Précédent 210 Précédent 209 Précédent 208 Précédent 207 Précédent 206 Précédent 205 Précédent 204 Précédent 203 Précédent 202 Précédent 201 Précédent 200 Précédent 199 Précédent 198 Précédent 197 Précédent 196 Précédent 195 Précédent 194 Précédent 193 Précédent 192 Précédent 191 Précédent 190 Précédent 189 Précédent 188 Suivant 219 Suivant 220 Suivant 221 Suivant 222 Suivant 223 Suivant 224 Suivant 225 Suivant 226 Suivant 227 Suivant 228 Suivant 229 Suivant 230 Suivant 231 Suivant 232 Suivant 233 Suivant 234 Suivant 235 Suivant 236 Suivant 237 Suivant 238 Suivant 239 Suivant 240 Suivant 241 Suivant 242 Suivant 243 Suivant 244 Suivant 245 Suivant 246 Suivant 247 Suivant 248