Code for estimating the absolute pose of a multi-camera system from a set of 2D-3D matches.
This project has the following dependencies:
RansacLib and PoseLib are included as a submodule. After cloning the repository, run
git submodule update --init --recursive
To compile the project (under Linux), simple type
mkdir build
cd build/
cmake -DCMAKE_BUILD_TYPE=Release ../
make
There are two executables" fixed_rig_camera_pose and multi_camera_pose.
fixed_rig_camera_pose assumes that the absolute scale of the transformation between the images is known.
multi_camera_pose does not require the scale to be known, e.g., when the poses are estimated by SLAM, but rather
estimates the scale. Both executables define multi-camera rigs define from sequences of images and expect a list of 2D-3D
matches to be given for each image in the multi-camera system. Both executables share the same command line parameters:
images_with_intrinsicsis the file name of a text file that contains image names, camera intrinsics, and camera extrinsics. Each line consists of the following information:image_name: the name of the image.intrinsics: the intrinsics of the image, consisting of a camera type and the parameters. We use Colmaps camera definition. Please see Colmap's camera definitions. An example for this part isSIMPLE_RADIAL 1024 1024 640.0 512 512 0.2.- The camera extrinsics in the form
qw qx qy qz cx cy cz, whereqw qx qy qzis a unit quaternion defining a rotation from world to camera coordinates for this image andcx cy czis the position of the image in world coordinates. I.e., a pointXwin world coordinates is transformed into the local camera coordinate system of the image asXc = R * (Xw - c), whereRis the rotation matrix defined by the quaternion andcis the position of the image in the world coordinate system. These poses can be defined in an arbitrary coordinate frame. The executables will automatically extract relative poses between the images.
outfileis the file name of a text file into which the estimated poses will be written. For each image inimages_with_intrinsics, a pose will be written (if a corresponding pose can be estimated) in a line of the output file. The format of that line isimage_name qw qx qy qz tx ty tz. Hereimage_nameis the name of the image (as specified inimages_with_intrinsics,qw qx qy qzagain is a unit quaternion defining the rotation from the coordinate system of the 3D points (see below) to the local camera coordinate system andtx ty tzis the corresponding translation. IfRis the rotation matrix corresponding to the quaternion, then a pointXwin world coordinates is transformed into the local camera coordinate system asXc = R * Xw + t, wheretis the translation vector given bytx ty tz.inlier_threshold: the inlier threshold to be used in RANSAC, given in pixels.num_lo_steps: the number of local optimization steps performed in RANSAC whenever a new best minimal pose is found.invert_Y_Z: the 2D-3D matches for each image are read from text files, where each line has the formatx y X Y Z. Here,x ydefines a 2D keypoint. Set this variable to1ifx yis given in a coordinate system where the y-axis is pointing upwards.points_centered: set to1if the 2D keypoint coordinatesx yare already centered around the principal point. If set to0, the executables will center the keypoints before using them.undistortion_needed: set to1if the 2D keypoint coordinates need to be undistorted and to0if the keypoints that are read from the text files are already undistorted.sequence_length: both executables assume that the images specified inimages_with_intrinsicsare given in sequential order. If set tok, e.g.,3,fixed_rig_camera_posewill use the firstkimages to define a multi-camera system and attempt to localize them jointly. It will then use the nextkimages to define the next multi-camera rig, etc.multi_camera_posewill use the firstkimages to define the first multi-camera system. It will then use images2, ..., k+1to define the next multi-camera system, then3, ..., k + 2, etc.[match-file postfix](optional): the postfix of the match files, set to.individual_datasets.matches.txtper default. For a given image namea.jpginimages_with_intrinsics, both executables will attempt to load 2D-3D matches from a text file calleda.jpg.individual_datasets.matches.txt(for the default value). The text file stores each 2D-3D match in a single line, with the formatx y X Y Z. Here,x yis the 2D coordinate of the matching 2D point andX Y Zis the corresponding 3D point.
Simply calling one of the programs without parameters will give you a list of command line arguments.
When using this software for publications, please cite
@inproceedings{wald2020,
title={{Beyond Controlled Environments: 3D Camera Re-Localization in Changing Indoor Scenes}},
author={Wald, Johanna and Sattler, Torsten and Golodetz, Stuart and Cavallari, Tommaso and Tombari, Federico},
booktitle = {Proceedings IEEE European Conference on Computer Vision (ECCV)},
year = {2020}
}
When using the pose estimators based on the GP3P or GP4Ps solvers, please cite
@InProceedings{Kukelova2016CVPR,
author = {Kukelova, Zuzana and Heller, Jan and Fitzgibbon, Andrew},
title = {Efficient Intersection of Three Quadrics and Applications in Computer Vision},
booktitle = {The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
year = {2016}
}