| .. _output-format: | |
| Output Format | |
| ============= | |
| ================== | |
| Binary File Format | |
| ================== | |
| Note that all binary data is stored using little endian byte ordering. All x86 | |
| processors are little endian and thus no special care has to be taken when | |
| reading COLMAP binary data on most platforms. | |
| ======================= | |
| Indices and Identifiers | |
| ======================= | |
| Any variable name ending with ``*_idx`` should be considered as an ordered, | |
| contiguous zero-based index. In general, any variable name ending with ``*_id`` | |
| should be considered as an unordered, non-contiguous identifier. | |
| For example, the unique identifiers of cameras (`CAMERA_ID`), images | |
| (`IMAGE_ID`), and 3D points (`POINT3D_ID`) are unordered and are most likely not | |
| contiguous. This also means that the maximum `POINT3D_ID` does not necessarily | |
| correspond to the number 3D points, since some `POINT3D_ID`'s are missing due to | |
| filtering during the reconstruction, etc. | |
| ===================== | |
| Sparse Reconstruction | |
| ===================== | |
| By default, COLMAP uses a binary file format (machine-readable, fast) for | |
| storing sparse models. In addition, COLMAP provides the option to store the | |
| sparse models as text files (human-readable, slow). In both cases, the | |
| information is split into three files for the information about `cameras`, | |
| `images`, and `points`. Any directory containing those three files constitutes a | |
| sparse model. The binary files have the file extension `.bin` and the text files | |
| the file extension `.txt`. Note that when loading a model from a directory which | |
| contains both binary and text files, COLMAP prefers the binary format. | |
| To export the currently selected model in the GUI, choose ``File > Export | |
| model``. To export all reconstructed models in the current dataset, choose | |
| ``File > Export all``. The selected folder then contains the three files, and | |
| for convenience, the current project configuration for importing the model to | |
| COLMAP. To import the exported models, e.g., for visualization or to resume the | |
| reconstruction, choose ``File > Import model`` and select the folder containing | |
| the `cameras`, `images`, and `points3D` files. | |
| To convert between the binary and text format in the GUI, you can load the model | |
| using ``File > Import model`` and then export the model in the desired output | |
| format using ``File > Export model`` (binary) or ``File > Export model as text`` | |
| (text). In addition, you can export sparse models to other formats, such as | |
| VisualSfM's NVM, Bundler files, PLY, VRML, etc., using ``File > Export as...``. | |
| To convert between various formats from the CLI, use the ``model_converter`` | |
| executable. | |
| There are two source files to conveniently read the sparse reconstructions using | |
| Python (``scripts/python/read_write_model.py`` supporting binary and text) and Matlab | |
| (``scripts/matlab/read_model.m`` supporting text). | |
| ----------- | |
| Text Format | |
| ----------- | |
| COLMAP exports the following three text files for every reconstructed model: | |
| `cameras.txt`, `images.txt`, and `points3D.txt`. Comments start with a leading | |
| "#" character and are ignored. The first comment lines briefly describe the | |
| format of the text files, as described in more detailed on this page. | |
| cameras.txt | |
| ----------- | |
| This file contains the intrinsic parameters of all reconstructed cameras in the | |
| dataset using one line per camera, e.g.:: | |
| # Camera list with one line of data per camera: | |
| # CAMERA_ID, MODEL, WIDTH, HEIGHT, PARAMS[] | |
| # Number of cameras: 3 | |
| 1 SIMPLE_PINHOLE 3072 2304 2559.81 1536 1152 | |
| 2 PINHOLE 3072 2304 2560.56 2560.56 1536 1152 | |
| 3 SIMPLE_RADIAL 3072 2304 2559.69 1536 1152 -0.0218531 | |
| Here, the dataset contains 3 cameras based using different distortion models | |
| with the same sensor dimensions (width: 3072, height: 2304). The length of | |
| parameters is variable and depends on the camera model. For the first camera, | |
| there are 3 parameters with a single focal length of 2559.81 pixels and a | |
| principal point at pixel location `(1536, 1152)`. The intrinsic parameters of a | |
| camera can be shared by multiple images, which refer to cameras using the unique | |
| identifier `CAMERA_ID`. | |
| images.txt | |
| ---------- | |
| This file contains the pose and keypoints of all reconstructed images in the | |
| dataset using two lines per image, e.g.:: | |
| # Image list with two lines of data per image: | |
| # IMAGE_ID, QW, QX, QY, QZ, TX, TY, TZ, CAMERA_ID, NAME | |
| # POINTS2D[] as (X, Y, POINT3D_ID) | |
| # Number of images: 2, mean observations per image: 2 | |
| 1 0.851773 0.0165051 0.503764 -0.142941 -0.737434 1.02973 3.74354 1 P1180141.JPG | |
| 2362.39 248.498 58396 1784.7 268.254 59027 1784.7 268.254 -1 | |
| 2 0.851773 0.0165051 0.503764 -0.142941 -0.737434 1.02973 3.74354 1 P1180142.JPG | |
| 1190.83 663.957 23056 1258.77 640.354 59070 | |
| Here, the first two lines define the information of the first image, and so on. | |
| The reconstructed pose of an image is specified as the projection from world to | |
| the camera coordinate system of an image using a quaternion `(QW, QX, QY, QZ)` | |
| and a translation vector `(TX, TY, TZ)`. The quaternion is defined using the | |
| Hamilton convention, which is, for example, also used by the Eigen library. The | |
| coordinates of the projection/camera center are given by ``-R^t * T``, where | |
| ``R^t`` is the inverse/transpose of the 3x3 rotation matrix composed from the | |
| quaternion and ``T`` is the translation vector. The local camera coordinate | |
| system of an image is defined in a way that the X axis points to the right, the | |
| Y axis to the bottom, and the Z axis to the front as seen from the image. | |
| Both images in the example above use the same camera model and share intrinsics | |
| (`CAMERA_ID = 1`). The image name is relative to the selected base image folder | |
| of the project. The first image has 3 keypoints and the second image has 2 | |
| keypoints, while the location of the keypoints is specified in pixel | |
| coordinates. Both images observe 2 3D points and note that the last keypoint of | |
| the first image does not observe a 3D point in the reconstruction as the 3D | |
| point identifier is -1. | |
| points3D.txt | |
| ------------ | |
| This file contains the information of all reconstructed 3D points in the | |
| dataset using one line per point, e.g.:: | |
| # 3D point list with one line of data per point: | |
| # POINT3D_ID, X, Y, Z, R, G, B, ERROR, TRACK[] as (IMAGE_ID, POINT2D_IDX) | |
| # Number of points: 3, mean track length: 3.3334 | |
| 63390 1.67241 0.292931 0.609726 115 121 122 1.33927 16 6542 15 7345 6 6714 14 7227 | |
| 63376 2.01848 0.108877 -0.0260841 102 209 250 1.73449 16 6519 15 7322 14 7212 8 3991 | |
| 63371 1.71102 0.28566 0.53475 245 251 249 0.612829 118 4140 117 4473 | |
| Here, there are three reconstructed 3D points, where `POINT2D_IDX` defines the | |
| zero-based index of the keypoint in the `images.txt` file. The error is given in | |
| pixels of reprojection error and is only updated after global bundle adjustment. | |
| ==================== | |
| Dense Reconstruction | |
| ==================== | |
| COLMAP uses the following workspace folder structure:: | |
| +ββ images | |
| βΒ Β +ββ image1.jpg | |
| βΒ Β +ββ image2.jpg | |
| βΒ Β +ββ ... | |
| +ββ sparse | |
| βΒ Β +ββ cameras.txt | |
| βΒ Β +ββ images.txt | |
| βΒ Β +ββ points3D.txt | |
| +ββ stereo | |
| βΒ Β +ββ consistency_graphs | |
| βΒ Β βΒ Β +ββ image1.jpg.photometric.bin | |
| βΒ Β βΒ Β +ββ image2.jpg.photometric.bin | |
| βΒ Β βΒ Β +ββ ... | |
| βΒ Β +ββ depth_maps | |
| βΒ Β βΒ Β +ββ image1.jpg.photometric.bin | |
| βΒ Β βΒ Β +ββ image2.jpg.photometric.bin | |
| βΒ Β βΒ Β +ββ ... | |
| βΒ Β +ββ normal_maps | |
| βΒ Β βΒ Β +ββ image1.jpg.photometric.bin | |
| βΒ Β βΒ Β +ββ image2.jpg.photometric.bin | |
| βΒ Β βΒ Β +ββ ... | |
| βΒ Β +ββ patch-match.cfg | |
| βΒ Β +ββ fusion.cfg | |
| +ββ fused.ply | |
| +ββ meshed-poisson.ply | |
| +ββ meshed-delaunay.ply | |
| +ββ run-colmap-geometric.sh | |
| +ββ run-colmap-photometric.sh | |
| Here, the `images` folder contains the undistorted images, the `sparse` folder | |
| contains the sparse reconstruction with undistorted cameras, the `stereo` folder | |
| contains the stereo reconstruction results, `point-cloud.ply` and `mesh.ply` are | |
| the results of the fusion and meshing procedure, and `run-colmap-geometric.sh` | |
| and `run-colmap-photometric.sh` contain example command-line usage to perform | |
| the dense reconstruction. | |
| --------------------- | |
| Depth and Normal Maps | |
| --------------------- | |
| The depth maps are stored as mixed text and binary files. The text header | |
| defines the dimensions of the image in the format ``with&height&channels&`` | |
| followed by row-major `float32` binary data. For depth maps ``channels=1`` and | |
| for normal maps ``channels=3``. The depth and normal maps can be conveniently | |
| read with Python using the functions in ``scripts/python/read_dense.py`` and | |
| with Matlab using the functions in ``scripts/matlab/read_depth_map.m`` and | |
| ``scripts/matlab/read_normal_map.m``. | |
| ------------------ | |
| Consistency Graphs | |
| ------------------ | |
| The consistency graph defines, for all pixels in an image, the source images a | |
| pixel is consistent with. The graph is stored as a mixed text and binary file, | |
| while the text part is equivalent to the depth and normal maps and the binary | |
| part is a continuous list of `int32` values in the format | |
| ``<row><col><N><image_idx1>...<image_idxN>``. Here, ``(row, col)`` defines the | |
| location of the pixel in the image followed by a list of ``N`` image indices. | |
| The indices are specified w.r.t. the ordering in the ``images.txt`` file. | |