update readme
Browse files
README.md
CHANGED
|
@@ -1,3 +1,384 @@
|
|
| 1 |
-
---
|
| 2 |
-
license: apache-2.0
|
| 3 |
-
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: apache-2.0
|
| 3 |
+
---
|
| 4 |
+
|
| 5 |
+
<div align="center">
|
| 6 |
+
<h1>AnyCalib:<br>
|
| 7 |
+
On-Manifold Learning for Model-Agnostic Single-View Camera Calibration</h1>
|
| 8 |
+
<p>Javier Tirado-Garín    Javier Civera<br>
|
| 9 |
+
I3A, University of Zaragoza</p>
|
| 10 |
+
<img width="99%" src="https://github.com/javrtg/AnyCalib/blob/main/assets/method_dark.png?raw=true">
|
| 11 |
+
<p><strong>Camera calibration from a single perspective/edited/distorted image using a freely chosen camera model</strong></p>
|
| 12 |
+
|
| 13 |
+
[](https://github.com/javrtg/AnyCalib)
|
| 14 |
+
[](https://arxiv.org/abs/2503.12701)
|
| 15 |
+
|
| 16 |
+
</div>
|
| 17 |
+
|
| 18 |
+
|
| 19 |
+
## Usage (pretrained models)
|
| 20 |
+
|
| 21 |
+
The only requirements are Python (≥3.10) and PyTorch.
|
| 22 |
+
The project, in development mode, can be installed with:
|
| 23 |
+
```shell
|
| 24 |
+
git clone https://github.com/javrtg/AnyCalib.git && cd AnyCalib
|
| 25 |
+
pip install -e .
|
| 26 |
+
```
|
| 27 |
+
Alternatively, and optionally, a compatible version of [`xformers`](https://github.com/facebookresearch/xformers) can also be installed for better efficiency by running the following instead of `pip install -e .`:
|
| 28 |
+
```shell
|
| 29 |
+
pip install -e .[eff]
|
| 30 |
+
```
|
| 31 |
+
|
| 32 |
+
|
| 33 |
+
### Minimal usage example
|
| 34 |
+
```python
|
| 35 |
+
import numpy as np
|
| 36 |
+
import torch
|
| 37 |
+
from PIL import Image # the library of choice to load images
|
| 38 |
+
|
| 39 |
+
from anycalib import AnyCalib
|
| 40 |
+
|
| 41 |
+
|
| 42 |
+
dev = torch.device("cuda")
|
| 43 |
+
|
| 44 |
+
# load input image and convert it to a (3, H, W) tensor with RGB values in [0, 1]
|
| 45 |
+
image = np.array(Image.open("path/to/image.jpg").convert("RGB"))
|
| 46 |
+
image = torch.tensor(image, dtype=torch.float32, device=dev).permute(2, 0, 1) / 255
|
| 47 |
+
|
| 48 |
+
# instantiate AnyCalib according to the desired model_id. Options:
|
| 49 |
+
# "anycalib_pinhole": model trained with *only* perspective (pinhole) images,
|
| 50 |
+
# "anycalib_gen": trained with perspective, distorted and strongly distorted images,
|
| 51 |
+
# "anycalib_dist": trained with distorted and strongly distorted images,
|
| 52 |
+
# "anycalib_edit": Trained on edited (stretched and cropped) perspective images.
|
| 53 |
+
model = AnyCalib(model_id="anycalib_pinhole").to(dev)
|
| 54 |
+
|
| 55 |
+
# Alternatively, the weights can be loaded from the huggingface hub as follows:
|
| 56 |
+
# NOTE: huggingface_hub (https://pypi.org/project/huggingface-hub/) needs to be installed
|
| 57 |
+
# model = AnyCalib().from_pretrained(model_id=<model_id>).to(dev)
|
| 58 |
+
|
| 59 |
+
# predict according to the desired camera model. Implemented camera models are detailed further below.
|
| 60 |
+
output = model.predict(image, cam_id="pinhole")
|
| 61 |
+
# output is a dictionary with the following key-value pairs:
|
| 62 |
+
# {
|
| 63 |
+
# "intrinsics": (D,) tensor with the estimated intrinsics for the selected camera model,
|
| 64 |
+
# "fov_field": (N, 2) tensor with the regressed FoV field by the network. N≈320^2 (resolution close to the one seen during training),
|
| 65 |
+
# "tangent_coords": alias for "fov_field",
|
| 66 |
+
# "rays": (N, 3) tensor with the corresponding (via the exponential map) ray directions in the camera frame (x right, y down, z forward),
|
| 67 |
+
# "pred_size": (H, W) tuple with the image size used by the network. It can be used e.g. for resizing the FoV/ray fields to the original image size.
|
| 68 |
+
# }
|
| 69 |
+
```
|
| 70 |
+
The weights of the selected `model_id`, if not already downloaded, will be automatically downloaded to the:
|
| 71 |
+
* torch hub cache directory (`torch.hub.get_dir()`) if `AnyCalib(model_id=<model_id>)` is used, or
|
| 72 |
+
* huggingface cache directory if `AnyCalib().from_pretrained(model_id=<model_id>)` is used.
|
| 73 |
+
|
| 74 |
+
Additional configuration options are indicated in the docstring of `AnyCalib`:
|
| 75 |
+
<details>
|
| 76 |
+
<summary> <code>help(AnyCalib)</code> </summary>
|
| 77 |
+
|
| 78 |
+
```python
|
| 79 |
+
"""AnyCalib class.
|
| 80 |
+
|
| 81 |
+
Args for instantiation:
|
| 82 |
+
model_id: one of {'anycalib_pinhole', 'anycalib_gen', 'anycalib_dist', 'anycalib_edit'}.
|
| 83 |
+
Each model differes in the type of images they seen during training:
|
| 84 |
+
* 'anycalib_pinhole': Perspective (pinhole) images,
|
| 85 |
+
* 'anycalib_gen': General images, including perspective, distorted and
|
| 86 |
+
strongly distorted images, and
|
| 87 |
+
* 'anycalib_dist': Distorted images using the Brown-Conrady camera model
|
| 88 |
+
and strongly distorted images, using the EUCM camera model,
|
| 89 |
+
* 'anycalib_edit': Trained on edited (stretched and cropped) perspective
|
| 90 |
+
images.
|
| 91 |
+
Default: 'anycalib_pinhole'.
|
| 92 |
+
nonlin_opt_method: nonlinear optimization method: 'gauss_newton' or 'lev_mar'.
|
| 93 |
+
Default: 'gauss_newton'
|
| 94 |
+
nonlin_opt_conf: nonlinear optimization configuration.
|
| 95 |
+
This config can be used to control the number of iterations and the space
|
| 96 |
+
where the residuals are minimized. See the classes `GaussNewtonCalib` or
|
| 97 |
+
`LevMarCalib` under anycalib/optim for details. Default: None.
|
| 98 |
+
init_with_sac: use RANSAC instead of nonminimal fit for initializating the
|
| 99 |
+
intrinsics. Default: False.
|
| 100 |
+
fallback_to_sac: use RANSAC if nonminimal fit fails. Default: True.
|
| 101 |
+
ransac_conf: RANSAC configuration. This config can be used to control e.g. the
|
| 102 |
+
inlier threshold or the number of minimal samples to try. See the class
|
| 103 |
+
`RANSAC` in anycalib/ransac.py for details. Default: None.
|
| 104 |
+
rm_borders: border size of the dense FoV fields to ignore during fitting.
|
| 105 |
+
Default: 0.
|
| 106 |
+
sample_size: approximate number of 2D-3D correspondences to use for fitting the
|
| 107 |
+
intrinsics. Negative value -> no subsampling. Default: -1.
|
| 108 |
+
"""
|
| 109 |
+
```
|
| 110 |
+
</details>
|
| 111 |
+
|
| 112 |
+
### Minimal batched example
|
| 113 |
+
AnyCalib can also be executed in batch and using possibly different camera models for each image. For example:
|
| 114 |
+
```python
|
| 115 |
+
images = ... # (B, 3, H, W)
|
| 116 |
+
# NOTE: if cam_ids is a list, then len(cam_ids) must be equal to B
|
| 117 |
+
cam_ids = ["pinhole", "radial:1", "kb:4"] # different camera models for each image
|
| 118 |
+
cam_ids = "pinhole" # same camera model across images
|
| 119 |
+
output = model.predict(images, cam_id=cam_ids)
|
| 120 |
+
# corresponding batched output dictionary:
|
| 121 |
+
# {
|
| 122 |
+
# "intrinsics": List[(D_i,) tensors] for each camera model "i",
|
| 123 |
+
# "fov_field": (B, N, 2) tensor,
|
| 124 |
+
# "tangent_coords": alias for "fov_field",
|
| 125 |
+
# "rays": (B, N, 3) tensor,
|
| 126 |
+
# "pred_size": (H, W).
|
| 127 |
+
# }
|
| 128 |
+
```
|
| 129 |
+
|
| 130 |
+
### Currently implemented camera models
|
| 131 |
+
* `cam_id` represents the camera model identifier(s) that can be used in the `predict` method. <br>
|
| 132 |
+
* `D` corresponds to the number of intrinsics of the camera model. It determines the length of each `intrinsics` tensor in the output dictionary.
|
| 133 |
+
|
| 134 |
+
| `cam_id` | Description | `D` | Intrinsics |
|
| 135 |
+
|:--|:--|:-:|:--|
|
| 136 |
+
| `pinhole` | Pinhole camera model | 4 | $f_x,~f_y,~c_x,~c_y$ |
|
| 137 |
+
| `simple_pinhole` | `pinhole` with one focal length | 3 | $f,~c_x,~c_y$ |
|
| 138 |
+
| `radial:k` | Radial (Brown-Conrady) [[1]](#1) camera model with `k` $\in$ [1, 4] distortion coefficients | 4+`k` | $f_x,~f_y,~c_x,~c_y$ <br> $k_1[,~k_2[,~k_3[,~k_4]]]$ |
|
| 139 |
+
| `simple_radial:k` | `radial:k` with one focal length | 3+`k` | $f,~c_x,~c_y$ <br> $k_1[,~k_2[,~k_3[,~k_4]]]$ |
|
| 140 |
+
| `kb:k` | Kannala-Brandt [[2]](#2) camera model with `k` $\in$ [1, 4] distortion coefficients | 4+`k` | $f_x,~f_y,~c_x,~c_y$ <br> $k_1[,~k_2[,~k_3[,~k_4]]]$ |
|
| 141 |
+
| `simple_kb:k` | `kb:k` with one focal length | 3+`k` | $f,~c_x,~c_y$ <br> $k_1[,~k_2[,~k_3[,~k_4]]]$ |
|
| 142 |
+
| `ucm` | Unified Camera Model [[3]](#3) | 5 | $f_x,~f_y,~c_x,~c_y$ <br> $k$ |
|
| 143 |
+
| `simple_ucm` | `ucm` with one focal length | 4 | $f,~c_x,~c_y$ <br> $k$ |
|
| 144 |
+
| `eucm` | Enhanced Unified Camera Model [[4]](#4) | 6 | $f_x,~f_y,~c_x,~c_y$ <br> $k_1,~k_2$ |
|
| 145 |
+
| `simple_eucm` | `eucm` with one focal length | 5 | $f,~c_x,~c_y$ <br> $k_1,~k_2$ |
|
| 146 |
+
| `division:k` | Division camera model [[5]](#5) with `k` $\in$ [1, 4] distortion coefficients | 4+`k` | $f_x,~f_y,~c_x,~c_y$ <br> $k_1[,~k_2[,~k_3[,~k_4]]]$ |
|
| 147 |
+
| `simple_division:k` | `division:k` with one focal length | 3+`k` | $f,~c_x,~c_y$ <br> $k_1[,~k_2[,~k_3[,~k_4]]]$ |
|
| 148 |
+
|
| 149 |
+
In addition to the original works, we recommend the works of Usenko et al. [[6]](#6) and Lochman et al. [[7]](#7) for a comprehensive comparison of the different camera models.
|
| 150 |
+
|
| 151 |
+
|
| 152 |
+
## Evaluation
|
| 153 |
+
The evaluation and training code is built upon the [`siclib`](siclib) library from [GeoCalib](https://github.com/cvg/GeoCalib), which can be installed as:
|
| 154 |
+
```shell
|
| 155 |
+
pip install -e siclib
|
| 156 |
+
```
|
| 157 |
+
Running the evaluation commands will write the results to `outputs/results/`.
|
| 158 |
+
|
| 159 |
+
### LaMAR
|
| 160 |
+
Running the evaluation commands will download the dataset to `data/lamar2k` which will take around 400 MB of disk space.
|
| 161 |
+
|
| 162 |
+
AnyCalib trained on $\mathrm{OP_{p}}$:
|
| 163 |
+
```shell
|
| 164 |
+
python -m siclib.eval.lamar2k_rays --conf anycalib_pretrained --tag anycalib_p --overwrite
|
| 165 |
+
```
|
| 166 |
+
AnyCalib trained on $\mathrm{OP_{g}}$:
|
| 167 |
+
```shell
|
| 168 |
+
python -m siclib.eval.lamar2k_rays --conf anycalib_pretrained --tag anycalib_g --overwrite model.model_id=anycalib_gen
|
| 169 |
+
```
|
| 170 |
+
|
| 171 |
+
### MegaDepth (pinhole)
|
| 172 |
+
Running the evaluation commands will download the dataset to `data/megadepth2k` which will take around 2 GB of disk space.
|
| 173 |
+
|
| 174 |
+
AnyCalib trained on $\mathrm{OP_{p}}$:
|
| 175 |
+
```shell
|
| 176 |
+
python -m siclib.eval.megadepth2k_rays --conf anycalib_pretrained --tag anycalib_p --overwrite
|
| 177 |
+
```
|
| 178 |
+
AnyCalib trained on $\mathrm{OP_{g}}$:
|
| 179 |
+
```shell
|
| 180 |
+
python -m siclib.eval.megadepth2k_rays --conf anycalib_pretrained --tag anycalib_g --overwrite model.model_id=anycalib_gen
|
| 181 |
+
```
|
| 182 |
+
|
| 183 |
+
### TartanAir
|
| 184 |
+
Running the evaluation commands will download the dataset to `data/tartanair` which will take around 1.7 GB of disk space.
|
| 185 |
+
|
| 186 |
+
AnyCalib trained on $\mathrm{OP_{p}}$:
|
| 187 |
+
```shell
|
| 188 |
+
python -m siclib.eval.tartanair_rays --conf anycalib_pretrained --tag anycalib_p --overwrite
|
| 189 |
+
```
|
| 190 |
+
AnyCalib trained on $\mathrm{OP_{g}}$:
|
| 191 |
+
```shell
|
| 192 |
+
python -m siclib.eval.tartanair_rays --conf anycalib_pretrained --tag anycalib_g --overwrite model.model_id=anycalib_gen
|
| 193 |
+
```
|
| 194 |
+
|
| 195 |
+
### Stanford2D3D
|
| 196 |
+
Running the evaluation commands will download the dataset to `data/stanford2d3d` which will take around 844 MB of disk space.
|
| 197 |
+
|
| 198 |
+
AnyCalib trained on $\mathrm{OP_{p}}$:
|
| 199 |
+
```shell
|
| 200 |
+
python -m siclib.eval.stanford2d3d_rays --conf anycalib_pretrained --tag anycalib_p --overwrite
|
| 201 |
+
```
|
| 202 |
+
AnyCalib trained on $\mathrm{OP_{g}}$:
|
| 203 |
+
```shell
|
| 204 |
+
python -m siclib.eval.stanford2d3d_rays --conf anycalib_pretrained --tag anycalib_g --overwrite model.model_id=anycalib_gen
|
| 205 |
+
```
|
| 206 |
+
|
| 207 |
+
### MegaDepth (radial)
|
| 208 |
+
Running the evaluation commands will download the dataset to `data/megadepth2k-radial` which will take around 1.4 GB of disk space.
|
| 209 |
+
|
| 210 |
+
AnyCalib trained on $\mathrm{OP_{g}}$:
|
| 211 |
+
```shell
|
| 212 |
+
python -m siclib.eval.megadepth2k_radial_rays --conf anycalib_pretrained --tag anycalib_g --overwrite model.model_id=anycalib_gen
|
| 213 |
+
```
|
| 214 |
+
|
| 215 |
+
### Mono
|
| 216 |
+
Running the evaluation commands will download the dataset to `data/monovo2k` which will take around 445 MB of disk space.
|
| 217 |
+
|
| 218 |
+
AnyCalib trained on $\mathrm{OP_{d}}$:
|
| 219 |
+
```shell
|
| 220 |
+
python -m siclib.eval.monovo2k_rays --conf anycalib_pretrained --tag anycalib_d --overwrite model.model_id=anycalib_dist data.cam_id=ucm
|
| 221 |
+
```
|
| 222 |
+
AnyCalib trained on $\mathrm{OP_{g}}$:
|
| 223 |
+
```shell
|
| 224 |
+
python -m siclib.eval.monovo2k_rays --conf anycalib_pretrained --tag anycalib_g --overwrite model.model_id=anycalib_gen data.cam_id=ucm
|
| 225 |
+
```
|
| 226 |
+
|
| 227 |
+
### ScanNet++
|
| 228 |
+
To comply with ScanNet++ license, we cannot directly share its data.
|
| 229 |
+
Please download the ScanNet++ dataset following the [official instructions](https://kaldir.vc.in.tum.de/scannetpp/#:~:text=the%20data%20now.-,Download%20the%20data,-To%20download%20the) and indicate the path to the root of the dataset in the following evaluation command. <br>
|
| 230 |
+
This needs to be provided only the first time the evaluation is run. This first time, the command will automatically copy the evaluation images under `data/scannetpp2k` which will take around 760 MB of disk space.
|
| 231 |
+
|
| 232 |
+
AnyCalib trained on $\mathrm{OP_{d}}$:
|
| 233 |
+
```shell
|
| 234 |
+
python -m siclib.eval.scannetpp2k_rays --conf anycalib_pretrained --tag anycalib_d --overwrite model.model_id=anycalib_dist scannetpp_root=<path_to_scannetpp>
|
| 235 |
+
```
|
| 236 |
+
AnyCalib trained on $\mathrm{OP_{g}}$:
|
| 237 |
+
```shell
|
| 238 |
+
python -m siclib.eval.scannetpp2k_rays --conf anycalib_pretrained --tag anycalib_g --overwrite model.model_id=anycalib_gen scannetpp_root=<path_to_scannetpp>
|
| 239 |
+
```
|
| 240 |
+
|
| 241 |
+
### LaMAR (edited)
|
| 242 |
+
Running the evaluation commands will download the dataset to `data/lamar2k_edit` which will take around 224 MB of disk space.
|
| 243 |
+
|
| 244 |
+
AnyCalib trained following WildCam [[8]](#8) training protocol:
|
| 245 |
+
```shell
|
| 246 |
+
python -m siclib.eval.lamar2k_rays --conf anycalib_pretrained --tag anycalib_e --overwrite model.model_id=anycalib_edit eval.eval_on_edit=True
|
| 247 |
+
```
|
| 248 |
+
|
| 249 |
+
### Tartanair (edited)
|
| 250 |
+
Running the evaluation commands will download the dataset to `data/tartanair_edit` which will take around 488 MB of disk space.
|
| 251 |
+
|
| 252 |
+
AnyCalib trained following WildCam [[8]](#8) training protocol:
|
| 253 |
+
```shell
|
| 254 |
+
python -m siclib.eval.tartanair_rays --conf anycalib_pretrained --tag anycalib_e --overwrite model.model_id=anycalib_edit eval.eval_on_edit=True
|
| 255 |
+
```
|
| 256 |
+
|
| 257 |
+
### Stanford2D3D (edited)
|
| 258 |
+
Running the evaluation commands will download the dataset to `data/stanford2d3d_edit` which will take around 420 MB of disk space.
|
| 259 |
+
|
| 260 |
+
AnyCalib trained on $\mathrm{OP_{p}}$, following WildCam [[8]](#8) training protocol:
|
| 261 |
+
```shell
|
| 262 |
+
python -m siclib.eval.stanford2d3d_rays --conf anycalib_pretrained --tag anycalib_e --overwrite model.model_id=anycalib_edit eval.eval_on_edit=True
|
| 263 |
+
```
|
| 264 |
+
|
| 265 |
+
## Extended OpenPano Dataset
|
| 266 |
+
We extend the OpenPano dataset from [GeoCalib](https://github.com/cvg/GeoCalib?tab=readme-ov-file#openpano-dataset) with panoramas that not need to be aligned with the gravity direction. This extended version consists of tonemapped panoramas from [The Laval Photometric Indoor HDR Dataset](http://hdrdb.com/indoor-hdr-photometric/), [PolyHaven](https://polyhaven.com/hdris), [HDRMaps](https://hdrmaps.com/freebies/free-hdris/), [AmbientCG](https://ambientcg.com/list?type=hdri&sort=popular) and [BlenderKit](https://www.blenderkit.com/asset-gallery?query=category_subtree:hdr).
|
| 267 |
+
|
| 268 |
+
Before sampling images from the panoramas, first download the Laval dataset following the instructions on the [corresponding project page](http://hdrdb.com/indoor-hdr-photometric/#:~:text=HDR%20Dataset.-,Download,-To%20obtain%20the) and place the panoramas in `data/indoorDatasetCalibrated`. Then, tonemap the HDR images using the following command:
|
| 269 |
+
```shell
|
| 270 |
+
python -m siclib.datasets.utils.tonemapping --hdr_dir data/indoorDatasetCalibrated --out_dir data/laval-tonemap
|
| 271 |
+
```
|
| 272 |
+
|
| 273 |
+
To download the rest of the panoramas and organize all the panoramas in their corresponding splits `data/openpano_v2/panoramas/{split}`, execute:
|
| 274 |
+
```shell
|
| 275 |
+
python -m siclib.datasets.utils.download_openpano --name openpano_v2 --laval_dir data/laval-tonemap
|
| 276 |
+
```
|
| 277 |
+
The panoramas from PolyHaven, HDRMaps, AmbientCG and BlenderKit can be alternatively manually downloaded from [here](https://drive.google.com/drive/folders/1HSXKNrleJKas4cRLd1C8SqR9J1nU1-Z_?usp=sharing).
|
| 278 |
+
|
| 279 |
+
Afterwards, the different training datasets mentioned in the paper: $\mathrm{OP_{p}}$, $\mathrm{OP_{g}}$, $\mathrm{OP_{r}}$ and $\mathrm{OP_{d}}$ can be created by running the following commands. We recommend running them with the flag `device=cuda` as this significantly speeds up the creation of the datasets, but if no GPU is available, the flag can be omitted.
|
| 280 |
+
|
| 281 |
+
$\mathrm{OP_{p}}$ (will be stored under `data/openpano_v2/openpano_v2`):
|
| 282 |
+
```shell
|
| 283 |
+
python -m siclib.datasets.create_dataset_from_pano --config-name openpano_v2 device=cuda
|
| 284 |
+
```
|
| 285 |
+
$\mathrm{OP_{g}}$ (will be stored under `data/openpano_v2/openpano_v2_gen`):
|
| 286 |
+
```shell
|
| 287 |
+
python -m siclib.datasets.create_dataset_from_pano_rays --config-name openpano_v2_gen device=cuda
|
| 288 |
+
```
|
| 289 |
+
$\mathrm{OP_{r}}$ (will be stored under `data/openpano_v2/openpano_v2_radial`):
|
| 290 |
+
```shell
|
| 291 |
+
python -m siclib.datasets.create_dataset_from_pano_rays --config-name openpano_v2_radial device=cuda
|
| 292 |
+
```
|
| 293 |
+
$\mathrm{OP_{d}}$ (will be stored under `data/openpano_v2/openpano_v2_dist`):
|
| 294 |
+
```shell
|
| 295 |
+
python -m siclib.datasets.create_dataset_from_pano_rays --config-name openpano_v2_dist device=cuda
|
| 296 |
+
```
|
| 297 |
+
|
| 298 |
+
## Training
|
| 299 |
+
As with the evaluation, the training code is built upon the [`siclib`](siclib) library from [GeoCalib](https://github.com/cvg/GeoCalib). Here we adapt their instructions to AnyCalib. `siclib` can be installed executing:
|
| 300 |
+
```shell
|
| 301 |
+
pip install -e siclib
|
| 302 |
+
```
|
| 303 |
+
Once (at least one of) the [extended OpenPano Dataset](#Extended-OpenPano-Dataset) (`openpano_v2`) has been downloaded and prepared, we can train AnyCalib with it.
|
| 304 |
+
|
| 305 |
+
For training with $\mathrm{OP_{p}}$ (default):
|
| 306 |
+
```shell
|
| 307 |
+
python -m siclib.train anycalib_op_p --conf anycalib --distributed
|
| 308 |
+
```
|
| 309 |
+
Feel free to use any other experiment name. By default, the checkpoints will be written to `outputs/training/`. The default batch size is 24 which requires at least 1 NVIDIA Tesla V100 GPU with 32GB of VRAM. If only one GPU is used, the flag `--distributed` can be omitted. Configurations are managed by [Hydra](https://hydra.cc/) and can be overwritten from the command line.
|
| 310 |
+
|
| 311 |
+
For example, for training with $\mathrm{OP_{g}}$:
|
| 312 |
+
```shell
|
| 313 |
+
python -m siclib.train anycalib_op_g --conf anycalib --distributed data.dataset_dir='data/openpano_v2/openpano_v2_gen'
|
| 314 |
+
```
|
| 315 |
+
|
| 316 |
+
For training with $\mathrm{OP_{d}}$:
|
| 317 |
+
```shell
|
| 318 |
+
python -m siclib.train anycalib_op_d --conf anycalib --distributed data.dataset_dir='data/openpano_v2/openpano_v2_dist'
|
| 319 |
+
```
|
| 320 |
+
|
| 321 |
+
For training with $\mathrm{OP_{r}}$:
|
| 322 |
+
```shell
|
| 323 |
+
python -m siclib.train anycalib_op_r --conf anycalib --distributed data.dataset_dir='data/openpano_v2/openpano_v2_radial'
|
| 324 |
+
```
|
| 325 |
+
|
| 326 |
+
For training with $\mathrm{OP_{p}}$ on edited (stretched and cropped) images, following the training protocol of WildCam [[8]](#8):
|
| 327 |
+
```shell
|
| 328 |
+
python -m siclib.train anycalib_op_e --conf anycalib --distributed \
|
| 329 |
+
data.dataset_dir='data/openpano_v2/openpano_v2' \
|
| 330 |
+
data.im_geom_transform.change_pixel_ar=true \
|
| 331 |
+
data.im_geom_transform.crop=0.5
|
| 332 |
+
```
|
| 333 |
+
|
| 334 |
+
After training, the model can be evaluated using its experiment name:
|
| 335 |
+
```shell
|
| 336 |
+
python -m siclib.eval.<benchmark> --checkpoint <experiment_name> --tag <experiment_tag> --conf anycalib
|
| 337 |
+
```
|
| 338 |
+
|
| 339 |
+
|
| 340 |
+
## Acknowledgements
|
| 341 |
+
Thanks to the authors of [GeoCalib](https://github.com/cvg/GeoCalib) for open-sourcing the comprehensive and easy-to-use [`siclib`](https://github.com/cvg/GeoCalib/tree/main/siclib) which we use as the base of our evaluation and training code. <br>
|
| 342 |
+
Thanks to the authors of the [The Laval Photometric Indoor HDR Dataset](http://hdrdb.com/indoor-hdr-photometric/) for allowing us to release the weights of AnyCalib under a permissive license. <br>
|
| 343 |
+
Thanks also to the authors of [The Laval Photometric Indoor HDR Dataset](http://hdrdb.com/indoor-hdr-photometric/), [PolyHaven](https://polyhaven.com/hdris), [HDRMaps](https://hdrmaps.com/freebies/free-hdris/), [AmbientCG](https://ambientcg.com/list?type=hdri&sort=popular) and [BlenderKit](https://www.blenderkit.com/asset-gallery?query=category_subtree:hdr) for providing high-quality freely-available panoramas that made the training of AnyCalib possible.
|
| 344 |
+
|
| 345 |
+
## BibTex citation
|
| 346 |
+
If you use any ideas from the paper or code from this repo, please consider citing:
|
| 347 |
+
```bibtex
|
| 348 |
+
@InProceedings{tirado2025anycalib,
|
| 349 |
+
author={Javier Tirado-Gar{\'\i}n and Javier Civera},
|
| 350 |
+
title={{AnyCalib: On-Manifold Learning for Model-Agnostic Single-View Camera Calibration}},
|
| 351 |
+
booktitle={ICCV},
|
| 352 |
+
year={2025}
|
| 353 |
+
}
|
| 354 |
+
```
|
| 355 |
+
|
| 356 |
+
## License
|
| 357 |
+
Code and weights are provided under the [Apache 2.0 license](LICENSE).
|
| 358 |
+
|
| 359 |
+
|
| 360 |
+
## References
|
| 361 |
+
<a id="1">[1]</a>
|
| 362 |
+
Close-Range Camera Calibration. D.C. Brown, 1971.
|
| 363 |
+
|
| 364 |
+
<a id="2">[2]</a>
|
| 365 |
+
A Generic Camera Model and Calibration Method for Conventional, Wide-Angle, and Fish-Eye Lenses. J. Kannala, S.S. Brandt, TPAMI 2006.
|
| 366 |
+
|
| 367 |
+
<a id="3">[3]</a>
|
| 368 |
+
Single View Point Omnidirectional Camera Calibration from Planar Grids. C. Mei, P. Rives, ICRA, 2007.
|
| 369 |
+
|
| 370 |
+
<a id="4">[4]</a>
|
| 371 |
+
An Enhanced Unified Camera Model. B. Khomutenko, at al., IEEE RA-L, 2016.
|
| 372 |
+
|
| 373 |
+
<a id="5">[5]</a>
|
| 374 |
+
Simultaneous Linear Estimation of Multiple View Geometry and Lens Distortion. A.W. Fitzgibbon, CVPR, 2001.
|
| 375 |
+
|
| 376 |
+
<a id="6">[6]</a>
|
| 377 |
+
The Double Sphere Camera Model. V. Usenko, et al., 3DV, 2018.
|
| 378 |
+
|
| 379 |
+
<a id="7">[7]</a>
|
| 380 |
+
BabelCalib: A Universal Approach to Calibrating Central Cameras. Y. Lochman, et al., ICCV, 2021.
|
| 381 |
+
|
| 382 |
+
<a id="8">[8]</a>
|
| 383 |
+
Tame a Wild Camera: In-the-Wild Monocular Camera Calibration. S. Zhu, et al., NeurIPS, 2023.
|
| 384 |
+
|