AnyCalib:
On-Manifold Learning for Model-Agnostic Single-View Camera Calibration
Javier Tirado-Garín Javier Civera
I3A, University of Zaragoza
Camera calibration from a single perspective/edited/distorted image using a freely chosen camera model
Usage (pretrained models)
The only requirements are Python (≥3.10) and PyTorch. The project, in development mode, can be installed with:
git clone https://github.com/javrtg/AnyCalib.git && cd AnyCalib
pip install -e .
Alternatively, and optionally, a compatible version of xformers can also be installed for better efficiency by running the following instead of pip install -e .:
pip install -e .[eff]
Minimal usage example
import numpy as np
import torch
from PIL import Image # the library of choice to load images
from anycalib import AnyCalib
dev = torch.device("cuda")
# load input image and convert it to a (3, H, W) tensor with RGB values in [0, 1]
image = np.array(Image.open("path/to/image.jpg").convert("RGB"))
image = torch.tensor(image, dtype=torch.float32, device=dev).permute(2, 0, 1) / 255
# instantiate AnyCalib according to the desired model_id. Options:
# "anycalib_pinhole": model trained with *only* perspective (pinhole) images,
# "anycalib_gen": trained with perspective, distorted and strongly distorted images,
# "anycalib_dist": trained with distorted and strongly distorted images,
# "anycalib_edit": Trained on edited (stretched and cropped) perspective images.
model = AnyCalib(model_id="anycalib_pinhole").to(dev)
# Alternatively, the weights can be loaded from the huggingface hub as follows:
# NOTE: huggingface_hub (https://pypi.org/project/huggingface-hub/) needs to be installed
# model = AnyCalib().from_pretrained(model_id=<model_id>).to(dev)
# predict according to the desired camera model. Implemented camera models are detailed further below.
output = model.predict(image, cam_id="pinhole")
# output is a dictionary with the following key-value pairs:
# {
# "intrinsics": (D,) tensor with the estimated intrinsics for the selected camera model,
# "fov_field": (N, 2) tensor with the regressed FoV field by the network. N≈320^2 (resolution close to the one seen during training),
# "tangent_coords": alias for "fov_field",
# "rays": (N, 3) tensor with the corresponding (via the exponential map) ray directions in the camera frame (x right, y down, z forward),
# "pred_size": (H, W) tuple with the image size used by the network. It can be used e.g. for resizing the FoV/ray fields to the original image size.
# }
The weights of the selected model_id, if not already downloaded, will be automatically downloaded to the:
- torch hub cache directory (
torch.hub.get_dir()) ifAnyCalib(model_id=<model_id>)is used, or - huggingface cache directory if
AnyCalib().from_pretrained(model_id=<model_id>)is used.
Additional configuration options are indicated in the docstring of AnyCalib:
help(AnyCalib)
"""AnyCalib class.
Args for instantiation:
model_id: one of {'anycalib_pinhole', 'anycalib_gen', 'anycalib_dist', 'anycalib_edit'}.
Each model differes in the type of images they seen during training:
* 'anycalib_pinhole': Perspective (pinhole) images,
* 'anycalib_gen': General images, including perspective, distorted and
strongly distorted images, and
* 'anycalib_dist': Distorted images using the Brown-Conrady camera model
and strongly distorted images, using the EUCM camera model,
* 'anycalib_edit': Trained on edited (stretched and cropped) perspective
images.
Default: 'anycalib_pinhole'.
nonlin_opt_method: nonlinear optimization method: 'gauss_newton' or 'lev_mar'.
Default: 'gauss_newton'
nonlin_opt_conf: nonlinear optimization configuration.
This config can be used to control the number of iterations and the space
where the residuals are minimized. See the classes `GaussNewtonCalib` or
`LevMarCalib` under anycalib/optim for details. Default: None.
init_with_sac: use RANSAC instead of nonminimal fit for initializating the
intrinsics. Default: False.
fallback_to_sac: use RANSAC if nonminimal fit fails. Default: True.
ransac_conf: RANSAC configuration. This config can be used to control e.g. the
inlier threshold or the number of minimal samples to try. See the class
`RANSAC` in anycalib/ransac.py for details. Default: None.
rm_borders: border size of the dense FoV fields to ignore during fitting.
Default: 0.
sample_size: approximate number of 2D-3D correspondences to use for fitting the
intrinsics. Negative value -> no subsampling. Default: -1.
"""
Minimal batched example
AnyCalib can also be executed in batch and using possibly different camera models for each image. For example:
images = ... # (B, 3, H, W)
# NOTE: if cam_ids is a list, then len(cam_ids) must be equal to B
cam_ids = ["pinhole", "radial:1", "kb:4"] # different camera models for each image
cam_ids = "pinhole" # same camera model across images
output = model.predict(images, cam_id=cam_ids)
# corresponding batched output dictionary:
# {
# "intrinsics": List[(D_i,) tensors] for each camera model "i",
# "fov_field": (B, N, 2) tensor,
# "tangent_coords": alias for "fov_field",
# "rays": (B, N, 3) tensor,
# "pred_size": (H, W).
# }
Currently implemented camera models
cam_idrepresents the camera model identifier(s) that can be used in thepredictmethod.
Dcorresponds to the number of intrinsics of the camera model. It determines the length of eachintrinsicstensor in the output dictionary.
cam_id |
Description | D |
Intrinsics |
|---|---|---|---|
pinhole |
Pinhole camera model | 4 | $f_x, |
simple_pinhole |
pinhole with one focal length |
3 | $f, |
radial:k |
Radial (Brown-Conrady) [1] camera model with k $\in$ [1, 4] distortion coefficients |
4+k |
$f_x, $k_1[, |
simple_radial:k |
radial:k with one focal length |
3+k |
$f, $k_1[, |
kb:k |
Kannala-Brandt [2] camera model with k $\in$ [1, 4] distortion coefficients |
4+k |
$f_x, $k_1[, |
simple_kb:k |
kb:k with one focal length |
3+k |
$f, $k_1[, |
ucm |
Unified Camera Model [3] | 5 | $f_x, $k$ |
simple_ucm |
ucm with one focal length |
4 | $f, $k$ |
eucm |
Enhanced Unified Camera Model [4] | 6 | $f_x, $k_1, |
simple_eucm |
eucm with one focal length |
5 | $f, $k_1,~k_2$ |
division:k |
Division camera model [5] with k $\in$ [1, 4] distortion coefficients |
4+k |
$f_x, $k_1[, |
simple_division:k |
division:k with one focal length |
3+k |
$f, $k_1[, |
In addition to the original works, we recommend the works of Usenko et al. [6] and Lochman et al. [7] for a comprehensive comparison of the different camera models.
Evaluation
The evaluation and training code is built upon the siclib library from GeoCalib, which can be installed as:
pip install -e siclib
Running the evaluation commands will write the results to outputs/results/.
LaMAR
Running the evaluation commands will download the dataset to data/lamar2k which will take around 400 MB of disk space.
AnyCalib trained on $\mathrm{OP_{p}}$:
python -m siclib.eval.lamar2k_rays --conf anycalib_pretrained --tag anycalib_p --overwrite
AnyCalib trained on $\mathrm{OP_{g}}$:
python -m siclib.eval.lamar2k_rays --conf anycalib_pretrained --tag anycalib_g --overwrite model.model_id=anycalib_gen
MegaDepth (pinhole)
Running the evaluation commands will download the dataset to data/megadepth2k which will take around 2 GB of disk space.
AnyCalib trained on $\mathrm{OP_{p}}$:
python -m siclib.eval.megadepth2k_rays --conf anycalib_pretrained --tag anycalib_p --overwrite
AnyCalib trained on $\mathrm{OP_{g}}$:
python -m siclib.eval.megadepth2k_rays --conf anycalib_pretrained --tag anycalib_g --overwrite model.model_id=anycalib_gen
TartanAir
Running the evaluation commands will download the dataset to data/tartanair which will take around 1.7 GB of disk space.
AnyCalib trained on $\mathrm{OP_{p}}$:
python -m siclib.eval.tartanair_rays --conf anycalib_pretrained --tag anycalib_p --overwrite
AnyCalib trained on $\mathrm{OP_{g}}$:
python -m siclib.eval.tartanair_rays --conf anycalib_pretrained --tag anycalib_g --overwrite model.model_id=anycalib_gen
Stanford2D3D
Running the evaluation commands will download the dataset to data/stanford2d3d which will take around 844 MB of disk space.
AnyCalib trained on $\mathrm{OP_{p}}$:
python -m siclib.eval.stanford2d3d_rays --conf anycalib_pretrained --tag anycalib_p --overwrite
AnyCalib trained on $\mathrm{OP_{g}}$:
python -m siclib.eval.stanford2d3d_rays --conf anycalib_pretrained --tag anycalib_g --overwrite model.model_id=anycalib_gen
MegaDepth (radial)
Running the evaluation commands will download the dataset to data/megadepth2k-radial which will take around 1.4 GB of disk space.
AnyCalib trained on $\mathrm{OP_{g}}$:
python -m siclib.eval.megadepth2k_radial_rays --conf anycalib_pretrained --tag anycalib_g --overwrite model.model_id=anycalib_gen
Mono
Running the evaluation commands will download the dataset to data/monovo2k which will take around 445 MB of disk space.
AnyCalib trained on $\mathrm{OP_{d}}$:
python -m siclib.eval.monovo2k_rays --conf anycalib_pretrained --tag anycalib_d --overwrite model.model_id=anycalib_dist data.cam_id=ucm
AnyCalib trained on $\mathrm{OP_{g}}$:
python -m siclib.eval.monovo2k_rays --conf anycalib_pretrained --tag anycalib_g --overwrite model.model_id=anycalib_gen data.cam_id=ucm
ScanNet++
To comply with ScanNet++ license, we cannot directly share its data.
Please download the ScanNet++ dataset following the official instructions and indicate the path to the root of the dataset in the following evaluation command.
This needs to be provided only the first time the evaluation is run. This first time, the command will automatically copy the evaluation images under data/scannetpp2k which will take around 760 MB of disk space.
AnyCalib trained on $\mathrm{OP_{d}}$:
python -m siclib.eval.scannetpp2k_rays --conf anycalib_pretrained --tag anycalib_d --overwrite model.model_id=anycalib_dist scannetpp_root=<path_to_scannetpp>
AnyCalib trained on $\mathrm{OP_{g}}$:
python -m siclib.eval.scannetpp2k_rays --conf anycalib_pretrained --tag anycalib_g --overwrite model.model_id=anycalib_gen scannetpp_root=<path_to_scannetpp>
LaMAR (edited)
Running the evaluation commands will download the dataset to data/lamar2k_edit which will take around 224 MB of disk space.
AnyCalib trained following WildCam [8] training protocol:
python -m siclib.eval.lamar2k_rays --conf anycalib_pretrained --tag anycalib_e --overwrite model.model_id=anycalib_edit eval.eval_on_edit=True
Tartanair (edited)
Running the evaluation commands will download the dataset to data/tartanair_edit which will take around 488 MB of disk space.
AnyCalib trained following WildCam [8] training protocol:
python -m siclib.eval.tartanair_rays --conf anycalib_pretrained --tag anycalib_e --overwrite model.model_id=anycalib_edit eval.eval_on_edit=True
Stanford2D3D (edited)
Running the evaluation commands will download the dataset to data/stanford2d3d_edit which will take around 420 MB of disk space.
AnyCalib trained on $\mathrm{OP_{p}}$, following WildCam [8] training protocol:
python -m siclib.eval.stanford2d3d_rays --conf anycalib_pretrained --tag anycalib_e --overwrite model.model_id=anycalib_edit eval.eval_on_edit=True
Extended OpenPano Dataset
We extend the OpenPano dataset from GeoCalib with panoramas that not need to be aligned with the gravity direction. This extended version consists of tonemapped panoramas from The Laval Photometric Indoor HDR Dataset, PolyHaven, HDRMaps, AmbientCG and BlenderKit.
Before sampling images from the panoramas, first download the Laval dataset following the instructions on the corresponding project page and place the panoramas in data/indoorDatasetCalibrated. Then, tonemap the HDR images using the following command:
python -m siclib.datasets.utils.tonemapping --hdr_dir data/indoorDatasetCalibrated --out_dir data/laval-tonemap
To download the rest of the panoramas and organize all the panoramas in their corresponding splits data/openpano_v2/panoramas/{split}, execute:
python -m siclib.datasets.utils.download_openpano --name openpano_v2 --laval_dir data/laval-tonemap
The panoramas from PolyHaven, HDRMaps, AmbientCG and BlenderKit can be alternatively manually downloaded from here.
Afterwards, the different training datasets mentioned in the paper: $\mathrm{OP_{p}}$, $\mathrm{OP_{g}}$, $\mathrm{OP_{r}}$ and $\mathrm{OP_{d}}$ can be created by running the following commands. We recommend running them with the flag device=cuda as this significantly speeds up the creation of the datasets, but if no GPU is available, the flag can be omitted.
$\mathrm{OP_{p}}$ (will be stored under data/openpano_v2/openpano_v2):
python -m siclib.datasets.create_dataset_from_pano --config-name openpano_v2 device=cuda
$\mathrm{OP_{g}}$ (will be stored under data/openpano_v2/openpano_v2_gen):
python -m siclib.datasets.create_dataset_from_pano_rays --config-name openpano_v2_gen device=cuda
$\mathrm{OP_{r}}$ (will be stored under data/openpano_v2/openpano_v2_radial):
python -m siclib.datasets.create_dataset_from_pano_rays --config-name openpano_v2_radial device=cuda
$\mathrm{OP_{d}}$ (will be stored under data/openpano_v2/openpano_v2_dist):
python -m siclib.datasets.create_dataset_from_pano_rays --config-name openpano_v2_dist device=cuda
Training
As with the evaluation, the training code is built upon the siclib library from GeoCalib. Here we adapt their instructions to AnyCalib. siclib can be installed executing:
pip install -e siclib
Once (at least one of) the extended OpenPano Dataset (openpano_v2) has been downloaded and prepared, we can train AnyCalib with it.
For training with $\mathrm{OP_{p}}$ (default):
python -m siclib.train anycalib_op_p --conf anycalib --distributed
Feel free to use any other experiment name. By default, the checkpoints will be written to outputs/training/. The default batch size is 24 which requires at least 1 NVIDIA Tesla V100 GPU with 32GB of VRAM. If only one GPU is used, the flag --distributed can be omitted. Configurations are managed by Hydra and can be overwritten from the command line.
For example, for training with $\mathrm{OP_{g}}$:
python -m siclib.train anycalib_op_g --conf anycalib --distributed data.dataset_dir='data/openpano_v2/openpano_v2_gen'
For training with $\mathrm{OP_{d}}$:
python -m siclib.train anycalib_op_d --conf anycalib --distributed data.dataset_dir='data/openpano_v2/openpano_v2_dist'
For training with $\mathrm{OP_{r}}$:
python -m siclib.train anycalib_op_r --conf anycalib --distributed data.dataset_dir='data/openpano_v2/openpano_v2_radial'
For training with $\mathrm{OP_{p}}$ on edited (stretched and cropped) images, following the training protocol of WildCam [8]:
python -m siclib.train anycalib_op_e --conf anycalib --distributed \
data.dataset_dir='data/openpano_v2/openpano_v2' \
data.im_geom_transform.change_pixel_ar=true \
data.im_geom_transform.crop=0.5
After training, the model can be evaluated using its experiment name:
python -m siclib.eval.<benchmark> --checkpoint <experiment_name> --tag <experiment_tag> --conf anycalib
Acknowledgements
Thanks to the authors of GeoCalib for open-sourcing the comprehensive and easy-to-use siclib which we use as the base of our evaluation and training code.
Thanks to the authors of the The Laval Photometric Indoor HDR Dataset for allowing us to release the weights of AnyCalib under a permissive license.
Thanks also to the authors of The Laval Photometric Indoor HDR Dataset, PolyHaven, HDRMaps, AmbientCG and BlenderKit for providing high-quality freely-available panoramas that made the training of AnyCalib possible.
BibTex citation
If you use any ideas from the paper or code from this repo, please consider citing:
@InProceedings{tirado2025anycalib,
author={Javier Tirado-Gar{\'\i}n and Javier Civera},
title={{AnyCalib: On-Manifold Learning for Model-Agnostic Single-View Camera Calibration}},
booktitle={ICCV},
year={2025}
}
License
Code and weights are provided under the Apache 2.0 license.
References
[1] Close-Range Camera Calibration. D.C. Brown, 1971.
[2] A Generic Camera Model and Calibration Method for Conventional, Wide-Angle, and Fish-Eye Lenses. J. Kannala, S.S. Brandt, TPAMI 2006.
[3] Single View Point Omnidirectional Camera Calibration from Planar Grids. C. Mei, P. Rives, ICRA, 2007.
[4] An Enhanced Unified Camera Model. B. Khomutenko, at al., IEEE RA-L, 2016.
[5] Simultaneous Linear Estimation of Multiple View Geometry and Lens Distortion. A.W. Fitzgibbon, CVPR, 2001.
[6] The Double Sphere Camera Model. V. Usenko, et al., 3DV, 2018.
[7] BabelCalib: A Universal Approach to Calibrating Central Cameras. Y. Lochman, et al., ICCV, 2021.
[8] Tame a Wild Camera: In-the-Wild Monocular Camera Calibration. S. Zhu, et al., NeurIPS, 2023.