---
license: cc-by-4.0
pipeline_tag: robotics
tags:
- visual-navigation
- sim-to-real
- topological-navigation
---

# FAINT

Fast, Appearance-Invariant Navigation Transformer (FAINT) is a learned policy for vision-based topological navigation.

This model is presented in the paper [Synthetic vs. Real Training Data for Visual Navigation](https://huggingface.co/papers/2509.11791).

[**Project Page**](https://lasuomela.github.io/faint/) | [**Code**](https://github.com/lasuomela/faint)

## Model Details

The `FAINT-Sim` model uses [`Theia-Tiny-CDDSV`](https://theia.theaiinstitute.com/) as backbone, and was trained for 10 rounds of DAgger with ~12M samples from the Habitat simulator.
It is capable of zero-shot transfer for navigation with real robots.

This repo contains two versions of the trained model weights.
- `model_pytorch.pt`: Weights-only state dict of the Pytorch model.
- `model_torchscript.pt`: A 'standalone' Torchscript model for deployment.

## Usage

See the main Github [repo](https://github.com/lasuomela/FAINT) for details, input preprocessing etc.

### Torchscript

Only dependency is Pytorch.

```python
import torch
ckpt_path = 'FAINT-Sim/model_torchscript.pt'
model = torch.jit.load(ckpt_path)
```

### Pytorch

Need to have the Faint library installed.

```python
import torch
from faint.common.models.faint import FAINT

ckpt_path = 'FAINT-Sim/model_pytorch.pt'
state_dict = torch.load(ckpt_path)

model = FAINT() # The weights in this repo correspond to FAINT initialized with the default arguments
model.load_state_dict(state_dict)
```

## Citation

If you use FAINT in your research, please use the following BibTeX entry:
```bibtex
@article{suomela2025synthetic, 
  title={Synthetic vs. Real Training Data for Visual Navigation},
  author={Suomela, Lauri and Kuruppu Arachchige, Sasanka and Torres, German F. and Edelman, Harry and Kämäräinen, Joni-Kristian},
  journal={arXiv:2509.11791},
  year={2025}
}
```