---
license: cc-by-4.0
tags:
- self-supervised-learning
- vit
- latent-dynamics
- motion
- recognition
- video
- latent-action
---

# Midway Network: Learning Representations for Recognition and Motion from Latent Dynamics

[Paper](https://arxiv.org/abs/2510.05558) | [Code](https://github.com/agentic-learning-ai-lab/midway-network)| [Website](https://agenticlearning.ai/midway-network) 

These are trained models instantiating the Midway Network (ICLR 2026) architecture for self-supervised learning of visual representations for recognition and motion from videos.

> [**Midway Network: Learning Representations for Recognition and Motion from Latent Dynamics**](https://arxiv.org/abs/2510.05558)<br>
> [Christopher Hoang](https://www.chrishoang.com), [Mengye Ren](https://mengyeren.com)<br>
> International Conference on Learning Representations 2026<br>
> *arXiv ([arXiv 2510.05558](https://arxiv.org/abs/2510.05558))*

The models are trained on [BDD100K](https://bair.berkeley.edu/blog/2018/05/30/bdd) or [WT-Venice](https://huggingface.co/datasets/shawshankvkt/Walking_Tours).

## Citation
If you find this repository useful in your research, please consider giving a like and a citation:
```
@inproceedings{hoang:2026:midway-network,
  title={Midway Network: Learning Representations for Recognition and Motion from Latent Dynamics}, 
    author={Chris Hoang and Mengye Ren},
  booktitle={International Conference on Learning Representations},  
  year={2026}
}
```