---
license: cc-by-4.0
tags:
- self-supervised-learning
- vit
- latent-dynamics
- motion
- recognition
- video
- latent-action
---
# Midway Network: Learning Representations for Recognition and Motion from Latent Dynamics
[Paper](https://arxiv.org/abs/2510.05558) | [Code](https://github.com/agentic-learning-ai-lab/midway-network)| [Website](https://agenticlearning.ai/midway-network)
These are trained models instantiating the Midway Network (ICLR 2026) architecture for self-supervised learning of visual representations for recognition and motion from videos.
> [**Midway Network: Learning Representations for Recognition and Motion from Latent Dynamics**](https://arxiv.org/abs/2510.05558)
> [Christopher Hoang](https://www.chrishoang.com), [Mengye Ren](https://mengyeren.com)
> International Conference on Learning Representations 2026
> *arXiv ([arXiv 2510.05558](https://arxiv.org/abs/2510.05558))*
The models are trained on [BDD100K](https://bair.berkeley.edu/blog/2018/05/30/bdd) or [WT-Venice](https://huggingface.co/datasets/shawshankvkt/Walking_Tours).
## Citation
If you find this repository useful in your research, please consider giving a like and a citation:
```
@inproceedings{hoang:2026:midway-network,
title={Midway Network: Learning Representations for Recognition and Motion from Latent Dynamics},
author={Chris Hoang and Mengye Ren},
booktitle={International Conference on Learning Representations},
year={2026}
}
```