|
|
--- |
|
|
license: cc-by-4.0 |
|
|
tags: |
|
|
- self-supervised-learning |
|
|
- vit |
|
|
- latent-dynamics |
|
|
- motion |
|
|
- recognition |
|
|
- video |
|
|
- latent-action |
|
|
--- |
|
|
|
|
|
# Midway Network: Learning Representations for Recognition and Motion from Latent Dynamics |
|
|
|
|
|
[Paper](https://arxiv.org/abs/2510.05558) | [Code](https://github.com/agentic-learning-ai-lab/midway-network)| [Website](https://agenticlearning.ai/midway-network) |
|
|
|
|
|
These are trained models instantiating the Midway Network (ICLR 2026) architecture for self-supervised learning of visual representations for recognition and motion from videos. |
|
|
|
|
|
> [**Midway Network: Learning Representations for Recognition and Motion from Latent Dynamics**](https://arxiv.org/abs/2510.05558)<br> |
|
|
> [Christopher Hoang](https://www.chrishoang.com), [Mengye Ren](https://mengyeren.com)<br> |
|
|
> International Conference on Learning Representations 2026<br> |
|
|
> *arXiv ([arXiv 2510.05558](https://arxiv.org/abs/2510.05558))* |
|
|
|
|
|
The models are trained on [BDD100K](https://bair.berkeley.edu/blog/2018/05/30/bdd) or [WT-Venice](https://huggingface.co/datasets/shawshankvkt/Walking_Tours). |
|
|
|
|
|
## Citation |
|
|
If you find this repository useful in your research, please consider giving a like and a citation: |
|
|
``` |
|
|
@inproceedings{hoang:2026:midway-network, |
|
|
title={Midway Network: Learning Representations for Recognition and Motion from Latent Dynamics}, |
|
|
author={Chris Hoang and Mengye Ren}, |
|
|
booktitle={International Conference on Learning Representations}, |
|
|
year={2026} |
|
|
} |
|
|
``` |