File size: 2,009 Bytes
c7351af
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
ed56ca1
c7351af
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
b411b68
 
c7351af
 
b411b68
c7351af
b411b68
c7351af
 
 
 
 
 
 
b641eb6
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
---
license: mit
---


# Micron-Flow: Real-Time Optical Flow Model

## Model Overview
**Micron-Flow** is a lightweight optical flow model optimized for real-time inference at **80+ FPS** on high-end GPUs. By leveraging knowledge distillation from RAFT-Large, this model achieves **high accuracy** while maintaining an extremely small size of **522K parameters**.

## Model Details
- **Architecture**: Modified U-Net with MobileNetV2-based Siamese encoder, residual blocks, and a flow refinement module.
- **Parameters**: 522K
- **Input Resolution**: (152, 240)
- **Training Dataset**: 200K video frame pairs generated from the **Moments of Time** dataset using RAFT-Large.
- **Distillation Approach**: 
  - Mean squared error (MSE) loss in tanh-space
  - Edge-aware smoothness loss
- **Optimization**: Trained with **CosineAnnealing** scheduler and progressive encoder unfreezing.

## Performance
| Device       | Inference Time | FPS  |
|-------------|---------------|------|
| **RTX 4090** | 0.012 sec     | 83   |
| **GTX 1650** | 0.013 sec       | 76 |
| **CPU-Only** | 0.07 sec      | 14   |

## Key Features
- **Real-time processing**: 80+ FPS on RTX 4090
- **Small model size**: Only 2.1MB on disk
- **Efficient architecture**: 
  - Depthwise convolutions for reduced parameters
  - Inverted residual blocks for better efficiency
  - Flow refiner for enhanced motion consistency
- **Optimized training pipeline**: GPU caching and JPEG decoding acceleration

## Limitations
- Trained on synthetic optical flow from RAFT-Large, which may introduce biases.
- Resolution fixed to (152, 240) – requires up/downscaling for different input sizes.

## Model Usage
```python
from torchvision.transforms.functional import to_tensor

# load the model from the .pth file

model = MicronFlow().eval()
frame1 = to_tensor(image1).unsqueeze(0)
frame2 = to_tensor(image2).unsqueeze(0)
flow = model(frame1, frame2)
```

## License
MIT License.


## Links
- **Code**: [GitHub](https://github.com/krasnoteh/micron-flow)