krasnoteh commited on
Commit
c7351af
·
verified ·
1 Parent(s): 66aef14

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +57 -3
README.md CHANGED
@@ -1,3 +1,57 @@
1
- ---
2
- license: mit
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ ---
4
+
5
+
6
+ # Micron-Flow: Real-Time Optical Flow Model
7
+
8
+ ## Model Overview
9
+ **Micron-Flow** is a lightweight optical flow model optimized for real-time inference at **80+ FPS** on high-end GPUs. By leveraging knowledge distillation from RAFT-Large, this model achieves **high accuracy** while maintaining an extremely small size of **522K parameters**.
10
+
11
+ ## Model Details
12
+ - **Architecture**: Modified U-Net with MobileNetV2-based Siamese encoder, residual blocks, and a flow refinement module.
13
+ - **Parameters**: 522K
14
+ - **Input Resolution**: (152, 240)
15
+ - **Training Dataset**: 200K video frame pairs generated from the **Moments of Time** dataset using RAFT-Large.
16
+ - **Distillation Approach**:
17
+ - Mean squared error (MSE) loss in tanh-space
18
+ - Edge-aware smoothness loss
19
+ - **Optimization**: Trained with **CosineAnnealing** scheduler and progressive encoder unfreezing.
20
+
21
+ ## Performance
22
+ | Device | Inference Time | FPS |
23
+ |-------------|---------------|------|
24
+ | **RTX 4090** | 0.012 sec | 83 |
25
+ | **RTX 3070 Ti** | TBD | TBD |
26
+ | **CPU-Only** | 0.07 sec | 14 |
27
+
28
+ ## Key Features
29
+ - **Real-time processing**: 80+ FPS on RTX 4090
30
+ - **Small model size**: Only 2.1MB on disk
31
+ - **Efficient architecture**:
32
+ - Depthwise convolutions for reduced parameters
33
+ - Inverted residual blocks for better efficiency
34
+ - Flow refiner for enhanced motion consistency
35
+ - **Optimized training pipeline**: GPU caching and JPEG decoding acceleration
36
+
37
+ ## Limitations
38
+ - Trained on synthetic optical flow from RAFT-Large, which may introduce biases.
39
+ - Resolution fixed to (152, 240) – requires up/downscaling for different input sizes.
40
+
41
+ ## Model Usage
42
+ ```python
43
+ from torchvision.transforms.functional import to_tensor
44
+ from model import MicronFlow # Assuming model implementation is available
45
+
46
+ model = MicronFlow().eval()
47
+ frame1 = to_tensor(image1).unsqueeze(0) # Convert images to tensors
48
+ frame2 = to_tensor(image2).unsqueeze(0)
49
+ flow = model(frame1, frame2) # Optical flow prediction
50
+ ```
51
+
52
+ ## License
53
+ MIT License.
54
+
55
+
56
+ ## Links
57
+ - **Code**: [GitHub](https://github.com/your-repo)