Fix README YAML
Browse files
README.md
CHANGED
|
@@ -8,28 +8,20 @@ sdk_version: 5.31.0
|
|
| 8 |
python_version: '3.11'
|
| 9 |
app_file: app.py
|
| 10 |
pinned: false
|
| 11 |
-
short_description:
|
| 12 |
---
|
| 13 |
|
| 14 |
-
# Zeeb β Video-LLM
|
| 15 |
|
| 16 |
-
|
| 17 |
|
| 18 |
## Pipeline
|
| 19 |
```
|
| 20 |
-
Text Prompt β LLM (
|
| 21 |
```
|
| 22 |
|
| 23 |
-
##
|
| 24 |
-
1.
|
| 25 |
-
2.
|
| 26 |
-
3.
|
| 27 |
-
4.
|
| 28 |
-
5. Trains for 3 epochs on tokenized video data
|
| 29 |
-
6. Merges LoRA weights and pushes to [EeshaAI/zeeb](https://huggingface.co/EeshaAI/zeeb)
|
| 30 |
-
|
| 31 |
-
## Files
|
| 32 |
-
- `app.py` β Gradio training interface
|
| 33 |
-
- `train_on_hf_spaces.py` β Training logic (OLMo 2 1B + LoRA)
|
| 34 |
-
- `tokenized_dataset.json` β Tokenized video-text training data
|
| 35 |
-
- `requirements.txt` β Python dependencies
|
|
|
|
| 8 |
python_version: '3.11'
|
| 9 |
app_file: app.py
|
| 10 |
pinned: false
|
| 11 |
+
short_description: "Video-LLM - OLMo 2 + LoRA + VQ-VAE text-to-video"
|
| 12 |
---
|
| 13 |
|
| 14 |
+
# Zeeb β Video-LLM
|
| 15 |
|
| 16 |
+
Text-to-Video generation using **OLMo 2 1B Instruct** + **LoRA** + **VQ-VAE**.
|
| 17 |
|
| 18 |
## Pipeline
|
| 19 |
```
|
| 20 |
+
Text Prompt β LLM (constrained decoding) β Visual Tokens β VQ-VAE Decoder β Video
|
| 21 |
```
|
| 22 |
|
| 23 |
+
## Training Pipeline
|
| 24 |
+
1. Train VQ-VAE on 50K COCO images (real photos)
|
| 25 |
+
2. Tokenize 10K OpenVid-1M clips through VQ-VAE
|
| 26 |
+
3. Fine-tune OLMo 2 1B + LoRA on tokenized data
|
| 27 |
+
4. Push trained model to EeshaAI/zeeb
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|