PhysicsGIF-135M / README.md
vikramlingam's picture
Update README.md
0329ec3 verified
---
license: apache-2.0
language:
- en
library_name: transformers
tags:
- text-parsing
- scene-understanding
- physics-simulation
- smollm2
- lora
- fine-tuned
base_model: HuggingFaceTB/SmolLM2-135M-Instruct
pipeline_tag: text-generation
---
# PhysicsGIF-135M
οΏ½ **Natural Language to Scene Parser**: A fine-tuned 135M parameter model that converts text descriptions into structured JSON scene specifications.
> ⚠️ **Note**: This model is the **text parsing component** of a larger physics-based GIF generation pipeline. It does NOT generate GIFs directly, it outputs structured JSON that is then processed by a separate physics engine and renderer.
![Training Loss](01_training_loss.png)
## What This Model Does
```
"a red ball bouncing to the right"
β”‚
β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ PhysicsGIF-135M β”‚ ← THIS MODEL
β”‚ (Text β†’ JSON) β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”‚
β–Ό
{
"objects": [{"type": "ball", "color": "#FF0000"}],
"motion": {"velocity": [3, 0], "gravity": 0.3, "bounce": 0.9},
"canvas": {"size": 128, "frames": 40}
}
```
The JSON output is then processed by **separate Python code** (physics engine + renderer) to create the actual GIF.
## 🎬 Example Outputs
| Prompt | Generated GIF |
|--------|---------------|
| "two triangles colliding with each other and exploding" | ![Triangles Exploding](output.gif) |
| "a pink ball dropping slowly from up" | ![Pink Ball Falling](output_1.gif) |
## πŸ“Š Training Results
| Metric | Value |
|--------|-------|
| Base Model | SmolLM2-135M-Instruct |
| Training Examples | 500 |
| Epochs | 20 |
| Final Loss | 0.092 |
| Loss Reduction | 95.9% |
| Training Time | 42 minutes |
| LoRA Rank | 16 |
| LoRA Alpha | 32 |
<details>
<summary>πŸ“ˆ Training Visualizations</summary>
#### Training Loss Curve
![Training Loss](01_training_loss.png)
#### Learning Rate Schedule
![Learning Rate](02_learning_rate.png)
#### Gradient Norms
![Gradient Norms](03_gradient_norms.png)
#### Per-Epoch Loss
![Epoch Losses](04_epoch_losses.png)
#### Dataset Distribution
![Object Distribution](05_object_distribution.png)
![Motion Distribution](06_motion_distribution.png)
#### Convergence Analysis
![Convergence](08_convergence_analysis.png)
</details>
## πŸš€ Usage
### With the Full Pipeline (Recommended)
To generate actual GIFs, you need the complete pipeline code:
```bash
git clone https://github.com/vikramlingam/PhysicsGIF-135M
cd PhysicsGIF-135M
pip install torch transformers peft pillow numpy tqdm
# Interactive mode - generates real GIFs
python generate.py
```
```
🎬 PhysicsGIF Text-to-GIF Generator
Enter prompt: a red ball bouncing
Generating...
βœ“ Generated: output_1.gif
```
### Using This Model Directly (Text β†’ JSON only)
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained("vikramlingam/PhysicsGIF-135M")
tokenizer = AutoTokenizer.from_pretrained("vikramlingam/PhysicsGIF-135M")
prompt = '''<|im_start|>system
You are a scene description parser. Convert text to JSON scene specification.<|im_end|>
<|im_start|>user
Convert to scene JSON: a red ball bouncing to the right<|im_end|>
<|im_start|>assistant
'''
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=200, do_sample=False)
result = tokenizer.decode(outputs[0])
# Output: JSON scene specification
# You need physics.py and renderer.py to convert this to a GIF
```
## πŸ—οΈ Full Pipeline Architecture
```
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Complete GIF Generation Pipeline β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚ β”‚
β”‚ User Input: "a red ball bouncing" β”‚
β”‚ β”‚ β”‚
β”‚ β–Ό β”‚
β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
β”‚ β”‚ PhysicsGIF-135M (THIS MODEL) β”‚ β”‚
β”‚ β”‚ Fine-tuned LLM β”‚ β”‚
β”‚ β”‚ Converts text β†’ JSON DSL β”‚ β”‚
β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
β”‚ β”‚ β”‚
β”‚ β–Ό β”‚
β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
β”‚ β”‚ physics.py (Python code) β”‚ β”‚
β”‚ β”‚ Newtonian physics simulation β”‚ β”‚
β”‚ β”‚ Calculates positions per frame β”‚ β”‚
β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
β”‚ β”‚ β”‚
β”‚ β–Ό β”‚
β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
β”‚ β”‚ renderer.py (Python code) β”‚ β”‚
β”‚ β”‚ PIL-based frame rendering β”‚ β”‚
β”‚ β”‚ Saves as animated GIF β”‚ β”‚
β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
β”‚ β”‚ β”‚
β”‚ β–Ό β”‚
β”‚ output.gif β”‚
β”‚ β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
```
## 🎯 What This Model Understands
### Objects
`ball`, `square`, `triangle`
### Colors
`red`, `blue`, `green`, `yellow`, `orange`, `purple`, `pink`, `cyan`, `white`
### Motion Patterns
- `bouncing` β€” Gravity + elastic bounce
- `falling` / `dropping` β€” Falls from top
- `floating` β€” No gravity
- `colliding` β€” Objects collide
- `exploding` β€” Triggers particle effects
### Multi-Object
`two balls`, `three triangles`
## πŸ“ Required Files for GIF Generation
This model alone cannot generate GIFs. You need:
| File | Purpose |
|------|---------|
| `src/parser.py` | Integrates this model |
| `src/physics.py` | Physics simulation |
| `src/renderer.py` | GIF rendering |
| `src/pipeline.py` | Combines all components |
| `generate.py` | CLI interface |
## πŸ”¬ Training Details
- **Method**: LoRA fine-tuning
- **Target Modules**: q_proj, v_proj, k_proj, o_proj, gate_proj, up_proj, down_proj
- **Hardware**: CPU only (MacBook Pro)
- **Dataset**: 500 text-to-JSON examples
## πŸ“œ License
Apache 2.0