|
|
--- |
|
|
license: apache-2.0 |
|
|
language: |
|
|
- en |
|
|
library_name: transformers |
|
|
tags: |
|
|
- text-parsing |
|
|
- scene-understanding |
|
|
- physics-simulation |
|
|
- smollm2 |
|
|
- lora |
|
|
- fine-tuned |
|
|
base_model: HuggingFaceTB/SmolLM2-135M-Instruct |
|
|
pipeline_tag: text-generation |
|
|
--- |
|
|
|
|
|
# PhysicsGIF-135M |
|
|
|
|
|
οΏ½ **Natural Language to Scene Parser**: A fine-tuned 135M parameter model that converts text descriptions into structured JSON scene specifications. |
|
|
|
|
|
> β οΈ **Note**: This model is the **text parsing component** of a larger physics-based GIF generation pipeline. It does NOT generate GIFs directly, it outputs structured JSON that is then processed by a separate physics engine and renderer. |
|
|
|
|
|
 |
|
|
|
|
|
## What This Model Does |
|
|
|
|
|
``` |
|
|
"a red ball bouncing to the right" |
|
|
β |
|
|
βΌ |
|
|
βββββββββββββββββββββββ |
|
|
β PhysicsGIF-135M β β THIS MODEL |
|
|
β (Text β JSON) β |
|
|
ββββββββββββ¬βββββββββββ |
|
|
β |
|
|
βΌ |
|
|
{ |
|
|
"objects": [{"type": "ball", "color": "#FF0000"}], |
|
|
"motion": {"velocity": [3, 0], "gravity": 0.3, "bounce": 0.9}, |
|
|
"canvas": {"size": 128, "frames": 40} |
|
|
} |
|
|
``` |
|
|
|
|
|
The JSON output is then processed by **separate Python code** (physics engine + renderer) to create the actual GIF. |
|
|
|
|
|
## π¬ Example Outputs |
|
|
|
|
|
| Prompt | Generated GIF | |
|
|
|--------|---------------| |
|
|
| "two triangles colliding with each other and exploding" |  | |
|
|
| "a pink ball dropping slowly from up" |  | |
|
|
|
|
|
## π Training Results |
|
|
|
|
|
| Metric | Value | |
|
|
|--------|-------| |
|
|
| Base Model | SmolLM2-135M-Instruct | |
|
|
| Training Examples | 500 | |
|
|
| Epochs | 20 | |
|
|
| Final Loss | 0.092 | |
|
|
| Loss Reduction | 95.9% | |
|
|
| Training Time | 42 minutes | |
|
|
| LoRA Rank | 16 | |
|
|
| LoRA Alpha | 32 | |
|
|
|
|
|
<details> |
|
|
<summary>π Training Visualizations</summary> |
|
|
|
|
|
#### Training Loss Curve |
|
|
 |
|
|
|
|
|
#### Learning Rate Schedule |
|
|
 |
|
|
|
|
|
#### Gradient Norms |
|
|
 |
|
|
|
|
|
#### Per-Epoch Loss |
|
|
 |
|
|
|
|
|
#### Dataset Distribution |
|
|
 |
|
|
 |
|
|
|
|
|
#### Convergence Analysis |
|
|
 |
|
|
|
|
|
</details> |
|
|
|
|
|
## π Usage |
|
|
|
|
|
### With the Full Pipeline (Recommended) |
|
|
|
|
|
To generate actual GIFs, you need the complete pipeline code: |
|
|
|
|
|
```bash |
|
|
git clone https://github.com/vikramlingam/PhysicsGIF-135M |
|
|
cd PhysicsGIF-135M |
|
|
pip install torch transformers peft pillow numpy tqdm |
|
|
|
|
|
# Interactive mode - generates real GIFs |
|
|
python generate.py |
|
|
``` |
|
|
|
|
|
``` |
|
|
π¬ PhysicsGIF Text-to-GIF Generator |
|
|
Enter prompt: a red ball bouncing |
|
|
Generating... |
|
|
β Generated: output_1.gif |
|
|
``` |
|
|
|
|
|
### Using This Model Directly (Text β JSON only) |
|
|
|
|
|
```python |
|
|
from transformers import AutoModelForCausalLM, AutoTokenizer |
|
|
|
|
|
model = AutoModelForCausalLM.from_pretrained("vikramlingam/PhysicsGIF-135M") |
|
|
tokenizer = AutoTokenizer.from_pretrained("vikramlingam/PhysicsGIF-135M") |
|
|
|
|
|
prompt = '''<|im_start|>system |
|
|
You are a scene description parser. Convert text to JSON scene specification.<|im_end|> |
|
|
<|im_start|>user |
|
|
Convert to scene JSON: a red ball bouncing to the right<|im_end|> |
|
|
<|im_start|>assistant |
|
|
''' |
|
|
|
|
|
inputs = tokenizer(prompt, return_tensors="pt") |
|
|
outputs = model.generate(**inputs, max_new_tokens=200, do_sample=False) |
|
|
result = tokenizer.decode(outputs[0]) |
|
|
|
|
|
# Output: JSON scene specification |
|
|
# You need physics.py and renderer.py to convert this to a GIF |
|
|
``` |
|
|
|
|
|
## ποΈ Full Pipeline Architecture |
|
|
|
|
|
``` |
|
|
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ |
|
|
β Complete GIF Generation Pipeline β |
|
|
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€ |
|
|
β β |
|
|
β User Input: "a red ball bouncing" β |
|
|
β β β |
|
|
β βΌ β |
|
|
β βββββββββββββββββββββββββββββββββββββββ β |
|
|
β β PhysicsGIF-135M (THIS MODEL) β β |
|
|
β β Fine-tuned LLM β β |
|
|
β β Converts text β JSON DSL β β |
|
|
β ββββββββββββββββββββ¬βββββββββββββββββββ β |
|
|
β β β |
|
|
β βΌ β |
|
|
β βββββββββββββββββββββββββββββββββββββββ β |
|
|
β β physics.py (Python code) β β |
|
|
β β Newtonian physics simulation β β |
|
|
β β Calculates positions per frame β β |
|
|
β ββββββββββββββββββββ¬βββββββββββββββββββ β |
|
|
β β β |
|
|
β βΌ β |
|
|
β βββββββββββββββββββββββββββββββββββββββ β |
|
|
β β renderer.py (Python code) β β |
|
|
β β PIL-based frame rendering β β |
|
|
β β Saves as animated GIF β β |
|
|
β ββββββββββββββββββββ¬βββββββββββββββββββ β |
|
|
β β β |
|
|
β βΌ β |
|
|
β output.gif β |
|
|
β β |
|
|
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ |
|
|
``` |
|
|
|
|
|
## π― What This Model Understands |
|
|
|
|
|
### Objects |
|
|
`ball`, `square`, `triangle` |
|
|
|
|
|
### Colors |
|
|
`red`, `blue`, `green`, `yellow`, `orange`, `purple`, `pink`, `cyan`, `white` |
|
|
|
|
|
### Motion Patterns |
|
|
- `bouncing` β Gravity + elastic bounce |
|
|
- `falling` / `dropping` β Falls from top |
|
|
- `floating` β No gravity |
|
|
- `colliding` β Objects collide |
|
|
- `exploding` β Triggers particle effects |
|
|
|
|
|
### Multi-Object |
|
|
`two balls`, `three triangles` |
|
|
|
|
|
## π Required Files for GIF Generation |
|
|
|
|
|
This model alone cannot generate GIFs. You need: |
|
|
|
|
|
| File | Purpose | |
|
|
|------|---------| |
|
|
| `src/parser.py` | Integrates this model | |
|
|
| `src/physics.py` | Physics simulation | |
|
|
| `src/renderer.py` | GIF rendering | |
|
|
| `src/pipeline.py` | Combines all components | |
|
|
| `generate.py` | CLI interface | |
|
|
|
|
|
## π¬ Training Details |
|
|
|
|
|
- **Method**: LoRA fine-tuning |
|
|
- **Target Modules**: q_proj, v_proj, k_proj, o_proj, gate_proj, up_proj, down_proj |
|
|
- **Hardware**: CPU only (MacBook Pro) |
|
|
- **Dataset**: 500 text-to-JSON examples |
|
|
|
|
|
## π License |
|
|
|
|
|
Apache 2.0 |
|
|
|