PhysicsGIF-135M
οΏ½ Natural Language to Scene Parser: A fine-tuned 135M parameter model that converts text descriptions into structured JSON scene specifications.
β οΈ Note: This model is the text parsing component of a larger physics-based GIF generation pipeline. It does NOT generate GIFs directly, it outputs structured JSON that is then processed by a separate physics engine and renderer.
What This Model Does
"a red ball bouncing to the right"
β
βΌ
βββββββββββββββββββββββ
β PhysicsGIF-135M β β THIS MODEL
β (Text β JSON) β
ββββββββββββ¬βββββββββββ
β
βΌ
{
"objects": [{"type": "ball", "color": "#FF0000"}],
"motion": {"velocity": [3, 0], "gravity": 0.3, "bounce": 0.9},
"canvas": {"size": 128, "frames": 40}
}
The JSON output is then processed by separate Python code (physics engine + renderer) to create the actual GIF.
π¬ Example Outputs
| Prompt | Generated GIF |
|---|---|
| "two triangles colliding with each other and exploding" | ![]() |
| "a pink ball dropping slowly from up" | ![]() |
π Training Results
| Metric | Value |
|---|---|
| Base Model | SmolLM2-135M-Instruct |
| Training Examples | 500 |
| Epochs | 20 |
| Final Loss | 0.092 |
| Loss Reduction | 95.9% |
| Training Time | 42 minutes |
| LoRA Rank | 16 |
| LoRA Alpha | 32 |
π Training Visualizations
Training Loss Curve
Learning Rate Schedule
Gradient Norms
Per-Epoch Loss
Dataset Distribution
Convergence Analysis
π Usage
With the Full Pipeline (Recommended)
To generate actual GIFs, you need the complete pipeline code:
git clone https://github.com/vikramlingam/PhysicsGIF-135M
cd PhysicsGIF-135M
pip install torch transformers peft pillow numpy tqdm
# Interactive mode - generates real GIFs
python generate.py
π¬ PhysicsGIF Text-to-GIF Generator
Enter prompt: a red ball bouncing
Generating...
β Generated: output_1.gif
Using This Model Directly (Text β JSON only)
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained("vikramlingam/PhysicsGIF-135M")
tokenizer = AutoTokenizer.from_pretrained("vikramlingam/PhysicsGIF-135M")
prompt = '''<|im_start|>system
You are a scene description parser. Convert text to JSON scene specification.<|im_end|>
<|im_start|>user
Convert to scene JSON: a red ball bouncing to the right<|im_end|>
<|im_start|>assistant
'''
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=200, do_sample=False)
result = tokenizer.decode(outputs[0])
# Output: JSON scene specification
# You need physics.py and renderer.py to convert this to a GIF
ποΈ Full Pipeline Architecture
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Complete GIF Generation Pipeline β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β User Input: "a red ball bouncing" β
β β β
β βΌ β
β βββββββββββββββββββββββββββββββββββββββ β
β β PhysicsGIF-135M (THIS MODEL) β β
β β Fine-tuned LLM β β
β β Converts text β JSON DSL β β
β ββββββββββββββββββββ¬βββββββββββββββββββ β
β β β
β βΌ β
β βββββββββββββββββββββββββββββββββββββββ β
β β physics.py (Python code) β β
β β Newtonian physics simulation β β
β β Calculates positions per frame β β
β ββββββββββββββββββββ¬βββββββββββββββββββ β
β β β
β βΌ β
β βββββββββββββββββββββββββββββββββββββββ β
β β renderer.py (Python code) β β
β β PIL-based frame rendering β β
β β Saves as animated GIF β β
β ββββββββββββββββββββ¬βββββββββββββββββββ β
β β β
β βΌ β
β output.gif β
β β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
π― What This Model Understands
Objects
ball, square, triangle
Colors
red, blue, green, yellow, orange, purple, pink, cyan, white
Motion Patterns
bouncingβ Gravity + elastic bouncefalling/droppingβ Falls from topfloatingβ No gravitycollidingβ Objects collideexplodingβ Triggers particle effects
Multi-Object
two balls, three triangles
π Required Files for GIF Generation
This model alone cannot generate GIFs. You need:
| File | Purpose |
|---|---|
src/parser.py |
Integrates this model |
src/physics.py |
Physics simulation |
src/renderer.py |
GIF rendering |
src/pipeline.py |
Combines all components |
generate.py |
CLI interface |
π¬ Training Details
- Method: LoRA fine-tuning
- Target Modules: q_proj, v_proj, k_proj, o_proj, gate_proj, up_proj, down_proj
- Hardware: CPU only (MacBook Pro)
- Dataset: 500 text-to-JSON examples
π License
Apache 2.0
- Downloads last month
- 24








