vikramlingam commited on 11 days ago

Commit

3a656f2

verified ·

1 Parent(s): daeac25

Upload folder using huggingface_hub

Browse files

Files changed (26) hide show

.gitattributes +6 -0
01_training_loss.png +3 -0
02_learning_rate.png +3 -0
03_gradient_norms.png +3 -0
04_epoch_losses.png +0 -0
05_object_distribution.png +3 -0
06_motion_distribution.png +0 -0
07_epoch_times.png +0 -0
08_convergence_analysis.png +0 -0
09_training_summary.png +3 -0
10_loss_zoom.png +3 -0
11_physics_features.png +0 -0
12_prompt_lengths.png +0 -0
README.md +221 -3
chat_template.jinja +6 -0
config.json +38 -0
generation_config.json +7 -0
merges.txt +0 -0
model.safetensors +3 -0
output.gif +0 -0
output_1.gif +0 -0
special_tokens_map.json +34 -0
tokenizer.json +0 -0
tokenizer_config.json +154 -0
training_config.json +12 -0
vocab.json +0 -0

.gitattributes CHANGED Viewed

@@ -33,3 +33,9 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text

 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text
+01_training_loss.png filter=lfs diff=lfs merge=lfs -text
+02_learning_rate.png filter=lfs diff=lfs merge=lfs -text
+03_gradient_norms.png filter=lfs diff=lfs merge=lfs -text
+05_object_distribution.png filter=lfs diff=lfs merge=lfs -text
+09_training_summary.png filter=lfs diff=lfs merge=lfs -text
+10_loss_zoom.png filter=lfs diff=lfs merge=lfs -text

01_training_loss.png ADDED Viewed

Git LFS Details

SHA256: 261c21283ee83b7e0a4dbfd5d13d361f27a7e9411da702ec44059e815c2b1036
Pointer size: 131 Bytes
Size of remote file: 134 kB

02_learning_rate.png ADDED Viewed

Git LFS Details

SHA256: 061bdda9cd8038aeb097877494c30d52b3290cc9507140141adedbec03067670
Pointer size: 131 Bytes
Size of remote file: 105 kB

03_gradient_norms.png ADDED Viewed

Git LFS Details

SHA256: da8784d5560ba898397b58c168960d0c6aaa13a2369249625c24dd0e8ad90dcd
Pointer size: 131 Bytes
Size of remote file: 231 kB

04_epoch_losses.png ADDED Viewed

05_object_distribution.png ADDED Viewed

Git LFS Details

SHA256: 639091cf925274d9db17a30301c3137db8b591ca5d19a33351c621347a1f3f20
Pointer size: 131 Bytes
Size of remote file: 119 kB

06_motion_distribution.png ADDED Viewed

07_epoch_times.png ADDED Viewed

08_convergence_analysis.png ADDED Viewed

09_training_summary.png ADDED Viewed

Git LFS Details

SHA256: e6643f14d94a6f89234a381b2ca82fda6ba268c0eed8c90ddcfe4018c48ba6b7
Pointer size: 131 Bytes
Size of remote file: 160 kB

10_loss_zoom.png ADDED Viewed

Git LFS Details

SHA256: ebb21fbd834661a28de9ff03e9ca086e8d2befe09f10cb24af01cd77cf1381d9
Pointer size: 131 Bytes
Size of remote file: 243 kB

11_physics_features.png ADDED Viewed

12_prompt_lengths.png ADDED Viewed

README.md CHANGED Viewed

@@ -1,3 +1,221 @@
----
-license: apache-2.0
----

+---
+license: apache-2.0
+language:
+- en
+library_name: transformers
+tags:
+- text-parsing
+- scene-understanding
+- physics-simulation
+- smollm2
+- lora
+- fine-tuned
+base_model: HuggingFaceTB/SmolLM2-135M-Instruct
+pipeline_tag: text-generation
+---
+# PhysicsGIF-135M
+� **Natural Language to Scene Parser** — A fine-tuned 135M parameter model that converts text descriptions into structured JSON scene specifications.
+> ⚠️ **Note**: This model is the **text parsing component** of a larger physics-based GIF generation pipeline. It does NOT generate GIFs directly — it outputs structured JSON that is then processed by a separate physics engine and renderer.
+![Training Loss](01_training_loss.png)
+## What This Model Does
+```
+"a red ball bouncing to the right"
+              │
+              ▼
+    ┌─────────────────────┐
+    │  PhysicsGIF-135M    │  ← THIS MODEL
+    │  (Text → JSON)      │
+    └──────────┬──────────┘
+               │
+               ▼
+{
+  "objects": [{"type": "ball", "color": "#FF0000"}],
+  "motion": {"velocity": [3, 0], "gravity": 0.3, "bounce": 0.9},
+  "canvas": {"size": 128, "frames": 40}
+}
+```
+The JSON output is then processed by **separate Python code** (physics engine + renderer) to create the actual GIF.
+## 🎬 Example Outputs
+| Prompt | Generated GIF |
+|--------|---------------|
+| "two triangles colliding with each other and exploding" | ![Triangles Exploding](output.gif) |
+| "a pink ball dropping slowly from up" | ![Pink Ball Falling](output_1.gif) |
+## 📊 Training Results
+| Metric | Value |
+|--------|-------|
+| Base Model | SmolLM2-135M-Instruct |
+| Training Examples | 500 |
+| Epochs | 20 |
+| Final Loss | 0.092 |
+| Loss Reduction | 95.9% |
+| Training Time | 42 minutes |
+| LoRA Rank | 16 |
+| LoRA Alpha | 32 |
+<details>
+<summary>📈 Training Visualizations</summary>
+#### Training Loss Curve
+![Training Loss](01_training_loss.png)
+#### Learning Rate Schedule
+![Learning Rate](02_learning_rate.png)
+#### Gradient Norms
+![Gradient Norms](03_gradient_norms.png)
+#### Per-Epoch Loss
+![Epoch Losses](04_epoch_losses.png)
+#### Dataset Distribution
+![Object Distribution](05_object_distribution.png)
+![Motion Distribution](06_motion_distribution.png)
+#### Convergence Analysis
+![Convergence](08_convergence_analysis.png)
+</details>
+## 🚀 Usage
+### With the Full Pipeline (Recommended)
+To generate actual GIFs, you need the complete pipeline code:
+```bash
+git clone https://github.com/your-username/PhysicsGIF
+cd PhysicsGIF
+pip install torch transformers peft pillow numpy tqdm
+# Interactive mode - generates real GIFs
+python generate.py
+```
+```
+🎬 TRM-V2 Text-to-GIF Generator
+Enter prompt: a red ball bouncing
+Generating...
+✓ Generated: output_1.gif
+```
+### Using This Model Directly (Text → JSON only)
+```python
+from transformers import AutoModelForCausalLM, AutoTokenizer
+model = AutoModelForCausalLM.from_pretrained("your-username/PhysicsGIF-135M")
+tokenizer = AutoTokenizer.from_pretrained("your-username/PhysicsGIF-135M")
+prompt = '''<|im_start|>system
+You are a scene description parser. Convert text to JSON scene specification.<|im_end|>
+<|im_start|>user
+Convert to scene JSON: a red ball bouncing to the right<|im_end|>
+<|im_start|>assistant
+'''
+inputs = tokenizer(prompt, return_tensors="pt")
+outputs = model.generate(**inputs, max_new_tokens=200, do_sample=False)
+result = tokenizer.decode(outputs[0])
+# Output: JSON scene specification
+# You need physics.py and renderer.py to convert this to a GIF
+```
+## 🏗️ Full Pipeline Architecture
+```
+┌─────────────────────────────────────────────────────────────┐
+│                 Complete GIF Generation Pipeline            │
+├─────────────────────────────────────────────────────────────┤
+│                                                             │
+│  User Input: "a red ball bouncing"                          │
+│                      │                                      │
+│                      ▼                                      │
+│  ┌─────────────────────────────────────┐                    │
+│  │      PhysicsGIF-135M (THIS MODEL)   │                    │
+│  │      Fine-tuned LLM                 │                    │
+│  │      Converts text → JSON DSL       │                    │
+│  └──────────────────┬──────────────────┘                    │
+│                     │                                       │
+│                     ▼                                       │
+│  ┌─────────────────────────────────────┐                    │
+│  │      physics.py (Python code)       │                    │
+│  │      Newtonian physics simulation   │                    │
+│  │      Calculates positions per frame │                    │
+│  └──────────────────┬──────────────────┘                    │
+│                     │                                       │
+│                     ▼                                       │
+│  ┌─────────────────────────────────────┐                    │
+│  │      renderer.py (Python code)      │                    │
+│  │      PIL-based frame rendering      │                    │
+│  │      Saves as animated GIF          │                    │
+│  └──────────────────┬──────────────────┘                    │
+│                     │                                       │
+│                     ▼                                       │
+│               output.gif                                    │
+│                                                             │
+└─────────────────────────────────────────────────────────────┘
+```
+## 🎯 What This Model Understands
+### Objects
+`ball`, `square`, `triangle`
+### Colors
+`red`, `blue`, `green`, `yellow`, `orange`, `purple`, `pink`, `cyan`, `white`
+### Motion Patterns
+- `bouncing` — Gravity + elastic bounce
+- `falling` / `dropping` — Falls from top
+- `floating` — No gravity
+- `colliding` — Objects collide
+- `exploding` — Triggers particle effects
+### Multi-Object
+`two balls`, `three triangles`
+## 📁 Required Files for GIF Generation
+This model alone cannot generate GIFs. You need:
+| File | Purpose |
+|------|---------|
+| `src/parser.py` | Integrates this model |
+| `src/physics.py` | Physics simulation |
+| `src/renderer.py` | GIF rendering |
+| `src/pipeline.py` | Combines all components |
+| `generate.py` | CLI interface |
+## 🔬 Training Details
+- **Method**: LoRA fine-tuning
+- **Target Modules**: q_proj, v_proj, k_proj, o_proj, gate_proj, up_proj, down_proj
+- **Hardware**: CPU only (MacBook Pro)
+- **Dataset**: 500 text-to-JSON examples
+## 📜 License
+Apache 2.0
+##  Citation
+```bibtex
+@misc{physicsgif2024,
+  title={PhysicsGIF-135M: Text-to-Scene Parser for Physics-Based Animation},
+  author={Your Name},
+  year={2024},
+  publisher={Hugging Face}
+}
+```

chat_template.jinja ADDED Viewed

	@@ -0,0 +1,6 @@

+{% for message in messages %}{% if loop.first and messages[0]['role'] != 'system' %}{{ '<|im_start|>system
+You are a helpful AI assistant named SmolLM, trained by Hugging Face<|im_end|>
+' }}{% endif %}{{'<|im_start|>' + message['role'] + '
+' + message['content'] + '<|im_end|>' + '
+'}}{% endfor %}{% if add_generation_prompt %}{{ '<|im_start|>assistant
+' }}{% endif %}

config.json ADDED Viewed

	@@ -0,0 +1,38 @@

+{
+  "architectures": [
+    "LlamaForCausalLM"
+  ],
+  "attention_bias": false,
+  "attention_dropout": 0.0,
+  "bos_token_id": 1,
+  "dtype": "float32",
+  "eos_token_id": 2,
+  "head_dim": 64,
+  "hidden_act": "silu",
+  "hidden_size": 576,
+  "initializer_range": 0.041666666666666664,
+  "intermediate_size": 1536,
+  "is_llama_config": true,
+  "max_position_embeddings": 8192,
+  "mlp_bias": false,
+  "model_type": "llama",
+  "num_attention_heads": 9,
+  "num_hidden_layers": 30,
+  "num_key_value_heads": 3,
+  "pad_token_id": 2,
+  "pretraining_tp": 1,
+  "rms_norm_eps": 1e-05,
+  "rope_interleaved": false,
+  "rope_scaling": null,
+  "rope_theta": 100000,
+  "tie_word_embeddings": true,
+  "transformers.js_config": {
+    "kv_cache_dtype": {
+      "fp16": "float16",
+      "q4f16": "float16"
+    }
+  },
+  "transformers_version": "4.57.3",
+  "use_cache": true,
+  "vocab_size": 49152
+}

generation_config.json ADDED Viewed

	@@ -0,0 +1,7 @@

+{
+  "_from_model_config": true,
+  "bos_token_id": 1,
+  "eos_token_id": 2,
+  "pad_token_id": 2,
+  "transformers_version": "4.57.3"
+}

merges.txt ADDED Viewed

The diff for this file is too large to render. See raw diff

model.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:cff55b21f2bccf07f13d7bcfd60458c795f341f435ea83ea71d501e5434a1310
+size 538090408

output.gif ADDED Viewed

output_1.gif ADDED Viewed

special_tokens_map.json ADDED Viewed

	@@ -0,0 +1,34 @@

+{
+  "additional_special_tokens": [
+    "<|im_start|>",
+    "<|im_end|>"
+  ],
+  "bos_token": {
+    "content": "<|im_start|>",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "eos_token": {
+    "content": "<|im_end|>",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "pad_token": {
+    "content": "<|im_end|>",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "unk_token": {
+    "content": "<|endoftext|>",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  }
+}

tokenizer.json ADDED Viewed

The diff for this file is too large to render. See raw diff

tokenizer_config.json ADDED Viewed

	@@ -0,0 +1,154 @@

+{
+  "add_prefix_space": false,
+  "added_tokens_decoder": {
+    "0": {
+      "content": "<|endoftext|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "1": {
+      "content": "<|im_start|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "2": {
+      "content": "<|im_end|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "3": {
+      "content": "<repo_name>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "4": {
+      "content": "<reponame>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "5": {
+      "content": "<file_sep>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "6": {
+      "content": "<filename>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "7": {
+      "content": "<gh_stars>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "8": {
+      "content": "<issue_start>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "9": {
+      "content": "<issue_comment>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "10": {
+      "content": "<issue_closed>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "11": {
+      "content": "<jupyter_start>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "12": {
+      "content": "<jupyter_text>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "13": {
+      "content": "<jupyter_code>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "14": {
+      "content": "<jupyter_output>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "15": {
+      "content": "<jupyter_script>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "16": {
+      "content": "<empty_output>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    }
+  },
+  "additional_special_tokens": [
+    "<|im_start|>",
+    "<|im_end|>"
+  ],
+  "bos_token": "<|im_start|>",
+  "clean_up_tokenization_spaces": false,
+  "eos_token": "<|im_end|>",
+  "extra_special_tokens": {},
+  "model_max_length": 8192,
+  "pad_token": "<|im_end|>",
+  "tokenizer_class": "GPT2Tokenizer",
+  "unk_token": "<|endoftext|>",
+  "vocab_size": 49152
+}

training_config.json ADDED Viewed

	@@ -0,0 +1,12 @@

+{
+  "base_model_name_or_path": "HuggingFaceTB/SmolLM2-135M-Instruct",
+  "lora_r": 16,
+  "lora_alpha": 32,
+  "epochs": 20,
+  "batch_size": 2,
+  "learning_rate": 0.0002,
+  "grad_accum": 4,
+  "num_examples": 500,
+  "training_time_seconds": 2538.535966873169,
+  "seed": 42
+}

vocab.json ADDED Viewed

The diff for this file is too large to render. See raw diff