vikramlingam commited on
Commit
3a656f2
·
verified ·
1 Parent(s): daeac25

Upload folder using huggingface_hub

Browse files
.gitattributes CHANGED
@@ -33,3 +33,9 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
 
 
 
 
 
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ 01_training_loss.png filter=lfs diff=lfs merge=lfs -text
37
+ 02_learning_rate.png filter=lfs diff=lfs merge=lfs -text
38
+ 03_gradient_norms.png filter=lfs diff=lfs merge=lfs -text
39
+ 05_object_distribution.png filter=lfs diff=lfs merge=lfs -text
40
+ 09_training_summary.png filter=lfs diff=lfs merge=lfs -text
41
+ 10_loss_zoom.png filter=lfs diff=lfs merge=lfs -text
01_training_loss.png ADDED

Git LFS Details

  • SHA256: 261c21283ee83b7e0a4dbfd5d13d361f27a7e9411da702ec44059e815c2b1036
  • Pointer size: 131 Bytes
  • Size of remote file: 134 kB
02_learning_rate.png ADDED

Git LFS Details

  • SHA256: 061bdda9cd8038aeb097877494c30d52b3290cc9507140141adedbec03067670
  • Pointer size: 131 Bytes
  • Size of remote file: 105 kB
03_gradient_norms.png ADDED

Git LFS Details

  • SHA256: da8784d5560ba898397b58c168960d0c6aaa13a2369249625c24dd0e8ad90dcd
  • Pointer size: 131 Bytes
  • Size of remote file: 231 kB
04_epoch_losses.png ADDED
05_object_distribution.png ADDED

Git LFS Details

  • SHA256: 639091cf925274d9db17a30301c3137db8b591ca5d19a33351c621347a1f3f20
  • Pointer size: 131 Bytes
  • Size of remote file: 119 kB
06_motion_distribution.png ADDED
07_epoch_times.png ADDED
08_convergence_analysis.png ADDED
09_training_summary.png ADDED

Git LFS Details

  • SHA256: e6643f14d94a6f89234a381b2ca82fda6ba268c0eed8c90ddcfe4018c48ba6b7
  • Pointer size: 131 Bytes
  • Size of remote file: 160 kB
10_loss_zoom.png ADDED

Git LFS Details

  • SHA256: ebb21fbd834661a28de9ff03e9ca086e8d2befe09f10cb24af01cd77cf1381d9
  • Pointer size: 131 Bytes
  • Size of remote file: 243 kB
11_physics_features.png ADDED
12_prompt_lengths.png ADDED
README.md CHANGED
@@ -1,3 +1,221 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ language:
4
+ - en
5
+ library_name: transformers
6
+ tags:
7
+ - text-parsing
8
+ - scene-understanding
9
+ - physics-simulation
10
+ - smollm2
11
+ - lora
12
+ - fine-tuned
13
+ base_model: HuggingFaceTB/SmolLM2-135M-Instruct
14
+ pipeline_tag: text-generation
15
+ ---
16
+
17
+ # PhysicsGIF-135M
18
+
19
+ � **Natural Language to Scene Parser** — A fine-tuned 135M parameter model that converts text descriptions into structured JSON scene specifications.
20
+
21
+ > ⚠️ **Note**: This model is the **text parsing component** of a larger physics-based GIF generation pipeline. It does NOT generate GIFs directly — it outputs structured JSON that is then processed by a separate physics engine and renderer.
22
+
23
+ ![Training Loss](01_training_loss.png)
24
+
25
+ ## What This Model Does
26
+
27
+ ```
28
+ "a red ball bouncing to the right"
29
+
30
+
31
+ ┌─────────────────────┐
32
+ │ PhysicsGIF-135M │ ← THIS MODEL
33
+ │ (Text → JSON) │
34
+ └──────────┬──────────┘
35
+
36
+
37
+ {
38
+ "objects": [{"type": "ball", "color": "#FF0000"}],
39
+ "motion": {"velocity": [3, 0], "gravity": 0.3, "bounce": 0.9},
40
+ "canvas": {"size": 128, "frames": 40}
41
+ }
42
+ ```
43
+
44
+ The JSON output is then processed by **separate Python code** (physics engine + renderer) to create the actual GIF.
45
+
46
+ ## 🎬 Example Outputs
47
+
48
+ | Prompt | Generated GIF |
49
+ |--------|---------------|
50
+ | "two triangles colliding with each other and exploding" | ![Triangles Exploding](output.gif) |
51
+ | "a pink ball dropping slowly from up" | ![Pink Ball Falling](output_1.gif) |
52
+
53
+ ## 📊 Training Results
54
+
55
+ | Metric | Value |
56
+ |--------|-------|
57
+ | Base Model | SmolLM2-135M-Instruct |
58
+ | Training Examples | 500 |
59
+ | Epochs | 20 |
60
+ | Final Loss | 0.092 |
61
+ | Loss Reduction | 95.9% |
62
+ | Training Time | 42 minutes |
63
+ | LoRA Rank | 16 |
64
+ | LoRA Alpha | 32 |
65
+
66
+ <details>
67
+ <summary>📈 Training Visualizations</summary>
68
+
69
+ #### Training Loss Curve
70
+ ![Training Loss](01_training_loss.png)
71
+
72
+ #### Learning Rate Schedule
73
+ ![Learning Rate](02_learning_rate.png)
74
+
75
+ #### Gradient Norms
76
+ ![Gradient Norms](03_gradient_norms.png)
77
+
78
+ #### Per-Epoch Loss
79
+ ![Epoch Losses](04_epoch_losses.png)
80
+
81
+ #### Dataset Distribution
82
+ ![Object Distribution](05_object_distribution.png)
83
+ ![Motion Distribution](06_motion_distribution.png)
84
+
85
+ #### Convergence Analysis
86
+ ![Convergence](08_convergence_analysis.png)
87
+
88
+ </details>
89
+
90
+ ## 🚀 Usage
91
+
92
+ ### With the Full Pipeline (Recommended)
93
+
94
+ To generate actual GIFs, you need the complete pipeline code:
95
+
96
+ ```bash
97
+ git clone https://github.com/your-username/PhysicsGIF
98
+ cd PhysicsGIF
99
+ pip install torch transformers peft pillow numpy tqdm
100
+
101
+ # Interactive mode - generates real GIFs
102
+ python generate.py
103
+ ```
104
+
105
+ ```
106
+ 🎬 TRM-V2 Text-to-GIF Generator
107
+ Enter prompt: a red ball bouncing
108
+ Generating...
109
+ ✓ Generated: output_1.gif
110
+ ```
111
+
112
+ ### Using This Model Directly (Text → JSON only)
113
+
114
+ ```python
115
+ from transformers import AutoModelForCausalLM, AutoTokenizer
116
+
117
+ model = AutoModelForCausalLM.from_pretrained("your-username/PhysicsGIF-135M")
118
+ tokenizer = AutoTokenizer.from_pretrained("your-username/PhysicsGIF-135M")
119
+
120
+ prompt = '''<|im_start|>system
121
+ You are a scene description parser. Convert text to JSON scene specification.<|im_end|>
122
+ <|im_start|>user
123
+ Convert to scene JSON: a red ball bouncing to the right<|im_end|>
124
+ <|im_start|>assistant
125
+ '''
126
+
127
+ inputs = tokenizer(prompt, return_tensors="pt")
128
+ outputs = model.generate(**inputs, max_new_tokens=200, do_sample=False)
129
+ result = tokenizer.decode(outputs[0])
130
+
131
+ # Output: JSON scene specification
132
+ # You need physics.py and renderer.py to convert this to a GIF
133
+ ```
134
+
135
+ ## 🏗️ Full Pipeline Architecture
136
+
137
+ ```
138
+ ┌─────────────────────────────────────────────────────────────┐
139
+ │ Complete GIF Generation Pipeline │
140
+ ├─────────────────────────────────────────────────────────────┤
141
+ │ │
142
+ │ User Input: "a red ball bouncing" │
143
+ │ │ │
144
+ │ ▼ │
145
+ │ ┌─────────────────────────────────────┐ │
146
+ │ │ PhysicsGIF-135M (THIS MODEL) │ │
147
+ │ │ Fine-tuned LLM │ │
148
+ │ │ Converts text → JSON DSL │ │
149
+ │ └──────────────────┬──────────────────┘ │
150
+ │ │ │
151
+ │ ▼ │
152
+ │ ┌─────────────────────────────────────┐ │
153
+ │ │ physics.py (Python code) │ │
154
+ │ │ Newtonian physics simulation │ │
155
+ │ │ Calculates positions per frame │ │
156
+ │ └──────────────────┬──────────────────┘ │
157
+ │ │ │
158
+ │ ▼ │
159
+ │ ┌─────────────────────────────────────┐ │
160
+ │ │ renderer.py (Python code) │ │
161
+ │ │ PIL-based frame rendering │ │
162
+ │ │ Saves as animated GIF │ │
163
+ │ └──────────────────┬──────────────────┘ │
164
+ │ │ │
165
+ │ ▼ │
166
+ │ output.gif │
167
+ │ │
168
+ └─────────────────────────────────────────────────────────────┘
169
+ ```
170
+
171
+ ## 🎯 What This Model Understands
172
+
173
+ ### Objects
174
+ `ball`, `square`, `triangle`
175
+
176
+ ### Colors
177
+ `red`, `blue`, `green`, `yellow`, `orange`, `purple`, `pink`, `cyan`, `white`
178
+
179
+ ### Motion Patterns
180
+ - `bouncing` — Gravity + elastic bounce
181
+ - `falling` / `dropping` — Falls from top
182
+ - `floating` — No gravity
183
+ - `colliding` — Objects collide
184
+ - `exploding` — Triggers particle effects
185
+
186
+ ### Multi-Object
187
+ `two balls`, `three triangles`
188
+
189
+ ## 📁 Required Files for GIF Generation
190
+
191
+ This model alone cannot generate GIFs. You need:
192
+
193
+ | File | Purpose |
194
+ |------|---------|
195
+ | `src/parser.py` | Integrates this model |
196
+ | `src/physics.py` | Physics simulation |
197
+ | `src/renderer.py` | GIF rendering |
198
+ | `src/pipeline.py` | Combines all components |
199
+ | `generate.py` | CLI interface |
200
+
201
+ ## 🔬 Training Details
202
+
203
+ - **Method**: LoRA fine-tuning
204
+ - **Target Modules**: q_proj, v_proj, k_proj, o_proj, gate_proj, up_proj, down_proj
205
+ - **Hardware**: CPU only (MacBook Pro)
206
+ - **Dataset**: 500 text-to-JSON examples
207
+
208
+ ## 📜 License
209
+
210
+ Apache 2.0
211
+
212
+ ## Citation
213
+
214
+ ```bibtex
215
+ @misc{physicsgif2024,
216
+ title={PhysicsGIF-135M: Text-to-Scene Parser for Physics-Based Animation},
217
+ author={Your Name},
218
+ year={2024},
219
+ publisher={Hugging Face}
220
+ }
221
+ ```
chat_template.jinja ADDED
@@ -0,0 +1,6 @@
 
 
 
 
 
 
 
1
+ {% for message in messages %}{% if loop.first and messages[0]['role'] != 'system' %}{{ '<|im_start|>system
2
+ You are a helpful AI assistant named SmolLM, trained by Hugging Face<|im_end|>
3
+ ' }}{% endif %}{{'<|im_start|>' + message['role'] + '
4
+ ' + message['content'] + '<|im_end|>' + '
5
+ '}}{% endfor %}{% if add_generation_prompt %}{{ '<|im_start|>assistant
6
+ ' }}{% endif %}
config.json ADDED
@@ -0,0 +1,38 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "architectures": [
3
+ "LlamaForCausalLM"
4
+ ],
5
+ "attention_bias": false,
6
+ "attention_dropout": 0.0,
7
+ "bos_token_id": 1,
8
+ "dtype": "float32",
9
+ "eos_token_id": 2,
10
+ "head_dim": 64,
11
+ "hidden_act": "silu",
12
+ "hidden_size": 576,
13
+ "initializer_range": 0.041666666666666664,
14
+ "intermediate_size": 1536,
15
+ "is_llama_config": true,
16
+ "max_position_embeddings": 8192,
17
+ "mlp_bias": false,
18
+ "model_type": "llama",
19
+ "num_attention_heads": 9,
20
+ "num_hidden_layers": 30,
21
+ "num_key_value_heads": 3,
22
+ "pad_token_id": 2,
23
+ "pretraining_tp": 1,
24
+ "rms_norm_eps": 1e-05,
25
+ "rope_interleaved": false,
26
+ "rope_scaling": null,
27
+ "rope_theta": 100000,
28
+ "tie_word_embeddings": true,
29
+ "transformers.js_config": {
30
+ "kv_cache_dtype": {
31
+ "fp16": "float16",
32
+ "q4f16": "float16"
33
+ }
34
+ },
35
+ "transformers_version": "4.57.3",
36
+ "use_cache": true,
37
+ "vocab_size": 49152
38
+ }
generation_config.json ADDED
@@ -0,0 +1,7 @@
 
 
 
 
 
 
 
 
1
+ {
2
+ "_from_model_config": true,
3
+ "bos_token_id": 1,
4
+ "eos_token_id": 2,
5
+ "pad_token_id": 2,
6
+ "transformers_version": "4.57.3"
7
+ }
merges.txt ADDED
The diff for this file is too large to render. See raw diff
 
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:cff55b21f2bccf07f13d7bcfd60458c795f341f435ea83ea71d501e5434a1310
3
+ size 538090408
output.gif ADDED
output_1.gif ADDED
special_tokens_map.json ADDED
@@ -0,0 +1,34 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "additional_special_tokens": [
3
+ "<|im_start|>",
4
+ "<|im_end|>"
5
+ ],
6
+ "bos_token": {
7
+ "content": "<|im_start|>",
8
+ "lstrip": false,
9
+ "normalized": false,
10
+ "rstrip": false,
11
+ "single_word": false
12
+ },
13
+ "eos_token": {
14
+ "content": "<|im_end|>",
15
+ "lstrip": false,
16
+ "normalized": false,
17
+ "rstrip": false,
18
+ "single_word": false
19
+ },
20
+ "pad_token": {
21
+ "content": "<|im_end|>",
22
+ "lstrip": false,
23
+ "normalized": false,
24
+ "rstrip": false,
25
+ "single_word": false
26
+ },
27
+ "unk_token": {
28
+ "content": "<|endoftext|>",
29
+ "lstrip": false,
30
+ "normalized": false,
31
+ "rstrip": false,
32
+ "single_word": false
33
+ }
34
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,154 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "add_prefix_space": false,
3
+ "added_tokens_decoder": {
4
+ "0": {
5
+ "content": "<|endoftext|>",
6
+ "lstrip": false,
7
+ "normalized": false,
8
+ "rstrip": false,
9
+ "single_word": false,
10
+ "special": true
11
+ },
12
+ "1": {
13
+ "content": "<|im_start|>",
14
+ "lstrip": false,
15
+ "normalized": false,
16
+ "rstrip": false,
17
+ "single_word": false,
18
+ "special": true
19
+ },
20
+ "2": {
21
+ "content": "<|im_end|>",
22
+ "lstrip": false,
23
+ "normalized": false,
24
+ "rstrip": false,
25
+ "single_word": false,
26
+ "special": true
27
+ },
28
+ "3": {
29
+ "content": "<repo_name>",
30
+ "lstrip": false,
31
+ "normalized": false,
32
+ "rstrip": false,
33
+ "single_word": false,
34
+ "special": true
35
+ },
36
+ "4": {
37
+ "content": "<reponame>",
38
+ "lstrip": false,
39
+ "normalized": false,
40
+ "rstrip": false,
41
+ "single_word": false,
42
+ "special": true
43
+ },
44
+ "5": {
45
+ "content": "<file_sep>",
46
+ "lstrip": false,
47
+ "normalized": false,
48
+ "rstrip": false,
49
+ "single_word": false,
50
+ "special": true
51
+ },
52
+ "6": {
53
+ "content": "<filename>",
54
+ "lstrip": false,
55
+ "normalized": false,
56
+ "rstrip": false,
57
+ "single_word": false,
58
+ "special": true
59
+ },
60
+ "7": {
61
+ "content": "<gh_stars>",
62
+ "lstrip": false,
63
+ "normalized": false,
64
+ "rstrip": false,
65
+ "single_word": false,
66
+ "special": true
67
+ },
68
+ "8": {
69
+ "content": "<issue_start>",
70
+ "lstrip": false,
71
+ "normalized": false,
72
+ "rstrip": false,
73
+ "single_word": false,
74
+ "special": true
75
+ },
76
+ "9": {
77
+ "content": "<issue_comment>",
78
+ "lstrip": false,
79
+ "normalized": false,
80
+ "rstrip": false,
81
+ "single_word": false,
82
+ "special": true
83
+ },
84
+ "10": {
85
+ "content": "<issue_closed>",
86
+ "lstrip": false,
87
+ "normalized": false,
88
+ "rstrip": false,
89
+ "single_word": false,
90
+ "special": true
91
+ },
92
+ "11": {
93
+ "content": "<jupyter_start>",
94
+ "lstrip": false,
95
+ "normalized": false,
96
+ "rstrip": false,
97
+ "single_word": false,
98
+ "special": true
99
+ },
100
+ "12": {
101
+ "content": "<jupyter_text>",
102
+ "lstrip": false,
103
+ "normalized": false,
104
+ "rstrip": false,
105
+ "single_word": false,
106
+ "special": true
107
+ },
108
+ "13": {
109
+ "content": "<jupyter_code>",
110
+ "lstrip": false,
111
+ "normalized": false,
112
+ "rstrip": false,
113
+ "single_word": false,
114
+ "special": true
115
+ },
116
+ "14": {
117
+ "content": "<jupyter_output>",
118
+ "lstrip": false,
119
+ "normalized": false,
120
+ "rstrip": false,
121
+ "single_word": false,
122
+ "special": true
123
+ },
124
+ "15": {
125
+ "content": "<jupyter_script>",
126
+ "lstrip": false,
127
+ "normalized": false,
128
+ "rstrip": false,
129
+ "single_word": false,
130
+ "special": true
131
+ },
132
+ "16": {
133
+ "content": "<empty_output>",
134
+ "lstrip": false,
135
+ "normalized": false,
136
+ "rstrip": false,
137
+ "single_word": false,
138
+ "special": true
139
+ }
140
+ },
141
+ "additional_special_tokens": [
142
+ "<|im_start|>",
143
+ "<|im_end|>"
144
+ ],
145
+ "bos_token": "<|im_start|>",
146
+ "clean_up_tokenization_spaces": false,
147
+ "eos_token": "<|im_end|>",
148
+ "extra_special_tokens": {},
149
+ "model_max_length": 8192,
150
+ "pad_token": "<|im_end|>",
151
+ "tokenizer_class": "GPT2Tokenizer",
152
+ "unk_token": "<|endoftext|>",
153
+ "vocab_size": 49152
154
+ }
training_config.json ADDED
@@ -0,0 +1,12 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "base_model_name_or_path": "HuggingFaceTB/SmolLM2-135M-Instruct",
3
+ "lora_r": 16,
4
+ "lora_alpha": 32,
5
+ "epochs": 20,
6
+ "batch_size": 2,
7
+ "learning_rate": 0.0002,
8
+ "grad_accum": 4,
9
+ "num_examples": 500,
10
+ "training_time_seconds": 2538.535966873169,
11
+ "seed": 42
12
+ }
vocab.json ADDED
The diff for this file is too large to render. See raw diff