NotoriousH2
/

gemma-3-1b-it-Math-SFT

Model card Files Files and versions

gemma-3-1b-it-Math-SFT

2.04 GB

Ctrl+K

Ctrl+K

1 contributor

History: 6 commits

NotoriousH2's picture

Update README with detailed data pipeline and reproduction steps

f6007ba verified 4 months ago

.gitattributes

1.57 kB
SFT only (teacher distillation from Qwen3-30B). GSM8K avg ~44.9% 4 months ago
README.md

16.6 kB
Update README with detailed data pipeline and reproduction steps 4 months ago
added_tokens.json

35 Bytes
SFT only (teacher distillation from Qwen3-30B). GSM8K avg ~44.9% 4 months ago
chat_template.jinja

1.53 kB
SFT only (teacher distillation from Qwen3-30B). GSM8K avg ~44.9% 4 months ago
config.json

1.6 kB
SFT only (teacher distillation from Qwen3-30B). GSM8K avg ~44.9% 4 months ago
eval.py

3.47 kB
Add eval.py 4 months ago
generation_config.json

217 Bytes
SFT only (teacher distillation from Qwen3-30B). GSM8K avg ~44.9% 4 months ago
pytorch_model.bin
Detected Pickle imports (3)
- "collections.OrderedDict",
- "torch._utils._rebuild_tensor_v2",
- "torch.BFloat16Storage"
What is a pickle import?
2 GB
xet

SFT only (teacher distillation from Qwen3-30B). GSM8K avg ~44.9% 4 months ago
special_tokens_map.json

548 Bytes
SFT only (teacher distillation from Qwen3-30B). GSM8K avg ~44.9% 4 months ago
tokenizer.json

33.4 MB
xet

SFT only (teacher distillation from Qwen3-30B). GSM8K avg ~44.9% 4 months ago
tokenizer.model

4.69 MB
xet

SFT only (teacher distillation from Qwen3-30B). GSM8K avg ~44.9% 4 months ago
tokenizer_config.json

1.16 MB
SFT only (teacher distillation from Qwen3-30B). GSM8K avg ~44.9% 4 months ago
train_sft.py

3.32 kB
Add train_sft.py 4 months ago