Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Log In
Sign Up
NotoriousH2
/
gemma-3-1b-it-Math-SFT
like
0
PyTorch
NotoriousH2/HRM8K
Korean
gemma3_text
math
korean
sft
gemma
distillation
License:
apache-2.0
Model card
Files
Files and versions
xet
Community
main
gemma-3-1b-it-Math-SFT
2.04 GB
Ctrl+K
Ctrl+K
1 contributor
History:
6 commits
NotoriousH2
Update README with detailed data pipeline and reproduction steps
f6007ba
verified
about 2 months ago
.gitattributes
Safe
1.57 kB
SFT only (teacher distillation from Qwen3-30B). GSM8K avg ~44.9%
about 2 months ago
README.md
Safe
16.6 kB
Update README with detailed data pipeline and reproduction steps
about 2 months ago
added_tokens.json
Safe
35 Bytes
SFT only (teacher distillation from Qwen3-30B). GSM8K avg ~44.9%
about 2 months ago
chat_template.jinja
Safe
1.53 kB
SFT only (teacher distillation from Qwen3-30B). GSM8K avg ~44.9%
about 2 months ago
config.json
Safe
1.6 kB
SFT only (teacher distillation from Qwen3-30B). GSM8K avg ~44.9%
about 2 months ago
eval.py
Safe
3.47 kB
Add eval.py
about 2 months ago
generation_config.json
Safe
217 Bytes
SFT only (teacher distillation from Qwen3-30B). GSM8K avg ~44.9%
about 2 months ago
pytorch_model.bin
pickle
Detected Pickle imports (3)
"collections.OrderedDict"
,
"torch._utils._rebuild_tensor_v2"
,
"torch.BFloat16Storage"
What is a pickle import?
2 GB
xet
SFT only (teacher distillation from Qwen3-30B). GSM8K avg ~44.9%
about 2 months ago
special_tokens_map.json
Safe
548 Bytes
SFT only (teacher distillation from Qwen3-30B). GSM8K avg ~44.9%
about 2 months ago
tokenizer.json
Safe
33.4 MB
xet
SFT only (teacher distillation from Qwen3-30B). GSM8K avg ~44.9%
about 2 months ago
tokenizer.model
Safe
4.69 MB
xet
SFT only (teacher distillation from Qwen3-30B). GSM8K avg ~44.9%
about 2 months ago
tokenizer_config.json
Safe
1.16 MB
SFT only (teacher distillation from Qwen3-30B). GSM8K avg ~44.9%
about 2 months ago
train_sft.py
Safe
3.32 kB
Add train_sft.py
about 2 months ago