Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Log In
Sign Up
TobiasLogic
/
ZeroShot-500M
like
1
Text Generation
PyTorch
4 datasets
English
gpt2
from-scratch
decoder-only
transformer
zeroshot
llm-training
bfloat16
flash-attention
License:
mit
Model card
Files
Files and versions
xet
Community
main
ZeroShot-500M
6.47 GB
Ctrl+K
Ctrl+K
2 contributors
History:
16 commits
TobiasLogic
Update README.md
2a4d19d
verified
2 days ago
.gitattributes
Safe
1.69 kB
Upload loss_curve_sft.png with huggingface_hub
2 days ago
README.md
4.36 kB
Update README.md
2 days ago
ckpt_base_final.pt
pickle
Detected Pickle imports (3)
"torch.FloatStorage"
,
"collections.OrderedDict"
,
"torch._utils._rebuild_tensor_v2"
What is a pickle import?
2.16 GB
xet
Upload ckpt_base_final.pt with huggingface_hub
3 days ago
ckpt_mid_final.pt
pickle
Detected Pickle imports (3)
"collections.OrderedDict"
,
"torch.FloatStorage"
,
"torch._utils._rebuild_tensor_v2"
What is a pickle import?
2.16 GB
xet
Upload ckpt_mid_final.pt with huggingface_hub
2 days ago
ckpt_sft_final.pt
pickle
Detected Pickle imports (3)
"torch.FloatStorage"
,
"collections.OrderedDict"
,
"torch._utils._rebuild_tensor_v2"
What is a pickle import?
2.16 GB
xet
Upload ckpt_sft_final.pt with huggingface_hub
2 days ago
config.json
Safe
163 Bytes
Create config.json
2 days ago
loss_base.json
Safe
130 kB
Upload loss_base.json with huggingface_hub
3 days ago
loss_curve_base.png
150 kB
xet
Upload loss_curve_base.png with huggingface_hub
3 days ago
loss_curve_mid.png
177 kB
xet
Upload loss_curve_mid.png with huggingface_hub
2 days ago
loss_curve_sft.png
216 kB
xet
Upload loss_curve_sft.png with huggingface_hub
2 days ago
loss_mid.json
Safe
21.6 kB
Upload loss_mid.json with huggingface_hub
2 days ago
loss_sft.json
Safe
8.62 kB
Upload loss_sft.json with huggingface_hub
2 days ago
train.py
Safe
35.1 kB
Upload train.py with huggingface_hub
3 days ago