Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Log In
Sign Up
LisaMegaWatts
/
SymbioGPT-10M
like
0
PyTorch
English
symbiogenesis
multi-organelle
monarch-mixer
philosophy
License:
mit
Model card
Files
Files and versions
xet
Community
main
SymbioGPT-10M
/
data
1.38 GB
Ctrl+K
Ctrl+K
1 contributor
History:
4 commits
LisaMegaWatts
Add 2MB val text sample for Gemma/HF tokenizer notebooks
588ff7f
verified
about 2 months ago
train_curated.txt.tokens.pt
Safe
pickle
Detected Pickle imports (3)
"torch._utils._rebuild_tensor_v2"
,
"collections.OrderedDict"
,
"torch.IntStorage"
What is a pickle import?
1.06 GB
xet
Add curated training tokens (266M tokens, Chinchilla-optimal)
about 2 months ago
train_curated_sample.txt
20 MB
xet
Add 20MB raw text sample for Gemma/HF tokenizer notebooks
about 2 months ago
val.txt.tokens.pt
Safe
pickle
Detected Pickle imports (3)
"torch._utils._rebuild_tensor_v2"
,
"torch.IntStorage"
,
"collections.OrderedDict"
What is a pickle import?
289 MB
xet
Add validation tokens (72M tokens)
about 2 months ago
val_sample.txt
Safe
2 MB
Add 2MB val text sample for Gemma/HF tokenizer notebooks
about 2 months ago