Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
LisaMegaWatts
/
JuliaFluxGPT
like
0
Text Generation
LisaMegaWatts/philosophy-corpus
English
flux
julia
flux-jl
llama-style
gqa
grouped-query-attention
rope
rmsnorm
swiglu
bpe
philosophy
License:
mit
Model card
Files
Files and versions
xet
Community
main
JuliaFluxGPT
1.19 GB
1 contributor
History:
267 commits
LisaMegaWatts
Add PyTorch weights (.pt) converted from JLD2 checkpoint
a7ea2a7
verified
2 days ago
.gitattributes
Safe
1.95 kB
Upload julia-slm/5m-chinchilla/step_12000.jld2 with huggingface_hub
5 days ago
README.md
6.71 kB
Fix model card: match actual HF checkpoint (d=512, 8L, 8Q/2KV, ~23M params, ctx=256, FFN=1344)
2 days ago
best_model.jld2
274 MB
xet
Upload best_model.jld2 (261.1 MB)
6 days ago
checkpoint_interrupted.jld2
274 MB
xet
Upload checkpoint_interrupted.jld2 (261.1 MB)
6 days ago
checkpoint_latest.jld2
274 MB
xet
Upload checkpoint_latest.jld2 (261.2 MB)
6 days ago
final_model.jld2
274 MB
xet
Upload final_model.jld2 (261.1 MB)
6 days ago
juliaflux_weights.pt
pickle
Detected Pickle imports (3)
"collections.OrderedDict"
,
"torch._utils._rebuild_tensor_v2"
,
"torch.FloatStorage"
What is a pickle import?
91.3 MB
xet
Add PyTorch weights (.pt) converted from JLD2 checkpoint
2 days ago
tokenizer.json
Safe
59.5 kB
Fix tokenizer: trim to 2000 vocab to match trained model
5 days ago