Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up

LisaMegaWatts
/
JuliaFluxGPT

Text Generation
English
flux
julia
flux-jl
llama-style
gqa
grouped-query-attention
rope
rmsnorm
swiglu
bpe
philosophy
Model card Files Files and versions
xet
Community
JuliaFluxGPT
1.19 GB
  • 1 contributor
History: 267 commits
LisaMegaWatts's picture
LisaMegaWatts
Add PyTorch weights (.pt) converted from JLD2 checkpoint
a7ea2a7 verified 2 days ago
  • .gitattributes
    1.95 kB
    Upload julia-slm/5m-chinchilla/step_12000.jld2 with huggingface_hub 5 days ago
  • README.md
    6.71 kB
    Fix model card: match actual HF checkpoint (d=512, 8L, 8Q/2KV, ~23M params, ctx=256, FFN=1344) 2 days ago
  • best_model.jld2
    274 MB
    xet
    Upload best_model.jld2 (261.1 MB) 6 days ago
  • checkpoint_interrupted.jld2
    274 MB
    xet
    Upload checkpoint_interrupted.jld2 (261.1 MB) 6 days ago
  • checkpoint_latest.jld2
    274 MB
    xet
    Upload checkpoint_latest.jld2 (261.2 MB) 6 days ago
  • final_model.jld2
    274 MB
    xet
    Upload final_model.jld2 (261.1 MB) 6 days ago
  • juliaflux_weights.pt

    Detected Pickle imports (3)

    • "collections.OrderedDict",
    • "torch._utils._rebuild_tensor_v2",
    • "torch.FloatStorage"

    What is a pickle import?

    91.3 MB
    xet
    Add PyTorch weights (.pt) converted from JLD2 checkpoint 2 days ago
  • tokenizer.json
    59.5 kB
    Fix tokenizer: trim to 2000 vocab to match trained model 5 days ago