Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up

LisaMegaWatts
/
juliadistill-v2

Text Generation
English
flux
julia
flux-jl
distillation
knowledge-distillation
llama-style
gqa
rope
rmsnorm
swiglu
bpe
philosophy
Eval Results (legacy)
Model card Files Files and versions
xet
Community
juliadistill-v2
193 MB
  • 3 contributors
History: 83 commits
LisaMegaWatts's picture
LisaMegaWatts
Add 4000-vocab BPE tokenizer for inference serving
dd923fa verified 4 days ago
  • .gitattributes
    1.75 kB
    Upload checkpoint_interrupted.jld2 (45.6 MB) 5 days ago
  • README.md
    3.78 kB
    Fix model card: vocab=4000 (not 2000), correct architecture details from checkpoint 4 days ago
  • best_model.jld2
    47.9 MB
    xet
    Upload best_model.jld2 (45.6 MB) 5 days ago
  • checkpoint_interrupted.jld2
    47.8 MB
    xet
    Upload checkpoint_interrupted.jld2 (45.6 MB) 5 days ago
  • checkpoint_latest.jld2
    48.2 MB
    xet
    Upload checkpoint_latest.jld2 (45.9 MB) 5 days ago
  • final_model.jld2
    48.6 MB
    xet
    Upload final_model.jld2 (46.4 MB) 6 days ago
  • tokenizer.json
    268 kB
    Add 4000-vocab BPE tokenizer for inference serving 4 days ago