JuliaFluxGPT / README.md

Commit History

Fix model card: match actual HF checkpoint (d=512, 8L, 8Q/2KV, ~23M params, ctx=256, FFN=1344)
afa692e
verified

LisaMegaWatts commited on

Fix model card: actual trained model is d=256, 4 layers, 4Q/2KV, ~4M params (was incorrectly listed as 10M)
287076b
verified

LisaMegaWatts commited on

Fix model card: context_length=256 (not 512), dropout=0.1 (not 0.0) per checkpoint
5907abe
verified

LisaMegaWatts commited on

Add model card with architecture details, provenance, and training metrics
9c956d0
verified

LisaMegaWatts commited on

Update model card with architecture details, training config, and usage instructions
1f7bea9
verified

LisaMegaWatts commited on