Fix model card: match actual HF checkpoint (d=512, 8L, 8Q/2KV, ~23M params, ctx=256, FFN=1344) afa692e verified LisaMegaWatts commited on 3 days ago
Fix model card: actual trained model is d=256, 4 layers, 4Q/2KV, ~4M params (was incorrectly listed as 10M) 287076b verified LisaMegaWatts commited on 3 days ago
Fix model card: context_length=256 (not 512), dropout=0.1 (not 0.0) per checkpoint 5907abe verified LisaMegaWatts commited on 4 days ago
Add model card with architecture details, provenance, and training metrics 9c956d0 verified LisaMegaWatts commited on 4 days ago
Update model card with architecture details, training config, and usage instructions 1f7bea9 verified LisaMegaWatts commited on 7 days ago