Commit History

Upload ascii-chart5-L4-D768-mkii-c1932d6c-1962-493d-b0b7-78e84e30e4e5.txt with huggingface_hub
4adce2e
verified

SQCU commited on

Upload ascii-eos-L4-D768-rollout-test-01661b5f-2ff7-49b1-9ad8-fee77e14bd1c.txt with huggingface_hub
6a81277
verified

SQCU commited on

Upload folder using huggingface_hub
95a04a0
verified

SQCU commited on

Upload folder using huggingface_hub
07c9491
verified

SQCU commited on

Upload folder using huggingface_hub
8129d8f
verified

SQCU commited on

compiled models train faster so you can train more of them in a short experiment, to better convergence.
921107d
verified

SQCU commited on

89,301,000 parameter attention_ii, z_lossed model trained for 6250 steps at batchsize:4*32, device_batchsize:32
8a69386
verified

SQCU commited on

sling the illustrious and mysterious "attention_II" models. also some layerwise rmsnorm, qkprojection rmsnorm models, one twice as large as the other.
1f45909
verified

SQCU commited on

Upload 8 files
6d543db
verified

SQCU commited on

Update README.md
87045f5
verified

SQCU commited on

Create README.md
fd3ca39
verified

SQCU commited on

initial commit
5e8f667
verified

SQCU commited on