cheese-112M / README.md
cheeseman182's picture
Update README.md
e58a4a4 verified
metadata
license: mit

this is a model made by me on a 5090 (rented). its trained by scratch

its a 112m prameter model trained on huggingfaceFW on 200k rows (i could do more if i want to) and 15k rows on dolly-15k.

there will be a 400m prameter model trained on 2m rows on huggingfaceFW soon maybe once i get the money.