Inspired by the heroes of day zero quants (@TheBloke@danielhanchen@shimmyshimmer@bartowski), I decided to join the race by releasing the first FP8 quant of glm-4.7-flash! Not as easy as i expected, but I'm happy i was still able to have it working within a few hours after the original model was released! Interested in feedback if anyone wants to try it out!
This isn’t a goal of ours because we have plenty of money in the bank but quite excited to see that @huggingfaceis profitable these days, with 220 team members and most of our platform being free (like model hosting) and open-source for the community!
Especially noteworthy at a time when most AI startups wouldn’t survive a year or two without VC money. Yay!