Finally, someone made GPT look good, Jackpot.

by Trilogix1 - opened Jan 25

Jan 25

This is a breaking point for the small models I believe. You just broke the bottleneck of the small and fast but ineffective LLM models.
I am wondering if you can apply it already to 0.3b-0.6b LFM2 or Qwen3 would be an excellent example with it (because of high training tokens originally).

Again great job.

Trilogix1 changed discussion status to closed Jan 30

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment