Spaces:

CompactAI
/

Built-with-curiosity-not-compute

Running

App Files Files Community

CompactAI commited on 7 days ago

Commit

9ecece1

verified ·

1 Parent(s): 1641eca

Delete blog-posts.json

Browse files

Files changed (1) hide show

blog-posts.json +0 -10

blog-posts.json DELETED Viewed

@@ -1,10 +0,0 @@
-[
-  {
-    "slug": "built-with-curiosity-over-compute",
-    "date": "2026-02-15",
-    "tag": "Philosophy",
-    "title": "Built with curiosity over compute.",
-    "excerpt": "There's a strange pressure in tech circles that every idea must be revolutionary, every project must be scalable, every experiment must lead somewhere. We disagree. Ideas don't have to be good to exist. They just have to exist.",
-    "content": "There's a strange pressure in tech circles. Every idea must be revolutionary. Every project must be scalable. Every experiment must lead somewhere worth going. If it's going to change the world, why build it? If it's going to beat the state of the art, why publish it? If it's going to get thousands of GitHub stars, why open source it?\n\nWe've internalized this idea that only \"good\" ideas deserve to exist. That only projects with clear paths to success are worth starting. That only experiments with predictable outcomes are worth running.\n\nBut here's the thing: FMN-GPT started as a weird question. What if a model could be small by design, avoiding compression entirely?\n\nWas that a good idea? Honestly, we still don't know. The model has ~100K parameters. It might fail. It might be a dead end. It might teach us something unexpected. And that's exactly the point.\n\nThe phrase 'Built with curiosity over compute' goes beyond a tagline. A philosophy drives our work. We build because we're curious, without needing infinite resources to throw at problems. A half-baked idea explored on a single GPU matters more than a perfect idea that never leaves the whiteboard.\n\nThis project exists because someone got curious. They lacked funding. They lacked a roadmap to success. They lacked certainty it would work. They just wanted to see what would happen.\n\nWhen I wiped my HuggingFace profile clean and started over, people probably thought I was crazy. Dozens of compressed models, gone. Why? Because quantity was masking the real problem. I was cloning and shrinking other people's work, avoiding building anything new. The work lacked genuine exploration. It was pure optimization.\n\nAnd optimization without exploration is just a race to the bottom.\n\nBad ideas teach us. Weird experiments surprise us. Small projects accumulate into something bigger.\n\nThe character-level tokenization in FMN-GPT might be inefficient compared to BPE. The recurrent mixer might add complexity without sufficient benefit. The whole architecture might be a glorified thought experiment. But that's fine.\n\n## What's the Point?\n\nThe point isnt to win benchmarks. The point is to understand what makes intelligence emerge from simple building blocks.\n\nWe could throw more compute at the problem. We could scale to billions of parameters. We could chase state of the art on leaderboards.\n\n**But we're doing something different.**\n\nWe're asking: what's the minimum viable architecture that can still reason? What can ~100K parameters actually do when given the right structure?\n\n### Key Principles\n\n- **Small by design** means thinking about efficiency from the start\n- **Curiosity over compute** means exploring ideas without massive resources\n- **Understanding over performance** means learning why things work\n\nThe journey matters as much as the destination. Maybe this project fails. Maybe it succeeds. Either way, we'll have learned something.\n\n> \"The goal is smaller models. The goal is understanding what makes models work in the first place.\"\n\nThat's what we're here for. That's why we build."
-  }
-]