[HYPOTHESIS] Sleep is the solution to memory
Hey everyone! I'm a hobbyist who's most recently been obsessed with local AI models and the question of persistent memory โ not as an external database or RAG pipeline, but baked directly into the weights themselves. The fact is AI memory isn't just unsolved, it's not even close to functional and no one seems to have a clue what the solution (or even the right direction) could be.
I've been setting up a multi-Mac Studio cluster for distributed inference using exo, which gives me enough memory headroom to run frontier-class open models locally. But the experiments I really want to run are about something deeper: can we give models genuine in-weights memory that accumulates over time without catastrophic forgetting, and without needing an external database for memory files?
The Core Idea: Artificial Sleep Cycles
I'm a believer that most solutions can be found by looking at how biological systems in our world actually work, in this case systems of memory. What excites me is how little we actually understand about memory (in AI and ourselves) and even more-so sleep, which is something scientists know very little of in regards to grasping what brains actually use sleep for (or need it for, really). What we do know is this: during waking hours, the hippocampus rapidly buffers new experiences. During sleep, those experiences get replayed and slowly consolidated into the neocortex โ filtered, compressed, and integrated without overwriting everything that came before.
I want to build an artificial version of this:
Wake phase โ the model operates normally, with a LoRA adapter acting as a hippocampus-like rapid buffer, capturing new information without touching base weights.
Sleep phase โ an offline consolidation pass where the most important/repeated information from the LoRA buffer gets written into base weights using MEMIT-style surgical editing, while less important memories get pruned. The LoRA adapter then resets for the next cycle.
The hypothesis is that the buffered, cyclical, offline consolidation approach could produce more stable in-weights memory than direct continuous fine-tuning โ essentially using the sleep cycle structure to solve catastrophic forgetting.
What I'm Most Curious About
- Has anyone experimented with cyclical or staged fine-tuning as a forgetting mitigation strategy?
- Are there better alternatives to MEMIT for the consolidation step that I should be looking at?
- How would you design evals to cleanly measure what gets retained vs lost across sleep cycles?
- Anyone else thinking about LoRA adapters as temporary memory buffers rather than permanent fine-tunes?
I'm documenting everything publicly as I go and would love collaborators, feedback, or just people who want to nerd out about this stuff. Total hobbyist here but I think the idea is worth exploring seriously โ would love to hear what the community thinks!
โ Will