aios-framework
/

aios-paper

memory-bandwidth

Model card Files Files and versions

acasavaraju commited on 23 days ago

Commit

0503d3f

·

verified ·

1 Parent(s): ccc6bd7

Update README.md

Files changed (1) hide show

README.md +50 -3

README.md CHANGED Viewed

@@ -1,3 +1,50 @@
----
-license: apache-2.0
----

+---
+license: apache-2.0
+tags:
+- llm-inference
+- cpu-inference
+- memory-bandwidth
+- transformer
+- quantization
+- research
+---
+# AIOS: A CPU-Native Inference Architecture for Large Language Models
+**This is not a model.** This is the framework paper and specification
+for AIOS — a memory residency controller for CPU-native LLM inference.
+## Paper
+**Title:** AIOS: A CPU-Native Inference Architecture for Large Language Models
+**Author:** Anand Casavaraju
+**Published:** March 2026
+**SSRN:** https://ssrn.com/abstract=6467298
+**GitHub:** https://github.com/acasavaraju/AIOS
+## What AIOS Is
+AIOS is a memory residency controller that sits between inference
+engines (llama.cpp, Ollama, vLLM) and hardware, managing how weight
+data moves from DRAM to CPU. It addresses four resource dimensions:
+- **Weight reads** — aliasing + sparsity maps
+- **KV cache reads** — MQA/GQA + tiered residency
+- **Activation spill** — chunked prefill
+- **Attention compute** — sparsity map
+## Current State
+Framework and specification published. Runtime not yet implemented.
+All performance projections are analytical. Empirical validation
+tracked at github.com/acasavaraju/AIOS/issues.
+## Citation
+```bibtex
+@misc{casavaraju2026aios,
+  title  = {AIOS: A CPU-Native Inference Architecture for Large Language Models},
+  author = {Casavaraju, Anand},
+  year   = {2026},
+  url    = {https://ssrn.com/abstract=6467298}
+}
+```