Upload folder using huggingface_hub
Browse files
README.md
CHANGED
|
@@ -34,6 +34,11 @@ model-index:
|
|
| 34 |
|
| 35 |
> **A JAX/Flax language model that separates *what it knows* from *how it thinks* — so the knowledge can grow to 100B+ vectors while inference stays fast and cheap.**
|
| 36 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 37 |
---
|
| 38 |
|
| 39 |
## What Is DPSNR?
|
|
|
|
| 34 |
|
| 35 |
> **A JAX/Flax language model that separates *what it knows* from *how it thinks* — so the knowledge can grow to 100B+ vectors while inference stays fast and cheap.**
|
| 36 |
|
| 37 |
+
[](https://colab.research.google.com/drive/1VM64IOZHj5rDvxWPbqktC037LlyOJih3?usp=sharing)
|
| 38 |
+
|
| 39 |
+
> [!WARNING]
|
| 40 |
+
> **Disclaimer**: This repository and checkpoint are provided as a **research proof-of-concept to demonstrate the novel DPSNR architecture**. It is an experimental model trained on a limited compute budget (for ~31,000 steps) to validate theoretical claims (such as $O(1)$ retrieval scaling, Sparse Adam optimizer speedups, and memory-bandwidth properties). **It is NOT a fully-trained competitive model** and is not intended to compete with state-of-the-art open-source text models (like LLaMA or Mistral) on downstream benchmarks.
|
| 41 |
+
|
| 42 |
---
|
| 43 |
|
| 44 |
## What Is DPSNR?
|