docs: switch architecture to PNG embed (HF model cards render PNG reliably); tighten prose
Browse files
README.md
CHANGED
|
@@ -77,11 +77,11 @@ The cascade architecture (A gate + B specialist) is the result of **421 autonomo
|
|
| 77 |
|
| 78 |
## Architecture
|
| 79 |
|
| 80 |
-
|
|
|
|
|
|
|
| 81 |
|
| 82 |
-
|
| 83 |
-
|
| 84 |
-
> **Source**: BitNet b1.58 architecture from Ma, Wang, Ma, et al. ([arXiv:2402.17764](https://arxiv.org/abs/2402.17764)). This is a clean-room Python implementation with **pure-integer Q16.16 fixed-point arithmetic** — no `torch` runtime dep, no GPU required. Training used PyTorch + Straight-Through Estimator on H200 SXM (RunPod).
|
| 85 |
|
| 86 |
---
|
| 87 |
|
|
|
|
| 77 |
|
| 78 |
## Architecture
|
| 79 |
|
| 80 |
+
<p align="center">
|
| 81 |
+
<img src="architecture.png" alt="ClinicalMem BitNet b1.58 — A+B cascade architecture" width="900">
|
| 82 |
+
</p>
|
| 83 |
|
| 84 |
+
Architecture from Ma, Wang, Ma, et al. ([arXiv:2402.17764](https://arxiv.org/abs/2402.17764)). Clean-room Python implementation with **pure-integer Q16.16 fixed-point arithmetic** — no `torch` runtime dep, no GPU required. Training used PyTorch + Straight-Through Estimator on H200 SXM (RunPod). Diagram source: [`architecture.mmd`](architecture.mmd).
|
|
|
|
|
|
|
| 85 |
|
| 86 |
---
|
| 87 |
|