docs(model card): replace ASCII architecture with Mermaid flowchart (HF renders natively)
Browse files
README.md
CHANGED
|
@@ -77,62 +77,36 @@ The cascade architecture (A gate + B specialist) is the result of **421 autonomo
|
|
| 77 |
|
| 78 |
## Architecture
|
| 79 |
|
| 80 |
-
```
|
| 81 |
-
|
| 82 |
-
|
| 83 |
-
|
| 84 |
-
|
| 85 |
-
|
| 86 |
-
|
| 87 |
-
|
| 88 |
-
|
| 89 |
-
|
| 90 |
-
|
| 91 |
-
|
| 92 |
-
|
| 93 |
-
|
| 94 |
-
|
| 95 |
-
|
| 96 |
-
|
| 97 |
-
|
| 98 |
-
|
| 99 |
-
|
| 100 |
-
|
| 101 |
-
|
| 102 |
-
|
| 103 |
-
|
| 104 |
-
|
| 105 |
-
|
| 106 |
-
|
| 107 |
-
β 100% recall: contra (44/44) β 100% recall: β
|
| 108 |
-
β major (4/4) β serious (69/69) β
|
| 109 |
-
β 0 contra FP β moderate (22/22) β
|
| 110 |
-
β 0 major FP β major (4/4 within non-contra)β
|
| 111 |
-
ββββββββββββββββββββββββββββ ββββββββββββββββββββββββββββββββββββ
|
| 112 |
-
β
|
| 113 |
-
βΌ
|
| 114 |
-
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
|
| 115 |
-
β CASCADE DISPATCHER β
|
| 116 |
-
β if A predicts "contraindicated" β return "contraindicated" β
|
| 117 |
-
β else β return B's constrained argmax over β
|
| 118 |
-
β {moderate, serious, major} β
|
| 119 |
-
β composite weights_id = "{a_id}+{b_id}" (129 chars) β
|
| 120 |
-
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
|
| 121 |
-
β
|
| 122 |
-
βΌ
|
| 123 |
-
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
|
| 124 |
-
β OUTPUT: BitNetResult( β
|
| 125 |
-
β severity_name β {none, moderate, serious, major, contraindicated},β
|
| 126 |
-
β logits_q16 : 5ΓQ16.16 fixed-point logits, β
|
| 127 |
-
β feature_hash : SHA-256 over canonical 193-dim feature vector, β
|
| 128 |
-
β repro_hash : SHA-256 over (feature_hash, logits_q16, severity, β
|
| 129 |
-
β weights_id) β the audit primitive, β
|
| 130 |
-
β weights_id : composite "{a_id}+{b_id}", β
|
| 131 |
-
β ) β
|
| 132 |
-
ββββββββββββββββββββββββββββββββββββββββοΏ½οΏ½οΏ½ββββββββββββββββββββββββββββββ
|
| 133 |
```
|
| 134 |
|
| 135 |
-
Source: BitNet b1.58 architecture from Ma, Wang, Ma, et al. ([arXiv:2402.17764](https://arxiv.org/abs/2402.17764)). This is a clean-room Python implementation with **pure-integer Q16.16 fixed-point arithmetic** β no `torch` runtime dep, no GPU required. Training used PyTorch + Straight-Through Estimator on H200 SXM (RunPod).
|
| 136 |
|
| 137 |
---
|
| 138 |
|
|
|
|
| 77 |
|
| 78 |
## Architecture
|
| 79 |
|
| 80 |
+
```mermaid
|
| 81 |
+
flowchart TB
|
| 82 |
+
INPUT["**Input**<br/>(drug_a, drug_b)<br/>e.g. (warfarin, ibuprofen)"]:::input
|
| 83 |
+
|
| 84 |
+
ENCODE["**encode_pair()** β 193-dim ternary feature vector<br/>β’ 64 BLAKE2b-128 hash trits per drug (Γ2 = 128 bits)<br/>β’ 26 ATC pharmacology flag bits per drug (Γ2 = 52 bits)<br/>β’ 13 pair-derived DDI rule bits<br/>(CYP3A4 inhibΓsubstrate, OATP1B1Γstatin, P-gpΓsubstrate,<br/>CYP2C9Γanticoag, MAOIΓserotonergic, PDE5Γnitrate,<br/>contrastΓmetformin, CYP1A2Γsubstrate, XOΓthiopurine,<br/>folate-antagonist, tetracyclineΓretinoid, ACEΓneprilysin,<br/>metforminΓrenal-state)"]:::encoder
|
| 85 |
+
|
| 86 |
+
A["π΄ **A Bundle** Β· gate Β· 256-hidden<br/>193 β 256 β 5 Β· ternary {-1, 0, +1} Β· Q16.16 biases<br/>bundle_id: <code>1f0f8859β¦</code> Β· 50,949 params Β· 118 KB<br/><br/>**100% recall**: contraindicated (44/44) Β· major (4/4)<br/>**0 false positives** on contra and major"]:::gate
|
| 87 |
+
|
| 88 |
+
B["π΅ **B Bundle** Β· tier-2 specialist Β· 64-hidden<br/>193 β 64 β 5 Β· ternary {-1, 0, +1} Β· Q16.16 biases<br/>bundle_id: <code>5f7ed5f6β¦</code> Β· ~12,300 params Β· 30 KB<br/>trained on non-contra subset (95 samples)<br/><br/>**100% recall**: serious (69/69) Β· moderate (22/22)<br/>major (4/4 within non-contra)"]:::specialist
|
| 89 |
+
|
| 90 |
+
DISPATCH["βοΈ **Cascade Dispatcher**<br/>if A predicts <strong>contraindicated</strong> β return contraindicated<br/>else β return B's constrained argmax over<br/>{moderate, serious, major}<br/><br/>composite weights_id = <code>{a_id}+{b_id}</code> (129 chars)"]:::dispatch
|
| 91 |
+
|
| 92 |
+
OUT["β
**BitNetResult**<br/>severity_name β {none, moderate, serious, major, contraindicated}<br/>logits_q16 : 5ΓQ16.16 fixed-point logits<br/>feature_hash : SHA-256 over canonical 193-dim feature vector<br/>repro_hash : SHA-256 over (feature_hash, logits_q16, severity, weights_id)<br/>weights_id : composite <code>{a_id}+{b_id}</code><br/><br/>β <strong>bit-identical replay primitive β verifiable decades later, on any chip</strong> β"]:::output
|
| 93 |
+
|
| 94 |
+
INPUT --> ENCODE
|
| 95 |
+
ENCODE --> A
|
| 96 |
+
ENCODE --> B
|
| 97 |
+
A --> DISPATCH
|
| 98 |
+
B --> DISPATCH
|
| 99 |
+
DISPATCH --> OUT
|
| 100 |
+
|
| 101 |
+
classDef input fill:#EFF6FF,stroke:#2563eb,color:#1e3a8a,stroke-width:2px
|
| 102 |
+
classDef encoder fill:#F0FDFA,stroke:#0F766E,color:#134E4A,stroke-width:2px
|
| 103 |
+
classDef gate fill:#FEF2F2,stroke:#dc2626,color:#7f1d1d,stroke-width:2px
|
| 104 |
+
classDef specialist fill:#EFF6FF,stroke:#2563eb,color:#1e3a8a,stroke-width:2px
|
| 105 |
+
classDef dispatch fill:#FEF3C7,stroke:#d97706,color:#7c2d12,stroke-width:2px
|
| 106 |
+
classDef output fill:#F0FDF4,stroke:#16a34a,color:#14532d,stroke-width:2px
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 107 |
```
|
| 108 |
|
| 109 |
+
> **Source**: BitNet b1.58 architecture from Ma, Wang, Ma, et al. ([arXiv:2402.17764](https://arxiv.org/abs/2402.17764)). This is a clean-room Python implementation with **pure-integer Q16.16 fixed-point arithmetic** β no `torch` runtime dep, no GPU required. Training used PyTorch + Straight-Through Estimator on H200 SXM (RunPod).
|
| 110 |
|
| 111 |
---
|
| 112 |
|