docs(model card): simplify Mermaid syntax (HF-renderer-compatible — no HTML entities, single-attribute styles)
Browse files
README.md
CHANGED
|
@@ -79,17 +79,12 @@ The cascade architecture (A gate + B specialist) is the result of **421 autonomo
|
|
| 79 |
|
| 80 |
```mermaid
|
| 81 |
flowchart TB
|
| 82 |
-
INPUT["
|
| 83 |
-
|
| 84 |
-
|
| 85 |
-
|
| 86 |
-
|
| 87 |
-
|
| 88 |
-
B["🔵 **B Bundle** · tier-2 specialist · 64-hidden<br/>193 → 64 → 5 · ternary {-1, 0, +1} · Q16.16 biases<br/>bundle_id: <code>5f7ed5f6…</code> · ~12,300 params · 30 KB<br/>trained on non-contra subset (95 samples)<br/><br/>**100% recall**: serious (69/69) · moderate (22/22)<br/>major (4/4 within non-contra)"]:::specialist
|
| 89 |
-
|
| 90 |
-
DISPATCH["⚖️ **Cascade Dispatcher**<br/>if A predicts <strong>contraindicated</strong> → return contraindicated<br/>else → return B's constrained argmax over<br/>{moderate, serious, major}<br/><br/>composite weights_id = <code>{a_id}+{b_id}</code> (129 chars)"]:::dispatch
|
| 91 |
-
|
| 92 |
-
OUT["✅ **BitNetResult**<br/>severity_name ∈ {none, moderate, serious, major, contraindicated}<br/>logits_q16 : 5×Q16.16 fixed-point logits<br/>feature_hash : SHA-256 over canonical 193-dim feature vector<br/>repro_hash : SHA-256 over (feature_hash, logits_q16, severity, weights_id)<br/>weights_id : composite <code>{a_id}+{b_id}</code><br/><br/>↓ <strong>bit-identical replay primitive — verifiable decades later, on any chip</strong> ↓"]:::output
|
| 93 |
|
| 94 |
INPUT --> ENCODE
|
| 95 |
ENCODE --> A
|
|
@@ -98,12 +93,12 @@ flowchart TB
|
|
| 98 |
B --> DISPATCH
|
| 99 |
DISPATCH --> OUT
|
| 100 |
|
| 101 |
-
|
| 102 |
-
|
| 103 |
-
|
| 104 |
-
|
| 105 |
-
|
| 106 |
-
|
| 107 |
```
|
| 108 |
|
| 109 |
> **Source**: BitNet b1.58 architecture from Ma, Wang, Ma, et al. ([arXiv:2402.17764](https://arxiv.org/abs/2402.17764)). This is a clean-room Python implementation with **pure-integer Q16.16 fixed-point arithmetic** — no `torch` runtime dep, no GPU required. Training used PyTorch + Straight-Through Estimator on H200 SXM (RunPod).
|
|
|
|
| 79 |
|
| 80 |
```mermaid
|
| 81 |
flowchart TB
|
| 82 |
+
INPUT["Input<br/>(drug_a, drug_b)<br/>e.g. warfarin + ibuprofen"]
|
| 83 |
+
ENCODE["encode_pair → 193-dim ternary feature vector<br/>• 64 BLAKE2b-128 hash trits per drug (x2 = 128 bits)<br/>• 26 ATC pharmacology flag bits per drug (x2 = 52 bits)<br/>• 13 pair-derived DDI rule bits"]
|
| 84 |
+
A["A Bundle (gate, 256-hidden)<br/>193 → 256 → 5<br/>ternary weights, Q16.16 biases<br/>bundle_id: 1f0f8859...<br/>50,949 params, 118 KB<br/>100% recall: contra (44/44), major (4/4)<br/>0 contra FP, 0 major FP"]
|
| 85 |
+
B["B Bundle (tier-2 specialist, 64-hidden)<br/>193 → 64 → 5<br/>ternary weights, Q16.16 biases<br/>bundle_id: 5f7ed5f6...<br/>~12,300 params, 30 KB<br/>trained on non-contra subset (95 samples)<br/>100% recall: serious (69/69), moderate (22/22)"]
|
| 86 |
+
DISPATCH["Cascade Dispatcher<br/>if A predicts contraindicated → contraindicated<br/>else → B's constrained argmax over moderate / serious / major<br/>composite weights_id = a_id + b_id (129 chars)"]
|
| 87 |
+
OUT["BitNetResult<br/>severity_name in none, moderate, serious, major, contraindicated<br/>logits_q16: 5x Q16.16 fixed-point logits<br/>feature_hash: SHA-256 over 193-dim feature vector<br/>repro_hash: SHA-256 over feature_hash + logits + severity + weights_id<br/>weights_id: composite a_id + b_id<br/>= bit-identical replay primitive, verifiable on any chip, decades later"]
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 88 |
|
| 89 |
INPUT --> ENCODE
|
| 90 |
ENCODE --> A
|
|
|
|
| 93 |
B --> DISPATCH
|
| 94 |
DISPATCH --> OUT
|
| 95 |
|
| 96 |
+
style INPUT fill:#EFF6FF,stroke:#2563eb,color:#1e3a8a
|
| 97 |
+
style ENCODE fill:#F0FDFA,stroke:#0F766E,color:#134E4A
|
| 98 |
+
style A fill:#FEF2F2,stroke:#dc2626,color:#7f1d1d
|
| 99 |
+
style B fill:#EFF6FF,stroke:#2563eb,color:#1e3a8a
|
| 100 |
+
style DISPATCH fill:#FEF3C7,stroke:#d97706,color:#7c2d12
|
| 101 |
+
style OUT fill:#F0FDF4,stroke:#16a34a,color:#14532d
|
| 102 |
```
|
| 103 |
|
| 104 |
> **Source**: BitNet b1.58 architecture from Ma, Wang, Ma, et al. ([arXiv:2402.17764](https://arxiv.org/abs/2402.17764)). This is a clean-room Python implementation with **pure-integer Q16.16 fixed-point arithmetic** — no `torch` runtime dep, no GPU required. Training used PyTorch + Straight-Through Estimator on H200 SXM (RunPod).
|