star-ga commited on
Commit
6c2e017
·
verified ·
1 Parent(s): 4bbaa3f

docs(model card): simplify Mermaid syntax (HF-renderer-compatible — no HTML entities, single-attribute styles)

Browse files
Files changed (1) hide show
  1. README.md +12 -17
README.md CHANGED
@@ -79,17 +79,12 @@ The cascade architecture (A gate + B specialist) is the result of **421 autonomo
79
 
80
  ```mermaid
81
  flowchart TB
82
- INPUT["**Input**<br/>(drug_a, drug_b)<br/>e.g. (warfarin, ibuprofen)"]:::input
83
-
84
- ENCODE["**encode_pair()** 193-dim ternary feature vector<br/> 64 BLAKE2b-128 hash trits per drug (×2 = 128 bits)<br/> 26 ATC pharmacology flag bits per drug (×2 = 52 bits)<br/> 13 pair-derived DDI rule bits<br/>(CYP3A4 inhib×substrate, OATP1B1×statin, P-gp×substrate,<br/>CYP2C9×anticoag, MAOI×serotonergic, PDE5×nitrate,<br/>contrast×metformin, CYP1A2×substrate, XO×thiopurine,<br/>folate-antagonist, tetracycline×retinoid, ACE×neprilysin,<br/>metformin×renal-state)"]:::encoder
85
-
86
- A["🔴 **A Bundle** &nbsp;·&nbsp; gate &nbsp;·&nbsp; 256-hidden<br/>193 2565 &nbsp;·&nbsp; ternary {-1, 0, +1} &nbsp;·&nbsp; Q16.16 biases<br/>bundle_id: <code>1f0f8859…</code> &nbsp;·&nbsp; 50,949 params &nbsp;·&nbsp; 118 KB<br/><br/>**100% recall**: contraindicated (44/44) &nbsp;·&nbsp; major (4/4)<br/>**0 false positives** on contra and major"]:::gate
87
-
88
- B["🔵 **B Bundle** &nbsp;·&nbsp; tier-2 specialist &nbsp;·&nbsp; 64-hidden<br/>193 → 64 → 5 &nbsp;·&nbsp; ternary {-1, 0, +1} &nbsp;·&nbsp; Q16.16 biases<br/>bundle_id: <code>5f7ed5f6…</code> &nbsp;·&nbsp; ~12,300 params &nbsp;·&nbsp; 30 KB<br/>trained on non-contra subset (95 samples)<br/><br/>**100% recall**: serious (69/69) &nbsp;·&nbsp; moderate (22/22)<br/>major (4/4 within non-contra)"]:::specialist
89
-
90
- DISPATCH["⚖️ **Cascade Dispatcher**<br/>if A predicts <strong>contraindicated</strong> → return contraindicated<br/>else → return B's constrained argmax over<br/>{moderate, serious, major}<br/><br/>composite weights_id = <code>{a_id}+{b_id}</code> (129 chars)"]:::dispatch
91
-
92
- OUT["✅ **BitNetResult**<br/>severity_name ∈ {none, moderate, serious, major, contraindicated}<br/>logits_q16 : 5×Q16.16 fixed-point logits<br/>feature_hash : SHA-256 over canonical 193-dim feature vector<br/>repro_hash : SHA-256 over (feature_hash, logits_q16, severity, weights_id)<br/>weights_id : composite <code>{a_id}+{b_id}</code><br/><br/>↓ <strong>bit-identical replay primitive — verifiable decades later, on any chip</strong> ↓"]:::output
93
 
94
  INPUT --> ENCODE
95
  ENCODE --> A
@@ -98,12 +93,12 @@ flowchart TB
98
  B --> DISPATCH
99
  DISPATCH --> OUT
100
 
101
- classDef input fill:#EFF6FF,stroke:#2563eb,color:#1e3a8a,stroke-width:2px
102
- classDef encoder fill:#F0FDFA,stroke:#0F766E,color:#134E4A,stroke-width:2px
103
- classDef gate fill:#FEF2F2,stroke:#dc2626,color:#7f1d1d,stroke-width:2px
104
- classDef specialist fill:#EFF6FF,stroke:#2563eb,color:#1e3a8a,stroke-width:2px
105
- classDef dispatch fill:#FEF3C7,stroke:#d97706,color:#7c2d12,stroke-width:2px
106
- classDef output fill:#F0FDF4,stroke:#16a34a,color:#14532d,stroke-width:2px
107
  ```
108
 
109
  > **Source**: BitNet b1.58 architecture from Ma, Wang, Ma, et al. ([arXiv:2402.17764](https://arxiv.org/abs/2402.17764)). This is a clean-room Python implementation with **pure-integer Q16.16 fixed-point arithmetic** — no `torch` runtime dep, no GPU required. Training used PyTorch + Straight-Through Estimator on H200 SXM (RunPod).
 
79
 
80
  ```mermaid
81
  flowchart TB
82
+ INPUT["Input<br/>(drug_a, drug_b)<br/>e.g. warfarin + ibuprofen"]
83
+ ENCODE["encode_pair → 193-dim ternary feature vector<br/>• 64 BLAKE2b-128 hash trits per drug (x2 = 128 bits)<br/>• 26 ATC pharmacology flag bits per drug (x2 = 52 bits)<br/>• 13 pair-derived DDI rule bits"]
84
+ A["A Bundle (gate, 256-hidden)<br/>193 256 5<br/>ternary weights, Q16.16 biases<br/>bundle_id: 1f0f8859...<br/>50,949 params, 118 KB<br/>100% recall: contra (44/44), major (4/4)<br/>0 contra FP, 0 major FP"]
85
+ B["B Bundle (tier-2 specialist, 64-hidden)<br/>193 → 64 → 5<br/>ternary weights, Q16.16 biases<br/>bundle_id: 5f7ed5f6...<br/>~12,300 params, 30 KB<br/>trained on non-contra subset (95 samples)<br/>100% recall: serious (69/69), moderate (22/22)"]
86
+ DISPATCH["Cascade Dispatcher<br/>if A predicts contraindicated contraindicated<br/>else B's constrained argmax over moderate / serious / major<br/>composite weights_id = a_id + b_id (129 chars)"]
87
+ OUT["BitNetResult<br/>severity_name in none, moderate, serious, major, contraindicated<br/>logits_q16: 5x Q16.16 fixed-point logits<br/>feature_hash: SHA-256 over 193-dim feature vector<br/>repro_hash: SHA-256 over feature_hash + logits + severity + weights_id<br/>weights_id: composite a_id + b_id<br/>= bit-identical replay primitive, verifiable on any chip, decades later"]
 
 
 
 
 
88
 
89
  INPUT --> ENCODE
90
  ENCODE --> A
 
93
  B --> DISPATCH
94
  DISPATCH --> OUT
95
 
96
+ style INPUT fill:#EFF6FF,stroke:#2563eb,color:#1e3a8a
97
+ style ENCODE fill:#F0FDFA,stroke:#0F766E,color:#134E4A
98
+ style A fill:#FEF2F2,stroke:#dc2626,color:#7f1d1d
99
+ style B fill:#EFF6FF,stroke:#2563eb,color:#1e3a8a
100
+ style DISPATCH fill:#FEF3C7,stroke:#d97706,color:#7c2d12
101
+ style OUT fill:#F0FDF4,stroke:#16a34a,color:#14532d
102
  ```
103
 
104
  > **Source**: BitNet b1.58 architecture from Ma, Wang, Ma, et al. ([arXiv:2402.17764](https://arxiv.org/abs/2402.17764)). This is a clean-room Python implementation with **pure-integer Q16.16 fixed-point arithmetic** — no `torch` runtime dep, no GPU required. Training used PyTorch + Straight-Through Estimator on H200 SXM (RunPod).