zekaic commited on
Commit
422c2f4
·
verified ·
1 Parent(s): f740f14

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +30 -3
README.md CHANGED
@@ -14,15 +14,42 @@ pipeline_tag: text-generation
14
 
15
  # SMB-v1-1.7B-Structure
16
 
17
- SMB-v1 Model Family is composed of biomedical multi-modal models for standard patient representations. We will release both the `SMB-v1-1.7B-Structure` (EHR Only) and the `SMB-v1-1.7B` (a complete multimodal version).
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
18
 
19
  ## Model Details
20
 
21
  - **LLM Backbone**: Qwen3-1.7B
22
- - **Vision Encoder**: None (text-only)
23
  - **Connector**: identity
24
  - **Model Family**: SMB-v1
25
- - **Modalities**: EHR Only
26
  - **Training Method**: SFT + JEPA Multi-objective
27
 
28
  ## Special Tokens
 
14
 
15
  # SMB-v1-1.7B-Structure
16
 
17
+ ## Model Type
18
+ Multimodal Longitudinal Oncology Foundation Model
19
+
20
+ ## Model Description
21
+ The SMB-v1-1.7B-Structure is the initial release of the SMB-v1 family, specifically engineered to model the complex, time-varying dynamics of cancer biology through structured clinical signals. It treats the clinical structured data as a multimodal environment, fusing heterogeneous data streams into a unified patient state representation.
22
+
23
+ It is designed to ingest and synthesize diverse structured modalities across the patient journey, including:
24
+
25
+ - Temporal Physiological Signals: Modeling continuous longitudinal trajectories of laboratory values, vital signs, and functional status markers to capture disease progression and physiological drift over time.
26
+
27
+ - Clinical Events & Phenotypes: Encoding discrete, high-cardinality sequences of diagnosis codes (ICD), procedure events (CPT), and adverse events to reconstruct the semantic history of the patient's care.
28
+
29
+ - Therapeutic Interventions: integrating complex treatment histories, including systemic therapies (chemotherapy, immunotherapy), radiation dosing schedules, and surgical interventions to understand causal treatment-response dynamics.
30
+
31
+ - Molecular & Genomic Profiles: Embedding high-dimensional static and dynamic biomarker panels, including somatic mutations, gene expression signatures, and proteomic markers, directly alongside clinical phenotypes.
32
+
33
+ - Oncologic Staging & Outcomes: Processing structured tumor staging (TNM), histology classifications, and survival endpoints to anchor representations in ground-truth biological states.
34
+
35
+ ## Intended Use Cases
36
+ This model is optimized for downstream tasks requiring a deep understanding of longitudinal patient history, such as:
37
+
38
+ - Predictive Risk Stratification: Forecasting adverse events, toxicity, or rapid progression based on historical trajectories.
39
+
40
+ - Treatment Response Modeling: Simulating potential patient outcomes under different therapeutic regimens.
41
+
42
+ - Patient Similarity Search: Identifying cohorts with similar biological and clinical progressions for real-world evidence generation.
43
+
44
+ - Clinical Trial Matching: Aligning complex patient states with structured eligibility criteria.
45
+
46
+ Note: While the full `SMB-v1-1.7B` will introduce unstructured modalities, this -Structure variant establishes the foundation using the highest-fidelity structured signals available in modern oncology data warehouses.
47
 
48
  ## Model Details
49
 
50
  - **LLM Backbone**: Qwen3-1.7B
 
51
  - **Connector**: identity
52
  - **Model Family**: SMB-v1
 
53
  - **Training Method**: SFT + JEPA Multi-objective
54
 
55
  ## Special Tokens