VirialyD commited on
Commit
7322445
Β·
verified Β·
1 Parent(s): b262442

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +17 -30
README.md CHANGED
@@ -1,5 +1,5 @@
1
  ---
2
- license: gpl-3.0
3
  library_name: pytorch
4
  tags:
5
  - biology
@@ -27,6 +27,8 @@ A multimodal deep learning ensemble for predicting T-cell functional states from
27
 
28
  **89.6% accuracy** | **macro F1 0.88** | **7 functional states** | **top-5 ensemble**
29
 
 
 
30
  ## Model Description
31
 
32
  This repository contains the weights for a top-5 ensemble of `FullGenesVJClassifier` models. Each model takes three input modalities:
@@ -71,15 +73,17 @@ Classification of T-cell functional states from paired scRNA-seq + TCR-seq data.
71
 
72
  ## Training Data
73
 
74
- ~290,000 T-cells from 4 public scRNA-seq datasets:
75
 
76
- | Dataset | Platform | Cells | Tissue |
77
  |---|---|---|---|
78
  | GSE144469 | 10x Genomics | ~60,000 | Colitis (colon) |
79
  | GSE179994 | 10x Genomics | ~77,000 | PBMC (exhaustion study) |
80
  | GSE181061 | 10x Genomics | ~31,000 | ccRCC (tumor-infiltrating) |
81
  | GSE108989 | Smart-seq2 | ~12,000 | CRC (tumor + blood) |
82
 
 
 
83
  Preprocessing: QC β†’ normalization (scanpy) β†’ 3,000 HVGs β†’ Harmony batch correction β†’ CDR3/V/J extraction via scirpy.
84
 
85
  ## Evaluation
@@ -109,37 +113,20 @@ Preprocessing: QC β†’ normalization (scanpy) β†’ 3,000 HVGs β†’ Harmony batch co
109
 
110
  ## How to Use
111
 
112
- ### Quick Start (CLI)
113
 
114
  ```bash
115
- pip install tcell-classifier
116
- tcell-predict your_data.h5ad
 
 
117
  ```
118
 
119
- Model weights (~300 MB) download automatically on first run.
120
-
121
- ```bash
122
- tcell-predict data.h5ad -o results/ # custom output dir
123
- tcell-predict data.h5ad --true-labels cell_type # evaluate vs ground truth
124
- tcell-predict data.h5ad --device cpu # force CPU
125
- ```
126
 
127
  Output: interactive HTML report, predictions.csv, annotated .h5ad.
128
 
129
- ### Python API
130
-
131
- ```python
132
- from src.hub import ensure_weights
133
- from src.inference import load_ensemble, ensemble_predict
134
- from src.data import InferenceDataset, prepare_inference_features
135
-
136
- model_dir = ensure_weights() # auto-downloads from this repo
137
- models = load_ensemble(model_dir, device)
138
- dataset = InferenceDataset(gex, tcr_a_emb, tcr_b_emb, vj_encoded)
139
- predictions, probabilities, agreement = ensemble_predict(models, dataset, device)
140
- ```
141
-
142
- ### Manual Download
143
 
144
  ```python
145
  from huggingface_hub import snapshot_download
@@ -178,14 +165,14 @@ snapshot_download("VirialyD/tcell-classifier", local_dir="./weights")
178
  ## Citation
179
 
180
  ```bibtex
181
- @software{levchenko2026multimodal,
182
  author = {Shirokikh, Polina},
183
  title = {Multimodal T-Cell Functional State Classifier},
184
- year = {2026},
185
  url = {https://github.com/polinavd/multimodal-tcell-classifier}
186
  }
187
  ```
188
 
189
  ## License
190
 
191
- GPL-3.0
 
1
  ---
2
+ license: mit
3
  library_name: pytorch
4
  tags:
5
  - biology
 
27
 
28
  **89.6% accuracy** | **macro F1 0.88** | **7 functional states** | **top-5 ensemble**
29
 
30
+ **GitHub**: [polinavd/multimodal-tcell-classifier](https://github.com/polinavd/multimodal-tcell-classifier)
31
+
32
  ## Model Description
33
 
34
  This repository contains the weights for a top-5 ensemble of `FullGenesVJClassifier` models. Each model takes three input modalities:
 
73
 
74
  ## Training Data
75
 
76
+ **136,667 T-cells** (after QC filtering) from 4 public scRNA-seq datasets:
77
 
78
+ | Dataset | Platform | Cells* | Tissue |
79
  |---|---|---|---|
80
  | GSE144469 | 10x Genomics | ~60,000 | Colitis (colon) |
81
  | GSE179994 | 10x Genomics | ~77,000 | PBMC (exhaustion study) |
82
  | GSE181061 | 10x Genomics | ~31,000 | ccRCC (tumor-infiltrating) |
83
  | GSE108989 | Smart-seq2 | ~12,000 | CRC (tumor + blood) |
84
 
85
+ *Cell counts are pre-QC; 136,667 cells remain after quality control filtering.
86
+
87
  Preprocessing: QC β†’ normalization (scanpy) β†’ 3,000 HVGs β†’ Harmony batch correction β†’ CDR3/V/J extraction via scirpy.
88
 
89
  ## Evaluation
 
113
 
114
  ## How to Use
115
 
116
+ ### Quick Start
117
 
118
  ```bash
119
+ git clone https://github.com/polinavd/multimodal-tcell-classifier.git
120
+ cd multimodal-tcell-classifier
121
+ pip install -r requirements.txt
122
+ python predict_report.py --input your_data.h5ad --output ./results
123
  ```
124
 
125
+ Model weights (~300 MB) are downloaded automatically from this HuggingFace repo on first run.
 
 
 
 
 
 
126
 
127
  Output: interactive HTML report, predictions.csv, annotated .h5ad.
128
 
129
+ ### Manual Weight Download
 
 
 
 
 
 
 
 
 
 
 
 
 
130
 
131
  ```python
132
  from huggingface_hub import snapshot_download
 
165
  ## Citation
166
 
167
  ```bibtex
168
+ @software{shirokikh2025multimodal,
169
  author = {Shirokikh, Polina},
170
  title = {Multimodal T-Cell Functional State Classifier},
171
+ year = {2025},
172
  url = {https://github.com/polinavd/multimodal-tcell-classifier}
173
  }
174
  ```
175
 
176
  ## License
177
 
178
+ MIT License β€” see [LICENSE](https://github.com/polinavd/multimodal-tcell-classifier/blob/main/LICENSE) for details.