Ham-Kris commited on
Commit
94f83d2
·
unverified ·
0 Parent(s):

Initial release: MacroPert IBD macrophage perturbation model

Browse files

- scGPT v2 fine-tuned on IBD macrophage scRNA-seq (GSE134809 + 3 datasets)
- Continuous polarization embedding: Pearson r=0.909, IBD AUROC=0.989
- scGPT v1 discrete baseline archived for comparison
- In silico KO predictions for 8 IBD target genes (CellOracle + OmniPath)

.gitattributes ADDED
@@ -0,0 +1 @@
 
 
1
+ *.pt filter=lfs diff=lfs merge=lfs -text
README.md ADDED
@@ -0,0 +1,119 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ language:
4
+ - en
5
+ tags:
6
+ - biology
7
+ - single-cell
8
+ - scRNA-seq
9
+ - macrophage
10
+ - IBD
11
+ - inflammatory-bowel-disease
12
+ - perturbation
13
+ - gene-knockout
14
+ - scGPT
15
+ - transformers
16
+ datasets:
17
+ - GEO:GSE134809
18
+ - GEO:GSE116222
19
+ - GEO:GSE182270
20
+ - GEO:GSE148810
21
+ metrics:
22
+ - pearsonr
23
+ - roc_auc
24
+ ---
25
+
26
+ # MacroPert — IBD Macrophage Perturbation Model
27
+
28
+ A fine-tuned scGPT model for predicting macrophage polarization state and IBD disease status from single-cell RNA-seq, combined with in silico gene knockout predictions for 13 IBD target genes.
29
+
30
+ ## Model Description
31
+
32
+ Macrophages in inflammatory bowel disease (IBD) exist along a continuous spectrum between pro-inflammatory (M1) and anti-inflammatory (M2) states, rather than in discrete classes. **MacroPert** captures this continuous polarization spectrum and predicts how individual gene knockouts shift macrophage state.
33
+
34
+ The repository contains:
35
+
36
+ | File | Description |
37
+ |------|-------------|
38
+ | `scgpt_ibd_v2/best_model.pt` | scGPT v2 fine-tuned weights — continuous polarization model |
39
+ | `scgpt_ibd_v2/test_metrics.json` | Test set performance metrics |
40
+ | `scgpt_ibd_v1/best_model.pt` | scGPT v1 weights — discrete 4-class baseline (archived) |
41
+ | `scgpt_ibd_v1/test_metrics.json` | v1 test metrics |
42
+ | `results/ibd_ko_predictions_combined.json` | In silico KO predictions for 8 IBD target genes |
43
+
44
+ ## Performance
45
+
46
+ | Metric | scGPT v1 (4-class) | scGPT v2 (continuous) |
47
+ |--------|--------------------|-----------------------|
48
+ | IBD AUROC | 0.915 | **0.989** |
49
+ | Polarization Pearson r | — | **0.909** |
50
+ | Classification Accuracy | 0.754 | — |
51
+
52
+ ## Training Data
53
+
54
+ Fine-tuned on four IBD macrophage scRNA-seq datasets:
55
+
56
+ | GEO Accession | Cells | Condition |
57
+ |---------------|-------|-----------|
58
+ | GSE134809 | 13,794 | IBD macrophages (primary) |
59
+ | GSE116222 | ~8,000 | Ulcerative colitis macrophages |
60
+ | GSE182270 | ~6,000 | IBD macrophage substates |
61
+ | GSE148810 | ~5,000 | Crohn's disease macrophages |
62
+
63
+ ## Model Architecture
64
+
65
+ - **Base:** scGPT pretrained transformer
66
+ - **Fine-tuning objective:**
67
+ ```
68
+ L = 0.5 × L_soft_contrastive + 0.3 × L_MSE(pol) + 0.2 × L_BCE(ibd)
69
+ ```
70
+ - **Polarization score:** `pol_score = z-score(M1_score − M2_score)` using literature-based gene signatures
71
+ - M1 genes: TNF, IL1B, IL6, CXCL10, NOS2, CD80, CD86, CCL5, CXCL9, PTGS2, IRF5, HIF1A, CXCL8, IL12A
72
+ - M2 genes: CD163, MRC1, ARG1, IL10, TGFB1, CCL18, CD209, FOLR2, SOCS3, HMOX1, CLEC7A
73
+ - **Soft contrastive loss:** Gaussian kernel `exp(−d²/2σ²)` over polarization score distances — shapes embedding space continuously without discrete cluster boundaries
74
+ - **Output heads:** polarization regression + IBD binary classification
75
+
76
+ ## In Silico KO Predictions
77
+
78
+ KO effects for 13 IBD target genes were predicted using two methods:
79
+
80
+ | Gene | Method | Key Predicted Effect |
81
+ |------|--------|---------------------|
82
+ | HIF1A | CellOracle (GRN) | PTGS2↓, CXCL8↓, CD74↓ |
83
+ | IRF5 | CellOracle (GRN) | ISG15↓, IFI6↓, IFIT3↓, IRF7↓ |
84
+ | IL6 | OmniPath propagation | JAK2↓, IL6ST↓, JAK1↓, IL6R↓, TYK2↓ |
85
+ | SOCS3 | OmniPath propagation | JAK2↓, STAT5A↓, STAT1↓, STAT3↓; AKT1↑ |
86
+ | TNF | OmniPath propagation | TNFRSF1A↓, TNFRSF1B↓, PIK3CG↓, AKT1↓ |
87
+ | TGFB1 | OmniPath propagation | PIK3R1↓, RAC1↓, TGFBR1↓, GRB2↓ |
88
+ | IL1B | OmniPath propagation | IL1R2↓, MYD88↓, STAT3↓, NR3C1↓ |
89
+ | PTGS2 | OmniPath propagation | IL4R↓, IL2RG↓, NFE2L2↓ |
90
+
91
+ Full predictions (top 15 up/down per gene) are in `results/ibd_ko_predictions_combined.json`.
92
+
93
+ ## Usage
94
+
95
+ ```python
96
+ import torch
97
+ import json
98
+
99
+ # Load model weights
100
+ checkpoint = torch.load("scgpt_ibd_v2/best_model.pt", map_location="cpu")
101
+
102
+ # Load KO predictions
103
+ with open("results/ibd_ko_predictions_combined.json") as f:
104
+ ko_predictions = json.load(f)
105
+
106
+ # Example: top downregulated genes after IL6 KO
107
+ il6_ko = ko_predictions["IL6"]
108
+ print(il6_ko["top_downregulated"][:5])
109
+ # [['JAK2', -0.207], ['IL6ST', ...], ['JAK1', ...], ...]
110
+ ```
111
+
112
+ ## Citation
113
+
114
+ If you use this model, please cite the underlying datasets and tools:
115
+
116
+ - **scGPT:** Cui et al., *Nature Methods* 2024
117
+ - **CellOracle:** Kamimoto et al., *Nature* 2023
118
+ - **OmniPath:** Türei et al., *Nature Methods* 2021
119
+ - **GSE134809:** Smillie et al., *Cell* 2019
results/ibd_ko_predictions_combined.json ADDED
@@ -0,0 +1,992 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "PTGS2": {
3
+ "top_upregulated": [
4
+ [
5
+ "CUL3",
6
+ 0.25
7
+ ],
8
+ [
9
+ "SEC31A",
10
+ 0.125
11
+ ],
12
+ [
13
+ "RAPGEF2",
14
+ 0.125
15
+ ],
16
+ [
17
+ "YWHAB",
18
+ 0.125
19
+ ],
20
+ [
21
+ "ARIH1",
22
+ 0.125
23
+ ],
24
+ [
25
+ "PFKFB3",
26
+ 0.125
27
+ ],
28
+ [
29
+ "IKBKG",
30
+ 0.125
31
+ ],
32
+ [
33
+ "BAD",
34
+ 0.125
35
+ ],
36
+ [
37
+ "FOXO3",
38
+ 0.125
39
+ ],
40
+ [
41
+ "TARDBP",
42
+ 0.125
43
+ ],
44
+ [
45
+ "COPS5",
46
+ 0.125
47
+ ],
48
+ [
49
+ "ARHGAP24",
50
+ 0.125
51
+ ]
52
+ ],
53
+ "top_downregulated": [
54
+ [
55
+ "IL4R",
56
+ -0.375
57
+ ],
58
+ [
59
+ "IL2RG",
60
+ -0.375
61
+ ],
62
+ [
63
+ "RHOA",
64
+ -0.25
65
+ ],
66
+ [
67
+ "NFE2L2",
68
+ -0.25
69
+ ],
70
+ [
71
+ "TP53",
72
+ -0.25
73
+ ],
74
+ [
75
+ "RAC1",
76
+ -0.25
77
+ ],
78
+ [
79
+ "SMARCA4",
80
+ -0.25
81
+ ],
82
+ [
83
+ "IKBKB",
84
+ -0.25
85
+ ],
86
+ [
87
+ "SMARCD3",
88
+ -0.25
89
+ ],
90
+ [
91
+ "IL13RA1",
92
+ -0.25
93
+ ],
94
+ [
95
+ "JAK1",
96
+ -0.25
97
+ ],
98
+ [
99
+ "JAK3",
100
+ -0.25
101
+ ],
102
+ [
103
+ "JAK2",
104
+ -0.25
105
+ ],
106
+ [
107
+ "WNK1",
108
+ -0.125
109
+ ],
110
+ [
111
+ "VAV1",
112
+ -0.125
113
+ ]
114
+ ],
115
+ "n_downstream_genes": 82,
116
+ "mean_abs_effect": 0.14939024390243902
117
+ },
118
+ "IL1B": {
119
+ "top_upregulated": [
120
+ [
121
+ "YWHAZ",
122
+ 0.375
123
+ ],
124
+ [
125
+ "STMN1",
126
+ 0.375
127
+ ],
128
+ [
129
+ "YWHAB",
130
+ 0.125
131
+ ],
132
+ [
133
+ "RCSD1",
134
+ 0.125
135
+ ],
136
+ [
137
+ "NFE2",
138
+ 0.125
139
+ ],
140
+ [
141
+ "LAT",
142
+ 0.125
143
+ ],
144
+ [
145
+ "NR4A1",
146
+ 0.125
147
+ ],
148
+ [
149
+ "KLF13",
150
+ 0.125
151
+ ],
152
+ [
153
+ "TOLLIP",
154
+ 0.125
155
+ ],
156
+ [
157
+ "SRSF1",
158
+ 0.125
159
+ ],
160
+ [
161
+ "TOB1",
162
+ 0.125
163
+ ],
164
+ [
165
+ "CFLAR",
166
+ 0.125
167
+ ],
168
+ [
169
+ "CTBP1",
170
+ 0.125
171
+ ],
172
+ [
173
+ "STAT6",
174
+ 0.125
175
+ ],
176
+ [
177
+ "AIMP1",
178
+ 0.125
179
+ ]
180
+ ],
181
+ "top_downregulated": [
182
+ [
183
+ "IL1R2",
184
+ -0.5
185
+ ],
186
+ [
187
+ "MYD88",
188
+ -0.5
189
+ ],
190
+ [
191
+ "TP53",
192
+ -0.375
193
+ ],
194
+ [
195
+ "STAT3",
196
+ -0.375
197
+ ],
198
+ [
199
+ "NR3C1",
200
+ -0.375
201
+ ],
202
+ [
203
+ "JUN",
204
+ -0.375
205
+ ],
206
+ [
207
+ "BCL2L11",
208
+ -0.375
209
+ ],
210
+ [
211
+ "AKT1",
212
+ -0.25
213
+ ],
214
+ [
215
+ "PIN1",
216
+ -0.25
217
+ ],
218
+ [
219
+ "BID",
220
+ -0.25
221
+ ],
222
+ [
223
+ "STAT1",
224
+ -0.25
225
+ ],
226
+ [
227
+ "MYC",
228
+ -0.25
229
+ ],
230
+ [
231
+ "ATF2",
232
+ -0.25
233
+ ],
234
+ [
235
+ "APP",
236
+ -0.25
237
+ ],
238
+ [
239
+ "HNRNPK",
240
+ -0.25
241
+ ]
242
+ ],
243
+ "n_downstream_genes": 81,
244
+ "mean_abs_effect": 0.16666666666666666
245
+ },
246
+ "TNF": {
247
+ "top_upregulated": [
248
+ [
249
+ "PPP2CA",
250
+ 0.25
251
+ ],
252
+ [
253
+ "TRAF5",
254
+ 0.25
255
+ ],
256
+ [
257
+ "CHN2",
258
+ 0.125
259
+ ],
260
+ [
261
+ "PDHA1",
262
+ 0.125
263
+ ],
264
+ [
265
+ "CAPN1",
266
+ 0.125
267
+ ],
268
+ [
269
+ "TP53",
270
+ 0.125
271
+ ],
272
+ [
273
+ "VHL",
274
+ 0.125
275
+ ],
276
+ [
277
+ "ENG",
278
+ 0.125
279
+ ],
280
+ [
281
+ "FXN",
282
+ 0.125
283
+ ],
284
+ [
285
+ "VCL",
286
+ 0.125
287
+ ],
288
+ [
289
+ "MAPRE1",
290
+ 0.125
291
+ ],
292
+ [
293
+ "RUNX3",
294
+ 0.125
295
+ ],
296
+ [
297
+ "YY1",
298
+ 0.125
299
+ ],
300
+ [
301
+ "PTPN2",
302
+ 0.125
303
+ ],
304
+ [
305
+ "CFL1",
306
+ 0.125
307
+ ]
308
+ ],
309
+ "top_downregulated": [
310
+ [
311
+ "PSIP1",
312
+ -0.5
313
+ ],
314
+ [
315
+ "TNFRSF1B",
316
+ -0.5
317
+ ],
318
+ [
319
+ "AKT1",
320
+ -0.5
321
+ ],
322
+ [
323
+ "TNFRSF1A",
324
+ -0.5
325
+ ],
326
+ [
327
+ "PIK3CG",
328
+ -0.5
329
+ ],
330
+ [
331
+ "TNFRSF21",
332
+ -0.5
333
+ ],
334
+ [
335
+ "NFE2L2",
336
+ -0.5
337
+ ],
338
+ [
339
+ "GCLC",
340
+ -0.5
341
+ ],
342
+ [
343
+ "CASP7",
344
+ -0.375
345
+ ],
346
+ [
347
+ "PAK2",
348
+ -0.375
349
+ ],
350
+ [
351
+ "MAP3K1",
352
+ -0.375
353
+ ],
354
+ [
355
+ "IKBKB",
356
+ -0.375
357
+ ],
358
+ [
359
+ "PDPK1",
360
+ -0.375
361
+ ],
362
+ [
363
+ "RAC1",
364
+ -0.375
365
+ ],
366
+ [
367
+ "SET",
368
+ -0.375
369
+ ]
370
+ ],
371
+ "n_downstream_genes": 202,
372
+ "mean_abs_effect": 0.1608910891089109
373
+ },
374
+ "IL6": {
375
+ "top_upregulated": [
376
+ [
377
+ "CDK4",
378
+ 0.125
379
+ ],
380
+ [
381
+ "YY1",
382
+ 0.125
383
+ ],
384
+ [
385
+ "LCP2",
386
+ 0.125
387
+ ],
388
+ [
389
+ "CDKN1B",
390
+ 0.125
391
+ ],
392
+ [
393
+ "KLF10",
394
+ 0.125
395
+ ],
396
+ [
397
+ "PDP1",
398
+ 0.125
399
+ ],
400
+ [
401
+ "PTEN",
402
+ 0.125
403
+ ],
404
+ [
405
+ "FOXO3",
406
+ 0.125
407
+ ],
408
+ [
409
+ "MAP3K5",
410
+ 0.125
411
+ ],
412
+ [
413
+ "PTPN2",
414
+ 0.125
415
+ ]
416
+ ],
417
+ "top_downregulated": [
418
+ [
419
+ "JAK2",
420
+ -0.875
421
+ ],
422
+ [
423
+ "IL6ST",
424
+ -0.75
425
+ ],
426
+ [
427
+ "JAK1",
428
+ -0.625
429
+ ],
430
+ [
431
+ "IL6R",
432
+ -0.5
433
+ ],
434
+ [
435
+ "TYK2",
436
+ -0.375
437
+ ],
438
+ [
439
+ "GAB2",
440
+ -0.375
441
+ ],
442
+ [
443
+ "STAT3",
444
+ -0.375
445
+ ],
446
+ [
447
+ "PTPN11",
448
+ -0.375
449
+ ],
450
+ [
451
+ "STAT5A",
452
+ -0.375
453
+ ],
454
+ [
455
+ "STAT1",
456
+ -0.375
457
+ ],
458
+ [
459
+ "STAT5B",
460
+ -0.375
461
+ ],
462
+ [
463
+ "IFNGR1",
464
+ -0.25
465
+ ],
466
+ [
467
+ "GRB2",
468
+ -0.25
469
+ ],
470
+ [
471
+ "STAT6",
472
+ -0.25
473
+ ],
474
+ [
475
+ "IFNGR2",
476
+ -0.25
477
+ ]
478
+ ],
479
+ "n_downstream_genes": 61,
480
+ "mean_abs_effect": 0.2069672131147541
481
+ },
482
+ "SOCS3": {
483
+ "top_upregulated": [
484
+ [
485
+ "AKT1",
486
+ 1.25
487
+ ],
488
+ [
489
+ "MAPK14",
490
+ 1.125
491
+ ],
492
+ [
493
+ "IL6ST",
494
+ 1.0
495
+ ],
496
+ [
497
+ "CDKN1A",
498
+ 0.875
499
+ ],
500
+ [
501
+ "SIGLEC7",
502
+ 0.75
503
+ ],
504
+ [
505
+ "TP53",
506
+ 0.75
507
+ ],
508
+ [
509
+ "MAP3K7",
510
+ 0.75
511
+ ],
512
+ [
513
+ "MAP2K7",
514
+ 0.75
515
+ ],
516
+ [
517
+ "RAC1",
518
+ 0.625
519
+ ],
520
+ [
521
+ "RARA",
522
+ 0.5
523
+ ],
524
+ [
525
+ "MAPK1",
526
+ 0.5
527
+ ],
528
+ [
529
+ "IRF5",
530
+ 0.5
531
+ ],
532
+ [
533
+ "MCL1",
534
+ 0.5
535
+ ],
536
+ [
537
+ "BID",
538
+ 0.5
539
+ ],
540
+ [
541
+ "RELA",
542
+ 0.5
543
+ ]
544
+ ],
545
+ "top_downregulated": [
546
+ [
547
+ "JAK2",
548
+ -1.75
549
+ ],
550
+ [
551
+ "STAT5A",
552
+ -1.25
553
+ ],
554
+ [
555
+ "JAK1",
556
+ -1.0
557
+ ],
558
+ [
559
+ "STAT1",
560
+ -0.75
561
+ ],
562
+ [
563
+ "STAT3",
564
+ -0.75
565
+ ],
566
+ [
567
+ "STAT4",
568
+ -0.75
569
+ ],
570
+ [
571
+ "GAB2",
572
+ -0.625
573
+ ],
574
+ [
575
+ "PTPN11",
576
+ -0.625
577
+ ],
578
+ [
579
+ "BTK",
580
+ -0.625
581
+ ],
582
+ [
583
+ "TYK2",
584
+ -0.625
585
+ ],
586
+ [
587
+ "MYC",
588
+ -0.625
589
+ ],
590
+ [
591
+ "IL12RB1",
592
+ -0.5
593
+ ],
594
+ [
595
+ "STAT6",
596
+ -0.5
597
+ ],
598
+ [
599
+ "CTNNB1",
600
+ -0.5
601
+ ],
602
+ [
603
+ "GTF2I",
604
+ -0.5
605
+ ]
606
+ ],
607
+ "n_downstream_genes": 763,
608
+ "mean_abs_effect": 0.165956749672346
609
+ },
610
+ "TGFB1": {
611
+ "top_upregulated": [
612
+ [
613
+ "CAPN1",
614
+ 0.625
615
+ ],
616
+ [
617
+ "MARCKS",
618
+ 0.375
619
+ ],
620
+ [
621
+ "VAPA",
622
+ 0.375
623
+ ],
624
+ [
625
+ "BCL2",
626
+ 0.375
627
+ ],
628
+ [
629
+ "CEBPA",
630
+ 0.375
631
+ ],
632
+ [
633
+ "CEBPB",
634
+ 0.375
635
+ ],
636
+ [
637
+ "CD36",
638
+ 0.25
639
+ ],
640
+ [
641
+ "NRAS",
642
+ 0.25
643
+ ],
644
+ [
645
+ "PPP2CA",
646
+ 0.25
647
+ ],
648
+ [
649
+ "ABCA1",
650
+ 0.25
651
+ ],
652
+ [
653
+ "LRRK1",
654
+ 0.25
655
+ ],
656
+ [
657
+ "CTNND1",
658
+ 0.25
659
+ ],
660
+ [
661
+ "RUNX2",
662
+ 0.25
663
+ ],
664
+ [
665
+ "EEF1A1",
666
+ 0.25
667
+ ],
668
+ [
669
+ "PTPN2",
670
+ 0.25
671
+ ]
672
+ ],
673
+ "top_downregulated": [
674
+ [
675
+ "PIK3R1",
676
+ -2.0
677
+ ],
678
+ [
679
+ "RAC1",
680
+ -1.5
681
+ ],
682
+ [
683
+ "TGFBR1",
684
+ -1.25
685
+ ],
686
+ [
687
+ "GRB2",
688
+ -1.125
689
+ ],
690
+ [
691
+ "PAK1",
692
+ -1.0
693
+ ],
694
+ [
695
+ "SYK",
696
+ -1.0
697
+ ],
698
+ [
699
+ "DAPK1",
700
+ -0.875
701
+ ],
702
+ [
703
+ "CTNNB1",
704
+ -0.875
705
+ ],
706
+ [
707
+ "TGFBR2",
708
+ -0.875
709
+ ],
710
+ [
711
+ "PIK3CG",
712
+ -0.875
713
+ ],
714
+ [
715
+ "CDKN1A",
716
+ -0.875
717
+ ],
718
+ [
719
+ "PPP2R2A",
720
+ -0.875
721
+ ],
722
+ [
723
+ "AKT1",
724
+ -0.75
725
+ ],
726
+ [
727
+ "INSR",
728
+ -0.75
729
+ ],
730
+ [
731
+ "RHOA",
732
+ -0.75
733
+ ]
734
+ ],
735
+ "n_downstream_genes": 533,
736
+ "mean_abs_effect": 0.20309568480300189
737
+ },
738
+ "HIF1A": {
739
+ "top_upregulated": [
740
+ [
741
+ "S100A12",
742
+ 0.0008134663168389971
743
+ ],
744
+ [
745
+ "S100A8",
746
+ 0.0005305317544904874
747
+ ],
748
+ [
749
+ "PPBP",
750
+ 0.0005028657796491108
751
+ ],
752
+ [
753
+ "NRGN",
754
+ 0.0003803229125426224
755
+ ],
756
+ [
757
+ "MT1H",
758
+ 0.00032238103713177417
759
+ ],
760
+ [
761
+ "IDO1",
762
+ 0.00016444837297460802
763
+ ],
764
+ [
765
+ "SPP1",
766
+ 0.0001275240900207309
767
+ ],
768
+ [
769
+ "MT1G",
770
+ 0.0001227169890742035
771
+ ],
772
+ [
773
+ "MT1HL1",
774
+ 9.230909985226494e-05
775
+ ],
776
+ [
777
+ "MT1A",
778
+ 8.65444566497999e-05
779
+ ],
780
+ [
781
+ "MT1E",
782
+ 7.576200801041446e-05
783
+ ],
784
+ [
785
+ "MT1M",
786
+ 5.678793555160166e-05
787
+ ],
788
+ [
789
+ "INHBA",
790
+ 5.174907556743538e-05
791
+ ],
792
+ [
793
+ "LYPD2",
794
+ 1.659507360527506e-05
795
+ ],
796
+ [
797
+ "SERPINB2",
798
+ 1.557577640575456e-05
799
+ ]
800
+ ],
801
+ "top_downregulated": [
802
+ [
803
+ "HIF1A",
804
+ -0.15231378650512128
805
+ ],
806
+ [
807
+ "G0S2",
808
+ -0.0899483601733625
809
+ ],
810
+ [
811
+ "CD74",
812
+ -0.08843891599198114
813
+ ],
814
+ [
815
+ "PTGS2",
816
+ -0.06775858018188559
817
+ ],
818
+ [
819
+ "HLA-DRB1",
820
+ -0.06695744893492933
821
+ ],
822
+ [
823
+ "TMSB4X",
824
+ -0.051565268523866295
825
+ ],
826
+ [
827
+ "C1QA",
828
+ -0.046627735305617694
829
+ ],
830
+ [
831
+ "NEAT1",
832
+ -0.04370720851767882
833
+ ],
834
+ [
835
+ "HLA-DPA1",
836
+ -0.04309927287770282
837
+ ],
838
+ [
839
+ "CXCL8",
840
+ -0.041510155249684176
841
+ ],
842
+ [
843
+ "SRGN",
844
+ -0.03785925150785177
845
+ ],
846
+ [
847
+ "HLA-DRA",
848
+ -0.03714633648291597
849
+ ],
850
+ [
851
+ "LYZ",
852
+ -0.03533002467842362
853
+ ],
854
+ [
855
+ "PLAUR",
856
+ -0.033321923706293444
857
+ ],
858
+ [
859
+ "SOD2",
860
+ -0.033279197447284886
861
+ ]
862
+ ],
863
+ "mean_abs_change": 0.0007538475191108255
864
+ },
865
+ "IRF5": {
866
+ "top_upregulated": [
867
+ [
868
+ "S100A8",
869
+ 0.006895962237654877
870
+ ],
871
+ [
872
+ "SDS",
873
+ 0.0005239871508121099
874
+ ],
875
+ [
876
+ "RASGEF1B",
877
+ 0.00020764254474868164
878
+ ],
879
+ [
880
+ "BCL6",
881
+ 0.00018783634677199616
882
+ ],
883
+ [
884
+ "MMP19",
885
+ 0.00014403313556970076
886
+ ],
887
+ [
888
+ "CCL5",
889
+ 0.0001088469455209754
890
+ ],
891
+ [
892
+ "SLC7A8",
893
+ 9.047486046896145e-05
894
+ ],
895
+ [
896
+ "ALOX5AP",
897
+ 4.92001214528229e-05
898
+ ],
899
+ [
900
+ "TFRC",
901
+ 4.881696522487404e-05
902
+ ],
903
+ [
904
+ "PHKG1",
905
+ 3.2193447632671586e-05
906
+ ],
907
+ [
908
+ "SOCS3",
909
+ 2.9233696066954194e-05
910
+ ],
911
+ [
912
+ "CLEC4D",
913
+ 2.270800460759542e-05
914
+ ],
915
+ [
916
+ "AQP9",
917
+ 2.116691988543141e-05
918
+ ],
919
+ [
920
+ "MIR4435-2HG",
921
+ 1.9844803099479844e-05
922
+ ],
923
+ [
924
+ "GLYR1",
925
+ 1.9713190685242186e-05
926
+ ]
927
+ ],
928
+ "top_downregulated": [
929
+ [
930
+ "IRF5",
931
+ -0.049034021232352516
932
+ ],
933
+ [
934
+ "SAT1",
935
+ -0.04257470526835846
936
+ ],
937
+ [
938
+ "HLA-DPB1",
939
+ -0.016567939176193442
940
+ ],
941
+ [
942
+ "HLA-DRB1",
943
+ -0.00982815800355405
944
+ ],
945
+ [
946
+ "ISG15",
947
+ -0.008719249415564087
948
+ ],
949
+ [
950
+ "IFI6",
951
+ -0.008579207224463483
952
+ ],
953
+ [
954
+ "IRF7",
955
+ -0.0078918440139776
956
+ ],
957
+ [
958
+ "NT5C3A",
959
+ -0.007360925077294011
960
+ ],
961
+ [
962
+ "SOD2",
963
+ -0.0063346473315063845
964
+ ],
965
+ [
966
+ "IFIT3",
967
+ -0.005948570201037761
968
+ ],
969
+ [
970
+ "PNPLA6",
971
+ -0.004070137653429122
972
+ ],
973
+ [
974
+ "ACP5",
975
+ -0.0028732792457450704
976
+ ],
977
+ [
978
+ "GBP1",
979
+ -0.002759290416514405
980
+ ],
981
+ [
982
+ "OTUD4",
983
+ -0.002481654304085488
984
+ ],
985
+ [
986
+ "NKG7",
987
+ -0.002318524392938188
988
+ ]
989
+ ],
990
+ "mean_abs_change": 7.050768231451016e-05
991
+ }
992
+ }
scgpt_ibd_v1/best_model.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:fed21d7eb4542781acd8c19dbe43202d1933c432b60c8568e1007abbfacba40b
3
+ size 205939765
scgpt_ibd_v1/test_metrics.json ADDED
@@ -0,0 +1,5 @@
 
 
 
 
 
 
1
+ {
2
+ "acc": 0.7543160690571049,
3
+ "f1": 0.69594503489039,
4
+ "auroc": 0.9150601745956721
5
+ }
scgpt_ibd_v2/best_model.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:7ebcd6dc411ecc68632a535ccdadc2ae65995582b1c48cf4d0aa03d3f5e7b70b
3
+ size 206465979
scgpt_ibd_v2/test_metrics.json ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ {
2
+ "pearson_r": 0.909355103969574,
3
+ "ibd_auroc": 0.989314058956916
4
+ }