jruffle commited on
Commit
26a55a8
·
verified ·
1 Parent(s): d64f9ed

Upload folder using huggingface_hub

Browse files
Files changed (3) hide show
  1. README.md +17 -45
  2. metadata.json +16 -0
  3. model.joblib +2 -2
README.md CHANGED
@@ -1,63 +1,35 @@
1
  ---
2
- title: Classical Methods (Transcriptome-centric, 32D)
3
- emoji: 📊
4
- colorFrom: purple
5
- colorTo: blue
6
- sdk: python
7
  tags:
8
  - transcriptomics
9
  - dimensionality-reduction
10
- - pca
11
-
12
  license: mit
13
  ---
14
 
15
- # Classical Dimensionality Reduction (Transcriptome-centric, 32D)
16
-
17
- Pre-trained PCA models for transcriptomics data compression, part of the TRACERx Datathon 2025 project.
18
-
19
- ## Model Details
20
 
21
- - **Methods**: PCA
22
- - **Compression Mode**: Transcriptome-centric
23
- - **Output Dimensions**: 32
24
- - **Training Data**: TRACERx open dataset (VST-normalized counts)
25
 
26
- ## Contents
27
-
28
- The model file contains:
29
- - **PCA**: Principal Component Analysis model
30
- - **UMAP**: Uniform Manifold Approximation and Projection model (2-4D only)
31
- - **Scaler**: StandardScaler fitted on TRACERx data
32
- - **Feature Order**: Gene/sample order for alignment
33
 
34
  ## Usage
35
 
36
- These models are designed to be used with the TRACERx Datathon 2025 analysis pipeline.
37
- They will be automatically downloaded and cached when needed.
38
-
39
  ```python
40
  import joblib
 
41
 
42
- # Load the model bundle
43
- model_data = joblib.load("model.joblib")
 
44
 
45
- # Access components
46
- pca = model_data['pca']
47
- scaler = model_data['scaler']
48
- gene_order = model_data.get('gene_order') # For sample-centric
49
 
50
- # Transform new data
51
- scaled_data = scaler.transform(aligned_data)
52
- embeddings = pca.transform(scaled_data)
53
  ```
54
-
55
- ## Training Details
56
-
57
- - **Input Features**: 1,051 samples
58
- - **Training Samples**: 20,136 genes
59
- - **Preprocessing**: StandardScaler normalization
60
-
61
- ## Files
62
-
63
- - `model.joblib`: Model bundle containing PCA, scaler, and feature order
 
1
  ---
 
 
 
 
 
2
  tags:
3
  - transcriptomics
4
  - dimensionality-reduction
5
+ - classical
6
+ - TRACERx
7
  license: mit
8
  ---
9
 
10
+ # CLASSICAL Model - transcriptome mode - 32D
 
 
 
 
11
 
12
+ Pre-trained classical model for transcriptomic data compression.
 
 
 
13
 
14
+ ## Details
15
+ - **Mode**: transcriptome-centric compression
16
+ - **Dimensions**: 32
17
+ - **Training data**: TRACERx lung cancer transcriptomics
18
+ - **Created**: 2026-01-09T20:55:40.333646
 
 
19
 
20
  ## Usage
21
 
 
 
 
22
  ```python
23
  import joblib
24
+ from huggingface_hub import snapshot_download
25
 
26
+ # Download model
27
+ local_dir = snapshot_download("jruffle/classical_transcriptome_32d")
28
+ model = joblib.load(f"{local_dir}/model.joblib")
29
 
30
+ # For classical models (PCA/UMAP):
31
+ # model contains: 'pca', 'umap', 'robust_scaler', 'gene_order'
 
 
32
 
33
+ # For TabPFN models:
34
+ # model contains: 'tabpfn_embedding', 'pca_final', 'input_scaler', etc.
 
35
  ```
 
 
 
 
 
 
 
 
 
 
metadata.json ADDED
@@ -0,0 +1,16 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "model_type": "classical",
3
+ "mode": "transcriptome",
4
+ "dimensions": 32,
5
+ "created": "2026-01-09T20:55:40.333915",
6
+ "keys": [
7
+ "robust_scaler",
8
+ "norm_params",
9
+ "pca",
10
+ "preprocessing_method",
11
+ "preprocessing_quantile_range",
12
+ "gene_ids",
13
+ "sample_order",
14
+ "umap"
15
+ ]
16
+ }
model.joblib CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:6b5bbc85834784c5a52a4bf70335c71ca4b61eaddada562e4426c5e80da954d1
3
- size 498324
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:c1481d13c98c9476963d157f2edd125a35d635839febc814de4773309aab29de
3
+ size 351168542