Upload folder using huggingface_hub

Browse files

Files changed (3) hide show

README.md +20 -44
metadata.json +15 -0
model.joblib +2 -2

README.md CHANGED Viewed

@@ -1,63 +1,39 @@
 ---
-title: Classical Methods (Transcriptome-centric, 2D)
-emoji: 📊
-colorFrom: purple
-colorTo: blue
-sdk: python
 tags:
 - transcriptomics
 - dimensionality-reduction
-- pca
-- umap
 license: mit
 ---
-# Classical Dimensionality Reduction (Transcriptome-centric, 2D)
-Pre-trained PCA and UMAP models for transcriptomics data compression, part of the TRACERx Datathon 2025 project.
-## Model Details
-- **Methods**: PCA and UMAP
-- **Compression Mode**: Transcriptome-centric
-- **Output Dimensions**: 2
-- **Training Data**: TRACERx open dataset (VST-normalized counts)
-## Contents
-The model file contains:
-- **PCA**: Principal Component Analysis model
-- **UMAP**: Uniform Manifold Approximation and Projection model (2-4D only)
-- **Scaler**: StandardScaler fitted on TRACERx data
-- **Feature Order**: Gene/sample order for alignment
 ## Usage
-These models are designed to be used with the TRACERx Datathon 2025 analysis pipeline.
-They will be automatically downloaded and cached when needed.
 ```python
 import joblib
-# Load the model bundle
-model_data = joblib.load("model.joblib")
-# Access components
-pca = model_data['pca']
-scaler = model_data['scaler']
-gene_order = model_data.get('gene_order')  # For sample-centric
-# Transform new data
-scaled_data = scaler.transform(aligned_data)
-embeddings = pca.transform(scaled_data)
 ```
-## Training Details
-- **Input Features**: 1,051 samples
-- **Training Samples**: 20,136 genes
-- **Preprocessing**: StandardScaler normalization
-## Files
-- `model.joblib`: Model bundle containing PCA, UMAP, scaler, and feature order

 ---
 tags:
 - transcriptomics
 - dimensionality-reduction
+- classical
+- TRACERx
+- UMAP
+- PCA
 license: mit
 ---
+# Classical Models (PCA + UMAP) - transcriptome mode - 2D
+Pre-trained PCA and UMAP models for transcriptomic data compression.
+**UMAP models support transform()** - new data can be projected into the same embedding space.
+## Details
+- **Mode**: transcriptome-centric compression
+- **Dimensions**: 2
+- **Training data**: TRACERx lung cancer transcriptomics
+- **Created**: 2026-01-13T16:56:13.982002
+- **UMAP transform**: Enabled (low_memory=False)
 ## Usage
 ```python
 import joblib
+from huggingface_hub import snapshot_download
+# Download model
+local_dir = snapshot_download("jruffle/classical_transcriptome_2d")
+model = joblib.load(f"{local_dir}/model.joblib")
+# Model contains: 'pca', 'umap', 'robust_scaler', 'gene_order'
+# Use UMAP transform on new data:
+new_embeddings = model['umap'].transform(preprocessed_new_data)
 ```

metadata.json ADDED Viewed

	@@ -0,0 +1,15 @@

+{
+  "model_type": "classical",
+  "mode": "transcriptome",
+  "dimensions": 2,
+  "created": "2026-01-13T16:56:13.982185",
+  "umap_transform_enabled": true,
+  "keys": [
+    "pca",
+    "robust_scaler",
+    "gene_order",
+    "sample_ids",
+    "umap",
+    "norm_params"
+  ]
+}

model.joblib CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:7a078234fb7a7d98c93ee007defea741d40518906180092294af12199bed19d2
-size 349722542

 version https://git-lfs.github.com/spec/v1
+oid sha256:e8958280fa3dfedb395e63983e543a50f580da35f2f325f9d5ef9d81b37a39e8
+size 264433435