| | --- |
| | title: Classical Methods (Transcriptome-centric, 8D) |
| | emoji: 📊 |
| | colorFrom: purple |
| | colorTo: blue |
| | sdk: python |
| | tags: |
| | - transcriptomics |
| | - dimensionality-reduction |
| | - pca |
| |
|
| | license: mit |
| | --- |
| | |
| | # Classical Dimensionality Reduction (Transcriptome-centric, 8D) |
| |
|
| | Pre-trained PCA models for transcriptomics data compression, part of the TRACERx Datathon 2025 project. |
| |
|
| | ## Model Details |
| |
|
| | - **Methods**: PCA |
| | - **Compression Mode**: Transcriptome-centric |
| | - **Output Dimensions**: 8 |
| | - **Training Data**: TRACERx open dataset (VST-normalized counts) |
| |
|
| | ## Contents |
| |
|
| | The model file contains: |
| | - **PCA**: Principal Component Analysis model |
| | - **UMAP**: Uniform Manifold Approximation and Projection model (2-4D only) |
| | - **Scaler**: StandardScaler fitted on TRACERx data |
| | - **Feature Order**: Gene/sample order for alignment |
| |
|
| | ## Usage |
| |
|
| | These models are designed to be used with the TRACERx Datathon 2025 analysis pipeline. |
| | They will be automatically downloaded and cached when needed. |
| |
|
| | ```python |
| | import joblib |
| | |
| | # Load the model bundle |
| | model_data = joblib.load("model.joblib") |
| | |
| | # Access components |
| | pca = model_data['pca'] |
| | scaler = model_data['scaler'] |
| | gene_order = model_data.get('gene_order') # For sample-centric |
| | |
| | # Transform new data |
| | scaled_data = scaler.transform(aligned_data) |
| | embeddings = pca.transform(scaled_data) |
| | ``` |
| |
|
| | ## Training Details |
| |
|
| | - **Input Features**: 1,051 samples |
| | - **Training Samples**: 20,136 genes |
| | - **Preprocessing**: StandardScaler normalization |
| |
|
| | ## Files |
| |
|
| | - `model.joblib`: Model bundle containing PCA, scaler, and feature order |
| |
|