Spaces:
Running
Running
| title: README | |
| emoji: 🐨 | |
| colorFrom: yellow | |
| colorTo: green | |
| sdk: static | |
| pinned: false | |
| # Datasets for NeurIPS Submission | |
| This page serves as the central index for all datasets associated with our NeurIPS 2026 submission. | |
| ## Overview | |
| We release multiple datasets covering 8 Biomedical domains. | |
| All datasets are hosted on Hugging Face and are publicly accessible. | |
| --- | |
| ## 📦 Datasets | |
| ### Dataset 1 | |
| - **Link:** https://huggingface.co/datasets/BenchmarkDatasets/BNG_breast-w | |
| - **Task:** Binary Classification | |
| - **Domain:** Clinical & Healthcare | |
| ### Dataset 2 | |
| - **Link:** https://huggingface.co/datasets/BenchmarkDatasets/CDC_Diabetes_Health_Indicators | |
| - **Task:** Binary Classification | |
| - **Domain:** Clinical & Healthcare | |
| ### Dataset 3 | |
| - **Link:** https://huggingface.co/datasets/BenchmarkDatasets/Cardiovascular-Disease-dataset | |
| - **Task:** Binary Classification | |
| - **Domain:** Clinical & Healthcare | |
| ### Dataset 4 | |
| - **Link:** https://huggingface.co/datasets/BenchmarkDatasets/Diabetes_UCI | |
| - **Task:** Binary Classification | |
| - **Domain:** Clinical & Healthcare | |
| ### Dataset 5 | |
| - **Link:** https://huggingface.co/datasets/BenchmarkDatasets/Diabetic_Retinopathy_Debrecen | |
| - **Task:** Binary Classification | |
| - **Domain:** Clinical & Healthcare | |
| ### Dataset 6 | |
| - **Link:** https://huggingface.co/datasets/BenchmarkDatasets/Heart-Disease-Dataset-_Comprehensive | |
| - **Task:** Regression | |
| - **Domain:** Clinical & Healthcare | |
| ### Dataset 7 | |
| - **Link:** https://huggingface.co/datasets/BenchmarkDatasets/HeartDisease_UCI | |
| - **Task:** Binary Classification | |
| - **Domain:** Clinical & Healthcare | |
| ### Dataset 8 | |
| - **Link:** https://huggingface.co/datasets/BenchmarkDatasets/MIMIC_II | |
| - **Task:** Binary Classification | |
| - **Domain:** Clinical & Healthcare | |
| ### Dataset 9 | |
| - **Link:** https://huggingface.co/datasets/BenchmarkDatasets/MUSIC | |
| - **Task:** MultiClass Classification | |
| - **Domain:** Clinical & Healthcare | |
| ### Dataset 10 | |
| - **Link:** https://huggingface.co/datasets/BenchmarkDatasets/National_Health_and_Nutrition_Health_Survey | |
| - **Task:** Binary Classification | |
| - **Domain:** Clinical & Healthcare | |
| ### Dataset 11 | |
| - **Link:** https://huggingface.co/datasets/BenchmarkDatasets/Parkinson_Speech | |
| - **Task:** Binary Classification | |
| - **Domain:** Clinical & Healthcare | |
| ### Dataset 12 | |
| - **Link:** https://huggingface.co/datasets/BenchmarkDatasets/PatientCare | |
| - **Task:** Binary Classification | |
| - **Domain:** Clinical & Healthcare | |
| ### Dataset 13 | |
| - **Link:** https://huggingface.co/datasets/BenchmarkDatasets/VitalDB | |
| - **Task:** Binary Classification | |
| - **Domain:** Clinical & Healthcare | |
| ### Dataset 14 | |
| - **Link:** https://huggingface.co/datasets/BenchmarkDatasets/analcatdata-dmft | |
| - **Task:** MultiClass Classification | |
| - **Domain:** Clinical & Healthcare | |
| ### Dataset 15 | |
| - **Link:** https://huggingface.co/datasets/BenchmarkDatasets/audiology | |
| - **Task:** Binary Classification | |
| - **Domain:** Clinical & Healthcare | |
| ### Dataset 16 | |
| - **Link:** https://huggingface.co/datasets/BenchmarkDatasets/blood-transfusion-service | |
| - **Task:** Binary Classification | |
| - **Domain:** Clinical & Healthcare | |
| ### Dataset 17 | |
| - **Link:** https://huggingface.co/datasets/BenchmarkDatasets/ilpd_patient_data | |
| - **Task:** Binary Classification | |
| - **Domain:** Clinical & Healthcare | |
| ### Dataset 18 | |
| - **Link:** https://huggingface.co/datasets/BenchmarkDatasets/lymphography | |
| - **Task:** MultiClass Classification | |
| - **Domain:** Clinical & Healthcare | |
| ### Dataset 19 | |
| - **Link:** https://huggingface.co/datasets/BenchmarkDatasets/maternal_health_risk | |
| - **Task:** MultiClass Classification | |
| - **Domain:** Clinical & Healthcare | |
| ### Dataset 20 | |
| - **Link:** https://huggingface.co/datasets/BenchmarkDatasets/pima_diabetes | |
| - **Task:** Binary Classification | |
| - **Domain:** Clinical & Healthcare | |
| ### Dataset 21 | |
| - **Link:** https://huggingface.co/datasets/BenchmarkDatasets/sick | |
| - **Task:** Binary Classification | |
| - **Domain:** Clinical & Healthcare | |
| ### Dataset 22 | |
| - **Link:** https://huggingface.co/datasets/BenchmarkDatasets/thyroid | |
| - **Task:** MultiClass Classification | |
| - **Domain:** Clinical & Healthcare | |
| ### Dataset 23 | |
| - **Link:** https://huggingface.co/datasets/BenchmarkDatasets/thyroid-ann | |
| - **Task:** MultiClass Classification | |
| - **Domain:** Clinical & Healthcare | |
| ### Dataset 24 | |
| - **Link:** https://huggingface.co/datasets/BenchmarkDatasets/thyroid-dis | |
| - **Task:** MultiClass Classification | |
| - **Domain:** Clinical & Healthcare | |
| ### Dataset 25 | |
| - **Link:** https://huggingface.co/datasets/BenchmarkDatasets/wisconsin-breast-cancer | |
| - **Task:** Binary Classification | |
| - **Domain:** Clinical & Healthcare | |
| ### Dataset 26 | |
| - **Link:** https://huggingface.co/datasets/BenchmarkDatasets/ESOL | |
| - **Task:** Regression | |
| - **Domain:** Drug Discovery | |
| ### Dataset 27 | |
| - **Link:** https://huggingface.co/datasets/BenchmarkDatasets/HIV | |
| - **Task:** Binary Classification | |
| - **Domain:** Drug Discovery | |
| ### Dataset 28 | |
| - **Link:** https://huggingface.co/datasets/BenchmarkDatasets/QM8_E1-CC2 | |
| - **Task:** Regression | |
| - **Domain:** Drug Discovery | |
| ### Dataset 29 | |
| - **Link:** https://huggingface.co/datasets/BenchmarkDatasets/QM8_f1-CC2 | |
| - **Task:** Regression | |
| - **Domain:** Drug Discovery | |
| ### Dataset 30 | |
| - **Link:** https://huggingface.co/datasets/BenchmarkDatasets/QM9_g298 | |
| - **Task:** Regression | |
| - **Domain:** Drug Discovery | |
| ### Dataset 31 | |
| - **Link:** https://huggingface.co/datasets/BenchmarkDatasets/QM9_gap | |
| - **Task:** Regression | |
| - **Domain:** Drug Discovery | |
| ### Dataset 32 | |
| - **Link:** https://huggingface.co/datasets/BenchmarkDatasets/QSAR_biodegradation | |
| - **Task:** Binary Classification | |
| - **Domain:** Drug Discovery | |
| ### Dataset 33 | |
| - **Link:** https://huggingface.co/datasets/BenchmarkDatasets/SIDER_gastro | |
| - **Task:** Binary Classification | |
| - **Domain:** Drug Discovery | |
| ### Dataset 34 | |
| - **Link:** https://huggingface.co/datasets/BenchmarkDatasets/SIDER_nervous | |
| - **Task:** Binary Classification | |
| - **Domain:** Drug Discovery | |
| ### Dataset 35 | |
| - **Link:** https://huggingface.co/datasets/BenchmarkDatasets/Tox21_NRAhR | |
| - **Task:** Binary Classification | |
| - **Domain:** Drug Discovery | |
| ### Dataset 36 | |
| - **Link:** https://huggingface.co/datasets/BenchmarkDatasets/Tox21_NRER | |
| - **Task:** Binary Classification | |
| - **Domain:** Drug Discovery | |
| ### Dataset 37 | |
| - **Link:** https://huggingface.co/datasets/BenchmarkDatasets/Tox21_SRMMP | |
| - **Task:** Binary Classification | |
| - **Domain:** Drug Discovery | |
| ### Dataset 38 | |
| - **Link:** https://huggingface.co/datasets/BenchmarkDatasets/bioresponse | |
| - **Task:** Binary Classification | |
| - **Domain:** Drug Discovery | |
| ### Dataset 39 | |
| - **Link:** https://huggingface.co/datasets/BenchmarkDatasets/vep_pathogenic_coding | |
| - **Task:** Binary Classification | |
| - **Domain:** Drug Discovery | |
| ### Dataset 40 | |
| - **Link:** https://huggingface.co/datasets/BenchmarkDatasets/continental_1k | |
| - **Task:** MultiClass Classification | |
| - **Domain:** Genomics | |
| ### Dataset 41 | |
| - **Link:** https://huggingface.co/datasets/BenchmarkDatasets/continental_50k | |
| - **Task:** MultiClass Classification | |
| - **Domain:** Genomics | |
| ### Dataset 42 | |
| - **Link:** https://huggingface.co/datasets/BenchmarkDatasets/continental_50k_missing | |
| - **Task:** MultiClass Classification | |
| - **Domain:** Genomics | |
| ### Dataset 43 | |
| - **Link:** https://huggingface.co/datasets/BenchmarkDatasets/coords_1k | |
| - **Task:** Multi-Target Regression | |
| - **Domain:** Genomics | |
| ### Dataset 44 | |
| - **Link:** https://huggingface.co/datasets/BenchmarkDatasets/coords_50k | |
| - **Task:** Multi-Target Regression | |
| - **Domain:** Genomics | |
| ### Dataset 45 | |
| - **Link:** https://huggingface.co/datasets/BenchmarkDatasets/dna | |
| - **Task:** MultiClass Classification | |
| - **Domain:** Genomics | |
| ### Dataset 46 | |
| - **Link:** https://huggingface.co/datasets/BenchmarkDatasets/pca_1k | |
| - **Task:** Multi-Target Regression | |
| - **Domain:** Genomics | |
| ### Dataset 47 | |
| - **Link:** https://huggingface.co/datasets/BenchmarkDatasets/pca_50k | |
| - **Task:** Multi-Target Regression | |
| - **Domain:** Genomics | |
| ### Dataset 48 | |
| - **Link:** https://huggingface.co/datasets/BenchmarkDatasets/sexcheck_chrX_50 | |
| - **Task:** Binary Classification | |
| - **Domain:** Genomics | |
| ### Dataset 49 | |
| - **Link:** https://huggingface.co/datasets/BenchmarkDatasets/sexcheck_chrX_500 | |
| - **Task:** Binary Classification | |
| - **Domain:** Genomics | |
| ### Dataset 50 | |
| - **Link:** https://huggingface.co/datasets/BenchmarkDatasets/sexcheck_chrX_50k | |
| - **Task:** Binary Classification | |
| - **Domain:** Genomics | |
| ### Dataset 51 | |
| - **Link:** https://huggingface.co/datasets/BenchmarkDatasets/subpop_1000_consec | |
| - **Task:** MultiClass Classification | |
| - **Domain:** Genomics | |
| ### Dataset 52 | |
| - **Link:** https://huggingface.co/datasets/BenchmarkDatasets/subpop_50k | |
| - **Task:** MultiClass Classification | |
| - **Domain:** Genomics | |
| ### Dataset 53 | |
| - **Link:** https://huggingface.co/datasets/BenchmarkDatasets/Gum1_s | |
| - **Task:** Regression | |
| - **Domain:** Metabolomics | |
| ### Dataset 54 | |
| - **Link:** https://huggingface.co/datasets/BenchmarkDatasets/MTBLS136 | |
| - **Task:** Binary Classification | |
| - **Domain:** Metabolomics | |
| ### Dataset 55 | |
| - **Link:** https://huggingface.co/datasets/BenchmarkDatasets/MTBLS161 | |
| - **Task:** Binary Classification | |
| - **Domain:** Metabolomics | |
| ### Dataset 56 | |
| - **Link:** https://huggingface.co/datasets/BenchmarkDatasets/MTBLS404 | |
| - **Task:** Binary Classification | |
| - **Domain:** Metabolomics | |
| ### Dataset 57 | |
| - **Link:** https://huggingface.co/datasets/BenchmarkDatasets/Noc_s | |
| - **Task:** Regression | |
| - **Domain:** Metabolomics | |
| ### Dataset 58 | |
| - **Link:** https://huggingface.co/datasets/BenchmarkDatasets/PT-Mush21 | |
| - **Task:** MultiClass Classification | |
| - **Domain:** Metabolomics | |
| ### Dataset 59 | |
| - **Link:** https://huggingface.co/datasets/BenchmarkDatasets/ST000369 | |
| - **Task:** Binary Classification | |
| - **Domain:** Metabolomics | |
| ### Dataset 60 | |
| - **Link:** https://huggingface.co/datasets/BenchmarkDatasets/ST000496 | |
| - **Task:** Binary Classification | |
| - **Domain:** Metabolomics | |
| ### Dataset 61 | |
| - **Link:** https://huggingface.co/datasets/BenchmarkDatasets/ST001000 | |
| - **Task:** Binary Classification | |
| - **Domain:** Metabolomics | |
| ### Dataset 62 | |
| - **Link:** https://huggingface.co/datasets/BenchmarkDatasets/ST001047 | |
| - **Task:** Binary Classification | |
| - **Domain:** Metabolomics | |
| ### Dataset 63 | |
| - **Link:** https://huggingface.co/datasets/BenchmarkDatasets/avida_hil6_onehot | |
| - **Task:** Binary Classification | |
| - **Domain:** Proteomics | |
| ### Dataset 64 | |
| - **Link:** https://huggingface.co/datasets/BenchmarkDatasets/avida_htnfa_3mer | |
| - **Task:** Binary Classification | |
| - **Domain:** Proteomics | |
| ### Dataset 65 | |
| - **Link:** https://huggingface.co/datasets/BenchmarkDatasets/avida_htnfa_esm2 | |
| - **Task:** Binary Classification | |
| - **Domain:** Proteomics | |
| ### Dataset 66 | |
| - **Link:** https://huggingface.co/datasets/BenchmarkDatasets/avida_htnfa_onehot | |
| - **Task:** Binary Classification | |
| - **Domain:** Proteomics | |
| ### Dataset 67 | |
| - **Link:** https://huggingface.co/datasets/BenchmarkDatasets/avida_sarscov2_3mer | |
| - **Task:** Binary Classification | |
| - **Domain:** Proteomics | |
| ### Dataset 68 | |
| - **Link:** https://huggingface.co/datasets/BenchmarkDatasets/cptac_survival | |
| - **Task:** Binary Classification | |
| - **Domain:** Proteomics | |
| ### Dataset 69 | |
| - **Link:** https://huggingface.co/datasets/BenchmarkDatasets/proteinea_solubility | |
| - **Task:** Binary Classification | |
| - **Domain:** Proteomics | |
| ### Dataset 70 | |
| - **Link:** https://huggingface.co/datasets/BenchmarkDatasets/true_betalactamase_complete | |
| - **Task:** Regression | |
| - **Domain:** Proteomics | |
| ### Dataset 71 | |
| - **Link:** https://huggingface.co/datasets/BenchmarkDatasets/true_fluorescence | |
| - **Task:** Regression | |
| - **Domain:** Proteomics | |
| ### Dataset 72 | |
| - **Link:** https://huggingface.co/datasets/BenchmarkDatasets/true_melting_point | |
| - **Task:** Regression | |
| - **Domain:** Proteomics | |
| ### Dataset 73 | |
| - **Link:** https://huggingface.co/datasets/BenchmarkDatasets/hnoca_cells30 | |
| - **Task:** MultiClass Classification | |
| - **Domain:** Single-cell | |
| ### Dataset 74 | |
| - **Link:** https://huggingface.co/datasets/BenchmarkDatasets/sc_bonemarrow_velocity_umap | |
| - **Task:** Multi-Target Regression | |
| - **Domain:** Single-cell | |
| ### Dataset 75 | |
| - **Link:** https://huggingface.co/datasets/BenchmarkDatasets/sc_dentategyrus_latenttime | |
| - **Task:** Regression | |
| - **Domain:** Single-cell | |
| ### Dataset 76 | |
| - **Link:** https://huggingface.co/datasets/BenchmarkDatasets/sc_dentategyrus_transition | |
| - **Task:** Regression | |
| - **Domain:** Single-cell | |
| ### Dataset 77 | |
| - **Link:** https://huggingface.co/datasets/BenchmarkDatasets/sc_dentategyrus_velocity_umap | |
| - **Task:** Multi-Target Regression | |
| - **Domain:** Single-cell | |
| ### Dataset 78 | |
| - **Link:** https://huggingface.co/datasets/BenchmarkDatasets/tahoe_batch_integration_14plates | |
| - **Task:** MultiClass Classification | |
| - **Domain:** Single-cell | |
| ### Dataset 79 | |
| - **Link:** https://huggingface.co/datasets/BenchmarkDatasets/tahoe_cell_name_1000 | |
| - **Task:** MultiClass Classification | |
| - **Domain:** Single-cell | |
| ### Dataset 80 | |
| - **Link:** https://huggingface.co/datasets/BenchmarkDatasets/tahoe_cell_name_500 | |
| - **Task:** MultiClass Classification | |
| - **Domain:** Single-cell | |
| ### Dataset 81 | |
| - **Link:** https://huggingface.co/datasets/BenchmarkDatasets/tahoe_cell_name_5000 | |
| - **Task:** MultiClass Classification | |
| - **Domain:** Single-cell | |
| ### Dataset 82 | |
| - **Link:** https://huggingface.co/datasets/BenchmarkDatasets/tahoe_g2m_score | |
| - **Task:** Regression | |
| - **Domain:** Single-cell | |
| ### Dataset 83 | |
| - **Link:** https://huggingface.co/datasets/BenchmarkDatasets/MultiOmics_GS-BRCA | |
| - **Task:** MultiClass Classification | |
| - **Domain:** Systems Biology | |
| ### Dataset 84 | |
| - **Link:** https://huggingface.co/datasets/BenchmarkDatasets/MultiOmics_GS-COAD | |
| - **Task:** MultiClass Classification | |
| - **Domain:** Systems Biology | |
| ### Dataset 85 | |
| - **Link:** https://huggingface.co/datasets/BenchmarkDatasets/MultiOmics_GS-GBM | |
| - **Task:** MultiClass Classification | |
| - **Domain:** Systems Biology | |
| ### Dataset 86 | |
| - **Link:** https://huggingface.co/datasets/BenchmarkDatasets/MultiOmics_GS-LGG | |
| - **Task:** MultiClass Classification | |
| - **Domain:** Systems Biology | |
| ### Dataset 87 | |
| - **Link:** https://huggingface.co/datasets/BenchmarkDatasets/MultiOmics_GS-OV | |
| - **Task:** MultiClass Classification | |
| - **Domain:** Systems Biology | |
| ### Dataset 88 | |
| - **Link:** https://huggingface.co/datasets/BenchmarkDatasets/ALLAML | |
| - **Task:** Binary Classification | |
| - **Domain:** Transcriptomics | |
| ### Dataset 89 | |
| - **Link:** https://huggingface.co/datasets/BenchmarkDatasets/CLL_SUB_111 | |
| - **Task:** MultiClass Classification | |
| - **Domain:** Transcriptomics | |
| ### Dataset 90 | |
| - **Link:** https://huggingface.co/datasets/BenchmarkDatasets/GLIOMA | |
| - **Task:** MultiClass Classification | |
| - **Domain:** Transcriptomics | |
| ### Dataset 91 | |
| - **Link:** https://huggingface.co/datasets/BenchmarkDatasets/GLI_85 | |
| - **Task:** Binary Classification | |
| - **Domain:** Transcriptomics | |
| ### Dataset 92 | |
| - **Link:** https://huggingface.co/datasets/BenchmarkDatasets/Prostate_GE | |
| - **Task:** Binary Classification | |
| - **Domain:** Transcriptomics | |
| ### Dataset 93 | |
| - **Link:** https://huggingface.co/datasets/BenchmarkDatasets/SMK_CAN_187 | |
| - **Task:** Gene Expression Classification | |
| - **Domain:** Transcriptomics | |
| ### Dataset 94 | |
| - **Link:** https://huggingface.co/datasets/BenchmarkDatasets/TOX_171 | |
| - **Task:** MultiClass Classification | |
| - **Domain:** Transcriptomics | |
| ### Dataset 95 | |
| - **Link:** https://huggingface.co/datasets/BenchmarkDatasets/lung | |
| - **Task:** MultiClass Classification | |
| - **Domain:** Transcriptomics |