README / README.md
tabbiomed-neurips2026's picture
Update README.md
2a6b1b5 verified
---
title: README
emoji: 🐨
colorFrom: yellow
colorTo: green
sdk: static
pinned: false
---
# Datasets for NeurIPS Submission
This page serves as the central index for all datasets associated with our NeurIPS 2026 submission.
## Overview
We release multiple datasets covering 8 Biomedical domains.
All datasets are hosted on Hugging Face and are publicly accessible.
---
## 📦 Datasets
### Dataset 1
- **Link:** https://huggingface.co/datasets/BenchmarkDatasets/BNG_breast-w
- **Task:** Binary Classification
- **Domain:** Clinical & Healthcare
### Dataset 2
- **Link:** https://huggingface.co/datasets/BenchmarkDatasets/CDC_Diabetes_Health_Indicators
- **Task:** Binary Classification
- **Domain:** Clinical & Healthcare
### Dataset 3
- **Link:** https://huggingface.co/datasets/BenchmarkDatasets/Cardiovascular-Disease-dataset
- **Task:** Binary Classification
- **Domain:** Clinical & Healthcare
### Dataset 4
- **Link:** https://huggingface.co/datasets/BenchmarkDatasets/Diabetes_UCI
- **Task:** Binary Classification
- **Domain:** Clinical & Healthcare
### Dataset 5
- **Link:** https://huggingface.co/datasets/BenchmarkDatasets/Diabetic_Retinopathy_Debrecen
- **Task:** Binary Classification
- **Domain:** Clinical & Healthcare
### Dataset 6
- **Link:** https://huggingface.co/datasets/BenchmarkDatasets/Heart-Disease-Dataset-_Comprehensive
- **Task:** Regression
- **Domain:** Clinical & Healthcare
### Dataset 7
- **Link:** https://huggingface.co/datasets/BenchmarkDatasets/HeartDisease_UCI
- **Task:** Binary Classification
- **Domain:** Clinical & Healthcare
### Dataset 8
- **Link:** https://huggingface.co/datasets/BenchmarkDatasets/MIMIC_II
- **Task:** Binary Classification
- **Domain:** Clinical & Healthcare
### Dataset 9
- **Link:** https://huggingface.co/datasets/BenchmarkDatasets/MUSIC
- **Task:** MultiClass Classification
- **Domain:** Clinical & Healthcare
### Dataset 10
- **Link:** https://huggingface.co/datasets/BenchmarkDatasets/National_Health_and_Nutrition_Health_Survey
- **Task:** Binary Classification
- **Domain:** Clinical & Healthcare
### Dataset 11
- **Link:** https://huggingface.co/datasets/BenchmarkDatasets/Parkinson_Speech
- **Task:** Binary Classification
- **Domain:** Clinical & Healthcare
### Dataset 12
- **Link:** https://huggingface.co/datasets/BenchmarkDatasets/PatientCare
- **Task:** Binary Classification
- **Domain:** Clinical & Healthcare
### Dataset 13
- **Link:** https://huggingface.co/datasets/BenchmarkDatasets/VitalDB
- **Task:** Binary Classification
- **Domain:** Clinical & Healthcare
### Dataset 14
- **Link:** https://huggingface.co/datasets/BenchmarkDatasets/analcatdata-dmft
- **Task:** MultiClass Classification
- **Domain:** Clinical & Healthcare
### Dataset 15
- **Link:** https://huggingface.co/datasets/BenchmarkDatasets/audiology
- **Task:** Binary Classification
- **Domain:** Clinical & Healthcare
### Dataset 16
- **Link:** https://huggingface.co/datasets/BenchmarkDatasets/blood-transfusion-service
- **Task:** Binary Classification
- **Domain:** Clinical & Healthcare
### Dataset 17
- **Link:** https://huggingface.co/datasets/BenchmarkDatasets/ilpd_patient_data
- **Task:** Binary Classification
- **Domain:** Clinical & Healthcare
### Dataset 18
- **Link:** https://huggingface.co/datasets/BenchmarkDatasets/lymphography
- **Task:** MultiClass Classification
- **Domain:** Clinical & Healthcare
### Dataset 19
- **Link:** https://huggingface.co/datasets/BenchmarkDatasets/maternal_health_risk
- **Task:** MultiClass Classification
- **Domain:** Clinical & Healthcare
### Dataset 20
- **Link:** https://huggingface.co/datasets/BenchmarkDatasets/pima_diabetes
- **Task:** Binary Classification
- **Domain:** Clinical & Healthcare
### Dataset 21
- **Link:** https://huggingface.co/datasets/BenchmarkDatasets/sick
- **Task:** Binary Classification
- **Domain:** Clinical & Healthcare
### Dataset 22
- **Link:** https://huggingface.co/datasets/BenchmarkDatasets/thyroid
- **Task:** MultiClass Classification
- **Domain:** Clinical & Healthcare
### Dataset 23
- **Link:** https://huggingface.co/datasets/BenchmarkDatasets/thyroid-ann
- **Task:** MultiClass Classification
- **Domain:** Clinical & Healthcare
### Dataset 24
- **Link:** https://huggingface.co/datasets/BenchmarkDatasets/thyroid-dis
- **Task:** MultiClass Classification
- **Domain:** Clinical & Healthcare
### Dataset 25
- **Link:** https://huggingface.co/datasets/BenchmarkDatasets/wisconsin-breast-cancer
- **Task:** Binary Classification
- **Domain:** Clinical & Healthcare
### Dataset 26
- **Link:** https://huggingface.co/datasets/BenchmarkDatasets/ESOL
- **Task:** Regression
- **Domain:** Drug Discovery
### Dataset 27
- **Link:** https://huggingface.co/datasets/BenchmarkDatasets/HIV
- **Task:** Binary Classification
- **Domain:** Drug Discovery
### Dataset 28
- **Link:** https://huggingface.co/datasets/BenchmarkDatasets/QM8_E1-CC2
- **Task:** Regression
- **Domain:** Drug Discovery
### Dataset 29
- **Link:** https://huggingface.co/datasets/BenchmarkDatasets/QM8_f1-CC2
- **Task:** Regression
- **Domain:** Drug Discovery
### Dataset 30
- **Link:** https://huggingface.co/datasets/BenchmarkDatasets/QM9_g298
- **Task:** Regression
- **Domain:** Drug Discovery
### Dataset 31
- **Link:** https://huggingface.co/datasets/BenchmarkDatasets/QM9_gap
- **Task:** Regression
- **Domain:** Drug Discovery
### Dataset 32
- **Link:** https://huggingface.co/datasets/BenchmarkDatasets/QSAR_biodegradation
- **Task:** Binary Classification
- **Domain:** Drug Discovery
### Dataset 33
- **Link:** https://huggingface.co/datasets/BenchmarkDatasets/SIDER_gastro
- **Task:** Binary Classification
- **Domain:** Drug Discovery
### Dataset 34
- **Link:** https://huggingface.co/datasets/BenchmarkDatasets/SIDER_nervous
- **Task:** Binary Classification
- **Domain:** Drug Discovery
### Dataset 35
- **Link:** https://huggingface.co/datasets/BenchmarkDatasets/Tox21_NRAhR
- **Task:** Binary Classification
- **Domain:** Drug Discovery
### Dataset 36
- **Link:** https://huggingface.co/datasets/BenchmarkDatasets/Tox21_NRER
- **Task:** Binary Classification
- **Domain:** Drug Discovery
### Dataset 37
- **Link:** https://huggingface.co/datasets/BenchmarkDatasets/Tox21_SRMMP
- **Task:** Binary Classification
- **Domain:** Drug Discovery
### Dataset 38
- **Link:** https://huggingface.co/datasets/BenchmarkDatasets/bioresponse
- **Task:** Binary Classification
- **Domain:** Drug Discovery
### Dataset 39
- **Link:** https://huggingface.co/datasets/BenchmarkDatasets/vep_pathogenic_coding
- **Task:** Binary Classification
- **Domain:** Drug Discovery
### Dataset 40
- **Link:** https://huggingface.co/datasets/BenchmarkDatasets/continental_1k
- **Task:** MultiClass Classification
- **Domain:** Genomics
### Dataset 41
- **Link:** https://huggingface.co/datasets/BenchmarkDatasets/continental_50k
- **Task:** MultiClass Classification
- **Domain:** Genomics
### Dataset 42
- **Link:** https://huggingface.co/datasets/BenchmarkDatasets/continental_50k_missing
- **Task:** MultiClass Classification
- **Domain:** Genomics
### Dataset 43
- **Link:** https://huggingface.co/datasets/BenchmarkDatasets/coords_1k
- **Task:** Multi-Target Regression
- **Domain:** Genomics
### Dataset 44
- **Link:** https://huggingface.co/datasets/BenchmarkDatasets/coords_50k
- **Task:** Multi-Target Regression
- **Domain:** Genomics
### Dataset 45
- **Link:** https://huggingface.co/datasets/BenchmarkDatasets/dna
- **Task:** MultiClass Classification
- **Domain:** Genomics
### Dataset 46
- **Link:** https://huggingface.co/datasets/BenchmarkDatasets/pca_1k
- **Task:** Multi-Target Regression
- **Domain:** Genomics
### Dataset 47
- **Link:** https://huggingface.co/datasets/BenchmarkDatasets/pca_50k
- **Task:** Multi-Target Regression
- **Domain:** Genomics
### Dataset 48
- **Link:** https://huggingface.co/datasets/BenchmarkDatasets/sexcheck_chrX_50
- **Task:** Binary Classification
- **Domain:** Genomics
### Dataset 49
- **Link:** https://huggingface.co/datasets/BenchmarkDatasets/sexcheck_chrX_500
- **Task:** Binary Classification
- **Domain:** Genomics
### Dataset 50
- **Link:** https://huggingface.co/datasets/BenchmarkDatasets/sexcheck_chrX_50k
- **Task:** Binary Classification
- **Domain:** Genomics
### Dataset 51
- **Link:** https://huggingface.co/datasets/BenchmarkDatasets/subpop_1000_consec
- **Task:** MultiClass Classification
- **Domain:** Genomics
### Dataset 52
- **Link:** https://huggingface.co/datasets/BenchmarkDatasets/subpop_50k
- **Task:** MultiClass Classification
- **Domain:** Genomics
### Dataset 53
- **Link:** https://huggingface.co/datasets/BenchmarkDatasets/Gum1_s
- **Task:** Regression
- **Domain:** Metabolomics
### Dataset 54
- **Link:** https://huggingface.co/datasets/BenchmarkDatasets/MTBLS136
- **Task:** Binary Classification
- **Domain:** Metabolomics
### Dataset 55
- **Link:** https://huggingface.co/datasets/BenchmarkDatasets/MTBLS161
- **Task:** Binary Classification
- **Domain:** Metabolomics
### Dataset 56
- **Link:** https://huggingface.co/datasets/BenchmarkDatasets/MTBLS404
- **Task:** Binary Classification
- **Domain:** Metabolomics
### Dataset 57
- **Link:** https://huggingface.co/datasets/BenchmarkDatasets/Noc_s
- **Task:** Regression
- **Domain:** Metabolomics
### Dataset 58
- **Link:** https://huggingface.co/datasets/BenchmarkDatasets/PT-Mush21
- **Task:** MultiClass Classification
- **Domain:** Metabolomics
### Dataset 59
- **Link:** https://huggingface.co/datasets/BenchmarkDatasets/ST000369
- **Task:** Binary Classification
- **Domain:** Metabolomics
### Dataset 60
- **Link:** https://huggingface.co/datasets/BenchmarkDatasets/ST000496
- **Task:** Binary Classification
- **Domain:** Metabolomics
### Dataset 61
- **Link:** https://huggingface.co/datasets/BenchmarkDatasets/ST001000
- **Task:** Binary Classification
- **Domain:** Metabolomics
### Dataset 62
- **Link:** https://huggingface.co/datasets/BenchmarkDatasets/ST001047
- **Task:** Binary Classification
- **Domain:** Metabolomics
### Dataset 63
- **Link:** https://huggingface.co/datasets/BenchmarkDatasets/avida_hil6_onehot
- **Task:** Binary Classification
- **Domain:** Proteomics
### Dataset 64
- **Link:** https://huggingface.co/datasets/BenchmarkDatasets/avida_htnfa_3mer
- **Task:** Binary Classification
- **Domain:** Proteomics
### Dataset 65
- **Link:** https://huggingface.co/datasets/BenchmarkDatasets/avida_htnfa_esm2
- **Task:** Binary Classification
- **Domain:** Proteomics
### Dataset 66
- **Link:** https://huggingface.co/datasets/BenchmarkDatasets/avida_htnfa_onehot
- **Task:** Binary Classification
- **Domain:** Proteomics
### Dataset 67
- **Link:** https://huggingface.co/datasets/BenchmarkDatasets/avida_sarscov2_3mer
- **Task:** Binary Classification
- **Domain:** Proteomics
### Dataset 68
- **Link:** https://huggingface.co/datasets/BenchmarkDatasets/cptac_survival
- **Task:** Binary Classification
- **Domain:** Proteomics
### Dataset 69
- **Link:** https://huggingface.co/datasets/BenchmarkDatasets/proteinea_solubility
- **Task:** Binary Classification
- **Domain:** Proteomics
### Dataset 70
- **Link:** https://huggingface.co/datasets/BenchmarkDatasets/true_betalactamase_complete
- **Task:** Regression
- **Domain:** Proteomics
### Dataset 71
- **Link:** https://huggingface.co/datasets/BenchmarkDatasets/true_fluorescence
- **Task:** Regression
- **Domain:** Proteomics
### Dataset 72
- **Link:** https://huggingface.co/datasets/BenchmarkDatasets/true_melting_point
- **Task:** Regression
- **Domain:** Proteomics
### Dataset 73
- **Link:** https://huggingface.co/datasets/BenchmarkDatasets/hnoca_cells30
- **Task:** MultiClass Classification
- **Domain:** Single-cell
### Dataset 74
- **Link:** https://huggingface.co/datasets/BenchmarkDatasets/sc_bonemarrow_velocity_umap
- **Task:** Multi-Target Regression
- **Domain:** Single-cell
### Dataset 75
- **Link:** https://huggingface.co/datasets/BenchmarkDatasets/sc_dentategyrus_latenttime
- **Task:** Regression
- **Domain:** Single-cell
### Dataset 76
- **Link:** https://huggingface.co/datasets/BenchmarkDatasets/sc_dentategyrus_transition
- **Task:** Regression
- **Domain:** Single-cell
### Dataset 77
- **Link:** https://huggingface.co/datasets/BenchmarkDatasets/sc_dentategyrus_velocity_umap
- **Task:** Multi-Target Regression
- **Domain:** Single-cell
### Dataset 78
- **Link:** https://huggingface.co/datasets/BenchmarkDatasets/tahoe_batch_integration_14plates
- **Task:** MultiClass Classification
- **Domain:** Single-cell
### Dataset 79
- **Link:** https://huggingface.co/datasets/BenchmarkDatasets/tahoe_cell_name_1000
- **Task:** MultiClass Classification
- **Domain:** Single-cell
### Dataset 80
- **Link:** https://huggingface.co/datasets/BenchmarkDatasets/tahoe_cell_name_500
- **Task:** MultiClass Classification
- **Domain:** Single-cell
### Dataset 81
- **Link:** https://huggingface.co/datasets/BenchmarkDatasets/tahoe_cell_name_5000
- **Task:** MultiClass Classification
- **Domain:** Single-cell
### Dataset 82
- **Link:** https://huggingface.co/datasets/BenchmarkDatasets/tahoe_g2m_score
- **Task:** Regression
- **Domain:** Single-cell
### Dataset 83
- **Link:** https://huggingface.co/datasets/BenchmarkDatasets/MultiOmics_GS-BRCA
- **Task:** MultiClass Classification
- **Domain:** Systems Biology
### Dataset 84
- **Link:** https://huggingface.co/datasets/BenchmarkDatasets/MultiOmics_GS-COAD
- **Task:** MultiClass Classification
- **Domain:** Systems Biology
### Dataset 85
- **Link:** https://huggingface.co/datasets/BenchmarkDatasets/MultiOmics_GS-GBM
- **Task:** MultiClass Classification
- **Domain:** Systems Biology
### Dataset 86
- **Link:** https://huggingface.co/datasets/BenchmarkDatasets/MultiOmics_GS-LGG
- **Task:** MultiClass Classification
- **Domain:** Systems Biology
### Dataset 87
- **Link:** https://huggingface.co/datasets/BenchmarkDatasets/MultiOmics_GS-OV
- **Task:** MultiClass Classification
- **Domain:** Systems Biology
### Dataset 88
- **Link:** https://huggingface.co/datasets/BenchmarkDatasets/ALLAML
- **Task:** Binary Classification
- **Domain:** Transcriptomics
### Dataset 89
- **Link:** https://huggingface.co/datasets/BenchmarkDatasets/CLL_SUB_111
- **Task:** MultiClass Classification
- **Domain:** Transcriptomics
### Dataset 90
- **Link:** https://huggingface.co/datasets/BenchmarkDatasets/GLIOMA
- **Task:** MultiClass Classification
- **Domain:** Transcriptomics
### Dataset 91
- **Link:** https://huggingface.co/datasets/BenchmarkDatasets/GLI_85
- **Task:** Binary Classification
- **Domain:** Transcriptomics
### Dataset 92
- **Link:** https://huggingface.co/datasets/BenchmarkDatasets/Prostate_GE
- **Task:** Binary Classification
- **Domain:** Transcriptomics
### Dataset 93
- **Link:** https://huggingface.co/datasets/BenchmarkDatasets/SMK_CAN_187
- **Task:** Gene Expression Classification
- **Domain:** Transcriptomics
### Dataset 94
- **Link:** https://huggingface.co/datasets/BenchmarkDatasets/TOX_171
- **Task:** MultiClass Classification
- **Domain:** Transcriptomics
### Dataset 95
- **Link:** https://huggingface.co/datasets/BenchmarkDatasets/lung
- **Task:** MultiClass Classification
- **Domain:** Transcriptomics