| --- |
| license: mit |
| datasets: |
| - LeBabyOx/EEGParquet |
| language: |
| - en |
| metrics: |
| - accuracy |
| - f1 |
| - precision |
| - recall |
| - roc_auc |
| pipeline_tag: tabular-classification |
| library_name: sklearn |
| tags: |
| - eeg |
| - seizure-detection |
| - biomedical |
| - time-series |
| - imbalanced-data |
| - healthcare |
| - classical-ml |
| --- |
| |
|
|
|
|
| ## π§ Key Insights |
|
|
| - Tree-based models (RF, XGBoost) fail under extreme imbalance, predicting only the majority class. |
| - Linear models achieve high recall but suffer from extremely low precision. |
| - Threshold tuning significantly improves performance: |
| - F1 improved from 0.0085 β 0.0769 (LogReg) |
|
|
| --- |
|
|
| ## βοΈ Usage |
|
|
| ```python |
| import joblib |
| |
| model = joblib.load("models/logistic_regression.joblib") |
| |
| preds = model.predict(X) |
| ``` |
|
|
| ## β οΈ Limitations |
| Models struggle with extreme imbalance (~1600:1) |
| Poor generalization across subjects (LOSO results) |
| Classical ML is insufficient for robust seizure detection in this setting |
|
|
| ## π Citation |
|
|
| If you use this model, please cite: |
| ``` |
| @dataset{eegparquet_benchmark_2026, |
| title={EEGParquet-Benchmark: Windowed and Feature-Enriched EEG Dataset for Seizure Detection}, |
| author={Daffa Tarigan}, |
| year={2026}, |
| publisher={Hugging Face} |
| } |
| ``` |
|
|
| ## π Notes |
|
|
| This repository is intended for: |
|
|
| Benchmarking classical ML under imbalance |
| Demonstrating limitations of accuracy-based evaluation |
| Supporting research in biomedical signal classification |
|
|
| --- |
| ## 1. Folder structure (important) |
| ``` |
| /models |
| βββ logistic_regression.joblib |
| βββ random_forest.joblib |
| βββ svm_rbf_cuml_gpu.joblib |
| βββ xgboost_gpu_optuna.joblib |
| ``` |