LeBabyOx
/

EEGDetectionMLBaseline

Tabular Classification

seizure-detection

imbalanced-data

Model card Files Files and versions

EEGDetectionMLBaseline / README.md

LeBabyOx's picture

Create README.md

d721840 verified 11 days ago

|

history blame contribute delete

1.59 kB

	---
	license: mit
	datasets:
	- LeBabyOx/EEGParquet
	language:
	- en
	metrics:
	- accuracy
	- f1
	- precision
	- recall
	- roc_auc
	pipeline_tag: tabular-classification
	library_name: sklearn
	tags:
	- eeg
	- seizure-detection
	- biomedical
	- time-series
	- imbalanced-data
	- healthcare
	- classical-ml
	---



	## 🧠 Key Insights

	- Tree-based models (RF, XGBoost) fail under extreme imbalance, predicting only the majority class.
	- Linear models achieve high recall but suffer from extremely low precision.
	- Threshold tuning significantly improves performance:
	- F1 improved from 0.0085 → 0.0769 (LogReg)

	---

	## ⚙️ Usage

	```python
	import joblib

	model = joblib.load("models/logistic_regression.joblib")

	preds = model.predict(X)
	```

	## ⚠️ Limitations
	Models struggle with extreme imbalance (~1600:1)
	Poor generalization across subjects (LOSO results)
	Classical ML is insufficient for robust seizure detection in this setting

	## 📚 Citation

	If you use this model, please cite:
	```
	@dataset{eegparquet_benchmark_2026,
	title={EEGParquet-Benchmark: Windowed and Feature-Enriched EEG Dataset for Seizure Detection},
	author={Daffa Tarigan},
	year={2026},
	publisher={Hugging Face}
	}
	```

	## 🚀 Notes

	This repository is intended for:

	Benchmarking classical ML under imbalance
	Demonstrating limitations of accuracy-based evaluation
	Supporting research in biomedical signal classification

	---
	## 1. Folder structure (important)
	```
	/models
	├── logistic_regression.joblib
	├── random_forest.joblib
	├── svm_rbf_cuml_gpu.joblib
	├── xgboost_gpu_optuna.joblib
	```