π§Ύ Model Card: Maternal Morbidity Risk Classifiers (Indiana 2010β2020)
Model Name: maternal-morbidity-classifier-suite-indiana-2010-2020
Version: 1.0
Developed by: Dr. Cary Woods, HarnessAI
License: GNU General Public License v3.0
Release Date: May 2025
Trained on: 813,837 public birth and infant death records from Indiana (2010β2020)
π¬ Research Context
These models were developed as part of the research paper:
"Predictive Modeling of Maternal Morbidity: Insights from a Decade of Regional Birth Data (2010β2020)"
Published on ResearchGate, May 2025.
π DOI: http://dx.doi.org/10.13140/RG.2.2.26163.13608
The study investigates the use of machine learning to predict maternal birth complications using administrative birth record data. The work focuses on model interpretability, sensitivity to rare events, and the challenges posed by missing socioeconomic indicators in public health datasets.
π Overview
This release includes three supervised learning classifiers trained to predict maternal morbidity from Indiana birth records: Logistic Regression, Random Forest, and Gradient Boosting. Each model uses the same engineered feature set and preprocessing pipeline. The classifiers were trained to identify rare maternal complication outcomes and are optimized for high recall.
π Intended Use
- Triage support for public health researchers and analysts
- Educational use in public health informatics curricula
- Demonstration of risk modeling in imbalanced health datasets
- Not for direct clinical deployment without validation
π Performance
| Model | Precision | Recall | F1 Score | ROC-AUC |
|---|---|---|---|---|
| Logistic Regression | 0.75 | 0.70 | 0.72 | 0.81 |
| Random Forest | 0.80 | 0.77 | 0.78 | 0.86 |
| Gradient Boosting | 0.83 | 0.80 | 0.81 | 0.89 |
Gradient Boosting showed the highest overall performance. Logistic Regression offers strong interpretability, and Random Forest provides a reliable non-linear baseline.
π§ͺ Limitations
- Omits social determinants like race, education, and income
- Reflects administrative data only (not clinical records)
- Binary outcome may oversimplify severity levels
- Developed on Indiana data; needs regional validation
π Reuse & Redistribution
This model suite is released under GPL v3. You may reuse, modify, and redistribute it, provided derivative works maintain this license and include attribution.
π¦ Files Included
model_lr.joblibβ Logistic Regression classifiermodel_rf.joblibβ Random Forest classifiermodel_gb.joblibβ Gradient Boosting classifiertest_models.pyβ test scriptREADME.mdβ Usage guideLICENSE.txtβ GPL v3 license