monajm36 commited on
Commit
e2ef18e
Β·
unverified Β·
1 Parent(s): 0ff9e8e

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +233 -100
README.md CHANGED
@@ -1,29 +1,43 @@
1
- # ohca-classifier-3.0
2
- BERT-based classifier for detecting Out-of-Hospital Cardiac Arrest (OHCA) cases in medical text
3
 
4
- ## NLP OHCA Classifier
5
- A BERT-based classifier for detecting Out-of-Hospital Cardiac Arrest (OHCA) cases in medical discharge notes using natural language processing.
 
 
 
 
 
 
 
 
 
 
 
6
 
7
  ## Overview
8
- This package provides two main modules:
9
 
10
- - **Training Pipeline** (`ohca_training_pipeline.py`) - Complete workflow from data annotation to model training
11
- - **Inference Module** (`ohca_inference.py`) - Apply pre-trained models to new datasets
12
 
13
  ## Features
14
 
15
- ### Training Pipeline
16
- - **Intelligent Sampling**: Two-stage sampling strategy (keyword-enriched + random)
17
- - **Annotation Interface**: Generates Excel files for manual annotation with guidelines
 
 
18
  - **BERT-based Training**: Uses PubMedBERT optimized for medical text
19
- - **Class Balancing**: Handles imbalanced datasets with oversampling
20
- - **Comprehensive Evaluation**: Clinical metrics including sensitivity, specificity, PPV, NPV
21
 
22
- ### Inference Module
23
- - **Pre-trained Model Loading**: Easy loading of trained OHCA models
 
24
  - **Batch Processing**: Efficient inference on large datasets
25
- - **Clinical Decision Support**: Probability thresholds and confidence categories
26
- - **Quality Analysis**: Built-in tools for analyzing prediction patterns
27
 
28
  ## Installation
29
 
@@ -56,101 +70,150 @@ pip install -e .
56
 
57
  ## Quick Start
58
 
59
- ### Training a New Model
 
60
  ```python
61
- from src.ohca_training_pipeline import create_training_sample, complete_annotation_and_train
62
  import pandas as pd
63
 
64
- # 1. Create annotation sample
65
- df = pd.read_csv("your_discharge_notes.csv") # Must have: hadm_id, clean_text
66
- annotation_df = create_training_sample(df, output_dir="./annotation_interface")
 
 
 
 
67
 
68
- # 2. Manually annotate the Excel file (ohca_annotation.xlsx)
 
 
69
  # Label each case: 1=OHCA, 0=Non-OHCA
70
 
71
- # 3. Train model after annotation
72
- results = complete_annotation_and_train(
73
- annotation_file="./annotation_interface/ohca_annotation.xlsx",
74
- model_save_path="./my_ohca_model",
 
 
 
 
75
  num_epochs=3
76
  )
 
 
 
77
  ```
78
 
79
- ### Using a Pre-trained Model
 
80
  ```python
81
- from src.ohca_inference import quick_inference
82
  import pandas as pd
83
 
84
- # Apply model to new data
85
  new_data = pd.read_csv("new_discharge_notes.csv") # Must have: hadm_id, clean_text
86
- results = quick_inference(
87
- model_path="./my_ohca_model",
88
  data_path=new_data,
89
  output_path="ohca_predictions.csv"
90
  )
91
 
92
- # View high-confidence predictions
93
- high_confidence = results[results['ohca_probability'] >= 0.8]
94
- print(f"Found {len(high_confidence)} high-confidence OHCA cases")
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
95
  ```
96
 
97
  ## Data Format
98
 
99
- ### Input Requirements
100
  Your CSV file must contain:
101
  - `hadm_id`: Unique identifier for each hospital admission
 
102
  - `clean_text`: Preprocessed discharge note text
103
 
104
  **Example:**
 
 
 
 
 
105
  ```
106
- hadm_id,clean_text
107
- 12345,"Chief complaint: Cardiac arrest at home. Patient found down by family..."
108
- 12346,"Chief complaint: Chest pain. Patient presents with acute onset chest pain..."
 
109
  ```
110
 
111
  ### Annotation Labels
112
- - `1`: OHCA case (cardiac arrest outside hospital)
113
- - `0`: Non-OHCA case (everything else, including all transfer cases)
114
 
115
  ## Module Documentation
116
 
117
- ### Training Pipeline (`ohca_training_pipeline.py`)
 
 
 
 
 
 
 
118
 
119
- **Main Functions:**
120
- - `create_training_sample()` - Create balanced annotation sample
121
- - `prepare_training_data()` - Process annotations for training
122
- - `train_ohca_model()` - Train BERT-based classifier
123
- - `evaluate_model()` - Comprehensive performance evaluation
124
- - `complete_training_pipeline()` - End-to-end training workflow
125
 
126
- **Example Usage:**
127
  ```python
128
- from src.ohca_training_pipeline import complete_training_pipeline
129
 
130
- # Complete training pipeline
131
- result = complete_training_pipeline(
132
  data_path="discharge_notes.csv",
133
- annotation_dir="./annotation",
134
- model_save_path="./trained_model"
 
135
  )
136
  ```
137
 
138
- ### Inference Module (`ohca_inference.py`)
 
 
 
 
 
 
139
 
140
- **Main Functions:**
141
- - `load_ohca_model()` - Load pre-trained model
142
- - `run_inference()` - Full inference with analysis
143
- - `quick_inference()` - Simple inference function
144
- - `process_large_dataset()` - Handle large datasets in chunks
145
- - `test_model_on_sample()` - Test on specific text samples
146
 
147
- **Example Usage:**
148
  ```python
149
- from src.ohca_inference import run_inference, load_ohca_model
150
 
151
- # Load model and run inference
152
- model, tokenizer = load_ohca_model("./trained_model")
153
- results = run_inference(model, tokenizer, new_data_df)
 
 
154
  ```
155
 
156
  ## Model Architecture
@@ -159,10 +222,17 @@ results = run_inference(model, tokenizer, new_data_df)
159
  - **Max Sequence Length**: 512 tokens
160
  - **Optimization**: AdamW with linear learning rate scheduling
161
  - **Class Balancing**: Weighted loss + minority class oversampling
 
162
 
163
  ## Performance Metrics
164
- The model reports comprehensive clinical metrics:
165
 
 
 
 
 
 
 
 
166
  - **Sensitivity (Recall)**: Percentage of OHCA cases correctly identified
167
  - **Specificity**: Percentage of non-OHCA cases correctly identified
168
  - **Precision (PPV)**: When model predicts OHCA, percentage that are correct
@@ -172,30 +242,40 @@ The model reports comprehensive clinical metrics:
172
 
173
  ## Clinical Usage
174
 
175
- ### Probability Thresholds
176
- - **β‰₯0.9**: Very high confidence - Priority manual review
177
- - **0.7-0.9**: High confidence - Clinical review recommended
178
- - **0.3-0.7**: Uncertain - Manual review suggested
179
- - **<0.3**: Low probability - Likely non-OHCA
180
 
181
- ### Workflow Integration
182
- 1. Run inference on new discharge notes
183
- 2. Prioritize high-confidence predictions for review
184
- 3. Use medium-confidence cases for quality improvement
185
- 4. Monitor low-confidence cases for false negatives
 
 
 
 
 
 
 
 
 
 
 
 
 
186
 
187
  ## Repository Structure
188
  ```
189
- nlp-ohca-classifier/
190
  β”œβ”€β”€ src/
191
  β”‚ β”œβ”€β”€ __init__.py
192
- β”‚ β”œβ”€β”€ ohca_training_pipeline.py # Training workflow
193
- β”‚ └── ohca_inference.py # Inference on new data
194
  β”œβ”€β”€ examples/
195
- β”‚ β”œβ”€β”€ training_example.py # Complete training examples
196
- β”‚ └── inference_example.py # Inference usage examples
 
197
  β”œβ”€β”€ docs/
198
- β”‚ └── annotation_guidelines.md # Detailed annotation guidelines
199
  β”œβ”€β”€ requirements.txt
200
  β”œβ”€β”€ setup.py
201
  β”œβ”€β”€ README.md
@@ -204,64 +284,104 @@ nlp-ohca-classifier/
204
 
205
  ## Examples
206
 
207
- ### Complete Training Example
208
  ```bash
209
  cd examples
210
  python training_example.py
 
211
  ```
212
 
213
- ### Inference Examples
214
  ```bash
215
  cd examples
216
  python inference_example.py
 
 
 
 
 
 
 
 
217
  ```
218
 
219
  ## Advanced Usage
220
 
221
- ### Large Dataset Processing
222
  ```python
223
- from src.ohca_inference import process_large_dataset
224
 
225
- # Process 100K+ records in chunks
226
- process_large_dataset(
227
- model_path="./trained_model",
228
  data_path="large_dataset.csv",
229
  output_path="results.csv",
230
  chunk_size=5000
231
  )
232
  ```
233
 
234
- ### Model Testing
235
  ```python
236
  from src.ohca_inference import test_model_on_sample
237
 
238
- # Test on specific cases
239
  test_cases = {
240
  'case1': "Chief complaint: Cardiac arrest at home...",
241
  'case2': "Chief complaint: Chest pain, no arrest..."
242
  }
243
 
244
- results = test_model_on_sample("./trained_model", test_cases)
 
245
  ```
246
 
247
  ## Performance Benchmarks
248
- Typical performance on validation data:
249
- - **AUC-ROC**: 0.85-0.95
250
- - **Sensitivity**: 85-95%
251
- - **Specificity**: 85-95%
252
- - **F1-Score**: 0.7-0.9
 
 
 
 
 
 
 
 
253
 
254
  *Performance varies based on data quality and annotation consistency*
255
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
256
  ## Citation
257
  If you use this code in your research, please cite:
258
 
259
  ```bibtex
260
- @software{nlp_ohca_classifier,
261
- title={NLP OHCA Classifier: BERT-based Detection of Out-of-Hospital Cardiac Arrest in Medical Text},
262
  author={Mona Moukaddem},
263
  year={2025},
264
- url={https://github.com/monajm36/ohca-classifier-3.0}
 
265
  }
266
  ```
267
 
@@ -275,9 +395,22 @@ This project is licensed under the MIT License - see the LICENSE file for detail
275
  4. Push to the branch (`git push origin feature/AmazingFeature`)
276
  5. Open a Pull Request
277
 
 
 
 
 
 
 
 
 
 
 
 
 
278
 
279
  ## Acknowledgments
280
  - PubMedBERT model from Microsoft Research
281
  - MIMIC-III dataset for model development
282
  - Transformers library by Hugging Face
283
  - PyTorch for deep learning framework
 
 
1
+ # OHCA Classifier v3.0 - Improved Methodology
2
+ BERT-based classifier for detecting Out-of-Hospital Cardiac Arrest (OHCA) cases in medical text with enhanced machine learning methodology
3
 
4
+ ## NLP OHCA Classifier v3.0
5
+ A BERT-based classifier for detecting Out-of-Hospital Cardiac Arrest (OHCA) cases in medical discharge notes using improved natural language processing methodology that addresses key methodological concerns in medical AI.
6
+
7
+ ## Key Improvements in v3.0
8
+
9
+ This version implements significant methodological improvements based on data science best practices:
10
+
11
+ **Patient-Level Data Splits** - Prevents data leakage by ensuring all notes from the same patient stay in one split
12
+ **Proper Train/Validation/Test** - Uses independent test set for unbiased evaluation
13
+ **Optimal Threshold Finding** - Finds and saves optimal decision threshold during training
14
+ **Larger Training Samples** - 800+ training samples instead of 264
15
+ **Enhanced Clinical Decision Support** - Improved confidence categories and workflow integration
16
+ **Unbiased Evaluation** - Eliminates threshold tuning on test data
17
 
18
  ## Overview
19
+ This package provides two main modules with v3.0 enhancements:
20
 
21
+ - **Training Pipeline** (`ohca_training_pipeline.py`) - Complete workflow with improved methodology
22
+ - **Inference Module** (`ohca_inference.py`) - Apply models with optimal threshold support
23
 
24
  ## Features
25
 
26
+ ### Training Pipeline (Enhanced v3.0)
27
+ - **Patient-Level Splits**: Prevents data leakage between training and test sets
28
+ - **Dual Annotation Strategy**: Separate training and validation annotation files
29
+ - **Intelligent Sampling**: Two-stage sampling strategy (keyword-enriched + random)
30
+ - **Larger Sample Sizes**: 800 training + 200 validation samples
31
  - **BERT-based Training**: Uses PubMedBERT optimized for medical text
32
+ - **Optimal Threshold Finding**: Automatically finds best decision threshold
33
+ - **Unbiased Evaluation**: Independent test set for reliable performance estimates
34
 
35
+ ### Inference Module (Enhanced v3.0)
36
+ - **Optimal Threshold Usage**: Automatically uses threshold found during training
37
+ - **Enhanced Clinical Priorities**: Improved confidence categories for clinical workflow
38
  - **Batch Processing**: Efficient inference on large datasets
39
+ - **Clinical Decision Support**: Evidence-based probability thresholds
40
+ - **Backward Compatibility**: Works with both v3.0 and legacy models
41
 
42
  ## Installation
43
 
 
70
 
71
  ## Quick Start
72
 
73
+ ### Training a New Model (v3.0 Methodology - RECOMMENDED)
74
+
75
  ```python
76
+ from src.ohca_training_pipeline import complete_improved_training_pipeline
77
  import pandas as pd
78
 
79
+ # Step 1: Create patient-level splits and annotation samples
80
+ results = complete_improved_training_pipeline(
81
+ data_path="your_discharge_notes.csv", # Must have: hadm_id, subject_id, clean_text
82
+ annotation_dir="./annotation_v3",
83
+ train_sample_size=800, # Much larger than legacy
84
+ val_sample_size=200 # Separate validation sample
85
+ )
86
 
87
+ # Step 2: Manually annotate BOTH Excel files:
88
+ # - annotation_v3/train_annotation.xlsx (800 cases)
89
+ # - annotation_v3/validation_annotation.xlsx (200 cases)
90
  # Label each case: 1=OHCA, 0=Non-OHCA
91
 
92
+ # Step 3: Complete training (after annotation)
93
+ from src.ohca_training_pipeline import complete_annotation_and_train_v3
94
+
95
+ model_results = complete_annotation_and_train_v3(
96
+ train_annotation_file="./annotation_v3/train_annotation.xlsx",
97
+ val_annotation_file="./annotation_v3/validation_annotation.xlsx",
98
+ test_file="./annotation_v3/test_set_DO_NOT_ANNOTATE.csv",
99
+ model_save_path="./my_ohca_model_v3",
100
  num_epochs=3
101
  )
102
+
103
+ print(f"Optimal threshold: {model_results['optimal_threshold']:.3f}")
104
+ print(f"Model automatically uses this threshold during inference")
105
  ```
106
 
107
+ ### Using a Pre-trained v3.0 Model
108
+
109
  ```python
110
+ from src.ohca_inference import quick_inference_with_optimal_threshold
111
  import pandas as pd
112
 
113
+ # Apply v3.0 model to new data (uses optimal threshold automatically)
114
  new_data = pd.read_csv("new_discharge_notes.csv") # Must have: hadm_id, clean_text
115
+ results = quick_inference_with_optimal_threshold(
116
+ model_path="./my_ohca_model_v3", # v3.0 model with metadata
117
  data_path=new_data,
118
  output_path="ohca_predictions.csv"
119
  )
120
 
121
+ # Enhanced v3.0 results with clinical priorities
122
+ immediate_review = results[results['clinical_priority'] == 'Immediate Review']
123
+ priority_review = results[results['clinical_priority'] == 'Priority Review']
124
+
125
+ print(f"Immediate review needed: {len(immediate_review)} cases")
126
+ print(f"Priority review needed: {len(priority_review)} cases")
127
+ print(f"Optimal threshold used: {results['optimal_threshold_used'].iloc[0]:.3f}")
128
+ ```
129
+
130
+ ### Backward Compatibility (Legacy Models)
131
+
132
+ ```python
133
+ from src.ohca_inference import quick_inference
134
+
135
+ # Works with both v3.0 and legacy models
136
+ results = quick_inference(
137
+ model_path="./any_model", # Auto-detects model version
138
+ data_path="new_data.csv"
139
+ )
140
  ```
141
 
142
  ## Data Format
143
 
144
+ ### Input Requirements (Enhanced for v3.0)
145
  Your CSV file must contain:
146
  - `hadm_id`: Unique identifier for each hospital admission
147
+ - `subject_id`: Patient identifier (for patient-level splits to prevent data leakage)
148
  - `clean_text`: Preprocessed discharge note text
149
 
150
  **Example:**
151
+ ```csv
152
+ hadm_id,subject_id,clean_text
153
+ 12345,101,"Chief complaint: Cardiac arrest at home. Patient found down by family..."
154
+ 12346,102,"Chief complaint: Chest pain. Patient presents with acute onset chest pain..."
155
+ 12347,101,"Follow-up visit. Patient doing well after recent arrest..."
156
  ```
157
+
158
+ **If you don't have patient IDs**: Add this line to your preprocessing:
159
+ ```python
160
+ df['subject_id'] = df['hadm_id'] # Use admission ID as patient ID
161
  ```
162
 
163
  ### Annotation Labels
164
+ - `1`: OHCA case (cardiac arrest outside hospital, primary reason for admission)
165
+ - `0`: Non-OHCA case (everything else, including transfers and historical arrests)
166
 
167
  ## Module Documentation
168
 
169
+ ### Training Pipeline (Enhanced v3.0)
170
+
171
+ **Main v3.0 Functions (RECOMMENDED):**
172
+ - `complete_improved_training_pipeline()` - Create patient-level splits and annotation samples
173
+ - `complete_annotation_and_train_v3()` - Train with optimal threshold finding
174
+ - `create_patient_level_splits()` - Create proper data splits
175
+ - `find_optimal_threshold()` - Find optimal decision threshold
176
+ - `evaluate_on_test_set()` - Unbiased final evaluation
177
 
178
+ **Legacy Functions (Backward Compatible):**
179
+ - `create_training_sample()` - Legacy single-file annotation
180
+ - `complete_annotation_and_train()` - Legacy training workflow
 
 
 
181
 
182
+ **Example Usage (v3.0):**
183
  ```python
184
+ from src.ohca_training_pipeline import complete_improved_training_pipeline
185
 
186
+ # Enhanced training with proper methodology
187
+ result = complete_improved_training_pipeline(
188
  data_path="discharge_notes.csv",
189
+ annotation_dir="./annotation_v3",
190
+ train_sample_size=800,
191
+ val_sample_size=200
192
  )
193
  ```
194
 
195
+ ### Inference Module (Enhanced v3.0)
196
+
197
+ **Main v3.0 Functions (RECOMMENDED):**
198
+ - `quick_inference_with_optimal_threshold()` - Uses optimal threshold automatically
199
+ - `load_ohca_model_with_metadata()` - Load model with optimal threshold
200
+ - `run_inference_with_optimal_threshold()` - Enhanced inference
201
+ - `analyze_predictions_enhanced()` - Improved prediction analysis
202
 
203
+ **Legacy Functions (Backward Compatible):**
204
+ - `quick_inference()` - Auto-detects model version
205
+ - `load_ohca_model()` - Basic model loading
206
+ - `run_inference()` - Basic inference
 
 
207
 
208
+ **Example Usage (v3.0):**
209
  ```python
210
+ from src.ohca_inference import load_ohca_model_with_metadata, run_inference_with_optimal_threshold
211
 
212
+ # Load v3.0 model with optimal threshold
213
+ model, tokenizer, optimal_threshold, metadata = load_ohca_model_with_metadata("./trained_model")
214
+
215
+ # Run inference with optimal threshold
216
+ results = run_inference_with_optimal_threshold(model, tokenizer, new_data_df, optimal_threshold)
217
  ```
218
 
219
  ## Model Architecture
 
222
  - **Max Sequence Length**: 512 tokens
223
  - **Optimization**: AdamW with linear learning rate scheduling
224
  - **Class Balancing**: Weighted loss + minority class oversampling
225
+ - **Threshold Selection**: Optimal threshold found via validation set (v3.0)
226
 
227
  ## Performance Metrics
 
228
 
229
+ ### v3.0 Enhanced Evaluation
230
+ The model provides unbiased performance estimates using:
231
+ - **Independent test set** for final evaluation
232
+ - **Optimal threshold** found on validation set only
233
+ - **Patient-level splits** preventing data leakage
234
+
235
+ **Clinical Metrics:**
236
  - **Sensitivity (Recall)**: Percentage of OHCA cases correctly identified
237
  - **Specificity**: Percentage of non-OHCA cases correctly identified
238
  - **Precision (PPV)**: When model predicts OHCA, percentage that are correct
 
242
 
243
  ## Clinical Usage
244
 
245
+ ### Enhanced v3.0 Clinical Decision Support
 
 
 
 
246
 
247
+ **Clinical Priorities (v3.0):**
248
+ - **Immediate Review**: Very high probability cases requiring urgent attention
249
+ - **Priority Review**: High probability cases for clinical team review
250
+ - **Clinical Review**: Medium-high probability cases above optimal threshold
251
+ - **Consider Review**: Medium probability cases for potential review
252
+ - **Routine Processing**: Low probability cases
253
+
254
+ **Optimal Threshold Usage:**
255
+ - Model automatically uses threshold found during validation
256
+ - Consistent decision-making across all datasets
257
+ - Better performance than static thresholds
258
+
259
+ **Workflow Integration:**
260
+ 1. Run inference on new discharge notes (uses optimal threshold)
261
+ 2. Prioritize "Immediate Review" cases for urgent manual review
262
+ 3. Schedule "Priority Review" cases for clinical team evaluation
263
+ 4. Use "Clinical Review" cases for quality improvement
264
+ 5. Monitor routine cases for false negatives
265
 
266
  ## Repository Structure
267
  ```
268
+ ohca-classifier-3.0/
269
  β”œβ”€β”€ src/
270
  β”‚ β”œβ”€β”€ __init__.py
271
+ β”‚ β”œβ”€β”€ ohca_training_pipeline.py # Enhanced v3.0 training workflow
272
+ β”‚ └── ohca_inference.py # Enhanced v3.0 inference
273
  β”œβ”€β”€ examples/
274
+ β”‚ β”œβ”€β”€ training_example.py # v3.0 training examples
275
+ β”‚ β”œβ”€β”€ inference_example.py # v3.0 inference examples
276
+ β”‚ └── clif_dataset_example.py # Cross-institutional deployment
277
  β”œβ”€β”€ docs/
278
+ β”‚ └── annotation_guidelines.md # Enhanced annotation guidelines
279
  β”œβ”€β”€ requirements.txt
280
  β”œβ”€β”€ setup.py
281
  β”œβ”€β”€ README.md
 
284
 
285
  ## Examples
286
 
287
+ ### Complete v3.0 Training Example
288
  ```bash
289
  cd examples
290
  python training_example.py
291
+ # Choose option 1: v3.0 Training with Improved Methodology
292
  ```
293
 
294
+ ### Enhanced v3.0 Inference Examples
295
  ```bash
296
  cd examples
297
  python inference_example.py
298
+ # Choose option 1: v3.0 Inference with Optimal Threshold
299
+ ```
300
+
301
+ ### Cross-Institutional Deployment
302
+ ```bash
303
+ cd examples
304
+ python clif_dataset_example.py
305
+ # Apply v3.0 model to external datasets
306
  ```
307
 
308
  ## Advanced Usage
309
 
310
+ ### Large Dataset Processing (v3.0)
311
  ```python
312
+ from src.ohca_inference import process_large_dataset_with_optimal_threshold
313
 
314
+ # Process with optimal threshold automatically
315
+ process_large_dataset_with_optimal_threshold(
316
+ model_path="./trained_model_v3",
317
  data_path="large_dataset.csv",
318
  output_path="results.csv",
319
  chunk_size=5000
320
  )
321
  ```
322
 
323
+ ### Model Testing with v3.0 Features
324
  ```python
325
  from src.ohca_inference import test_model_on_sample
326
 
327
+ # Test with optimal threshold support
328
  test_cases = {
329
  'case1': "Chief complaint: Cardiac arrest at home...",
330
  'case2': "Chief complaint: Chest pain, no arrest..."
331
  }
332
 
333
+ results = test_model_on_sample("./trained_model_v3", test_cases)
334
+ # Results include optimal threshold predictions and clinical priorities
335
  ```
336
 
337
  ## Performance Benchmarks
338
+
339
+ ### v3.0 Methodology Performance
340
+ Typical performance with improved methodology:
341
+ - **AUC-ROC**: 0.85-0.95 (unbiased estimates)
342
+ - **Sensitivity**: 85-95% (at optimal threshold)
343
+ - **Specificity**: 85-95% (at optimal threshold)
344
+ - **F1-Score**: 0.7-0.9 (optimized via validation)
345
+
346
+ **Key Improvements over Legacy:**
347
+ - **Unbiased evaluation** using independent test set
348
+ - **Optimal threshold** provides better sensitivity/specificity balance
349
+ - **Larger training sets** (800 vs 264) improve generalization
350
+ - **Patient-level splits** prevent overoptimistic performance estimates
351
 
352
  *Performance varies based on data quality and annotation consistency*
353
 
354
+ ## Migration from Legacy Versions
355
+
356
+ ### Upgrading from Legacy to v3.0
357
+
358
+ **Benefits of Upgrading:**
359
+ - More reliable performance estimates
360
+ - Better clinical decision support
361
+ - Optimal threshold usage
362
+ - Enhanced workflow integration
363
+
364
+ **Migration Steps:**
365
+ 1. **Retrain with v3.0 methodology** using `complete_improved_training_pipeline()`
366
+ 2. **Add patient IDs** to your data (`subject_id` column)
367
+ 3. **Use v3.0 inference functions** for new predictions
368
+ 4. **Update workflows** to use clinical priorities
369
+
370
+ **Backward Compatibility:**
371
+ - Legacy models continue to work
372
+ - Legacy functions automatically detect model version
373
+ - Gradual migration supported
374
+
375
  ## Citation
376
  If you use this code in your research, please cite:
377
 
378
  ```bibtex
379
+ @software{nlp_ohca_classifier_v3,
380
+ title={NLP OHCA Classifier v3.0: BERT-based Detection of Out-of-Hospital Cardiac Arrest with Enhanced Methodology},
381
  author={Mona Moukaddem},
382
  year={2025},
383
+ url={https://github.com/monajm36/ohca-classifier-3.0},
384
+ note={Enhanced methodology addressing data leakage, threshold optimization, and evaluation bias}
385
  }
386
  ```
387
 
 
395
  4. Push to the branch (`git push origin feature/AmazingFeature`)
396
  5. Open a Pull Request
397
 
398
+ ## Support
399
+ For questions or issues:
400
+ - Check the [Issues](https://github.com/monajm36/ohca-classifier-3.0/issues) page
401
+ - Create a new issue if needed
402
+ - Review examples in the `examples/` folder
403
+
404
+ ## Methodology References
405
+ The v3.0 improvements are based on established machine learning best practices:
406
+ - Patient-level data splits prevent data leakage in healthcare AI
407
+ - Proper train/validation/test methodology ensures unbiased evaluation
408
+ - Optimal threshold finding improves clinical performance
409
+ - Larger sample sizes enhance model generalization
410
 
411
  ## Acknowledgments
412
  - PubMedBERT model from Microsoft Research
413
  - MIMIC-III dataset for model development
414
  - Transformers library by Hugging Face
415
  - PyTorch for deep learning framework
416
+ - Data science community for methodological guidance