File size: 649 Bytes
53a6def
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
# Reduction

This folder contains scripts for combining, reducing, filling and scaling processed EHR data for modelling. Scripts should be run in the below order.

Note that scripts must be run in the below order:
1. `combine.py` - combine datasets and perform any post-processing
2. `post_prod_reduction.py` - Combine columns to reduce 0 values
3. `remove_ids.py` - remove receiver, scale up and test IDs
4. `clean_and_scale_train.py` - impute nulls and min-max scale training data
5. `clean_and_scale_test.py` - impute nulls and min-max scale testing data

_NB: The data_type in `clean_and_scale_test.py` can be changed to rec, sup, val and test._