IamGrooooot's picture
Model E: Unsupervised PCA + clustering risk stratification
53a6def

Processing

This folder contains scripts for processing raw EHR data, along with the mappings required to carry out the initial processing steps.

Before running any scripts, first create a directory called 'Model_E_Extracts' within the 'S:/data' directory.

NB: The below processing scripts can be run in any order.

Admissions

  • process_admissions.py - SMR01 COPD/Resp admissions per patient per year
  • process_comorbidities.py - SMR01 comorbidities per patient per year

Demographics

  • process_demographics.py - DOB, sex, marital status and SIMD data

Labs

  • process_labs.py - lab test values per patient per year, taking the median lab test value from the 2 years prior

Prescribing

  • process_prescribing.py - prescriptions per patient per year