jpcorb20 commited on
Commit
96d0042
·
verified ·
1 Parent(s): ec481f1

Upload README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -3
README.md CHANGED
@@ -115,13 +115,13 @@ Check `microsoft/Phi-3.5-mini-instruct` for details about the tokenizer, require
115
  ### Training Data
116
 
117
  Continual Pre-training:
118
- - [PubMed commercial](https://pubmed.ncbi.nlm.nih.gov/download/) and abstracts from `ncbi/pubmed`.
119
  - Medical Guideline `epfl-llm/guidelines`.
120
  - Medical Wikipedia `jpcorb20/medical_wikipedia`.
121
- - Medical Coding: ICD10CM, ICD10PROC, ICD9CM, ICD9PROC, and [ATC](https://en.wikipedia.org/wiki/Anatomical_Therapeutic_Chemical_Classification_System).
122
  - Clinical documents:
123
  - `zhengyun21/PMC-Patients`, `akemiH/NoteChat`, and `starmpcc/Asclepius-Synthetic-Clinical-Notes` (only commercial-friendly licenses across all three datasets)
124
- - [mtsamples](https://www.kaggle.com/datasets/atharvakaushik/mtsamples)
125
 
126
  Clinical alignment:
127
  - `microsoft/mediflow`
 
115
  ### Training Data
116
 
117
  Continual Pre-training:
118
+ - PubMed (commercial subset) and abstracts from `ncbi/pubmed`.
119
  - Medical Guideline `epfl-llm/guidelines`.
120
  - Medical Wikipedia `jpcorb20/medical_wikipedia`.
121
+ - Medical Coding: ICD10CM, ICD10PROC, ICD9CM, ICD9PROC, and ATC.
122
  - Clinical documents:
123
  - `zhengyun21/PMC-Patients`, `akemiH/NoteChat`, and `starmpcc/Asclepius-Synthetic-Clinical-Notes` (only commercial-friendly licenses across all three datasets)
124
+ - mtsamples
125
 
126
  Clinical alignment:
127
  - `microsoft/mediflow`