Upload README.md
Browse files
README.md
CHANGED
|
@@ -115,13 +115,13 @@ Check `microsoft/Phi-3.5-mini-instruct` for details about the tokenizer, require
|
|
| 115 |
### Training Data
|
| 116 |
|
| 117 |
Continual Pre-training:
|
| 118 |
-
-
|
| 119 |
- Medical Guideline `epfl-llm/guidelines`.
|
| 120 |
- Medical Wikipedia `jpcorb20/medical_wikipedia`.
|
| 121 |
-
- Medical Coding: ICD10CM, ICD10PROC, ICD9CM, ICD9PROC, and
|
| 122 |
- Clinical documents:
|
| 123 |
- `zhengyun21/PMC-Patients`, `akemiH/NoteChat`, and `starmpcc/Asclepius-Synthetic-Clinical-Notes` (only commercial-friendly licenses across all three datasets)
|
| 124 |
-
-
|
| 125 |
|
| 126 |
Clinical alignment:
|
| 127 |
- `microsoft/mediflow`
|
|
|
|
| 115 |
### Training Data
|
| 116 |
|
| 117 |
Continual Pre-training:
|
| 118 |
+
- PubMed (commercial subset) and abstracts from `ncbi/pubmed`.
|
| 119 |
- Medical Guideline `epfl-llm/guidelines`.
|
| 120 |
- Medical Wikipedia `jpcorb20/medical_wikipedia`.
|
| 121 |
+
- Medical Coding: ICD10CM, ICD10PROC, ICD9CM, ICD9PROC, and ATC.
|
| 122 |
- Clinical documents:
|
| 123 |
- `zhengyun21/PMC-Patients`, `akemiH/NoteChat`, and `starmpcc/Asclepius-Synthetic-Clinical-Notes` (only commercial-friendly licenses across all three datasets)
|
| 124 |
+
- mtsamples
|
| 125 |
|
| 126 |
Clinical alignment:
|
| 127 |
- `microsoft/mediflow`
|