bionlp
/

RadReportX

yishuwei2019 commited on Nov 26, 2024

Commit

7ed06d4

verified ·

1 Parent(s): 11fa89a

Update README.md

Files changed (1) hide show

README.md CHANGED Viewed

@@ -12,7 +12,7 @@ tags:
 # RadReportX
 ### Model description
-Llama3.1-8B-instruct model fine tuned on synthetic data. There are two tasks that this model can achieve. The first task is an open-ended question, which is to detect phrases in a radiology report that represents an ICD-10 code. There is no restriction about the underlying disease. The second task is to detect disease out of 13 candidates from a radiology report. The candidate diseases are [Atelectasis, Cardiomegaly, Consolidation, Edema, Enlarged Cardiomediastinum, Fracture, Lung Lesion, Lung Opacity, Pleural Effusion, Pleural Other, Pneumonia, Pneumothorax, Support Devices]. When there are no diseases out of the candidates, the model will output 'Normal'.
 ### Training set and training process
 There are two sources of training data. The first set is generated by GPT4o. The second source comes from MIMIC-CXR dataset (https://arxiv.org/pdf/1901.07042), with labels being extracted by Negbio algorithm. The training is conducted using torchtune framework (https://github.com/pytorch/torchtune). For details, please refer to our paper listed below.

 # RadReportX
 ### Model description
+Llama3.1-8B-instruct model fine tuned on synthetic data. There are two tasks that this model can achieve. The first task is an open-ended question, which is to detect phrases in a radiology report that represents an ICD-10 code. There is no restriction about the underlying disease. The second task is to detect disease out of 13 candidates from a radiology report. The candidate diseases are [*Atelectasis, Cardiomegaly, Consolidation, Edema, Enlarged Cardiomediastinum, Fracture, Lung Lesion, Lung Opacity, Pleural Effusion, Pleural Other, Pneumonia, Pneumothorax, Support Devices*]. When there are no diseases out of the candidates, the model will output 'Normal'.
 ### Training set and training process
 There are two sources of training data. The first set is generated by GPT4o. The second source comes from MIMIC-CXR dataset (https://arxiv.org/pdf/1901.07042), with labels being extracted by Negbio algorithm. The training is conducted using torchtune framework (https://github.com/pytorch/torchtune). For details, please refer to our paper listed below.