Update README.md
Browse files
README.md
CHANGED
|
@@ -28,6 +28,33 @@ The instructions and training datasets used are:
|
|
| 28 |
- `pubmedqa_testset.csv` for inference.
|
| 29 |
|
| 30 |
The datasets are available under `med-rcq/med-rcq-dataset`: https://huggingface.co/datasets/med-rcq/med-rcq-dataset/tree/main
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 31 |
|
| 32 |
## Environment Setup
|
| 33 |
- OS: Ubuntu 22.04.3
|
|
|
|
| 28 |
- `pubmedqa_testset.csv` for inference.
|
| 29 |
|
| 30 |
The datasets are available under `med-rcq/med-rcq-dataset`: https://huggingface.co/datasets/med-rcq/med-rcq-dataset/tree/main
|
| 31 |
+
## Training and Inference Parameters
|
| 32 |
+
|
| 33 |
+
| Training Parameter | MedConclusion |
|
| 34 |
+
|----------------------------------|---------------|
|
| 35 |
+
| Learning rate | 2e-04 |
|
| 36 |
+
| Seed | 42 |
|
| 37 |
+
| Scheduler | cosine |
|
| 38 |
+
| Warmup Ratio | 0.05 |
|
| 39 |
+
| Optimizer | AdamW |
|
| 40 |
+
| Gradient Accumulation steps | 4 |
|
| 41 |
+
| Train batch size per device | 8 |
|
| 42 |
+
| Effective Batch size | 192 |
|
| 43 |
+
| Evaluation batch size | 4 |
|
| 44 |
+
| Cut-off Length | 1024 |
|
| 45 |
+
| Number of GPU | 6 |
|
| 46 |
+
| LoRA Rank | 32 |
|
| 47 |
+
| LoRA Alpha | 32 |
|
| 48 |
+
| LoRA Dropout | 0.05 |
|
| 49 |
+
| LoRA Target | All |
|
| 50 |
+
| Number of epochs | 1 |
|
| 51 |
+
| Max Grad Norm | 1.0 |
|
| 52 |
+
| **Training Time** | **11.5 hours** |
|
| 53 |
+
|
| 54 |
+
| Inference Parameter | MedConclusion |
|
| 55 |
+
|----------------------|---------------|
|
| 56 |
+
| Temperature | 0.01 |
|
| 57 |
+
| Max Token | 250 |
|
| 58 |
|
| 59 |
## Environment Setup
|
| 60 |
- OS: Ubuntu 22.04.3
|