| language: en | |
| license: cc-by-4.0 | |
| tags: | |
| - Clinical notes | |
| - Discharge summaries | |
| - longformer | |
| datasets: | |
| - MIMIC-III | |
| * Continue pre-training RoBERTa-base using discharge summaries from MIMIC-III datasets. | |
| * Details can be found in the following paper | |
| > Xiang Dai and Ilias Chalkidis and Sune Darkner and Desmond Elliott. 2022. Revisiting Transformer-based Models for Long Document Classification. (https://arxiv.org/abs/2204.06683) | |
| * Important hyper-parameters | |
| | | | | |
| |---|---| | |
| | Max sequence | 4096 | | |
| | Batch size | 8 | | |
| | Learning rate | 5e-5 | | |
| | Training epochs | 6 | | |
| | Training time | 130 GPU-hours | |