|
|
--- |
|
|
library_name: transformers |
|
|
language: en |
|
|
license: apache-2.0 |
|
|
datasets: |
|
|
- community-datasets/ohsumed |
|
|
base_model: |
|
|
- google-bert/bert-base-uncased |
|
|
--- |
|
|
|
|
|
# Model Card: BERT-Ohsumed |
|
|
|
|
|
An in-domain BERT-base model, pre-trained from scratch on the Ohsumed dataset text. |
|
|
|
|
|
## Model Details |
|
|
|
|
|
### Description |
|
|
|
|
|
This model is based on the [BERT base (uncased)](https://huggingface.co/google-bert/bert-base-uncased) |
|
|
architecture and was pre-trained from scratch (in-domain) using the text in Ohsumed dataset, excluding its test split. |
|
|
Only the masked language modeling (MLM) objective was used during pre-training. |
|
|
|
|
|
- **Developed by:** [Cesar Gonzalez-Gutierrez](https://ceguel.es) |
|
|
- **Funded by:** [ERC](https://erc.europa.eu) |
|
|
- **Architecture:** BERT-base |
|
|
- **Language:** English |
|
|
- **License:** Apache 2.0 |
|
|
- **Base model:** [BERT base model (uncased)](https://huggingface.co/google-bert/bert-base-uncased) |
|
|
|
|
|
### Checkpoints |
|
|
|
|
|
Intermediate checkpoints from the pre-training process are available and can be accessed using specific tags, |
|
|
which correspond to training epochs and steps: |
|
|
|
|
|
| Epoch | Step | Tags | | |
|
|
|---|---|---|---| |
|
|
| 1 | 98 | epoch-1 | step-98 | |
|
|
| 5 | 490 | epoch-5 | step-490 | |
|
|
| 10 | 980 | epoch-10 | step-980 | |
|
|
| 20 | 1960 | epoch-20 | step-1960 | |
|
|
| 30 | 2940 | epoch-30 | step-2940 | |
|
|
| 40 | 3920 | epoch-40 | step-3920 | |
|
|
| 50 | 4900 | epoch-50 | step-4900 | |
|
|
| 60 | 5880 | epoch-60 | step-5880 | |
|
|
| 70 | 6860 | epoch-70 | step-6860 | |
|
|
| 80 | 7840 | epoch-80 | step-7840 | |
|
|
| 90 | 8820 | epoch-90 | step-8820 | |
|
|
| 100 | 9800 | epoch-100 | step-9800 | |
|
|
|
|
|
To load a model from a specific intermediate checkpoint, use the `revision` parameter with the corresponding tag: |
|
|
```python |
|
|
from transformers import AutoModelForMaskedLM |
|
|
|
|
|
model = AutoModelForMaskedLM.from_pretrained("<model-name>", revision="<checkpoint-tag>") |
|
|
``` |
|
|
|
|
|
### Sources |
|
|
|
|
|
- **Paper:** [Information pending] |
|
|
|
|
|
## Training Details |
|
|
|
|
|
For more details on the training procedure, please refer to the base model's documentation: |
|
|
[Training procedure](https://huggingface.co/google-bert/bert-base-uncased#training-procedure). |
|
|
|
|
|
### Training Data |
|
|
|
|
|
All texts from Ohsumed dataset, excluding the test partition. |
|
|
|
|
|
#### Training Hyperparameters |
|
|
|
|
|
- **Precision:** fp16 |
|
|
- **Batch size:** 32 |
|
|
- **Gradient accumulation steps:** 3 |
|
|
|
|
|
## Uses |
|
|
|
|
|
For typical use cases and limitations, please refer to the base model's guidance: |
|
|
[Inteded uses & limitations](https://huggingface.co/google-bert/bert-base-uncased#intended-uses--limitations). |
|
|
|
|
|
## Bias, Risks, and Limitations |
|
|
|
|
|
This model inherits potential risks and limitations from the base model. Refer to: |
|
|
[Limitations and bias](https://huggingface.co/google-bert/bert-base-uncased#limitations-and-bias). |
|
|
|
|
|
## Environmental Impact |
|
|
|
|
|
- **Hardware Type:** NVIDIA Tesla V100 PCIE 32GB |
|
|
- **Cluster Provider:** [Artemisa](https://artemisa.ific.uv.es/web/) |
|
|
- **Compute Region:** EU |
|
|
|
|
|
## Citation |
|
|
|
|
|
**BibTeX:** |
|
|
|
|
|
[More Information Needed] |
|
|
|