|
|
--- |
|
|
library_name: transformers |
|
|
language: en |
|
|
license: apache-2.0 |
|
|
datasets: |
|
|
- stanfordnlp/imdb |
|
|
base_model: |
|
|
- google-bert/bert-base-uncased |
|
|
--- |
|
|
|
|
|
# Model Card: BERT-IMDb |
|
|
|
|
|
An in-domain BERT-base model, pre-trained from scratch on the IMDb dataset text. |
|
|
|
|
|
## Model Details |
|
|
|
|
|
### Description |
|
|
|
|
|
This model is based on the [BERT base (uncased)](https://huggingface.co/google-bert/bert-base-uncased) |
|
|
architecture and was pre-trained from scratch (in-domain) using the text in IMDb dataset, excluding its test split. |
|
|
Only the masked language modeling (MLM) objective was used during pre-training. |
|
|
|
|
|
- **Developed by:** [Cesar Gonzalez-Gutierrez](https://ceguel.es) |
|
|
- **Funded by:** [ERC](https://erc.europa.eu) |
|
|
- **Architecture:** BERT-base |
|
|
- **Language:** English |
|
|
- **License:** Apache 2.0 |
|
|
- **Base model:** [BERT base model (uncased)](https://huggingface.co/google-bert/bert-base-uncased) |
|
|
|
|
|
### Checkpoints |
|
|
|
|
|
Intermediate checkpoints from the pre-training process are available and can be accessed using specific tags, |
|
|
which correspond to training epochs and steps: |
|
|
|
|
|
| Epoch | Step | Tags | | |
|
|
|---|---|---|---| |
|
|
| 1 | 703 | epoch-1 | step-703 | |
|
|
| 5 | 3516 | epoch-5 | step-3516 | |
|
|
| 10 | 7033 | epoch-10 | step-7033 | |
|
|
| 20 | 14066 | epoch-20 | step-14066 | |
|
|
| 30 | 21100 | epoch-30 | step-21100 | |
|
|
| 40 | 28133 | epoch-40 | step-28133 | |
|
|
| 50 | 35166 | epoch-50 | step-35166 | |
|
|
| 60 | 42200 | epoch-60 | step-42200 | |
|
|
| 70 | 49233 | epoch-70 | step-49233 | |
|
|
| 80 | 56240 | epoch-80 | step-56240 | |
|
|
|
|
|
To load a model from a specific intermediate checkpoint, use the `revision` parameter with the corresponding tag: |
|
|
```python |
|
|
from transformers import AutoModelForMaskedLM |
|
|
|
|
|
model = AutoModelForMaskedLM.from_pretrained("<model-name>", revision="<checkpoint-tag>") |
|
|
``` |
|
|
|
|
|
### Sources |
|
|
|
|
|
- **Paper:** [Information pending] |
|
|
|
|
|
## Training Details |
|
|
|
|
|
For more details on the training procedure, please refer to the base model's documentation: |
|
|
[Training procedure](https://huggingface.co/google-bert/bert-base-uncased#training-procedure). |
|
|
|
|
|
### Training Data |
|
|
|
|
|
All texts from IMDb dataset, excluding the test partition. |
|
|
|
|
|
#### Training Hyperparameters |
|
|
|
|
|
- **Precision:** fp16 |
|
|
- **Batch size:** 32 |
|
|
- **Gradient accumulation steps:** 3 |
|
|
|
|
|
## Uses |
|
|
|
|
|
For typical use cases and limitations, please refer to the base model's guidance: |
|
|
[Inteded uses & limitations](https://huggingface.co/google-bert/bert-base-uncased#intended-uses--limitations). |
|
|
|
|
|
## Bias, Risks, and Limitations |
|
|
|
|
|
This model inherits potential risks and limitations from the base model. Refer to: |
|
|
[Limitations and bias](https://huggingface.co/google-bert/bert-base-uncased#limitations-and-bias). |
|
|
|
|
|
## Environmental Impact |
|
|
|
|
|
- **Hardware Type:** NVIDIA Tesla V100 PCIE 32GB |
|
|
- **Runtime:** 11.5 h |
|
|
- **Cluster Provider:** [Artemisa](https://artemisa.ific.uv.es/web/) |
|
|
- **Compute Region:** EU |
|
|
- **Carbon Emitted:** 2.14 kg CO2 eq. |
|
|
|
|
|
## Citation |
|
|
|
|
|
**BibTeX:** |
|
|
|
|
|
[More Information Needed] |
|
|
|