|
|
--- |
|
|
library_name: transformers |
|
|
language: en |
|
|
license: apache-2.0 |
|
|
datasets: |
|
|
- stanfordnlp/imdb |
|
|
base_model: |
|
|
- google-bert/bert-base-uncased |
|
|
--- |
|
|
|
|
|
# BERT Fine-Tuned on IMDb |
|
|
|
|
|
A fine-tuned BERT model using the IMDb dataset. |
|
|
|
|
|
## Model Details |
|
|
|
|
|
### Description |
|
|
|
|
|
This model is based on the [BERT base (uncased)](https://huggingface.co/google-bert/bert-base-uncased) |
|
|
architecture and has been fine-tuned on the [IMDb](https://huggingface.co/datasets/stanfordnlp/imdb) dataset. |
|
|
|
|
|
- **Developed by:** [Cesar Gonzalez-Gutierrez](https://ceguel.es) |
|
|
- **Funded by:** [ERC](https://erc.europa.eu) |
|
|
- **Architecture:** BERT-base |
|
|
- **Base model:** [BERT base model (uncased)](https://huggingface.co/google-bert/bert-base-uncased) |
|
|
- **Language:** English |
|
|
- **License:** Apache 2.0 |
|
|
|
|
|
### Seed Initializations |
|
|
|
|
|
Alternative models trained using different initialization seeds are available and can be accessed using |
|
|
specific branches: |
|
|
|
|
|
| Random Seed | Branch | |
|
|
|-------------|----------| |
|
|
| 120 | seed-120 | |
|
|
| 220 | seed-220 | |
|
|
| 320 | seed-320 | |
|
|
| 420 | seed-420 | |
|
|
| 520 | seed-520 | |
|
|
|
|
|
To load a model from a specific branch, use the `revision` parameter: |
|
|
```python |
|
|
from transformers import AutoModelForSequenceClassification |
|
|
|
|
|
model = AutoModelForSequenceClassification.from_pretrained("<model>", revision="seed-120") |
|
|
``` |
|
|
|
|
|
### Sources |
|
|
|
|
|
[Information pending] |
|
|
|
|
|
## Training Details |
|
|
|
|
|
Fine-tuning was performed end-to-end using a grid search over key hyperparameters. |
|
|
Model performance was evaluated based on validation loss computed on the development set. |
|
|
After identifying the optimal hyperparameter configuration, the final model was retrained |
|
|
on the entire training dataset. |
|
|
|
|
|
### Training Data |
|
|
|
|
|
The model was trained on the IMDb training partition, with validation performed on |
|
|
a random 20% split of the training data. |
|
|
|
|
|
#### Training Hyperparameters |
|
|
|
|
|
- **Epochs:** 1-4 |
|
|
- **Batch size:** {16, 32} |
|
|
- **Learning rate:** {5e-5, 3e-5, 2e-5} |
|
|
- **Validation metric:** loss |
|
|
- **Precision:** fp16 |
|
|
|
|
|
## Uses |
|
|
|
|
|
This model can be used for classification tasks aligned with the structure and intent of the IMDb corpus. |
|
|
|
|
|
For broader guidance, refer to the BERT base model’s [Inteded Uses & Limitations](https://huggingface.co/google-bert/bert-base-uncased#intended-uses--limitations). |
|
|
|
|
|
## Bias, Risks, and Limitations |
|
|
|
|
|
This model inherits the potential risks and limitations of its base model. For more details, |
|
|
refer to the [Limitations and bias](https://huggingface.co/google-bert/bert-base-uncased#limitations-and-bias) section of the original model documentation. |
|
|
|
|
|
Additionally, it may reflect or amplify patterns and biases present in the IMDb training data. |
|
|
|
|
|
## Hardware |
|
|
|
|
|
- **Hardware Type:** NVIDIA Tesla V100 PCIE 32GB |
|
|
- **Cluster Provider:** [Artemisa](https://artemisa.ific.uv.es/web/) |
|
|
- **Compute Region:** EU |
|
|
|
|
|
## Citation |
|
|
|
|
|
If you use this model in your research, please cite both the base BERT model |
|
|
and the IMDb source. |