Model Card: GPT-2-IMDb

An in-domain GPT-2, pre-trained from scratch on the IMDb dataset text.

Model Details

Description

This model is based on the GPT-2 architecture and was pre-trained from scratch (in-domain) using the text in IMDb dataset, excluding its test split.

Developed by: Cesar Gonzalez-Gutierrez
Funded by: ERC
Architecture: GPT-2
Language: English
License: MIT
Base model: GPT-2

Checkpoints

Intermediate checkpoints from the pre-training process are available and can be accessed using specific tags, which correspond to training epochs and steps:

Epoch	Step	Tags
1	703	epoch-1	step-703
5	3515	epoch-5	step-3515
10	7031	epoch-10	step-7031
20	14063	epoch-20	step-14063
30	21095	epoch-30	step-21095
40	28126	epoch-40	step-28126
50	35158	epoch-50	step-35158
60	42190	epoch-60	step-42190
70	49221	epoch-70	step-49221
80	56240	epoch-80	step-56240

To load a model from a specific intermediate checkpoint, use the revision parameter with the corresponding tag:

from transformers import AutoModelForCausalLM

model = AutoModelForMaskedLM.from_pretrained("<model-name>", revision="<checkpoint-tag>")

Sources

Paper: [Information pending]

Training Details

For more details on the training procedure, please refer to the base model's documentation: Training procedure.

Training Data

All texts from IMDb dataset, excluding the test partition.

Training Hyperparameters

Precision: fp16
Batch size: 8
Gradient accumulation steps: 12

Uses

For typical use cases and limitations, please refer to the base model's guidance: Inteded uses & limitations.

Bias, Risks, and Limitations

This model inherits potential risks and limitations from the base model. Refer to: Limitations and bias.

Environmental Impact

Hardware Type: NVIDIA A100 PCIE 40GB
Runtime: 7 h
Cluster Provider: Artemisa
Compute Region: EU
Carbon Emitted: 1.08 kg CO2 eq.

Citation

BibTeX:

[More Information Needed]

Downloads last month: 4

Safetensors

Model size

0.1B params

Tensor type

F32

Model tree for cglez/gpt2-imdb

Base model

openai-community/gpt2

Finetuned

(2209)

this model

cglez
/

gpt2-imdb