Model Card: GPT-2 Wiki Toxic

An in-domain GPT-2, pre-trained from scratch on the Wiki Toxic dataset text.

Model Details

Description

This model is based on the GPT-2 architecture and was pre-trained from scratch (in-domain) using the text in Wiki Toxic dataset, excluding its test split.

Developed by: Cesar Gonzalez-Gutierrez
Funded by: ERC
Architecture: GPT-2
Language: English
License: MIT
Base model: GPT-2

Checkpoints

Intermediate checkpoints from the pre-training process are available and can be accessed using specific tags, which correspond to training epochs and steps:

Epoch	Step	Tags
1	1496	epoch-1	step-1496
5	7480	epoch-5	step-7480
10	14960	epoch-10	step-14960
15	22440	epoch-15	step-22440
20	29920	epoch-20	step-29920
25	37400	epoch-25	step-37400
30	44880	epoch-30	step-44880
35	52360	epoch-35	step-52360
40	59840	epoch-40	step-59840
45	67320	epoch-45	step-67320
50	74800	epoch-50	step-74800

To load a model from a specific intermediate checkpoint, use the revision parameter with the corresponding tag:

from transformers import AutoModelForCausalLM

model = AutoModelForMaskedLM.from_pretrained("<model-name>", revision="<checkpoint-tag>")

Sources

Paper: [Information pending]

Training Details

For more details on the training procedure, please refer to the base model's documentation: Training procedure.

Training Data

All texts from Wiki Toxic dataset, excluding the test partition.

Preprocessing

All markup and symbols were removed from the texts, including punctuation.

Training Hyperparameters

Precision: fp16
Batch size: 8
Gradient accumulation steps: 12

Uses

For typical use cases and limitations, please refer to the base model's guidance: Inteded uses & limitations.

Bias, Risks, and Limitations

This model inherits potential risks and limitations from the base model. Refer to: Limitations and bias.

Environmental Impact

Hardware Type: NVIDIA A100 PCIE 40GB
Runtime: 21.5
Cluster Provider: Artemisa
Compute Region: EU
Carbon Emitted: 3.34 kg CO2 eq.

Citation

BibTeX:

[More Information Needed]

Downloads last month: 2

Safetensors

Model size

0.1B params

Tensor type

F32

Model tree for cglez/gpt2-wiki_toxic

Base model

openai-community/gpt2

Finetuned

(2144)

this model

cglez
/

gpt2-wiki_toxic