cglez commited on
Commit
32fde23
·
verified ·
1 Parent(s): 53e902d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +54 -24
README.md CHANGED
@@ -2,66 +2,96 @@
2
  library_name: transformers
3
  language: en
4
  license: mit
5
- datasets: []
6
- tags: []
 
 
7
  ---
8
 
9
- # Model Card for <Model>
10
 
11
- A pretrained GPT2 using <Dataset>.
12
 
13
  ## Model Details
14
 
15
- ### Model Description
16
 
17
- A pretrained GPT2 using <Dataset>.
 
18
 
19
  - **Developed by:** [Cesar Gonzalez-Gutierrez](https://ceguel.es)
20
  - **Funded by:** [ERC](https://erc.europa.eu)
21
- - **Model type:** pretrained GPT2
22
- - **Language(s) (NLP):** English
23
  - **License:** MIT
24
- - **Pretrained from model:** [GPT2](https://huggingface.co/openai-community/gpt2)
25
 
26
- ### Model Checkpoints
27
 
28
- [More Information Needed]
29
-
30
- ### Model Sources
31
 
32
- - **Paper:** [More Information Needed]
 
 
 
 
 
 
 
 
 
 
 
 
33
 
34
- ## Intended Uses & Limitations
 
 
35
 
36
- See <https://huggingface.co/openai-community/gpt2#intended-uses--limitations>.
 
37
 
38
- ### Loading Checkpoints
39
 
40
- [More Information Needed]
41
 
42
  ## Training Details
43
 
 
 
 
44
  ### Training Data
45
 
46
- [More Information Needed]
47
 
48
- #### Preprocessing [optional]
49
 
50
- [More Information Needed]
51
 
52
  #### Training Hyperparameters
53
 
54
- - **Training regime:** fp16
55
  - **Batch size:** 8
56
  - **Gradient accumulation steps:** 12
57
 
 
 
 
 
 
 
 
 
 
 
58
  ## Environmental Impact
59
 
60
  - **Hardware Type:** NVIDIA A100 PCIE 40GB
61
- - **Hours used:** [More Information Needed]
62
  - **Cluster Provider:** [Artemisa](https://artemisa.ific.uv.es/web/)
63
  - **Compute Region:** EU
64
- - **Carbon Emitted:** [More Information Needed] <!-- Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). -->
65
 
66
  ## Citation
67
 
 
2
  library_name: transformers
3
  language: en
4
  license: mit
5
+ datasets:
6
+ - cglez/wiki_toxic_clean
7
+ base_model:
8
+ - openai-community/gpt2
9
  ---
10
 
11
+ # Model Card: GPT-2 DAPT Wiki Toxic
12
 
13
+ A domain-adapted GPT-2, further pre-trained on the Wiki Toxic dataset text.
14
 
15
  ## Model Details
16
 
17
+ ### Description
18
 
19
+ This model is based on the [GPT-2](https://huggingface.co/openai-community/gpt2)
20
+ architecture and was further pre-trained (domain-adapted) using the text in Wiki Toxic dataset, excluding its test split.
21
 
22
  - **Developed by:** [Cesar Gonzalez-Gutierrez](https://ceguel.es)
23
  - **Funded by:** [ERC](https://erc.europa.eu)
24
+ - **Architecture:** GPT-2
25
+ - **Language:** English
26
  - **License:** MIT
27
+ - **Base model:** [GPT-2](https://huggingface.co/openai-community/gpt2)
28
 
29
+ ### Checkpoints
30
 
31
+ Intermediate checkpoints from the pre-training process are available and can be accessed using specific tags,
32
+ which correspond to training epochs and steps:
 
33
 
34
+ | Epoch | Step | Tags | |
35
+ |---|---|---|---|
36
+ | 1 | 1496 | epoch-1 | step-1496 |
37
+ | 5 | 7480 | epoch-5 | step-7480 |
38
+ | 10 | 14960 | epoch-10 | step-14960 |
39
+ | 15 | 22440 | epoch-15 | step-22440 |
40
+ | 20 | 29920 | epoch-20 | step-29920 |
41
+ | 25 | 37400 | epoch-25 | step-37400 |
42
+ | 30 | 44880 | epoch-30 | step-44880 |
43
+ | 35 | 52360 | epoch-35 | step-52360 |
44
+ | 40 | 59840 | epoch-40 | step-59840 |
45
+ | 45 | 67320 | epoch-45 | step-67320 |
46
+ | 50 | 74800 | epoch-50 | step-74800 |
47
 
48
+ To load a model from a specific intermediate checkpoint, use the `revision` parameter with the corresponding tag:
49
+ ```python
50
+ from transformers import AutoModelForCausalLM
51
 
52
+ model = AutoModelForMaskedLM.from_pretrained("<model-name>", revision="<checkpoint-tag>")
53
+ ```
54
 
55
+ ### Sources
56
 
57
+ - **Paper:** [Information pending]
58
 
59
  ## Training Details
60
 
61
+ For more details on the training procedure, please refer to the base model's documentation:
62
+ [Training procedure](https://huggingface.co/openai-community/gpt2#training-procedure).
63
+
64
  ### Training Data
65
 
66
+ All texts from Wiki Toxic dataset, excluding the test partition.
67
 
68
+ #### Preprocessing
69
 
70
+ All markup and symbols were removed from the texts, including punctuation.
71
 
72
  #### Training Hyperparameters
73
 
74
+ - **Precision:** fp16
75
  - **Batch size:** 8
76
  - **Gradient accumulation steps:** 12
77
 
78
+ ## Uses
79
+
80
+ For typical use cases and limitations, please refer to the base model's guidance:
81
+ [Inteded uses & limitations](https://huggingface.co/openai-community/gpt2#intended-uses--limitations).
82
+
83
+ ## Bias, Risks, and Limitations
84
+
85
+ This model inherits potential risks and limitations from the base model. Refer to:
86
+ [Limitations and bias](https://huggingface.co/openai-community/gpt2#limitations-and-bias).
87
+
88
  ## Environmental Impact
89
 
90
  - **Hardware Type:** NVIDIA A100 PCIE 40GB
91
+ - **Runtime:** 24.5
92
  - **Cluster Provider:** [Artemisa](https://artemisa.ific.uv.es/web/)
93
  - **Compute Region:** EU
94
+ - **Carbon Emitted:** 3.8 kg CO2 eq.
95
 
96
  ## Citation
97