cglez commited on
Commit
ae6e951
·
verified ·
1 Parent(s): 754c68b

Initial commit

Browse files
Files changed (1) hide show
  1. README.md +94 -0
README.md ADDED
@@ -0,0 +1,94 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ library_name: transformers
3
+ language: en
4
+ license: apache-2.0
5
+ datasets: []
6
+ base_model:
7
+ - google-bert/bert-base-uncased
8
+ ---
9
+
10
+ # BERT Fine-Tuned on <Dataset>
11
+
12
+ A fine-tuned BERT model using the <Dataset> dataset.
13
+
14
+ ## Model Details
15
+
16
+ ### Description
17
+
18
+ This model is based on the [BERT base (uncased)](https://huggingface.co/google-bert/bert-base-uncased)
19
+ architecture and has been fine-tuned on the <Dataset> dataset.
20
+
21
+ - **Developed by:** [Cesar Gonzalez-Gutierrez](https://ceguel.es)
22
+ - **Funded by:** [ERC](https://erc.europa.eu)
23
+ - **Architecture:** BERT-base
24
+ - **Base model:** [BERT base model (uncased)](https://huggingface.co/google-bert/bert-base-uncased)
25
+ - **Language:** English
26
+ - **License:** Apache 2.0
27
+
28
+ ### Seed Initializations
29
+
30
+ Alternative models trained using different initialization seeds are available and can be accessed using
31
+ specific branches:
32
+
33
+ | Random Seed | Branch |
34
+ |-------------|----------|
35
+ | 120 | seed-120 |
36
+ | 220 | seed-220 |
37
+ | 320 | seed-320 |
38
+ | 420 | seed-420 |
39
+ | 520 | seed-520 |
40
+
41
+ To load a model from a specific branch, use the `revision` parameter:
42
+ ```python
43
+ from transformers import AutoModelForSequenceClassification
44
+
45
+ model = AutoModelForSequenceClassification.from_pretrained("<model>", revision="seed-120")
46
+ ```
47
+
48
+ ### Sources
49
+
50
+ [Information pending]
51
+
52
+ ## Training Details
53
+
54
+ Fine-tuning was performed end-to-end using a grid search over key hyperparameters.
55
+ Model performance was evaluated based on validation loss computed on the development set.
56
+ After identifying the optimal hyperparameter configuration, the final model was retrained
57
+ on the entire training dataset.
58
+
59
+ ### Training Data
60
+
61
+ The model was trained on the <Dataset> training partition, with validation performed on
62
+ either the dataset’s development set (if available) or a random 20% split of the training data.
63
+
64
+ #### Training Hyperparameters
65
+
66
+ - **Epochs:** 1-4
67
+ - **Batch size:** {16, 32}
68
+ - **Learning rate:** {5e-5, 3e-5, 2e-5}
69
+ - **Validation metric:** loss
70
+ - **Precision:** fp16
71
+
72
+ ## Uses
73
+
74
+ This model can be used for classification tasks aligned with the structure and intent of the <Dataset> corpus.
75
+
76
+ For broader guidance, refer to the BERT base model’s [Inteded Uses & Limitations](https://huggingface.co/google-bert/bert-base-uncased#intended-uses--limitations).
77
+
78
+ ## Bias, Risks, and Limitations
79
+
80
+ This model inherits the potential risks and limitations of its base model. For more details,
81
+ refer to the [Limitations and bias](https://huggingface.co/google-bert/bert-base-uncased#limitations-and-bias) section of the original model documentation.
82
+
83
+ Additionally, it may reflect or amplify patterns and biases present in the <Dataset> training data.
84
+
85
+ ## Hardware
86
+
87
+ - **Hardware Type:** NVIDIA Tesla V100 PCIE 32GB
88
+ - **Cluster Provider:** [Artemisa](https://artemisa.ific.uv.es/web/)
89
+ - **Compute Region:** EU
90
+
91
+ ## Citation
92
+
93
+ If you use this model in your research, please cite both the base BERT model
94
+ and the <Dataset> source (to be added).