Update README.md
Browse files
README.md
CHANGED
|
@@ -16,7 +16,7 @@ datasets:
|
|
| 16 |
|
| 17 |
This model is a fine-tuned version of [malteklaes/based-CodeBERTa-language-id-llm-module](https://huggingface.co/malteklaes/based-CodeBERTa-language-id-llm-module) on the None dataset.
|
| 18 |
|
| 19 |
-
## Model description
|
| 20 |
|
| 21 |
- based on model [https://huggingface.co/malteklaes/based-CodeBERTa-language-id-llm-module_uniVienna-2](malteklaes/based-CodeBERTa-language-id-llm-module) (7 programming languages), which in turn is based on [huggingface/CodeBERTa-language-id](https://huggingface.co/huggingface/CodeBERTa-language-id) (6 programming languages)
|
| 22 |
- model details:
|
|
@@ -130,19 +130,11 @@ myPipeline(CODE_TO_IDENTIFY_py) # output: [{'label': 'python', 'score': 0.999996
|
|
| 130 |
|
| 131 |
## Training and evaluation data
|
| 132 |
|
| 133 |
-
-
|
| 134 |
-
|
| 135 |
-
|
| 136 |
-
output_dir="./based-CodeBERTa-language-id-llm-module_uniVienna",
|
| 137 |
-
overwrite_output_dir=True,
|
| 138 |
-
num_train_epochs=0.1,
|
| 139 |
-
per_device_train_batch_size=8,
|
| 140 |
-
save_steps=500,
|
| 141 |
-
save_total_limit=2,
|
| 142 |
-
)
|
| 143 |
-
```
|
| 144 |
|
| 145 |
-
## Training procedure
|
| 146 |
- machine: GPU T4 (Google Colab)
|
| 147 |
- system-RAM: 4.7/12.7 GB (during training)
|
| 148 |
- GPU-RAM: 2.8/15.0GB
|
|
@@ -156,14 +148,17 @@ training_args = TrainingArguments(
|
|
| 156 |
|
| 157 |
### Training hyperparameters
|
| 158 |
|
| 159 |
-
The following hyperparameters were used during training:
|
| 160 |
-
|
| 161 |
-
|
| 162 |
-
-
|
| 163 |
-
|
| 164 |
-
|
| 165 |
-
|
| 166 |
-
|
|
|
|
|
|
|
|
|
|
| 167 |
|
| 168 |
### Training results
|
| 169 |
|
|
@@ -171,11 +166,3 @@ The following hyperparameters were used during training:
|
|
| 171 |
```
|
| 172 |
TrainOutput(global_step=24136, training_loss=0.005988701689750161, metrics={'train_runtime': 1936.0586, 'train_samples_per_second': 99.731, 'train_steps_per_second': 12.467, 'total_flos': 3197518224531456.0, 'train_loss': 0.005988701689750161, 'epoch': 0.1})
|
| 173 |
```
|
| 174 |
-
|
| 175 |
-
|
| 176 |
-
### Framework versions
|
| 177 |
-
|
| 178 |
-
- Transformers 4.39.3
|
| 179 |
-
- Pytorch 2.2.1+cu121
|
| 180 |
-
- Datasets 2.18.0
|
| 181 |
-
- Tokenizers 0.15.2
|
|
|
|
| 16 |
|
| 17 |
This model is a fine-tuned version of [malteklaes/based-CodeBERTa-language-id-llm-module](https://huggingface.co/malteklaes/based-CodeBERTa-language-id-llm-module) on the None dataset.
|
| 18 |
|
| 19 |
+
## Model description and Framework version
|
| 20 |
|
| 21 |
- based on model [https://huggingface.co/malteklaes/based-CodeBERTa-language-id-llm-module_uniVienna-2](malteklaes/based-CodeBERTa-language-id-llm-module) (7 programming languages), which in turn is based on [huggingface/CodeBERTa-language-id](https://huggingface.co/huggingface/CodeBERTa-language-id) (6 programming languages)
|
| 22 |
- model details:
|
|
|
|
| 130 |
|
| 131 |
## Training and evaluation data
|
| 132 |
|
| 133 |
+
### Training-Datasets used
|
| 134 |
+
- for Go, Java, Javascript, PHP, Python, Ruby: [code_search_net](https://huggingface.co/datasets/code_search_net)
|
| 135 |
+
- for C++: [malteklaes/cpp-code-code_search_net-style](https://huggingface.co/datasets/malteklaes/cpp-code-code_search_net-style)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 136 |
|
| 137 |
+
### Training procedure
|
| 138 |
- machine: GPU T4 (Google Colab)
|
| 139 |
- system-RAM: 4.7/12.7 GB (during training)
|
| 140 |
- GPU-RAM: 2.8/15.0GB
|
|
|
|
| 148 |
|
| 149 |
### Training hyperparameters
|
| 150 |
|
| 151 |
+
The following hyperparameters were used during training (training args):
|
| 152 |
+
```
|
| 153 |
+
training_args = TrainingArguments(
|
| 154 |
+
output_dir="./based-CodeBERTa-language-id-llm-module_uniVienna",
|
| 155 |
+
overwrite_output_dir=True,
|
| 156 |
+
num_train_epochs=0.1,
|
| 157 |
+
per_device_train_batch_size=8,
|
| 158 |
+
save_steps=500,
|
| 159 |
+
save_total_limit=2,
|
| 160 |
+
)
|
| 161 |
+
```
|
| 162 |
|
| 163 |
### Training results
|
| 164 |
|
|
|
|
| 166 |
```
|
| 167 |
TrainOutput(global_step=24136, training_loss=0.005988701689750161, metrics={'train_runtime': 1936.0586, 'train_samples_per_second': 99.731, 'train_steps_per_second': 12.467, 'total_flos': 3197518224531456.0, 'train_loss': 0.005988701689750161, 'epoch': 0.1})
|
| 168 |
```
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|