|
|
--- |
|
|
tags: |
|
|
- generated_from_trainer |
|
|
model-index: |
|
|
- name: multi-label-class-classification-on-github-issues |
|
|
results: [] |
|
|
--- |
|
|
|
|
|
<!-- This model card has been generated automatically according to the information the Trainer had access to. You |
|
|
should probably proofread and complete it, then remove this comment. --> |
|
|
|
|
|
# multi-label-class-classification-on-github-issues |
|
|
|
|
|
This model is a fine-tuned version of [neuralmagic/oBERT-12-upstream-pruned-unstructured-97](https://huggingface.co/neuralmagic/oBERT-12-upstream-pruned-unstructured-97) on the None dataset. |
|
|
It achieves the following results on the evaluation set: |
|
|
- Loss: 0.1301 |
|
|
- Micro f1: 0.5159 |
|
|
- Macro f1: 0.0352 |
|
|
|
|
|
## Model description |
|
|
|
|
|
More information needed |
|
|
|
|
|
## Intended uses & limitations |
|
|
|
|
|
More information needed |
|
|
|
|
|
## Training and evaluation data |
|
|
|
|
|
More information needed |
|
|
|
|
|
## Training procedure |
|
|
|
|
|
### Training hyperparameters |
|
|
|
|
|
The following hyperparameters were used during training: |
|
|
- learning_rate: 3e-05 |
|
|
- train_batch_size: 64 |
|
|
- eval_batch_size: 8 |
|
|
- seed: 42 |
|
|
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 |
|
|
- lr_scheduler_type: linear |
|
|
- num_epochs: 15 |
|
|
- mixed_precision_training: Native AMP |
|
|
|
|
|
### Training results |
|
|
|
|
|
| Training Loss | Epoch | Step | Validation Loss | Micro f1 | Macro f1 | |
|
|
|:-------------:|:-----:|:----:|:---------------:|:--------:|:--------:| |
|
|
| No log | 1.0 | 25 | 0.3994 | 0.3783 | 0.0172 | |
|
|
| No log | 2.0 | 50 | 0.2846 | 0.3791 | 0.0172 | |
|
|
| No log | 3.0 | 75 | 0.2159 | 0.3791 | 0.0172 | |
|
|
| No log | 4.0 | 100 | 0.1802 | 0.3791 | 0.0172 | |
|
|
| No log | 5.0 | 125 | 0.1618 | 0.3791 | 0.0172 | |
|
|
| No log | 6.0 | 150 | 0.1515 | 0.3791 | 0.0172 | |
|
|
| No log | 7.0 | 175 | 0.1452 | 0.3791 | 0.0172 | |
|
|
| No log | 8.0 | 200 | 0.1411 | 0.3931 | 0.0202 | |
|
|
| No log | 9.0 | 225 | 0.1379 | 0.4413 | 0.0277 | |
|
|
| No log | 10.0 | 250 | 0.1350 | 0.4694 | 0.0309 | |
|
|
| No log | 11.0 | 275 | 0.1327 | 0.4993 | 0.0336 | |
|
|
| No log | 12.0 | 300 | 0.1309 | 0.5084 | 0.0344 | |
|
|
| No log | 13.0 | 325 | 0.1297 | 0.5147 | 0.0349 | |
|
|
| No log | 14.0 | 350 | 0.1291 | 0.5060 | 0.0343 | |
|
|
| No log | 15.0 | 375 | 0.1287 | 0.5107 | 0.0346 | |
|
|
|
|
|
|
|
|
### Framework versions |
|
|
|
|
|
- Transformers 4.25.1 |
|
|
- Pytorch 1.13.0+cu116 |
|
|
- Datasets 2.8.0 |
|
|
- Tokenizers 0.13.2 |
|
|
# Day 1 |
|
|
|
|
|
1. Tried to use the Neural Magic Model "neuralmagic/oBERT-12-upstream-pruned-unstructured-97". The macro and micro f1 scores were much smaller at the |
|
|
beginning of the model; the initial step did not increase much. However, it did outperform in the same epoch by .159 difference in the f1 score. |
|
|
2. Modification of the code was more significant was able to add errors in my program to move to the CPU if there was an error in my program |
|
|
``` Python |
|
|
import gc |
|
|
''' |
|
|
Try and Catch the block when training the model using more memory than the GPU, it will produce an error. |
|
|
1. Check the Amount of GPU memory used |
|
|
2. Move the model to the CPU |
|
|
3. Call the garbage collector |
|
|
4. Free the GPU memory in the cache |
|
|
5. Check the amount of GPU memory used to see if it is freed |
|
|
''' |
|
|
def check_gpu_memory(): |
|
|
print(torch.cuda.memory_allocated()/1e9) |
|
|
return torch.cuda.memory_allocated()/1e9 |
|
|
try: |
|
|
trainer.train() |
|
|
except RuntimeError as e: |
|
|
if "CUDA out of memory" in str(e): |
|
|
print("CUDA out of memory") |
|
|
print("Let's free some GPU memory and re-allocate") |
|
|
check_gpu_memory() |
|
|
## Move the model to CPU |
|
|
model.to("cpu") |
|
|
gc.collect() |
|
|
## Free the GPU memory |
|
|
torch.cuda.empty_cache() |
|
|
check_gpu_memory() |
|
|
else: |
|
|
raise e |
|
|
``` |
|
|
4. Able to check if there was a number of support my model can support in my model |
|
|
``` Python |
|
|
from transformers import Trainer, TrainingArguments |
|
|
def is_on_colab(): |
|
|
if 'google.colab' in sys.modules: |
|
|
return True |
|
|
return False |
|
|
|
|
|
training_args_fine_tune = TrainingArguments( |
|
|
output_dir = "./multi-label-class-classification-on-github-issues" , |
|
|
num_train_epochs = 15, |
|
|
learning_rate = 3e-5, |
|
|
per_device_train_batch_size = 64 , |
|
|
evaluation_strategy = "epoch" , |
|
|
save_strategy="epoch" , |
|
|
load_best_model_at_end=True, |
|
|
metric_for_best_model='micro f1', |
|
|
save_total_limit=1, |
|
|
log_level='error', |
|
|
push_to_hub = True if is_on_colab else False , |
|
|
) |
|
|
if torch.cuda.is_available(): |
|
|
## check if the Cuda GPU can bfloat16 |
|
|
if torch.cuda.is_bf16_supported(): |
|
|
print("Cuda GPU can support bfloat16") |
|
|
training_args_fine_tune.fp16 = True |
|
|
else: |
|
|
print("Cuda GPU cannot support bfloat16 so instead we will use float16 ") |
|
|
training_args_fine_tune.fp16 = True |
|
|
``` |