Update README.md
Browse files
README.md
CHANGED
|
@@ -70,3 +70,67 @@ The following hyperparameters were used during training:
|
|
| 70 |
- Pytorch 1.13.0+cu116
|
| 71 |
- Datasets 2.8.0
|
| 72 |
- Tokenizers 0.13.2
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 70 |
- Pytorch 1.13.0+cu116
|
| 71 |
- Datasets 2.8.0
|
| 72 |
- Tokenizers 0.13.2
|
| 73 |
+
# Day 1
|
| 74 |
+
|
| 75 |
+
1. Tried to use the Neural Magic Model "neuralmagic/oBERT-12-upstream-pruned-unstructured-97". The macro and micro f1 scores were much smaller at the
|
| 76 |
+
beginning of the model; the initial step did not increase much. However, it did outperform in the same epoch by .159 difference in the f1 score.
|
| 77 |
+
2. Modification of the code was more significant was able to add errors in my program to move to the CPU if there was an error in my program
|
| 78 |
+
``` Python
|
| 79 |
+
import gc
|
| 80 |
+
'''
|
| 81 |
+
Try and Catch the block when training the model using more memory than the GPU, it will produce an error.
|
| 82 |
+
1. Check the Amount of GPU memory used
|
| 83 |
+
2. Move the model to the CPU
|
| 84 |
+
3. Call the garbage collector
|
| 85 |
+
4. Free the GPU memory in the cache
|
| 86 |
+
5. Check the amount of GPU memory used to see if it is freed
|
| 87 |
+
'''
|
| 88 |
+
def check_gpu_memory():
|
| 89 |
+
print(torch.cuda.memory_allocated()/1e9)
|
| 90 |
+
return torch.cuda.memory_allocated()/1e9
|
| 91 |
+
try:
|
| 92 |
+
trainer.train()
|
| 93 |
+
except RuntimeError as e:
|
| 94 |
+
if "CUDA out of memory" in str(e):
|
| 95 |
+
print("CUDA out of memory")
|
| 96 |
+
print("Let's free some GPU memory and re-allocate")
|
| 97 |
+
check_gpu_memory()
|
| 98 |
+
## Move the model to CPU
|
| 99 |
+
model.to("cpu")
|
| 100 |
+
gc.collect()
|
| 101 |
+
## Free the GPU memory
|
| 102 |
+
torch.cuda.empty_cache()
|
| 103 |
+
check_gpu_memory()
|
| 104 |
+
else:
|
| 105 |
+
raise e
|
| 106 |
+
```
|
| 107 |
+
4. Able to check if there was a number of support my model can support in my model
|
| 108 |
+
``` Python
|
| 109 |
+
from transformers import Trainer, TrainingArguments
|
| 110 |
+
def is_on_colab():
|
| 111 |
+
if 'google.colab' in sys.modules:
|
| 112 |
+
return True
|
| 113 |
+
return False
|
| 114 |
+
|
| 115 |
+
training_args_fine_tune = TrainingArguments(
|
| 116 |
+
output_dir = "./multi-label-class-classification-on-github-issues" ,
|
| 117 |
+
num_train_epochs = 15,
|
| 118 |
+
learning_rate = 3e-5,
|
| 119 |
+
per_device_train_batch_size = 64 ,
|
| 120 |
+
evaluation_strategy = "epoch" ,
|
| 121 |
+
save_strategy="epoch" ,
|
| 122 |
+
load_best_model_at_end=True,
|
| 123 |
+
metric_for_best_model='micro f1',
|
| 124 |
+
save_total_limit=1,
|
| 125 |
+
log_level='error',
|
| 126 |
+
push_to_hub = True if is_on_colab else False ,
|
| 127 |
+
)
|
| 128 |
+
if torch.cuda.is_available():
|
| 129 |
+
## check if the Cuda GPU can bfloat16
|
| 130 |
+
if torch.cuda.is_bf16_supported():
|
| 131 |
+
print("Cuda GPU can support bfloat16")
|
| 132 |
+
training_args_fine_tune.fp16 = True
|
| 133 |
+
else:
|
| 134 |
+
print("Cuda GPU cannot support bfloat16 so instead we will use float16 ")
|
| 135 |
+
training_args_fine_tune.fp16 = True
|
| 136 |
+
```
|