Update README.md
Browse files
README.md
CHANGED
|
@@ -22,12 +22,13 @@ model-index:
|
|
| 22 |
metrics:
|
| 23 |
- name: Loss
|
| 24 |
type: loss
|
| 25 |
-
value: 4.
|
| 26 |
---
|
| 27 |
|
| 28 |
# T5-Small with LoRA on OpenCodeReasoning
|
| 29 |
|
| 30 |
This is a LoRA fine-tuned version of T5-small on a subset of NVIDIA's OpenCodeReasoning dataset using [PEFT](https://github.com/huggingface/peft).
|
|
|
|
| 31 |
|
| 32 |
## Loss Curve
|
| 33 |
|
|
@@ -43,7 +44,8 @@ This is a LoRA fine-tuned version of T5-small on a subset of NVIDIA's OpenCodeRe
|
|
| 43 |
| 400 | 4.89 | 4.42 |
|
| 44 |
| 450 | 4.69 | 4.40 |
|
| 45 |
|
| 46 |
-
Final Train Loss: **
|
|
|
|
| 47 |
|
| 48 |
## Example Usage
|
| 49 |
|
|
@@ -59,8 +61,9 @@ tokenizer = AutoTokenizer.from_pretrained("ShahzebKhoso/t5-small-opencode-lora")
|
|
| 59 |
inputs = tokenizer("generate code: write a function to reverse a string", return_tensors="pt")
|
| 60 |
outputs = model.generate(**inputs)
|
| 61 |
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
|
|
|
|
| 62 |
|
| 63 |
-
Notes
|
| 64 |
|
| 65 |
Trained on subset of OpenCodeReasoning due to Colab memory limits
|
| 66 |
|
|
@@ -69,6 +72,6 @@ Use PeftModel with t5-small base
|
|
| 69 |
Metrics used: Loss (BLEU skipped due to output structure)
|
| 70 |
|
| 71 |
|
| 72 |
-
License
|
| 73 |
|
| 74 |
Apache 2.0
|
|
|
|
| 22 |
metrics:
|
| 23 |
- name: Loss
|
| 24 |
type: loss
|
| 25 |
+
value: 4.69
|
| 26 |
---
|
| 27 |
|
| 28 |
# T5-Small with LoRA on OpenCodeReasoning
|
| 29 |
|
| 30 |
This is a LoRA fine-tuned version of T5-small on a subset of NVIDIA's OpenCodeReasoning dataset using [PEFT](https://github.com/huggingface/peft).
|
| 31 |
+
Improved version to be uploaded soon.
|
| 32 |
|
| 33 |
## Loss Curve
|
| 34 |
|
|
|
|
| 44 |
| 400 | 4.89 | 4.42 |
|
| 45 |
| 450 | 4.69 | 4.40 |
|
| 46 |
|
| 47 |
+
Final Train Loss: **4.69**
|
| 48 |
+
Final Eval Loss: **4.40**
|
| 49 |
|
| 50 |
## Example Usage
|
| 51 |
|
|
|
|
| 61 |
inputs = tokenizer("generate code: write a function to reverse a string", return_tensors="pt")
|
| 62 |
outputs = model.generate(**inputs)
|
| 63 |
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
|
| 64 |
+
'''
|
| 65 |
|
| 66 |
+
## Notes
|
| 67 |
|
| 68 |
Trained on subset of OpenCodeReasoning due to Colab memory limits
|
| 69 |
|
|
|
|
| 72 |
Metrics used: Loss (BLEU skipped due to output structure)
|
| 73 |
|
| 74 |
|
| 75 |
+
## License
|
| 76 |
|
| 77 |
Apache 2.0
|