Update README.md
Browse files
README.md
CHANGED
|
@@ -14,7 +14,8 @@ datasets:
|
|
| 14 |
---
|
| 15 |
|
| 16 |
## Inroduction
|
| 17 |
-
SA stands for Safety and alignment. We fine tuned DeepCoder-1.5B-Preview with STAR-1 for 250 steps to enhance safety alignment using unsloth SFT cookbook
|
|
|
|
| 18 |
This model is fine-tuned with policy-grounded data to be safe and aligned with human values while coding. Specifically, it utilizes the STAR-1 dataset, which integrates diverse, deliberative reasoning examples evaluated rigorously by GPT-4o. This ensures the model maintains robust safety standards and minimizes biases, promoting responsible, secure, and effective coding practices without compromising its core reasoning capabilities.
|
| 19 |
|
| 20 |
# Uploaded model
|
|
|
|
| 14 |
---
|
| 15 |
|
| 16 |
## Inroduction
|
| 17 |
+
SA stands for Safety and alignment. We fine tuned DeepCoder-1.5B-Preview with STAR-1 for 250 steps to enhance safety alignment using unsloth SFT cookbook.
|
| 18 |
+
|
| 19 |
This model is fine-tuned with policy-grounded data to be safe and aligned with human values while coding. Specifically, it utilizes the STAR-1 dataset, which integrates diverse, deliberative reasoning examples evaluated rigorously by GPT-4o. This ensures the model maintains robust safety standards and minimizes biases, promoting responsible, secure, and effective coding practices without compromising its core reasoning capabilities.
|
| 20 |
|
| 21 |
# Uploaded model
|