Update README.md
Browse files
README.md
CHANGED
|
@@ -29,6 +29,14 @@ AlquistCoder demonstrates significantly lower vulnerability rates compared to la
|
|
| 29 |
| **CyberSecEval** | Autocomplete Vuln Rate | **2.97%** | 11.80% | 10.39% |
|
| 30 |
| **HumanEval** | Pass@1 (Utility) | **77.44%** | 78.05% | 74.40% |
|
| 31 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 32 |
*Note: Security metrics refer to the DPO model. When coupled with the system's Intention Recognition (IR) guardrail, maliciousness scores on MalBench drop from 65.49% to 13.38%.*
|
| 33 |
|
| 34 |
## Usage
|
|
|
|
| 29 |
| **CyberSecEval** | Autocomplete Vuln Rate | **2.97%** | 11.80% | 10.39% |
|
| 30 |
| **HumanEval** | Pass@1 (Utility) | **77.44%** | 78.05% | 74.40% |
|
| 31 |
|
| 32 |
+
|
| 33 |
+
### CyberSecEval Performance
|
| 34 |
+
|
| 35 |
+
| Configuration | MITRE (Maliciousness) | Vuln Rate (Autocomplete) | Vuln Rate (Instruct) |
|
| 36 |
+
| :--- | :--- | :--- | :--- |
|
| 37 |
+
| **AlquistCoder (DPO)** | 39.40% | 2.97% | 1.19% |
|
| 38 |
+
| **AlquistCoder (DPO + IR)** | **12.20%** | **2.97%** | **1.19%** |
|
| 39 |
+
|
| 40 |
*Note: Security metrics refer to the DPO model. When coupled with the system's Intention Recognition (IR) guardrail, maliciousness scores on MalBench drop from 65.49% to 13.38%.*
|
| 41 |
|
| 42 |
## Usage
|