Update README.md
Browse files
README.md
CHANGED
|
@@ -59,7 +59,9 @@ Debugged vibecoder dataset
|
|
| 59 |
|
| 60 |
### 📊 Model Evaluation Results
|
| 61 |
|
| 62 |
-
|
|
|
|
|
|
|
| 63 |
|
| 64 |
**Notes:**
|
| 65 |
- The `(+value)` indicates delta over baseline evaluation.
|
|
|
|
| 59 |
|
| 60 |
### 📊 Model Evaluation Results
|
| 61 |
|
| 62 |
+
| Tasks | Version | Filter | n-shot | Metric | Vcoder-120B | gpt-oss-120 | DeepSeek-V3.2-Exp |
|
| 63 |
+
|---------------------|---------|------------------|--------|------------|-------------|------------ |-------------------|
|
| 64 |
+
| gsm8k (cot) | 3 | flexible-extract | 5 | exact_match ↑ | 0.88 | 0.88 | - |
|
| 65 |
|
| 66 |
**Notes:**
|
| 67 |
- The `(+value)` indicates delta over baseline evaluation.
|