codewraith / data /eval_report_comparison_v2.md
slenk's picture
Upload folder using huggingface_hub
eeef81e verified
# CodeWraith Model Evaluation Report
## Summary
| Metric | CodeWraith-3b-v2 (Llama-3.2-3B-Instruct) | CodeWraith-8b-v2 (Llama-3.1-8B-Instruct) |
|--------|-----|-----|
| Avg Structural Score | 0.93 | 0.92 |
| Function Coverage | 0.84 | 0.85 |
| Class Coverage | 0.97 | 0.84 |
| Argument Coverage | 0.91 | 0.93 |
| Return Type Coverage | 0.97 | 0.97 |
| Good Scores (>=80%) | 25 | 24 |
| Avg Inference Time (s) | 20.01 | 21.91 |
## CodeWraith-3b-v2 (Llama-3.2-3B-Instruct)
- Examples evaluated: 31
- Valid (parseable): 28
- Perfect scores: 15
- Total inference time: 620.2s
## CodeWraith-8b-v2 (Llama-3.1-8B-Instruct)
- Examples evaluated: 31
- Valid (parseable): 28
- Perfect scores: 15
- Total inference time: 679.2s