Is an test train, full train on its way.. Looks promising?
First Turn Scores
| model | turn | score |
|---|---|---|
| gpt-4 | 1 | 8.95625 |
| claude-v1 | 1 | 8.15000 |
| gpt-3.5-turbo | 1 | 8.07500 |
| LexGPT-V2 | 1 | 7.55625 |
| vicuna-13b-v1.3 | 1 | 6.81250 |
Second Turn Scores
| model | turn | score |
|---|---|---|
| gpt-4 | 2 | 9.0250 |
| gpt-3.5-turbo | 2 | 7.8125 |
| claude-v1 | 2 | 7.6500 |
| LexGPT-V2 | 2 | 6.8375 |
| vicuna-13b-v1.3 | 2 | 5.9625 |
Average Scores
| model | score |
|---|---|
| gpt-4 | 8.990625 |
| gpt-3.5-turbo | 7.943750 |
| claude-v1 | 7.900000 |
| LexGPT-V2 | 7.196875 |
| vicuna-13b-v1.3 | 6.387500 |
- Downloads last month
- 5