| 2025-04-01 20:38:45,445 - Evaluation - INFO - +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ |
| 2025-04-01 20:38:45,448 - Evaluation - INFO - task: conv |
| 2025-04-01 20:38:45,449 - Evaluation - INFO - model_id: 3, |
| 2025-04-01 20:38:45,450 - Evaluation - INFO - average: 52.7, |
| 2025-04-01 20:38:45,451 - Evaluation - INFO - question: 30, |
| 2025-04-01 20:38:45,451 - Evaluation - INFO - +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ |
| 2025-04-01 20:49:08,644 - Evaluation - INFO - +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ |
| 2025-04-01 20:49:08,645 - Evaluation - INFO - task: detail |
| 2025-04-01 20:49:08,645 - Evaluation - INFO - model_id: 3, |
| 2025-04-01 20:49:08,646 - Evaluation - INFO - average: 44.9, |
| 2025-04-01 20:49:08,646 - Evaluation - INFO - question: 30, |
| 2025-04-01 20:49:08,647 - Evaluation - INFO - +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ |
| 2025-04-01 20:59:33,281 - Evaluation - INFO - +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ |
| 2025-04-01 20:59:33,282 - Evaluation - INFO - task: complex |
| 2025-04-01 20:59:33,283 - Evaluation - INFO - model_id: 3, |
| 2025-04-01 20:59:33,283 - Evaluation - INFO - average: 68.9, |
| 2025-04-01 20:59:33,284 - Evaluation - INFO - question: 30, |
| 2025-04-01 20:59:33,285 - Evaluation - INFO - +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ |
| 2025-04-01 20:59:33,286 - Evaluation - INFO - +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ |
| 2025-04-01 20:59:33,287 - Evaluation - INFO - model_id: 3, |
| 2025-04-01 20:59:33,287 - Evaluation - INFO - total_average: 55.5, |
| 2025-04-01 20:59:33,288 - Evaluation - INFO - total_question: 90, |
| 2025-04-01 20:59:33,288 - Evaluation - INFO - +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ |
|
|