| | Algorithm | Model | Rank | AC | CE | WA | RE | TLE | MLE | EXE | Hack Rate | | |
| |----------|--------|------|----|----|----|----|-----|-----|-----|-----------| | |
| | algo|Qwen2.5-14B-Instruct | rank1 | 93.88% | 0.00% | 5.64% | 0.28% | 0.21% | 0.00% | 0.00% | 6.12% | | |
| | algo|Qwen2.5-14B-Instruct | rank2 | 93.73% | 0.00% | 5.82% | 0.28% | 0.17% | 0.00% | 0.00% | 6.27% | | |
| | algo|Qwen2.5-14B-Instruct | rank3 | 93.41% | 0.00% | 6.07% | 0.29% | 0.23% | 0.00% | 0.00% | 6.59% | | |
| | algo|Qwen2.5-14B-Instruct | rank4 | 93.36% | 0.00% | 6.14% | 0.33% | 0.16% | 0.00% | 0.00% | 6.64% | | |
| | algo|Qwen2.5-14B-Instruct | rank5 | 93.40% | 0.00% | 6.08% | 0.30% | 0.22% | 0.00% | 0.00% | 6.6% | | |
| | crux|Qwen2.5-14B-Instruct | rank1 | 81.40% | 0.00% | 16.32% | 0.98% | 1.31% | 0.00% | 0.00% | 18.6% | | |
| | crux|Qwen2.5-14B-Instruct | rank2 | 79.89% | 0.00% | 17.54% | 1.04% | 1.52% | 0.00% | 0.00% | 20.11% | | |
| | crux|Qwen2.5-14B-Instruct | rank3 | 79.59% | 0.00% | 17.86% | 1.05% | 1.50% | 0.00% | 0.00% | 20.41% | | |
| | crux|Qwen2.5-14B-Instruct | rank4 | 79.28% | 0.00% | 18.03% | 1.08% | 1.62% | 0.00% | 0.00% | 20.72% | | |
| | crux|Qwen2.5-14B-Instruct | rank5 | 79.17% | 0.00% | 18.15% | 1.09% | 1.59% | 0.00% | 0.00% | 20.83% | | |
| | ht|Qwen2.5-14B-Instruct | rank1 | 62.31% | 0.00% | 32.89% | 1.34% | 3.47% | 0.00% | 0.00% | 37.69% | | |
| | ht|Qwen2.5-14B-Instruct | rank2 | 58.42% | 0.00% | 36.08% | 1.49% | 4.01% | 0.00% | 0.00% | 41.58% | | |
| | ht|Qwen2.5-14B-Instruct | rank3 | 57.33% | 0.00% | 37.01% | 1.44% | 4.22% | 0.00% | 0.00% | 42.67% | | |
| | ht|Qwen2.5-14B-Instruct | rank4 | 56.38% | 0.00% | 37.76% | 1.45% | 4.40% | 0.00% | 0.00% | 43.62% | | |
| | ht|Qwen2.5-14B-Instruct | rank5 | 55.78% | 0.00% | 38.09% | 1.52% | 4.62% | 0.00% | 0.00% | 44.22% | | |
| | lcb|Qwen2.5-14B-Instruct | rank1 | 51.35% | 0.00% | 42.89% | 1.76% | 4.00% | 0.00% | 0.00% | 48.65% | | |
| | lcb|Qwen2.5-14B-Instruct | rank2 | 45.76% | 0.00% | 47.73% | 1.85% | 4.66% | 0.00% | 0.00% | 54.24% | | |
| | lcb|Qwen2.5-14B-Instruct | rank3 | 44.32% | 0.00% | 48.97% | 1.98% | 4.74% | 0.00% | 0.00% | 55.68% | | |
| | lcb|Qwen2.5-14B-Instruct | rank4 | 42.57% | 0.00% | 50.74% | 1.86% | 4.83% | 0.00% | 0.00% | 57.43% | | |
| | lcb|Qwen2.5-14B-Instruct | rank5 | 41.88% | 0.00% | 50.92% | 1.94% | 5.27% | 0.00% | 0.00% | 58.12% | | |
| | predo|Qwen2.5-14B-Instruct | rank1 | 93.89% | 0.00% | 5.38% | 0.34% | 0.39% | 0.00% | 0.00% | 6.11% | | |
| | predo|Qwen2.5-14B-Instruct | rank2 | 93.35% | 0.00% | 5.80% | 0.33% | 0.52% | 0.00% | 0.00% | 6.65% | | |
| | predo|Qwen2.5-14B-Instruct | rank3 | 93.23% | 0.00% | 5.84% | 0.34% | 0.59% | 0.00% | 0.00% | 6.77% | | |
| | predo|Qwen2.5-14B-Instruct | rank4 | 93.02% | 0.00% | 6.04% | 0.34% | 0.61% | 0.00% | 0.00% | 6.98% | | |
| | predo|Qwen2.5-14B-Instruct | rank5 | 93.12% | 0.00% | 5.95% | 0.32% | 0.62% | 0.00% | 0.00% | 6.88% | | |
| | algo|Qwen2.5-32B-Instruct | rank1 | 94.56% | 0.00% | 5.05% | 0.21% | 0.18% | 0.00% | 0.00% | 5.44% | | |
| | algo|Qwen2.5-32B-Instruct | rank2 | 94.34% | 0.00% | 5.18% | 0.23% | 0.25% | 0.00% | 0.00% | 5.66% | | |
| | algo|Qwen2.5-32B-Instruct | rank3 | 94.14% | 0.00% | 5.32% | 0.23% | 0.31% | 0.00% | 0.00% | 5.86% | | |
| | algo|Qwen2.5-32B-Instruct | rank4 | 94.03% | 0.00% | 5.39% | 0.27% | 0.31% | 0.00% | 0.00% | 5.97% | | |
| | algo|Qwen2.5-32B-Instruct | rank5 | 93.95% | 0.00% | 5.42% | 0.23% | 0.40% | 0.00% | 0.00% | 6.05% | | |
| | crux|Qwen2.5-32B-Instruct | rank1 | 84.76% | 0.00% | 13.25% | 0.91% | 1.09% | 0.00% | 0.00% | 15.24% | | |
| | crux|Qwen2.5-32B-Instruct | rank2 | 83.58% | 0.00% | 14.20% | 0.94% | 1.27% | 0.00% | 0.00% | 16.42% | | |
| | crux|Qwen2.5-32B-Instruct | rank3 | 82.97% | 0.00% | 14.76% | 0.92% | 1.35% | 0.00% | 0.00% | 17.03% | | |
| | crux|Qwen2.5-32B-Instruct | rank4 | 82.68% | 0.00% | 15.06% | 0.90% | 1.36% | 0.00% | 0.00% | 17.32% | | |
| | crux|Qwen2.5-32B-Instruct | rank5 | 82.54% | 0.00% | 15.22% | 0.87% | 1.38% | 0.00% | 0.00% | 17.46% | | |
| | ht|Qwen2.5-32B-Instruct | rank1 | 67.26% | 0.00% | 28.29% | 1.12% | 3.33% | 0.00% | 0.00% | 32.74% | | |
| | ht|Qwen2.5-32B-Instruct | rank2 | 63.98% | 0.00% | 30.69% | 1.37% | 3.96% | 0.00% | 0.00% | 36.02% | | |
| | ht|Qwen2.5-32B-Instruct | rank3 | 62.44% | 0.00% | 31.75% | 1.27% | 4.53% | 0.00% | 0.00% | 37.56% | | |
| | ht|Qwen2.5-32B-Instruct | rank4 | 61.85% | 0.00% | 32.15% | 1.26% | 4.75% | 0.00% | 0.00% | 38.15% | | |
| | ht|Qwen2.5-32B-Instruct | rank5 | 61.22% | 0.00% | 32.70% | 1.33% | 4.75% | 0.00% | 0.00% | 38.78% | | |
| | lcb|Qwen2.5-32B-Instruct | rank1 | 52.13% | 0.00% | 42.30% | 1.80% | 3.77% | 0.00% | 0.00% | 47.87% | | |
| | lcb|Qwen2.5-32B-Instruct | rank2 | 47.93% | 0.00% | 45.58% | 2.02% | 4.47% | 0.00% | 0.00% | 52.07% | | |
| | lcb|Qwen2.5-32B-Instruct | rank3 | 45.86% | 0.00% | 47.56% | 2.01% | 4.57% | 0.00% | 0.00% | 54.14% | | |
| | lcb|Qwen2.5-32B-Instruct | rank4 | 44.36% | 0.00% | 48.34% | 2.16% | 5.14% | 0.00% | 0.00% | 55.64% | | |
| | lcb|Qwen2.5-32B-Instruct | rank5 | 43.90% | 0.00% | 48.78% | 2.18% | 5.14% | 0.00% | 0.00% | 56.1% | | |
| | predo|Qwen2.5-32B-Instruct | rank1 | 79.52% | 0.00% | 18.71% | 0.92% | 0.85% | 0.00% | 0.00% | 20.48% | | |
| | predo|Qwen2.5-32B-Instruct | rank2 | 76.89% | 0.00% | 21.16% | 1.01% | 0.94% | 0.00% | 0.00% | 23.11% | | |
| | predo|Qwen2.5-32B-Instruct | rank3 | 75.92% | 0.00% | 22.01% | 1.04% | 1.03% | 0.00% | 0.00% | 24.08% | | |
| | predo|Qwen2.5-32B-Instruct | rank4 | 75.09% | 0.00% | 22.86% | 1.03% | 1.02% | 0.00% | 0.00% | 24.91% | | |
| | predo|Qwen2.5-32B-Instruct | rank5 | 74.72% | 0.00% | 23.07% | 1.04% | 1.17% | 0.00% | 0.00% | 25.28% | | |
| | algo|Qwen2.5-7B-Instruct | rank1 | 97.58% | 0.00% | 2.20% | 0.14% | 0.09% | 0.00% | 0.00% | 2.42% | | |
| | algo|Qwen2.5-7B-Instruct | rank2 | 97.36% | 0.00% | 2.39% | 0.13% | 0.11% | 0.00% | 0.00% | 2.64% | | |
| | algo|Qwen2.5-7B-Instruct | rank3 | 97.17% | 0.00% | 2.49% | 0.15% | 0.19% | 0.00% | 0.00% | 2.83% | | |
| | algo|Qwen2.5-7B-Instruct | rank4 | 97.13% | 0.00% | 2.61% | 0.15% | 0.11% | 0.00% | 0.00% | 2.87% | | |
| | algo|Qwen2.5-7B-Instruct | rank5 | 97.12% | 0.00% | 2.56% | 0.14% | 0.18% | 0.00% | 0.00% | 2.88% | | |
| | crux|Qwen2.5-7B-Instruct | rank1 | 82.03% | 0.00% | 15.54% | 0.94% | 1.49% | 0.00% | 0.00% | 17.97% | | |
| | crux|Qwen2.5-7B-Instruct | rank2 | 80.80% | 0.00% | 16.64% | 0.93% | 1.63% | 0.00% | 0.00% | 19.2% | | |
| | crux|Qwen2.5-7B-Instruct | rank3 | 80.58% | 0.00% | 16.85% | 0.92% | 1.65% | 0.00% | 0.00% | 19.42% | | |
| | crux|Qwen2.5-7B-Instruct | rank4 | 80.48% | 0.00% | 16.97% | 0.94% | 1.62% | 0.00% | 0.00% | 19.52% | | |
| | crux|Qwen2.5-7B-Instruct | rank5 | 80.43% | 0.00% | 17.01% | 0.94% | 1.63% | 0.00% | 0.00% | 19.57% | | |
| | ht|Qwen2.5-7B-Instruct | rank1 | 68.70% | 0.00% | 27.39% | 1.36% | 2.54% | 0.00% | 0.00% | 31.3% | | |
| | ht|Qwen2.5-7B-Instruct | rank2 | 64.71% | 0.00% | 30.96% | 1.34% | 3.00% | 0.00% | 0.00% | 35.29% | | |
| | ht|Qwen2.5-7B-Instruct | rank3 | 63.40% | 0.00% | 32.21% | 1.44% | 2.96% | 0.00% | 0.00% | 36.6% | | |
| | ht|Qwen2.5-7B-Instruct | rank4 | 62.34% | 0.00% | 33.00% | 1.42% | 3.23% | 0.00% | 0.00% | 37.66% | | |
| | ht|Qwen2.5-7B-Instruct | rank5 | 62.00% | 0.00% | 33.41% | 1.47% | 3.12% | 0.00% | 0.00% | 38.0% | | |
| | lcb|Qwen2.5-7B-Instruct | rank1 | 50.28% | 0.00% | 43.08% | 2.08% | 4.55% | 0.00% | 0.00% | 49.72% | | |
| | lcb|Qwen2.5-7B-Instruct | rank2 | 44.51% | 0.00% | 47.67% | 2.23% | 5.59% | 0.00% | 0.00% | 55.49% | | |
| | lcb|Qwen2.5-7B-Instruct | rank3 | 42.13% | 0.00% | 49.44% | 2.34% | 6.10% | 0.00% | 0.00% | 57.87% | | |
| | lcb|Qwen2.5-7B-Instruct | rank4 | 40.95% | 0.00% | 50.47% | 2.36% | 6.21% | 0.00% | 0.00% | 59.05% | | |
| | lcb|Qwen2.5-7B-Instruct | rank5 | 40.07% | 0.00% | 51.32% | 2.33% | 6.27% | 0.00% | 0.00% | 59.93% | | |
| | predo|Qwen2.5-7B-Instruct | rank1 | 98.81% | 0.00% | 1.02% | 0.06% | 0.10% | 0.00% | 0.00% | 1.19% | | |
| | predo|Qwen2.5-7B-Instruct | rank2 | 98.76% | 0.00% | 1.08% | 0.08% | 0.08% | 0.00% | 0.00% | 1.24% | | |
| | predo|Qwen2.5-7B-Instruct | rank3 | 98.61% | 0.00% | 1.27% | 0.05% | 0.08% | 0.00% | 0.00% | 1.39% | | |
| | predo|Qwen2.5-7B-Instruct | rank4 | 98.68% | 0.00% | 1.20% | 0.06% | 0.06% | 0.00% | 0.00% | 1.32% | | |
| | predo|Qwen2.5-7B-Instruct | rank5 | 98.63% | 0.00% | 1.25% | 0.07% | 0.04% | 0.00% | 0.00% | 1.37% | | |
| | algo|Qwen2.5-Coder-14B-Instruct | rank1 | 98.90% | 0.00% | 0.95% | 0.14% | 0.01% | 0.00% | 0.00% | 1.1% | | |
| | algo|Qwen2.5-Coder-14B-Instruct | rank2 | 98.86% | 0.00% | 0.99% | 0.14% | 0.01% | 0.00% | 0.00% | 1.14% | | |
| | algo|Qwen2.5-Coder-14B-Instruct | rank3 | 98.81% | 0.00% | 1.02% | 0.15% | 0.03% | 0.00% | 0.00% | 1.19% | | |
| | algo|Qwen2.5-Coder-14B-Instruct | rank4 | 98.79% | 0.00% | 1.03% | 0.15% | 0.03% | 0.00% | 0.00% | 1.21% | | |
| | algo|Qwen2.5-Coder-14B-Instruct | rank5 | 98.78% | 0.00% | 1.06% | 0.15% | 0.01% | 0.00% | 0.00% | 1.22% | | |
| | crux|Qwen2.5-Coder-14B-Instruct | rank1 | 81.26% | 0.00% | 16.44% | 1.05% | 1.25% | 0.00% | 0.00% | 18.74% | | |
| | crux|Qwen2.5-Coder-14B-Instruct | rank2 | 80.15% | 0.00% | 17.24% | 1.17% | 1.44% | 0.00% | 0.00% | 19.85% | | |
| | crux|Qwen2.5-Coder-14B-Instruct | rank3 | 80.00% | 0.00% | 17.29% | 1.15% | 1.56% | 0.00% | 0.00% | 20.0% | | |
| | crux|Qwen2.5-Coder-14B-Instruct | rank4 | 79.72% | 0.00% | 17.65% | 1.13% | 1.50% | 0.00% | 0.00% | 20.28% | | |
| | crux|Qwen2.5-Coder-14B-Instruct | rank5 | 79.64% | 0.00% | 17.64% | 1.15% | 1.56% | 0.00% | 0.00% | 20.36% | | |
| | ht|Qwen2.5-Coder-14B-Instruct | rank1 | 66.79% | 0.00% | 28.21% | 1.31% | 3.70% | 0.00% | 0.00% | 33.21% | | |
| | ht|Qwen2.5-Coder-14B-Instruct | rank2 | 63.87% | 0.00% | 30.78% | 1.30% | 4.05% | 0.00% | 0.00% | 36.13% | | |
| | ht|Qwen2.5-Coder-14B-Instruct | rank3 | 62.62% | 0.00% | 31.47% | 1.37% | 4.54% | 0.00% | 0.00% | 37.38% | | |
| | ht|Qwen2.5-Coder-14B-Instruct | rank4 | 61.85% | 0.00% | 32.02% | 1.33% | 4.81% | 0.00% | 0.00% | 38.15% | | |
| | ht|Qwen2.5-Coder-14B-Instruct | rank5 | 61.58% | 0.00% | 32.43% | 1.34% | 4.66% | 0.00% | 0.00% | 38.42% | | |
| | lcb|Qwen2.5-Coder-14B-Instruct | rank1 | 43.35% | 0.00% | 48.58% | 2.07% | 6.00% | 0.00% | 0.00% | 56.65% | | |
| | lcb|Qwen2.5-Coder-14B-Instruct | rank2 | 36.47% | 0.00% | 53.92% | 2.36% | 7.26% | 0.00% | 0.00% | 63.53% | | |
| | lcb|Qwen2.5-Coder-14B-Instruct | rank3 | 33.93% | 0.00% | 56.01% | 2.37% | 7.69% | 0.00% | 0.00% | 66.07% | | |
| | lcb|Qwen2.5-Coder-14B-Instruct | rank4 | 32.25% | 0.00% | 57.35% | 2.45% | 7.95% | 0.00% | 0.00% | 67.75% | | |
| | lcb|Qwen2.5-Coder-14B-Instruct | rank5 | 31.34% | 0.00% | 57.45% | 2.42% | 8.79% | 0.00% | 0.00% | 68.66% | | |
| | predo|Qwen2.5-Coder-14B-Instruct | rank1 | 95.37% | 0.00% | 4.21% | 0.29% | 0.14% | 0.00% | 0.00% | 4.63% | | |
| | predo|Qwen2.5-Coder-14B-Instruct | rank2 | 95.15% | 0.00% | 4.43% | 0.29% | 0.14% | 0.00% | 0.00% | 4.85% | | |
| | predo|Qwen2.5-Coder-14B-Instruct | rank3 | 95.09% | 0.00% | 4.43% | 0.34% | 0.15% | 0.00% | 0.00% | 4.91% | | |
| | predo|Qwen2.5-Coder-14B-Instruct | rank4 | 94.93% | 0.00% | 4.61% | 0.31% | 0.15% | 0.00% | 0.00% | 5.07% | | |
| | predo|Qwen2.5-Coder-14B-Instruct | rank5 | 94.98% | 0.00% | 4.56% | 0.32% | 0.15% | 0.00% | 0.00% | 5.02% | | |
| | algo|Qwen2.5-Coder-32B-Instruct | rank1 | 78.36% | 0.32% | 9.96% | 0.87% | 1.50% | 0.00% | 0.00% | 12.64% | | |
| | algo|Qwen2.5-Coder-32B-Instruct | rank2 | 78.36% | 0.32% | 9.92% | 0.89% | 1.51% | 0.00% | 0.00% | 12.64% | | |
| | algo|Qwen2.5-Coder-32B-Instruct | rank3 | 78.36% | 0.32% | 9.95% | 0.87% | 1.50% | 0.00% | 0.00% | 12.64% | | |
| | algo|Qwen2.5-Coder-32B-Instruct | rank4 | 78.36% | 0.32% | 9.75% | 1.07% | 1.50% | 0.00% | 0.00% | 12.64% | | |
| | algo|Qwen2.5-Coder-32B-Instruct | rank5 | 78.36% | 0.32% | 9.93% | 0.86% | 1.54% | 0.00% | 0.00% | 12.64% | | |
| | algo|Qwen2.5-Coder-32B-Instruct | rank1 | 90.76% | 0.00% | 8.39% | 0.47% | 0.38% | 0.00% | 0.00% | 9.24% | | |
| | algo|Qwen2.5-Coder-32B-Instruct | rank2 | 90.49% | 0.00% | 8.70% | 0.44% | 0.37% | 0.00% | 0.00% | 9.51% | | |
| | algo|Qwen2.5-Coder-32B-Instruct | rank3 | 90.25% | 0.00% | 8.87% | 0.48% | 0.39% | 0.00% | 0.00% | 9.75% | | |
| | algo|Qwen2.5-Coder-32B-Instruct | rank4 | 90.16% | 0.00% | 8.96% | 0.46% | 0.42% | 0.00% | 0.00% | 9.84% | | |
| | algo|Qwen2.5-Coder-32B-Instruct | rank5 | 90.04% | 0.00% | 9.11% | 0.48% | 0.38% | 0.00% | 0.00% | 9.96% | | |
| | crux|Qwen2.5-Coder-32B-Instruct | rank1 | 66.73% | 0.32% | 19.64% | 0.91% | 3.40% | 0.00% | 0.00% | 24.27% | | |
| | crux|Qwen2.5-Coder-32B-Instruct | rank2 | 66.35% | 0.32% | 19.94% | 1.05% | 3.35% | 0.00% | 0.00% | 24.65% | | |
| | crux|Qwen2.5-Coder-32B-Instruct | rank3 | 66.27% | 0.32% | 19.94% | 1.07% | 3.40% | 0.00% | 0.00% | 24.73% | | |
| | crux|Qwen2.5-Coder-32B-Instruct | rank4 | 66.27% | 0.32% | 20.18% | 0.91% | 3.32% | 0.00% | 0.00% | 24.73% | | |
| | crux|Qwen2.5-Coder-32B-Instruct | rank5 | 66.27% | 0.32% | 20.14% | 0.90% | 3.37% | 0.00% | 0.00% | 24.73% | | |
| | crux|Qwen2.5-Coder-32B-Instruct | rank1 | 81.22% | 0.00% | 16.31% | 0.86% | 1.61% | 0.00% | 0.00% | 18.78% | | |
| | crux|Qwen2.5-Coder-32B-Instruct | rank2 | 79.66% | 0.00% | 17.63% | 0.93% | 1.77% | 0.00% | 0.00% | 20.34% | | |
| | crux|Qwen2.5-Coder-32B-Instruct | rank3 | 78.56% | 0.00% | 18.52% | 0.99% | 1.93% | 0.00% | 0.00% | 21.44% | | |
| | crux|Qwen2.5-Coder-32B-Instruct | rank4 | 78.31% | 0.00% | 18.76% | 0.93% | 2.00% | 0.00% | 0.00% | 21.69% | | |
| | crux|Qwen2.5-Coder-32B-Instruct | rank5 | 77.95% | 0.00% | 19.02% | 0.95% | 2.09% | 0.00% | 0.00% | 22.05% | | |
| | ht|Qwen2.5-Coder-32B-Instruct | rank1 | 36.21% | 0.32% | 47.48% | 1.51% | 5.49% | 0.00% | 0.00% | 54.79% | | |
| | ht|Qwen2.5-Coder-32B-Instruct | rank2 | 35.53% | 0.32% | 48.19% | 1.60% | 5.37% | 0.00% | 0.00% | 55.47% | | |
| | ht|Qwen2.5-Coder-32B-Instruct | rank3 | 36.02% | 0.32% | 47.41% | 1.62% | 5.63% | 0.00% | 0.00% | 54.98% | | |
| | ht|Qwen2.5-Coder-32B-Instruct | rank4 | 35.79% | 0.32% | 48.01% | 1.36% | 5.52% | 0.00% | 0.00% | 55.21% | | |
| | ht|Qwen2.5-Coder-32B-Instruct | rank5 | 35.46% | 0.32% | 48.46% | 1.38% | 5.39% | 0.00% | 0.00% | 55.54% | | |
| | ht|Qwen2.5-Coder-32B-Instruct | rank1 | 54.26% | 0.00% | 40.09% | 1.73% | 3.93% | 0.00% | 0.00% | 45.74% | | |
| | ht|Qwen2.5-Coder-32B-Instruct | rank2 | 47.83% | 0.00% | 45.28% | 1.97% | 4.92% | 0.00% | 0.00% | 52.17% | | |
| | ht|Qwen2.5-Coder-32B-Instruct | rank3 | 45.41% | 0.00% | 47.29% | 1.89% | 5.41% | 0.00% | 0.00% | 54.59% | | |
| | ht|Qwen2.5-Coder-32B-Instruct | rank4 | 43.91% | 0.00% | 48.39% | 1.95% | 5.76% | 0.00% | 0.00% | 56.09% | | |
| | ht|Qwen2.5-Coder-32B-Instruct | rank5 | 43.19% | 0.00% | 49.27% | 1.93% | 5.60% | 0.00% | 0.00% | 56.81% | | |
| | lcb|Qwen2.5-Coder-32B-Instruct | rank1 | 20.79% | 0.32% | 63.50% | 1.93% | 4.47% | 0.00% | 0.00% | 70.21% | | |
| | lcb|Qwen2.5-Coder-32B-Instruct | rank2 | 19.81% | 0.32% | 64.31% | 1.95% | 4.61% | 0.00% | 0.00% | 71.19% | | |
| | lcb|Qwen2.5-Coder-32B-Instruct | rank3 | 19.36% | 0.32% | 64.77% | 1.97% | 4.58% | 0.00% | 0.00% | 71.64% | | |
| | lcb|Qwen2.5-Coder-32B-Instruct | rank4 | 19.26% | 0.32% | 64.92% | 2.00% | 4.50% | 0.00% | 0.00% | 71.74% | | |
| | lcb|Qwen2.5-Coder-32B-Instruct | rank5 | 19.21% | 0.32% | 65.02% | 1.96% | 4.50% | 0.00% | 0.00% | 71.79% | | |
| | lcb|Qwen2.5-Coder-32B-Instruct | rank1 | 39.20% | 0.00% | 52.42% | 1.96% | 6.41% | 0.00% | 0.00% | 60.8% | | |
| | lcb|Qwen2.5-Coder-32B-Instruct | rank2 | 31.72% | 0.00% | 58.56% | 2.18% | 7.54% | 0.00% | 0.00% | 68.28% | | |
| | lcb|Qwen2.5-Coder-32B-Instruct | rank3 | 28.86% | 0.00% | 60.56% | 2.18% | 8.40% | 0.00% | 0.00% | 71.14% | | |
| | lcb|Qwen2.5-Coder-32B-Instruct | rank4 | 27.35% | 0.00% | 62.15% | 2.16% | 8.34% | 0.00% | 0.00% | 72.65% | | |
| | lcb|Qwen2.5-Coder-32B-Instruct | rank5 | 26.10% | 0.00% | 62.66% | 2.42% | 8.82% | 0.00% | 0.00% | 73.9% | | |
| | predo|Qwen2.5-Coder-32B-Instruct | rank1 | 63.34% | 0.32% | 21.50% | 1.28% | 4.58% | 0.00% | 0.00% | 27.66% | | |
| | predo|Qwen2.5-Coder-32B-Instruct | rank2 | 62.85% | 0.32% | 21.86% | 1.34% | 4.62% | 0.00% | 0.00% | 28.15% | | |
| | predo|Qwen2.5-Coder-32B-Instruct | rank3 | 62.41% | 0.32% | 22.42% | 1.30% | 4.56% | 0.00% | 0.00% | 28.59% | | |
| | predo|Qwen2.5-Coder-32B-Instruct | rank4 | 62.13% | 0.32% | 22.62% | 1.34% | 4.59% | 0.00% | 0.00% | 28.87% | | |
| | predo|Qwen2.5-Coder-32B-Instruct | rank5 | 62.40% | 0.32% | 22.50% | 1.38% | 4.41% | 0.00% | 0.00% | 28.6% | | |
| | predo|Qwen2.5-Coder-32B-Instruct | rank1 | 78.79% | 0.00% | 19.33% | 0.96% | 0.91% | 0.00% | 0.00% | 21.21% | | |
| | predo|Qwen2.5-Coder-32B-Instruct | rank2 | 76.48% | 0.00% | 21.19% | 1.15% | 1.18% | 0.00% | 0.00% | 23.52% | | |
| | predo|Qwen2.5-Coder-32B-Instruct | rank3 | 74.96% | 0.00% | 22.66% | 1.08% | 1.31% | 0.00% | 0.00% | 25.04% | | |
| | predo|Qwen2.5-Coder-32B-Instruct | rank4 | 74.32% | 0.00% | 23.15% | 1.13% | 1.39% | 0.00% | 0.00% | 25.68% | | |
| | predo|Qwen2.5-Coder-32B-Instruct | rank5 | 73.76% | 0.00% | 23.66% | 1.09% | 1.49% | 0.00% | 0.00% | 26.24% | | |
| | algo|Qwen2.5-Coder-7B-Instruct | rank1 | 99.22% | 0.00% | 0.72% | 0.04% | 0.03% | 0.00% | 0.00% | 0.78% | | |
| | algo|Qwen2.5-Coder-7B-Instruct | rank2 | 99.17% | 0.00% | 0.74% | 0.05% | 0.04% | 0.00% | 0.00% | 0.83% | | |
| | algo|Qwen2.5-Coder-7B-Instruct | rank3 | 99.17% | 0.00% | 0.73% | 0.07% | 0.02% | 0.00% | 0.00% | 0.83% | | |
| | algo|Qwen2.5-Coder-7B-Instruct | rank4 | 99.12% | 0.00% | 0.79% | 0.05% | 0.04% | 0.00% | 0.00% | 0.88% | | |
| | algo|Qwen2.5-Coder-7B-Instruct | rank5 | 99.11% | 0.00% | 0.79% | 0.07% | 0.02% | 0.00% | 0.00% | 0.89% | | |
| | crux|Qwen2.5-Coder-7B-Instruct | rank1 | 80.54% | 0.00% | 17.16% | 1.22% | 1.09% | 0.00% | 0.00% | 19.46% | | |
| | crux|Qwen2.5-Coder-7B-Instruct | rank2 | 79.83% | 0.00% | 17.69% | 1.24% | 1.23% | 0.00% | 0.00% | 20.17% | | |
| | crux|Qwen2.5-Coder-7B-Instruct | rank3 | 79.55% | 0.00% | 17.92% | 1.25% | 1.28% | 0.00% | 0.00% | 20.45% | | |
| | crux|Qwen2.5-Coder-7B-Instruct | rank4 | 79.53% | 0.00% | 17.95% | 1.24% | 1.28% | 0.00% | 0.00% | 20.47% | | |
| | crux|Qwen2.5-Coder-7B-Instruct | rank5 | 79.58% | 0.00% | 17.92% | 1.19% | 1.31% | 0.00% | 0.00% | 20.42% | | |
| | ht|Qwen2.5-Coder-7B-Instruct | rank1 | 76.61% | 0.00% | 20.24% | 0.90% | 2.25% | 0.00% | 0.00% | 23.39% | | |
| | ht|Qwen2.5-Coder-7B-Instruct | rank2 | 74.24% | 0.00% | 22.39% | 1.02% | 2.35% | 0.00% | 0.00% | 25.76% | | |
| | ht|Qwen2.5-Coder-7B-Instruct | rank3 | 73.52% | 0.00% | 22.91% | 1.01% | 2.56% | 0.00% | 0.00% | 26.48% | | |
| | ht|Qwen2.5-Coder-7B-Instruct | rank4 | 73.38% | 0.00% | 22.88% | 0.91% | 2.83% | 0.00% | 0.00% | 26.62% | | |
| | ht|Qwen2.5-Coder-7B-Instruct | rank5 | 73.24% | 0.00% | 23.08% | 0.98% | 2.70% | 0.00% | 0.00% | 26.76% | | |
| | lcb|Qwen2.5-Coder-7B-Instruct | rank1 | 67.30% | 0.00% | 28.61% | 1.46% | 2.63% | 0.00% | 0.00% | 32.7% | | |
| | lcb|Qwen2.5-Coder-7B-Instruct | rank2 | 63.58% | 0.00% | 32.15% | 1.47% | 2.81% | 0.00% | 0.00% | 36.42% | | |
| | lcb|Qwen2.5-Coder-7B-Instruct | rank3 | 62.65% | 0.00% | 32.91% | 1.52% | 2.92% | 0.00% | 0.00% | 37.35% | | |
| | lcb|Qwen2.5-Coder-7B-Instruct | rank4 | 61.99% | 0.00% | 33.41% | 1.66% | 2.94% | 0.00% | 0.00% | 38.01% | | |
| | lcb|Qwen2.5-Coder-7B-Instruct | rank5 | 61.45% | 0.00% | 33.96% | 1.49% | 3.10% | 0.00% | 0.00% | 38.55% | | |
| | predo|Qwen2.5-Coder-7B-Instruct | rank1 | 88.47% | 0.00% | 10.38% | 0.56% | 0.59% | 0.00% | 0.00% | 11.53% | | |
| | predo|Qwen2.5-Coder-7B-Instruct | rank2 | 87.30% | 0.00% | 11.40% | 0.62% | 0.68% | 0.00% | 0.00% | 12.7% | | |
| | predo|Qwen2.5-Coder-7B-Instruct | rank3 | 87.19% | 0.00% | 11.45% | 0.55% | 0.80% | 0.00% | 0.00% | 12.81% | | |
| | predo|Qwen2.5-Coder-7B-Instruct | rank4 | 86.68% | 0.00% | 11.88% | 0.55% | 0.89% | 0.00% | 0.00% | 13.32% | | |
| | predo|Qwen2.5-Coder-7B-Instruct | rank5 | 86.46% | 0.00% | 12.10% | 0.60% | 0.83% | 0.00% | 0.00% | 13.54% | | |
| | algo|claude-sonnet-4-20250514-thinking | rank1 | 38.31% | 0.35% | 54.45% | 1.76% | 5.14% | 0.00% | 0.00% | 61.69% | | |
| | algo|claude-sonnet-4-20250514-thinking | rank2 | 37.58% | 0.35% | 55.01% | 1.76% | 5.30% | 0.00% | 0.00% | 62.42% | | |
| | algo|claude-sonnet-4-20250514-thinking | rank3 | 37.24% | 0.35% | 55.32% | 1.76% | 5.34% | 0.00% | 0.00% | 62.76% | | |
| | algo|claude-sonnet-4-20250514-thinking | rank4 | 37.21% | 0.35% | 55.31% | 1.79% | 5.35% | 0.00% | 0.00% | 62.79% | | |
| | algo|claude-sonnet-4-20250514-thinking | rank5 | 37.22% | 0.35% | 55.50% | 1.86% | 5.07% | 0.00% | 0.00% | 62.78% | | |
| | algo|claude-sonnet-4-20250514-thinking | rank1 | 57.71% | 0.00% | 39.46% | 1.39% | 1.45% | 0.00% | 0.00% | 42.29% | | |
| | algo|claude-sonnet-4-20250514-thinking | rank2 | 52.35% | 0.00% | 44.48% | 1.45% | 1.71% | 0.00% | 0.00% | 47.65% | | |
| | algo|claude-sonnet-4-20250514-thinking | rank3 | 49.78% | 0.00% | 46.99% | 1.47% | 1.77% | 0.00% | 0.00% | 50.22% | | |
| | algo|claude-sonnet-4-20250514-thinking | rank4 | 48.47% | 0.00% | 48.00% | 1.56% | 1.96% | 0.00% | 0.00% | 51.53% | | |
| | algo|claude-sonnet-4-20250514-thinking | rank5 | 47.25% | 0.00% | 49.18% | 1.61% | 1.96% | 0.00% | 0.00% | 52.75% | | |
| | crux|claude-sonnet-4-20250514-thinking | rank1 | 43.26% | 0.35% | 46.62% | 1.77% | 8.01% | 0.00% | 0.00% | 56.74% | | |
| | crux|claude-sonnet-4-20250514-thinking | rank2 | 41.29% | 0.35% | 48.32% | 1.66% | 8.38% | 0.00% | 0.00% | 58.71% | | |
| | crux|claude-sonnet-4-20250514-thinking | rank3 | 40.30% | 0.35% | 49.20% | 1.82% | 8.33% | 0.00% | 0.00% | 59.7% | | |
| | crux|claude-sonnet-4-20250514-thinking | rank4 | 40.23% | 0.35% | 49.23% | 1.74% | 8.46% | 0.00% | 0.00% | 59.77% | | |
| | crux|claude-sonnet-4-20250514-thinking | rank5 | 40.04% | 0.35% | 49.23% | 1.70% | 8.69% | 0.00% | 0.00% | 59.96% | | |
| | crux|claude-sonnet-4-20250514-thinking | rank1 | 65.99% | 0.00% | 31.23% | 1.46% | 1.32% | 0.00% | 0.00% | 34.01% | | |
| | crux|claude-sonnet-4-20250514-thinking | rank2 | 60.00% | 0.00% | 37.10% | 1.40% | 1.50% | 0.00% | 0.00% | 40.0% | | |
| | crux|claude-sonnet-4-20250514-thinking | rank3 | 57.12% | 0.00% | 39.77% | 1.67% | 1.44% | 0.00% | 0.00% | 42.88% | | |
| | crux|claude-sonnet-4-20250514-thinking | rank4 | 54.94% | 0.00% | 41.84% | 1.60% | 1.62% | 0.00% | 0.00% | 45.06% | | |
| | crux|claude-sonnet-4-20250514-thinking | rank5 | 53.91% | 0.00% | 42.75% | 1.64% | 1.70% | 0.00% | 0.00% | 46.09% | | |
| | ht|claude-sonnet-4-20250514-thinking | rank1 | 18.07% | 0.35% | 73.16% | 2.21% | 6.21% | 0.00% | 0.00% | 81.93% | | |
| | ht|claude-sonnet-4-20250514-thinking | rank2 | 17.51% | 0.35% | 73.59% | 2.37% | 6.19% | 0.00% | 0.00% | 82.49% | | |
| | ht|claude-sonnet-4-20250514-thinking | rank3 | 17.03% | 0.35% | 74.12% | 2.32% | 6.18% | 0.00% | 0.00% | 82.97% | | |
| | ht|claude-sonnet-4-20250514-thinking | rank4 | 17.03% | 0.35% | 74.33% | 2.26% | 6.03% | 0.00% | 0.00% | 82.97% | | |
| | ht|claude-sonnet-4-20250514-thinking | rank5 | 17.03% | 0.35% | 74.20% | 2.16% | 6.27% | 0.00% | 0.00% | 82.97% | | |
| | ht|claude-sonnet-4-20250514-thinking | rank1 | 37.34% | 0.00% | 55.49% | 2.06% | 5.11% | 0.00% | 0.00% | 62.66% | | |
| | ht|claude-sonnet-4-20250514-thinking | rank2 | 29.42% | 0.00% | 62.67% | 2.17% | 5.74% | 0.00% | 0.00% | 70.58% | | |
| | ht|claude-sonnet-4-20250514-thinking | rank3 | 25.99% | 0.00% | 65.19% | 2.10% | 6.73% | 0.00% | 0.00% | 74.01% | | |
| | ht|claude-sonnet-4-20250514-thinking | rank4 | 24.15% | 0.00% | 66.90% | 2.27% | 6.68% | 0.00% | 0.00% | 75.85% | | |
| | ht|claude-sonnet-4-20250514-thinking | rank5 | 22.77% | 0.00% | 68.15% | 2.29% | 6.79% | 0.00% | 0.00% | 77.23% | | |
| | lcb|claude-sonnet-4-20250514-thinking | rank1 | 13.29% | 0.35% | 77.72% | 2.32% | 6.32% | 0.00% | 0.00% | 86.71% | | |
| | lcb|claude-sonnet-4-20250514-thinking | rank2 | 11.57% | 0.35% | 79.99% | 2.37% | 5.73% | 0.00% | 0.00% | 88.43% | | |
| | lcb|claude-sonnet-4-20250514-thinking | rank3 | 11.58% | 0.35% | 79.44% | 2.26% | 6.38% | 0.00% | 0.00% | 88.42% | | |
| | lcb|claude-sonnet-4-20250514-thinking | rank4 | 11.25% | 0.35% | 79.88% | 2.32% | 6.20% | 0.00% | 0.00% | 88.75% | | |
| | lcb|claude-sonnet-4-20250514-thinking | rank5 | 10.97% | 0.35% | 79.81% | 2.35% | 6.52% | 0.00% | 0.00% | 89.03% | | |
| | lcb|claude-sonnet-4-20250514-thinking | rank1 | 38.09% | 0.00% | 57.20% | 1.84% | 2.87% | 0.00% | 0.00% | 61.91% | | |
| | lcb|claude-sonnet-4-20250514-thinking | rank2 | 28.61% | 0.00% | 66.47% | 2.16% | 2.76% | 0.00% | 0.00% | 71.39% | | |
| | lcb|claude-sonnet-4-20250514-thinking | rank3 | 24.98% | 0.00% | 69.48% | 2.12% | 3.42% | 0.00% | 0.00% | 75.02% | | |
| | lcb|claude-sonnet-4-20250514-thinking | rank4 | 22.65% | 0.00% | 71.39% | 2.20% | 3.75% | 0.00% | 0.00% | 77.35% | | |
| | lcb|claude-sonnet-4-20250514-thinking | rank5 | 20.68% | 0.00% | 73.26% | 2.46% | 3.60% | 0.00% | 0.00% | 79.32% | | |
| | predo|claude-sonnet-4-20250514-thinking | rank1 | 76.09% | 0.35% | 22.04% | 0.64% | 0.89% | 0.00% | 0.00% | 23.91% | | |
| | predo|claude-sonnet-4-20250514-thinking | rank2 | 75.17% | 0.35% | 22.89% | 0.65% | 0.94% | 0.00% | 0.00% | 24.83% | | |
| | predo|claude-sonnet-4-20250514-thinking | rank3 | 75.37% | 0.35% | 22.79% | 0.63% | 0.86% | 0.00% | 0.00% | 24.63% | | |
| | predo|claude-sonnet-4-20250514-thinking | rank4 | 75.12% | 0.35% | 22.99% | 0.67% | 0.87% | 0.00% | 0.00% | 24.88% | | |
| | predo|claude-sonnet-4-20250514-thinking | rank5 | 75.12% | 0.35% | 23.03% | 0.64% | 0.87% | 0.00% | 0.00% | 24.88% | | |
| | predo|claude-sonnet-4-20250514-thinking | rank1 | 85.93% | 0.00% | 12.91% | 0.50% | 0.66% | 0.00% | 0.00% | 14.07% | | |
| | predo|claude-sonnet-4-20250514-thinking | rank2 | 83.63% | 0.00% | 15.08% | 0.51% | 0.78% | 0.00% | 0.00% | 16.37% | | |
| | predo|claude-sonnet-4-20250514-thinking | rank3 | 82.62% | 0.00% | 15.94% | 0.58% | 0.85% | 0.00% | 0.00% | 17.38% | | |
| | predo|claude-sonnet-4-20250514-thinking | rank4 | 81.87% | 0.00% | 16.68% | 0.51% | 0.95% | 0.00% | 0.00% | 18.13% | | |
| | predo|claude-sonnet-4-20250514-thinking | rank5 | 81.37% | 0.00% | 17.17% | 0.55% | 0.91% | 0.00% | 0.00% | 18.63% | | |
| | algo|claude4 | rank1 | 67.42% | 0.00% | 28.53% | 1.47% | 2.58% | 0.00% | 0.00% | 32.58% | | |
| | algo|claude4 | rank2 | 64.64% | 0.00% | 30.91% | 1.56% | 2.89% | 0.00% | 0.00% | 35.36% | | |
| | algo|claude4 | rank3 | 63.15% | 0.00% | 32.20% | 1.58% | 3.07% | 0.00% | 0.00% | 36.85% | | |
| | algo|claude4 | rank4 | 62.63% | 0.00% | 32.87% | 1.43% | 3.07% | 0.00% | 0.00% | 37.37% | | |
| | algo|claude4 | rank5 | 62.44% | 0.00% | 33.00% | 1.47% | 3.09% | 0.00% | 0.00% | 37.56% | | |
| | crux|claude4 | rank1 | 76.02% | 0.00% | 21.45% | 1.09% | 1.44% | 0.00% | 0.00% | 23.98% | | |
| | crux|claude4 | rank2 | 73.55% | 0.00% | 23.81% | 1.10% | 1.54% | 0.00% | 0.00% | 26.45% | | |
| | crux|claude4 | rank3 | 72.91% | 0.00% | 24.18% | 1.24% | 1.67% | 0.00% | 0.00% | 27.09% | | |
| | crux|claude4 | rank4 | 72.50% | 0.00% | 24.71% | 1.17% | 1.62% | 0.00% | 0.00% | 27.5% | | |
| | crux|claude4 | rank5 | 72.40% | 0.00% | 24.79% | 1.15% | 1.66% | 0.00% | 0.00% | 27.6% | | |
| | ht|claude4 | rank1 | 36.07% | 0.00% | 54.36% | 2.53% | 7.04% | 0.00% | 0.00% | 63.93% | | |
| | ht|claude4 | rank2 | 29.78% | 0.00% | 59.38% | 2.53% | 8.32% | 0.00% | 0.00% | 70.22% | | |
| | ht|claude4 | rank3 | 27.03% | 0.00% | 61.06% | 2.44% | 9.48% | 0.00% | 0.00% | 72.97% | | |
| | ht|claude4 | rank4 | 26.00% | 0.00% | 62.24% | 2.69% | 9.07% | 0.00% | 0.00% | 74.0% | | |
| | ht|claude4 | rank5 | 25.22% | 0.00% | 62.70% | 2.59% | 9.49% | 0.00% | 0.00% | 74.78% | | |
| | lcb|claude4 | rank1 | 39.54% | 0.00% | 54.12% | 2.20% | 4.15% | 0.00% | 0.00% | 60.46% | | |
| | lcb|claude4 | rank2 | 31.10% | 0.00% | 61.16% | 2.39% | 5.35% | 0.00% | 0.00% | 68.9% | | |
| | lcb|claude4 | rank3 | 28.23% | 0.00% | 63.67% | 2.46% | 5.63% | 0.00% | 0.00% | 71.77% | | |
| | lcb|claude4 | rank4 | 26.06% | 0.00% | 65.40% | 2.52% | 6.02% | 0.00% | 0.00% | 73.94% | | |
| | lcb|claude4 | rank5 | 25.26% | 0.00% | 66.52% | 2.47% | 5.75% | 0.00% | 0.00% | 74.74% | | |
| | predo|claude4 | rank1 | 63.85% | 0.00% | 32.44% | 1.32% | 2.39% | 0.00% | 0.00% | 36.15% | | |
| | predo|claude4 | rank2 | 58.52% | 0.00% | 37.33% | 1.43% | 2.72% | 0.00% | 0.00% | 41.48% | | |
| | predo|claude4 | rank3 | 56.52% | 0.00% | 38.79% | 1.40% | 3.30% | 0.00% | 0.00% | 43.48% | | |
| | predo|claude4 | rank4 | 54.26% | 0.00% | 40.43% | 1.41% | 3.90% | 0.00% | 0.00% | 45.74% | | |
| | predo|claude4 | rank5 | 53.19% | 0.00% | 41.55% | 1.53% | 3.73% | 0.00% | 0.00% | 46.81% | | |
| | algo|deepseek-v3 | rank1 | 67.03% | 0.00% | 30.26% | 1.25% | 1.46% | 0.00% | 0.00% | 32.97% | | |
| | algo|deepseek-v3 | rank2 | 63.46% | 0.00% | 33.45% | 1.39% | 1.70% | 0.00% | 0.00% | 36.54% | | |
| | algo|deepseek-v3 | rank3 | 62.45% | 0.00% | 34.32% | 1.30% | 1.93% | 0.00% | 0.00% | 37.55% | | |
| | algo|deepseek-v3 | rank4 | 61.41% | 0.00% | 35.41% | 1.36% | 1.82% | 0.00% | 0.00% | 38.59% | | |
| | algo|deepseek-v3 | rank5 | 60.77% | 0.00% | 36.04% | 1.33% | 1.87% | 0.00% | 0.00% | 39.23% | | |
| | crux|deepseek-v3 | rank1 | 83.04% | 0.00% | 15.57% | 0.86% | 0.54% | 0.00% | 0.00% | 16.96% | | |
| | crux|deepseek-v3 | rank2 | 81.36% | 0.00% | 17.21% | 0.89% | 0.54% | 0.00% | 0.00% | 18.64% | | |
| | crux|deepseek-v3 | rank3 | 80.62% | 0.00% | 17.87% | 0.91% | 0.61% | 0.00% | 0.00% | 19.38% | | |
| | crux|deepseek-v3 | rank4 | 80.07% | 0.00% | 18.34% | 0.95% | 0.64% | 0.00% | 0.00% | 19.93% | | |
| | crux|deepseek-v3 | rank5 | 79.92% | 0.00% | 18.55% | 0.93% | 0.59% | 0.00% | 0.00% | 20.08% | | |
| | ht|deepseek-v3 | rank1 | 100.00% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 0.0% | | |
| | ht|deepseek-v3 | rank2 | 100.00% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 0.0% | | |
| | ht|deepseek-v3 | rank3 | 100.00% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 0.0% | | |
| | ht|deepseek-v3 | rank4 | 100.00% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 0.0% | | |
| | ht|deepseek-v3 | rank5 | 100.00% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 0.00% | 0.0% | | |
| | lcb|deepseek-v3 | rank1 | 40.31% | 0.00% | 56.26% | 1.96% | 1.46% | 0.00% | 0.00% | 59.69% | | |
| | lcb|deepseek-v3 | rank2 | 33.45% | 0.00% | 62.71% | 2.16% | 1.68% | 0.00% | 0.00% | 66.55% | | |
| | lcb|deepseek-v3 | rank3 | 30.35% | 0.00% | 65.52% | 2.25% | 1.88% | 0.00% | 0.00% | 69.65% | | |
| | lcb|deepseek-v3 | rank4 | 27.98% | 0.00% | 67.78% | 2.34% | 1.90% | 0.00% | 0.00% | 72.02% | | |
| | lcb|deepseek-v3 | rank5 | 26.82% | 0.00% | 69.10% | 2.33% | 1.74% | 0.00% | 0.00% | 73.18% | | |
| | predo|deepseek-v3 | rank1 | 88.27% | 0.00% | 11.07% | 0.32% | 0.34% | 0.00% | 0.00% | 11.73% | | |
| | predo|deepseek-v3 | rank2 | 86.51% | 0.00% | 12.65% | 0.40% | 0.44% | 0.00% | 0.00% | 13.49% | | |
| | predo|deepseek-v3 | rank3 | 86.06% | 0.00% | 13.01% | 0.44% | 0.49% | 0.00% | 0.00% | 13.94% | | |
| | predo|deepseek-v3 | rank4 | 85.35% | 0.00% | 13.70% | 0.41% | 0.53% | 0.00% | 0.00% | 14.65% | | |
| | predo|deepseek-v3 | rank5 | 85.11% | 0.00% | 13.89% | 0.41% | 0.58% | 0.00% | 0.00% | 14.89% | | |
| | algo|gpt-4o | rank1 | 75.86% | 0.00% | 22.16% | 1.15% | 0.82% | 0.00% | 0.00% | 24.14% | | |
| | algo|gpt-4o | rank2 | 73.19% | 0.00% | 24.77% | 1.23% | 0.80% | 0.00% | 0.00% | 26.81% | | |
| | algo|gpt-4o | rank3 | 71.95% | 0.00% | 25.88% | 1.17% | 1.00% | 0.00% | 0.00% | 28.05% | | |
| | algo|gpt-4o | rank4 | 71.48% | 0.00% | 26.22% | 1.22% | 1.08% | 0.00% | 0.00% | 28.52% | | |
| | algo|gpt-4o | rank5 | 70.84% | 0.00% | 26.90% | 1.23% | 1.02% | 0.00% | 0.00% | 29.16% | | |
| | crux|gpt-4o | rank1 | 71.26% | 0.00% | 25.79% | 1.40% | 1.55% | 0.00% | 0.00% | 28.74% | | |
| | crux|gpt-4o | rank2 | 66.21% | 0.00% | 30.28% | 1.56% | 1.95% | 0.00% | 0.00% | 33.79% | | |
| | crux|gpt-4o | rank3 | 64.03% | 0.00% | 32.26% | 1.66% | 2.05% | 0.00% | 0.00% | 35.97% | | |
| | crux|gpt-4o | rank4 | 62.49% | 0.00% | 33.56% | 1.66% | 2.28% | 0.00% | 0.00% | 37.51% | | |
| | crux|gpt-4o | rank5 | 61.60% | 0.00% | 34.52% | 1.67% | 2.21% | 0.00% | 0.00% | 38.4% | | |
| | ht|gpt-4o | rank1 | 53.09% | 0.00% | 42.23% | 1.94% | 2.74% | 0.00% | 0.00% | 46.91% | | |
| | ht|gpt-4o | rank2 | 48.77% | 0.00% | 45.86% | 1.99% | 3.38% | 0.00% | 0.00% | 51.23% | | |
| | ht|gpt-4o | rank3 | 45.93% | 0.00% | 48.49% | 2.08% | 3.50% | 0.00% | 0.00% | 54.07% | | |
| | ht|gpt-4o | rank4 | 44.83% | 0.00% | 49.04% | 2.18% | 3.95% | 0.00% | 0.00% | 55.17% | | |
| | ht|gpt-4o | rank5 | 44.31% | 0.00% | 49.88% | 2.07% | 3.74% | 0.00% | 0.00% | 55.69% | | |
| | lcb|gpt-4o | rank1 | 45.90% | 0.00% | 50.26% | 2.14% | 1.71% | 0.00% | 0.00% | 54.1% | | |
| | lcb|gpt-4o | rank2 | 39.46% | 0.00% | 56.44% | 2.46% | 1.64% | 0.00% | 0.00% | 60.54% | | |
| | lcb|gpt-4o | rank3 | 37.03% | 0.00% | 58.66% | 2.50% | 1.81% | 0.00% | 0.00% | 62.97% | | |
| | lcb|gpt-4o | rank4 | 35.32% | 0.00% | 60.28% | 2.58% | 1.82% | 0.00% | 0.00% | 64.68% | | |
| | lcb|gpt-4o | rank5 | 34.75% | 0.00% | 60.87% | 2.50% | 1.87% | 0.00% | 0.00% | 65.25% | | |
| | predo|gpt-4o | rank1 | 73.00% | 0.00% | 24.59% | 1.01% | 1.40% | 0.00% | 0.00% | 27.0% | | |
| | predo|gpt-4o | rank2 | 69.45% | 0.00% | 27.71% | 1.19% | 1.64% | 0.00% | 0.00% | 30.55% | | |
| | predo|gpt-4o | rank3 | 67.63% | 0.00% | 29.35% | 1.21% | 1.82% | 0.00% | 0.00% | 32.37% | | |
| | predo|gpt-4o | rank4 | 66.60% | 0.00% | 30.32% | 1.15% | 1.93% | 0.00% | 0.00% | 33.4% | | |
| | predo|gpt-4o | rank5 | 66.03% | 0.00% | 30.86% | 1.18% | 1.93% | 0.00% | 0.00% | 33.97% | | |
| | algo|qwen-coder-plus | rank1 | 78.53% | 0.00% | 19.94% | 0.73% | 0.80% | 0.00% | 0.00% | 21.47% | | |
| | algo|qwen-coder-plus | rank2 | 76.18% | 0.00% | 22.00% | 0.76% | 1.06% | 0.00% | 0.00% | 23.82% | | |
| | algo|qwen-coder-plus | rank3 | 75.00% | 0.00% | 23.15% | 0.79% | 1.06% | 0.00% | 0.00% | 25.0% | | |
| | algo|qwen-coder-plus | rank4 | 74.47% | 0.00% | 23.57% | 0.78% | 1.18% | 0.00% | 0.00% | 25.53% | | |
| | algo|qwen-coder-plus | rank5 | 74.09% | 0.00% | 23.80% | 0.91% | 1.19% | 0.00% | 0.00% | 25.91% | | |
| | crux|qwen-coder-plus | rank1 | 67.59% | 0.00% | 28.96% | 1.69% | 1.77% | 0.00% | 0.00% | 32.41% | | |
| | crux|qwen-coder-plus | rank2 | 63.63% | 0.00% | 32.47% | 1.80% | 2.10% | 0.00% | 0.00% | 36.37% | | |
| | crux|qwen-coder-plus | rank3 | 62.36% | 0.00% | 33.56% | 1.83% | 2.25% | 0.00% | 0.00% | 37.64% | | |
| | crux|qwen-coder-plus | rank4 | 61.46% | 0.00% | 34.36% | 1.86% | 2.32% | 0.00% | 0.00% | 38.54% | | |
| | crux|qwen-coder-plus | rank5 | 61.25% | 0.00% | 34.67% | 1.78% | 2.29% | 0.00% | 0.00% | 38.75% | | |
| | ht|qwen-coder-plus | rank1 | 44.75% | 0.00% | 48.47% | 2.01% | 4.78% | 0.00% | 0.00% | 55.25% | | |
| | ht|qwen-coder-plus | rank2 | 39.20% | 0.00% | 52.50% | 2.23% | 6.07% | 0.00% | 0.00% | 60.8% | | |
| | ht|qwen-coder-plus | rank3 | 36.77% | 0.00% | 54.97% | 2.21% | 6.06% | 0.00% | 0.00% | 63.23% | | |
| | ht|qwen-coder-plus | rank4 | 35.73% | 0.00% | 55.47% | 2.19% | 6.61% | 0.00% | 0.00% | 64.27% | | |
| | ht|qwen-coder-plus | rank5 | 35.20% | 0.00% | 55.74% | 2.40% | 6.66% | 0.00% | 0.00% | 64.8% | | |
| | lcb|qwen-coder-plus | rank1 | 38.06% | 0.00% | 55.88% | 2.15% | 3.92% | 0.00% | 0.00% | 61.94% | | |
| | lcb|qwen-coder-plus | rank2 | 29.77% | 0.00% | 62.60% | 2.42% | 5.21% | 0.00% | 0.00% | 70.23% | | |
| | lcb|qwen-coder-plus | rank3 | 26.62% | 0.00% | 65.39% | 2.56% | 5.43% | 0.00% | 0.00% | 73.38% | | |
| | lcb|qwen-coder-plus | rank4 | 24.90% | 0.00% | 67.06% | 2.41% | 5.63% | 0.00% | 0.00% | 75.1% | | |
| | lcb|qwen-coder-plus | rank5 | 23.76% | 0.00% | 67.85% | 2.51% | 5.89% | 0.00% | 0.00% | 76.24% | | |
| | predo|qwen-coder-plus | rank1 | 67.57% | 0.00% | 29.25% | 1.36% | 1.83% | 0.00% | 0.00% | 32.43% | | |
| | predo|qwen-coder-plus | rank2 | 62.88% | 0.00% | 33.41% | 1.44% | 2.27% | 0.00% | 0.00% | 37.12% | | |
| | predo|qwen-coder-plus | rank3 | 60.94% | 0.00% | 35.29% | 1.51% | 2.26% | 0.00% | 0.00% | 39.06% | | |
| | predo|qwen-coder-plus | rank4 | 60.13% | 0.00% | 36.01% | 1.46% | 2.41% | 0.00% | 0.00% | 39.87% | | |
| | predo|qwen-coder-plus | rank5 | 59.17% | 0.00% | 36.75% | 1.52% | 2.56% | 0.00% | 0.00% | 40.83% | | |
| | algo|qwen3-nothink | rank1 | 81.44% | 0.00% | 16.64% | 0.88% | 1.04% | 0.00% | 0.00% | 18.56% | | |
| | algo|qwen3-nothink | rank2 | 79.83% | 0.00% | 18.03% | 0.89% | 1.25% | 0.00% | 0.00% | 20.17% | | |
| | algo|qwen3-nothink | rank3 | 79.22% | 0.00% | 18.53% | 0.92% | 1.33% | 0.00% | 0.00% | 20.78% | | |
| | algo|qwen3-nothink | rank4 | 78.90% | 0.00% | 18.74% | 0.92% | 1.43% | 0.00% | 0.00% | 21.1% | | |
| | algo|qwen3-nothink | rank5 | 78.57% | 0.00% | 19.13% | 0.93% | 1.38% | 0.00% | 0.00% | 21.43% | | |
| | crux|qwen3-nothink | rank1 | 68.89% | 0.00% | 27.16% | 1.74% | 2.21% | 0.00% | 0.00% | 31.11% | | |
| | crux|qwen3-nothink | rank2 | 64.47% | 0.00% | 30.92% | 1.86% | 2.75% | 0.00% | 0.00% | 35.53% | | |
| | crux|qwen3-nothink | rank3 | 63.00% | 0.00% | 32.23% | 2.01% | 2.75% | 0.00% | 0.00% | 37.0% | | |
| | crux|qwen3-nothink | rank4 | 62.02% | 0.00% | 33.21% | 1.94% | 2.83% | 0.00% | 0.00% | 37.98% | | |
| | crux|qwen3-nothink | rank5 | 61.86% | 0.00% | 33.42% | 1.99% | 2.74% | 0.00% | 0.00% | 38.14% | | |
| | ht|qwen3-nothink | rank1 | 75.54% | 0.00% | 20.36% | 1.14% | 2.97% | 0.00% | 0.00% | 24.46% | | |
| | ht|qwen3-nothink | rank2 | 73.11% | 0.00% | 22.25% | 1.14% | 3.51% | 0.00% | 0.00% | 26.89% | | |
| | ht|qwen3-nothink | rank3 | 72.13% | 0.00% | 23.04% | 1.11% | 3.72% | 0.00% | 0.00% | 27.87% | | |
| | ht|qwen3-nothink | rank4 | 71.75% | 0.00% | 23.27% | 1.13% | 3.85% | 0.00% | 0.00% | 28.25% | | |
| | ht|qwen3-nothink | rank5 | 71.27% | 0.00% | 23.51% | 1.20% | 4.03% | 0.00% | 0.00% | 28.73% | | |
| | lcb|qwen3-nothink | rank1 | 53.53% | 0.00% | 37.31% | 1.99% | 7.17% | 0.00% | 0.00% | 46.47% | | |
| | lcb|qwen3-nothink | rank2 | 48.12% | 0.00% | 41.22% | 2.23% | 8.43% | 0.00% | 0.00% | 51.88% | | |
| | lcb|qwen3-nothink | rank3 | 46.22% | 0.00% | 42.67% | 2.41% | 8.71% | 0.00% | 0.00% | 53.78% | | |
| | lcb|qwen3-nothink | rank4 | 45.13% | 0.00% | 43.44% | 2.25% | 9.17% | 0.00% | 0.00% | 54.87% | | |
| | lcb|qwen3-nothink | rank5 | 44.68% | 0.00% | 44.08% | 2.19% | 9.05% | 0.00% | 0.00% | 55.32% | | |
| | predo|qwen3-nothink | rank1 | 97.60% | 0.00% | 2.07% | 0.19% | 0.13% | 0.00% | 0.00% | 2.4% | | |
| | predo|qwen3-nothink | rank2 | 97.41% | 0.00% | 2.28% | 0.18% | 0.13% | 0.00% | 0.00% | 2.59% | | |
| | predo|qwen3-nothink | rank3 | 97.30% | 0.00% | 2.37% | 0.19% | 0.13% | 0.00% | 0.00% | 2.7% | | |
| | predo|qwen3-nothink | rank4 | 97.22% | 0.00% | 2.44% | 0.18% | 0.16% | 0.00% | 0.00% | 2.78% | | |
| | predo|qwen3-nothink | rank5 | 97.16% | 0.00% | 2.50% | 0.21% | 0.13% | 0.00% | 0.00% | 2.84% | | |
Xet Storage Details
- Size:
- 36.6 kB
- Xet hash:
- 568f0d29667746059b36ea4c0d1c6afc4018c269f1f48a60718b3cb048df4b61
·
Xet efficiently stores files, intelligently splitting them into unique chunks and accelerating uploads and downloads. More info.