Benchmark Summary
This release packages a stronger full checkpoint and a stronger bundled ONNX q8 artifact than temsa/OpenMed-mLiteClinical-IrishCorePII-135M-v2-rc2.
Recommended thresholds:
- full checkpoint:
ppsn_min_score = 0.55, other_min_score = 0.40
- ONNX q8:
ppsn_min_score = 0.60, other_min_score = 0.45
Broader CPU Benchmarks
| Variant |
User PPSN |
Core |
Edge |
Multilingual PPSN |
Strict Remaining IoU=1.0 |
temsa/OpenMed-mLiteClinical-IrishCorePII-135M-v2-rc2 full |
0.8571 |
0.9554 |
0.9500 |
0.8038 |
0.4000 |
temsa/OpenMed-mLiteClinical-IrishCorePII-135M-v2-rc3 full |
1.0000 |
0.9806 |
1.0000 |
0.9333 |
0.4444 |
temsa/OpenMed-mLiteClinical-IrishCorePII-135M-v2-rc2 ONNX q8 |
0.8571 |
0.9677 |
0.9500 |
0.8077 |
0.6000 |
temsa/OpenMed-mLiteClinical-IrishCorePII-135M-v2-rc3 ONNX q8 |
1.0000 |
0.9677 |
1.0000 |
0.9333 |
0.6667 |
Exact QA Suites
| Variant |
Numeric v2 |
Passport |
Routing |
Phone |
Gap |
Passport |
Routing |
Phone |
temsa/OpenMed-mLiteClinical-IrishCorePII-135M-v2-rc2 full |
0.8966 |
0.9091 |
1.0000 |
0.7500 |
0.8696 |
0.8889 |
1.0000 |
0.6667 |
temsa/OpenMed-mLiteClinical-IrishCorePII-135M-v2-rc3 full |
0.9333 |
0.9091 |
1.0000 |
0.8889 |
0.9167 |
0.8889 |
1.0000 |
0.8571 |
temsa/OpenMed-mLiteClinical-IrishCorePII-135M-v2-rc2 ONNX q8 |
0.8667 |
0.8333 |
1.0000 |
0.7500 |
0.8696 |
0.8889 |
1.0000 |
0.6667 |
temsa/OpenMed-mLiteClinical-IrishCorePII-135M-v2-rc3 ONNX q8 |
0.9333 |
0.9091 |
1.0000 |
0.8889 |
0.9167 |
0.8889 |
1.0000 |
0.8571 |
CPU Throughput
| Variant |
Core ex/s |
Edge ex/s |
Multilingual PPSN ex/s |
temsa/OpenMed-mLiteClinical-IrishCorePII-135M-v2-rc2 full |
58.3521 |
63.1592 |
128.1891 |
temsa/OpenMed-mLiteClinical-IrishCorePII-135M-v2-rc3 full |
64.8768 |
62.4612 |
126.4996 |
temsa/OpenMed-mLiteClinical-IrishCorePII-135M-v2-rc2 ONNX q8 |
44.9578 |
45.5786 |
297.5262 |
temsa/OpenMed-mLiteClinical-IrishCorePII-135M-v2-rc3 ONNX q8 |
44.9398 |
41.7212 |
190.8299 |
Reading These Numbers
- The new full checkpoint improves the previous RC on every critical suite measured here.
- The new ONNX q8 artifact also improves the previous ONNX q8 on every critical suite measured here.
- The best promoted q8 recipe remains the standard dynamic int8 export used in this repo. Two alternative recipes were tested locally and rejected because they hurt multilingual PPSN quality.
- The strict exact-span boundary suite is still not solved. The remaining failures are mostly passport values written as
prefix + space + digits.