temsa's picture
Publish v2-rc3 release
a024324 verified

Benchmark Summary

This release packages a stronger full checkpoint and a stronger bundled ONNX q8 artifact than temsa/OpenMed-mLiteClinical-IrishCorePII-135M-v2-rc2.

Recommended thresholds:

  • full checkpoint: ppsn_min_score = 0.55, other_min_score = 0.40
  • ONNX q8: ppsn_min_score = 0.60, other_min_score = 0.45

Broader CPU Benchmarks

Variant User PPSN Core Edge Multilingual PPSN Strict Remaining IoU=1.0
temsa/OpenMed-mLiteClinical-IrishCorePII-135M-v2-rc2 full 0.8571 0.9554 0.9500 0.8038 0.4000
temsa/OpenMed-mLiteClinical-IrishCorePII-135M-v2-rc3 full 1.0000 0.9806 1.0000 0.9333 0.4444
temsa/OpenMed-mLiteClinical-IrishCorePII-135M-v2-rc2 ONNX q8 0.8571 0.9677 0.9500 0.8077 0.6000
temsa/OpenMed-mLiteClinical-IrishCorePII-135M-v2-rc3 ONNX q8 1.0000 0.9677 1.0000 0.9333 0.6667

Exact QA Suites

Variant Numeric v2 Passport Routing Phone Gap Passport Routing Phone
temsa/OpenMed-mLiteClinical-IrishCorePII-135M-v2-rc2 full 0.8966 0.9091 1.0000 0.7500 0.8696 0.8889 1.0000 0.6667
temsa/OpenMed-mLiteClinical-IrishCorePII-135M-v2-rc3 full 0.9333 0.9091 1.0000 0.8889 0.9167 0.8889 1.0000 0.8571
temsa/OpenMed-mLiteClinical-IrishCorePII-135M-v2-rc2 ONNX q8 0.8667 0.8333 1.0000 0.7500 0.8696 0.8889 1.0000 0.6667
temsa/OpenMed-mLiteClinical-IrishCorePII-135M-v2-rc3 ONNX q8 0.9333 0.9091 1.0000 0.8889 0.9167 0.8889 1.0000 0.8571

CPU Throughput

Variant Core ex/s Edge ex/s Multilingual PPSN ex/s
temsa/OpenMed-mLiteClinical-IrishCorePII-135M-v2-rc2 full 58.3521 63.1592 128.1891
temsa/OpenMed-mLiteClinical-IrishCorePII-135M-v2-rc3 full 64.8768 62.4612 126.4996
temsa/OpenMed-mLiteClinical-IrishCorePII-135M-v2-rc2 ONNX q8 44.9578 45.5786 297.5262
temsa/OpenMed-mLiteClinical-IrishCorePII-135M-v2-rc3 ONNX q8 44.9398 41.7212 190.8299

Reading These Numbers

  • The new full checkpoint improves the previous RC on every critical suite measured here.
  • The new ONNX q8 artifact also improves the previous ONNX q8 on every critical suite measured here.
  • The best promoted q8 recipe remains the standard dynamic int8 export used in this repo. Two alternative recipes were tested locally and rejected because they hurt multilingual PPSN quality.
  • The strict exact-span boundary suite is still not solved. The remaining failures are mostly passport values written as prefix + space + digits.