temsa/OpenMed-mLiteClinical-IrishCorePII-135M-v2-rc1
QA release candidate for Irish core PII detection with OpenMed mLiteClinical.
This repository should be evaluated against the current public release:
- current public release:
temsa/OpenMed-mLiteClinical-IrishCorePII-135M-v1 - this repository:
temsa/OpenMed-mLiteClinical-IrishCorePII-135M-v2-rc1
The purpose of this RC is specific: improve weak-context PPSN detection without leaving the raw-model-only approach.
In particular, this RC is intended to fix cases like:
1234567T - am I eligible for the housing grant?I was told to provide my number 1234567T when applying, what do I do next?My ppsn is 1234567tw and I need to know about carer's allowance
Coverage
PPSNaccount_numberbank_routing_numbercredit_debit_cardPASSPORT_NUMBERpostcodephone_numberemailfirst_namelast_nameswift_bic
Recommended Inference
Use the bundled inference_mask.py with split thresholds:
python3 inference_mask.py \
--model temsa/OpenMed-mLiteClinical-IrishCorePII-135M-v2-rc1 \
--ppsn-min-score 0.5 \
--other-min-score 0.4 \
--text "I was told to provide my number 1234567T when applying, what do I do next?" \
--json
For deployment through the existing inference-server ONNX path, this repo also publishes a
dynamic 8-bit ONNX artifact at onnx/model.onnx.
Comparison To The Current Public Release
PPSN-only comparison:
| Model | User Raw | Core PPSN | Edge PPSN | QA v8 PPSN | Irish Large PPSN |
|---|---|---|---|---|---|
temsa/OpenMed-mLiteClinical-IrishCorePII-135M-v1 |
0.8000 | 0.0800 | 0.4211 | 0.7385 | 0.8980 |
temsa/OpenMed-mLiteClinical-IrishCorePII-135M-v2-rc1 |
1.0000 | 0.8571 | 0.8571 | 0.7353 | 0.9403 |
Broader Irish-core multilabel view at the recommended thresholds for this RC (--ppsn-min-score 0.5 --other-min-score 0.4):
- overall Irish core F1:
0.9487 - overall Irish edge F1:
0.8205 phone_numbercore F1:0.9167postcodecore F1:0.7500PPSNcore F1:0.8571PPSNedge F1:0.8571
ONNX Runtime Benchmark
The score tables above compare the model itself. The table below compares deployment artifacts for this RC on the same synthetic runtime corpus used by the inference-server benchmark harness.
| Artifact | Quantization | Size (MB) | Avg Latency (ms) | P95 Latency (ms) | Throughput (RPS) | CPU ms / req |
|---|---|---|---|---|---|---|
| previous ONNX export | float32 | 517.19 | 46.44 | 141.74 | 21.53 | 235.22 |
published onnx/model.onnx |
dynamic 8-bit (QUInt8, per-tensor) |
128.94 | 32.10 | 106.13 | 31.14 | 169.75 |
Notes:
- the published ONNX artifact is the dynamic 8-bit runtime export used by the current inference-server deployment path
- raw entity spans are not byte-identical to the float export on the synthetic benchmark corpus
- the endpoint-level redacted text matched on the smoke sample used for final validation (
first_name,last_name,email,phone_number,PPSN)
How To Read This RC
Compared with the current public v1 release, this RC is much stronger on the weak-context PPSN cases that were previously missed.
That is the main reason to test it.
This RC should still be validated carefully on:
- Irish phone numbers with spaces
- Irish Eircodes
- bank/account details
- names and emails in English and Irish Gaelic
Included Files
- full
transformerscheckpoint in the repo root - dynamic 8-bit ONNX Runtime artifact at
onnx/model.onnx inference_mask.pyqa_config.jsontraining_sources.json- clean benchmark summaries in
eval/
License And Attribution
- release license: Apache-2.0
- base model:
OpenMed/OpenMed-PII-mLiteClinical-Base-135M-v1 - upstream attributed data:
joelniklaus/mapa,gretelai/synthetic_pii_finance_multilingual - synthetic Irish training data created in this workspace
See NOTICE for attribution details.
- Downloads last month
- 118