File size: 1,261 Bytes
ed10267
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
# Benchmark Summary

## ONNX q8

| Suite | F1 | Examples/s |
|---|---:|---:|
| Irish core | 0.9664 | 247.0809 |
| Edge | 1.0000 | 247.8374 |
| Finance | 1.0000 | 260.5229 |
| Finance boundary | 1.0000 | 111.3480 |
| User PPSN | 1.0000 | 240.0219 |
| GA weak PPSN | 1.0000 | 121.8613 |
| Multilingual PPSN | 0.9212 | 256.1316 |
| Hardening exact | 0.9744 | 231.8666 |
| UAT replay exact | 0.8276 | 183.6675 |

## Full checkpoint

| Suite | F1 | Examples/s |
|---|---:|---:|
| Irish core | 0.9664 | 47.2794 |
| Edge | 1.0000 | 38.1395 |
| Multilingual PPSN | 0.9212 | 65.8959 |
| Hardening exact | 0.9744 | 31.0518 |

## UAT Replay Exact Comparison

| Model | F1 | Precision | Recall | Examples/s |
|---|---:|---:|---:|---:|
| `IrishCore-DiffMask-135M-v1-rc1` q8 | 0.4545 | 1.0000 | 0.2941 | 238.6524 |
| `OpenMed-mLiteClinical-IrishCorePII-135M-v2-rc8` q8 | 0.3636 | 0.3750 | 0.3529 | 110.7595 |
| `IrishCore-DiffMask-135M-v1-rc2` q8 | 0.8276 | 1.0000 | 0.7059 | 183.6675 |

## Notes

- `rc2` was selected from an interpolation blend after cleaning label contamination in the v5 training mix.
- The remaining known misses on the UAT replay suite are `071 967 2616`, `R93 EC57` inside a longer centre block, `EPStamp4@enterprise.gov.ie`, and one `D02 XY45` form.