Clarify PPSN scanner boundary fix in model card
Browse files
README.md
CHANGED
|
@@ -67,6 +67,7 @@ The change is the scanner implementation:
|
|
| 67 |
- semantic validation remains procedural in `ppsn.py`, `eircode.py`, and `irish_core_decoder.py`
|
| 68 |
- release Python files no longer depend on the third-party `regex` package
|
| 69 |
- the scanner layer no longer uses regex-based candidate extraction on untrusted text
|
|
|
|
| 70 |
|
| 71 |
This is not a pure PEG grammar because some labels need semantic checks after lexical scanning:
|
| 72 |
|
|
|
|
| 67 |
- semantic validation remains procedural in `ppsn.py`, `eircode.py`, and `irish_core_decoder.py`
|
| 68 |
- release Python files no longer depend on the third-party `regex` package
|
| 69 |
- the scanner layer no longer uses regex-based candidate extraction on untrusted text
|
| 70 |
+
- PPSN candidate extraction no longer consumes a following word-initial letter after whitespace, so cases like `1234567T a ...` stay bounded to `1234567T`
|
| 71 |
|
| 72 |
This is not a pure PEG grammar because some labels need semantic checks after lexical scanning:
|
| 73 |
|