temsa commited on
Commit
85570bb
·
verified ·
1 Parent(s): 32bcb86

Clarify PPSN scanner boundary fix in model card

Browse files
Files changed (1) hide show
  1. README.md +1 -0
README.md CHANGED
@@ -67,6 +67,7 @@ The change is the scanner implementation:
67
  - semantic validation remains procedural in `ppsn.py`, `eircode.py`, and `irish_core_decoder.py`
68
  - release Python files no longer depend on the third-party `regex` package
69
  - the scanner layer no longer uses regex-based candidate extraction on untrusted text
 
70
 
71
  This is not a pure PEG grammar because some labels need semantic checks after lexical scanning:
72
 
 
67
  - semantic validation remains procedural in `ppsn.py`, `eircode.py`, and `irish_core_decoder.py`
68
  - release Python files no longer depend on the third-party `regex` package
69
  - the scanner layer no longer uses regex-based candidate extraction on untrusted text
70
+ - PPSN candidate extraction no longer consumes a following word-initial letter after whitespace, so cases like `1234567T a ...` stay bounded to `1234567T`
71
 
72
  This is not a pure PEG grammar because some labels need semantic checks after lexical scanning:
73