Commit ·
ebce6ef
1
Parent(s): 5dfa9cd
FIX-43: Route bidirectional OOV→IV fixes through spelling guard
Browse filesThe bidirectional validation path (Phase 12 A5) was accepting OOV→IV
word replacements WITHOUT checking spelling safety guards. This bypassed:
- FIX-42b (first-letter change): واحتاج→وتحتاج, افهمه→تفهمة
- FIX-42a (length ratio): والممرضات→والرضا
- FIX-39 (edit distance): طبخ→طبي
Now validates through _is_small_spelling_change before accepting.
Tests: 39 passing.
- src/app.py +15 -0
src/app.py
CHANGED
|
@@ -1877,6 +1877,21 @@ def analyze_text():
|
|
| 1877 |
f"'{_sw}'→'{_rw}'"
|
| 1878 |
)
|
| 1879 |
continue
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1880 |
logger.info(
|
| 1881 |
f"[SPELLING] Bidirectional fix: "
|
| 1882 |
f"'{_safe_words[_bi]}'(OOV)→'{_raw_words[_bi]}'(IV)"
|
|
|
|
| 1877 |
f"'{_sw}'→'{_rw}'"
|
| 1878 |
)
|
| 1879 |
continue
|
| 1880 |
+
# ── FIX-43: Validate bidirectional fix through spelling guard ──
|
| 1881 |
+
# The bidirectional path bypassed ALL spelling guards (FIX-42b first-letter,
|
| 1882 |
+
# FIX-42a length ratio, FIX-39 edit distance). Now we validate the
|
| 1883 |
+
# OOV→IV replacement through _is_small_spelling_change to catch corruptions
|
| 1884 |
+
# like واحتاج→وتحتاج, افهمه→تفهمة, والممرضات→والرضا.
|
| 1885 |
+
_bidi_spell_conf = _is_small_spelling_change(
|
| 1886 |
+
_safe_words[_bi], _raw_words[_bi],
|
| 1887 |
+
spell_checker.vocab_manager
|
| 1888 |
+
)
|
| 1889 |
+
if not _bidi_spell_conf:
|
| 1890 |
+
logger.info(
|
| 1891 |
+
f"[SPELLING] Bidirectional blocked (spelling guard): "
|
| 1892 |
+
f"'{_safe_words[_bi]}'→'{_raw_words[_bi]}'"
|
| 1893 |
+
)
|
| 1894 |
+
continue
|
| 1895 |
logger.info(
|
| 1896 |
f"[SPELLING] Bidirectional fix: "
|
| 1897 |
f"'{_safe_words[_bi]}'(OOV)→'{_raw_words[_bi]}'(IV)"
|