bayan-api / tasks /todo.md
youssefreda9's picture
HF Deploy: Fix syntax error with smart quotes in popup.js
fe1e225
|
Raw
History Blame Contribute Delete
2.41 kB

BAYAN v2.0 — Task List

Phase A: Test Infrastructure ✅

  • Create tests/v2/test_level1_raw.py — Raw model tests with TP/FP/FN/TN verdicts
  • Create tests/v2/test_level2_solo.py — Solo API endpoint tests
  • Create tests/v2/test_level3_integrated.py — Full pipeline tests
  • Create tests/v2/benchmark_matrix.py — Master comparison runner
  • Fix verdict logic (strip terminal punctuation before comparison)
  • Run baseline on entities + spelling datasets
  • Run full 320-test baseline across all 3 levels

Phase A.1: Project Cleanup ✅

  • Archive legacy scripts (AraSpell.py, Grammer_Rules.py, PuncAra.py)
  • Archive 36 old phase/verification reports
  • Archive 23 old test files + 8 phase10 helpers
  • Delete 35 orphaned debug/temp files
  • Fix .gitignore corruption (binary null bytes)
  • Fix PROJECT_DESCRIPTION.md stale reference
  • Archive docs/audit + docs/audits

Phase B: Extract Stages (NOT STARTED)

  • Create src/nlp/stages/spelling_stage.py
  • Create src/nlp/stages/grammar_stage.py
  • Create src/nlp/stages/punctuation_stage.py
  • Each stage wraps: model call → filter → verdict
  • Hash (comment out) old inline stage code in app.py
  • Re-run v2 benchmark → must match Phase A baseline

Phase C: Extract Filters (NOT STARTED)

  • Create src/nlp/filters/ module
  • Extract overlap resolution, religious guard, entity guard
  • Hash old filter code in app.py
  • Re-run v2 benchmark → must match baseline

Phase D: Extract Preprocessors (NOT STARTED)

  • Create src/nlp/preprocessors/ module
  • Extract text normalization, diacritic handling, chunk splitting
  • Hash old preprocessor code in app.py
  • Re-run v2 benchmark → must match baseline

Phase E: Create Pipeline Orchestrator (NOT STARTED)

  • Create src/nlp/pipeline.py — orchestrates stages via PipelineContext
  • Wire app.py /api/analyze to use pipeline.run(text)
  • Hash old monolithic analyze code in app.py
  • Re-run v2 benchmark → must match baseline

Phase F: Clean app.py (NOT STARTED)

  • Move helpers (get_word_positions, OffsetMapper, etc.) to utility modules
  • Remove all hashed (commented) code blocks
  • app.py should only contain: Flask routes + pipeline.run() calls
  • Final v2 benchmark → must match baseline
  • Target: app.py < 500 lines