Spaces:

rohitsar567
/

InsuranceBot

Sleeping

App Files Files Community

InsuranceBot / tools /README.md

rohitsar567

feat(llm+docs): KI-177 + KI-179 + KI-183 + ADR-040 docs cascade

132a829 about 2 months ago

preview code

Raw

History Blame Contribute Delete

6.21 kB

`tools/` — Operational scripts

Loose collection of CLI scripts: corpus operations, data uploads, probes, KB regeneration, scheduled-job runners. Nothing under tools/ is imported by the live server — backend/ and rag/ are the runtime surface.

Scheduling for the long-running ones is wired via macOS LaunchAgents — see CRON_README.md in this folder for cadence + script paths, and ADR-029 for the disk-safety LaunchAgents.

Corpus + extraction batch ops

Script	Purpose
`extract_all_corpus.py`, `extract_batch_5.py`, `extract_failed.py`, `extract_pdf_range.py`, `reextract_all.py`	Batch re-extractions over `rag/corpus/`. Useful when the schema or extraction prompt changes.
`extract_pdf_text.py`, `extract_policy_text.py`, `extract_policy_text_batch2.py`	Raw text dumps for manual inspection / regex curation.
`curate_batch2.py`, `curate_remaining.py`, `clear_batch2.py`	Verbatim-quote curation passes that produced `40-data/policy_facts/`. See `40-data/policy_facts/_curation_report.md`.
`generate_policy_facts.py`	Convert extraction outputs to the `40-data/policy_facts/<id>.json` shape with `{value, unit, source_pdf_path, source_quote}` provenance.
`pydantic_validate_batch_5.py`, `validate_batch_5.py`, `validate_json.py`, `validate_schema.py`	Schema validators for the 62-field `HealthPolicy`.
`count_fields.py`	Per-policy completeness scorer that feeds the `kb/INDEX.md` completeness % column.

Source-map + verification

Script	Purpose
`info_source_map.py`	Builds `eval/info_source_map.json` + `40-data/information_source_map.md` — claim → URL → verdict (✅ / ⚠️ / ❌ / ⏳). The canonical KPI for source-grounding quality.
`verify_urls.py`	HEAD-checks every URL in the corpus / facts; writes `eval/verified_urls.json`.
`verify_review_urls.py`, `verify_new_corpus.py`	Sub-verifiers for the reviews dataset and freshly-added corpus URLs.
`browser_verify.py`	Playwright-backed verifier for URLs that block HEAD requests. Output: `tools/browser_verified.json`.
`check_link_rot.py`, `check_pdf_etags.py`	LaunchAgent-driven freshness checks — corpus URL rot + PDF eTag drift.
`refresh_premiums.py`	LaunchAgent-driven refresh of `40-data/premiums/illustrative_premiums.json`.

KB + dataset builders

Script	Purpose
`build_kb_mirror.py`	Regenerates the entire `kb/policies/<id>.md` tree from `40-data/policy_facts/`. Idempotent.
`ingest_kb_summaries.py`	Ingests `kb/policies/*.md` summaries into Chroma so policy meta is retrievable. Carries the HNSW bloat tripwire.
`ingest_reviews.py`	Ingests `40-data/reviews/<insurer>.json` into Chroma. Carries the HNSW bloat tripwire.
`build_readme_pdf.py`	Renders the master `README.md` to PDF for offline review.

HF Hub uploads (data-side mirror)

Script	Target
`upload_to_hf.py`	Code-side push to the HF Space repo (`huggingface.co/spaces/rohitsar567/InsuranceBot`).
`upload_corpus_to_dataset.py`, `upload_extracted_to_dataset.py`, `upload_vectors_to_dataset.py`, `upload_all_to_dataset.py`	Push specific slices of `rag/` to the companion HF Dataset `rohitsar567/insurance-bot-data`. See ADR-020 and ADR-024.
`set_hf_secrets.py`	One-shot helper that pushes the runtime secrets into the HF Space (idempotent). Current secret set: `GOOGLE_API_KEY` (Google AI Studio, per ADR-040), `NVIDIA_NIM_API_KEY`, `OPENROUTER_API_KEY`, `SARVAM_API_KEY`, plus admin password / IP allowlist.

Probes + diagnostics

Script	Provider it pokes
`sarvam_probe.py`, `sarvam_nothink_probe.py`	Sarvam-M / Saarika / Bulbul connectivity + latency.
`groq_probe.py`, `groq_long_probe.py`	Historical Groq Llama free-tier probe — Groq is no longer in any production chain (removed in ADR-038, not re-added in ADR-040). Kept for benchmarking.
`openrouter_probe.py`, `or_models.py`	OpenRouter routing + model-list inspection. Used by KI-178 to audit which `:free` models expose `response_format`.
`pdf_probe.py`	pdfplumber parse on a single PDF — first stop when extraction silently produces empty text.
`heavy_smoke_test.py`	End-to-end smoke against the live HF Space (every provider in one call).

Chunk-size & retrieval sweeps

Script	Purpose
`chunk_sweep.py`, `chunk_sweep_diagnostic.py`	Grid-search over chunk size / overlap. Output: `eval/chunk_sweep_results.json`. See ADR-018.
`sweep_retrieval.py`	Retrieval-strategy A/B (filter vs no-filter, top-k variants).

Scheduled jobs / shell wrappers

Path	Purpose
`install_crons.sh`, `CRON_README.md`	Install the LaunchAgents; the README is the canonical cadence + path reference.
`install_git_hooks.sh`, `git-hooks/`	Pre-commit hooks (decimal grep, secret scan, schema validation).
`full_pipeline.sh`, `pipeline_finish_all.sh`, `post_extract_deploy.sh`, `reextract_then_deploy.sh`, `quarterly_rebuild.sh`	Multi-step orchestrations (download → extract → ingest → push → smoke).
`reconcile_manifest.py`	Drift check between `rag/corpus/_manifest.json` and what's actually on disk.

Subdirectory

audit/ — multi-persona conversational audit framework. See tools/audit/README.md.

CRON_README.md (this folder) — LaunchAgent cadence reference
ADR-020, ADR-024, ADR-029
80-audit/ENTERPRISE_AUDIT.md — defect register, including silent-LaunchAgent regressions (D-002)

tools/ — Operational scripts