# Presentation Script

## One-line Pitch

I built a grounded QA system over official technical documentation with reproducible artifact generation, benchmark evaluation, and terminal-first usage for both factoid and explanatory questions.

## What It Does

- crawls official docs
- converts them into searchable chunks
- retrieves evidence with sparse and dense search
- fine-tunes the dense retriever on project-generated domain pairs
- reranks passages
- extracts answers with a QA reader
- synthesizes grounded explanatory answers from multiple supporting chunks
- evaluates quality with benchmark metrics

## Why It Is Not Just a Demo

- it has a real artifact pipeline
- it stores local model snapshots and indexes
- it includes benchmark evaluation and error analysis
- it has deterministic rebuild behavior
- it has CI and tests

## Suggested Live Flow

1. Show `scripts/qa_cli.py status`
2. Show `scripts/qa_cli.py ask "Which parameter type can you declare in a FastAPI path operation to set response headers?"`
3. Show `scripts/qa_cli.py ask --style explanatory "How do you set custom response headers in FastAPI, and why does using a Response parameter work?"`
4. Show `scripts/qa_cli.py eval --threshold 0.0`
5. Open `artifacts/real_qa/reports/evaluation_report.md`
6. Open `artifacts/real_qa/reports/error_analysis.md`

## Honest Framing

- this is a serious QA system, not a novelty LLM product
- the main strength is end-to-end retrieval QA engineering with grounded explanatory synthesis
- the next scaling step would be larger supervised training and a larger benchmark
- current snapshot is already reproducible and benchmarked, not just a notebook demo