Mayank Chugh commited on
Commit ·
97a5277
1
Parent(s): ca3b61b
Implement Milestone 6 by adding audit persistence functionality. Introduce `aiosqlite` as a dependency, update the audit routes to support event logging and retrieval, and enhance the query endpoint to persist audit records. Modify response models to include audit details and ensure proper initialization of the audit database on startup.
Browse files- LOGICAL_DEVELOPMENT_SEQUENCE.md +506 -0
- README.md +6 -1
- api/main.py +9 -0
- api/routes/audit.py +26 -7
- api/routes/query.py +14 -3
- models/responses.py +22 -0
- pyproject.toml +1 -0
- requirements.txt +1 -0
- storage/__init__.py +1 -0
- storage/audit_store.py +102 -0
- uv.lock +11 -0
LOGICAL_DEVELOPMENT_SEQUENCE.md
ADDED
|
@@ -0,0 +1,506 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# DocuAudit AI - Milestone Development Sequence (Build -> Verify -> Move On)
|
| 2 |
+
|
| 3 |
+
This guide is written so you can develop strictly milestone-by-milestone and validate your output at each step.
|
| 4 |
+
|
| 5 |
+
Rule for progression:
|
| 6 |
+
- If a milestone verification fails, do not continue to the next milestone.
|
| 7 |
+
- Only move forward when all checks for the current milestone pass.
|
| 8 |
+
|
| 9 |
+
---
|
| 10 |
+
|
| 11 |
+
## Shared Setup and Run Commands
|
| 12 |
+
|
| 13 |
+
### One-time setup
|
| 14 |
+
```bash
|
| 15 |
+
uv venv --python 3.11
|
| 16 |
+
uv init --python 3.11
|
| 17 |
+
uv pip install -r requirements.txt
|
| 18 |
+
copy .env.example .env
|
| 19 |
+
```
|
| 20 |
+
|
| 21 |
+
### Run backend
|
| 22 |
+
```bash
|
| 23 |
+
uv run uvicorn api.main:app --host 0.0.0.0 --port 8000 --reload
|
| 24 |
+
```
|
| 25 |
+
|
| 26 |
+
### Run UI
|
| 27 |
+
```bash
|
| 28 |
+
uv run streamlit run streamlit_app.py
|
| 29 |
+
```
|
| 30 |
+
|
| 31 |
+
### Smoke checks
|
| 32 |
+
```bash
|
| 33 |
+
curl http://localhost:8000/health
|
| 34 |
+
curl http://localhost:8000/docs
|
| 35 |
+
```
|
| 36 |
+
|
| 37 |
+
---
|
| 38 |
+
|
| 39 |
+
## Milestone 1 - FastAPI Foundation
|
| 40 |
+
|
| 41 |
+
### Dependencies
|
| 42 |
+
- `fastapi`
|
| 43 |
+
- `uvicorn[standard]`
|
| 44 |
+
|
| 45 |
+
### After adding dependencies
|
| 46 |
+
- `uv add requirements.txt`
|
| 47 |
+
- `uv pip install -r requirements.txt`
|
| 48 |
+
|
| 49 |
+
### Depends on previous milestones
|
| 50 |
+
- None (starting point).
|
| 51 |
+
|
| 52 |
+
### Expected input
|
| 53 |
+
- Fresh repo with Python environment ready.
|
| 54 |
+
- `requirements.txt` and `.env` prepared.
|
| 55 |
+
|
| 56 |
+
### Build scope
|
| 57 |
+
- Create `api/main.py`.
|
| 58 |
+
- Add `GET /health`.
|
| 59 |
+
- App starts with Uvicorn.
|
| 60 |
+
|
| 61 |
+
### Expected output/result
|
| 62 |
+
- API starts without errors.
|
| 63 |
+
- `/health` returns success JSON.
|
| 64 |
+
- Swagger loads at `/docs`.
|
| 65 |
+
|
| 66 |
+
### Start backend server- fastapi
|
| 67 |
+
- `uv run uvicorn api.main:app --host 0.0.0.0 --port 8000 --reload`
|
| 68 |
+
|
| 69 |
+
### Verification checks
|
| 70 |
+
```bash
|
| 71 |
+
curl http://localhost:8000/health
|
| 72 |
+
curl http://localhost:8000/docs
|
| 73 |
+
```
|
| 74 |
+
|
| 75 |
+
### Pass criteria
|
| 76 |
+
- No startup exception.
|
| 77 |
+
- No 404/500 for `/health` and `/docs`.
|
| 78 |
+
|
| 79 |
+
---
|
| 80 |
+
|
| 81 |
+
## Milestone 2 - Route Skeletons (Placeholder Only)
|
| 82 |
+
|
| 83 |
+
### Dependencies
|
| 84 |
+
- Milestone 1 dependencies only.
|
| 85 |
+
- (No new runtime dependency required if routes are placeholders.)
|
| 86 |
+
|
| 87 |
+
### Depends on previous milestones
|
| 88 |
+
- Milestone 1 must pass (`/health` and `/docs` working).
|
| 89 |
+
|
| 90 |
+
### Expected input
|
| 91 |
+
- Running FastAPI app from Milestone 1.
|
| 92 |
+
- Router module structure available under `api/routes`.
|
| 93 |
+
|
| 94 |
+
### Build scope
|
| 95 |
+
- Add routers:
|
| 96 |
+
- `api/routes/ingest.py`
|
| 97 |
+
- `api/routes/query.py`
|
| 98 |
+
- `api/routes/jobs.py`
|
| 99 |
+
- `api/routes/audit.py`
|
| 100 |
+
- Register routes in `api/main.py`.
|
| 101 |
+
- Keep responses as placeholders only.
|
| 102 |
+
|
| 103 |
+
### Expected output/result
|
| 104 |
+
- All route paths exist and respond.
|
| 105 |
+
- Placeholder payloads return consistently.
|
| 106 |
+
|
| 107 |
+
### Start backend server- fastapi
|
| 108 |
+
- `uv run uvicorn api.main:app --host 0.0.0.0 --port 8000 --reload`
|
| 109 |
+
|
| 110 |
+
### Verification checks
|
| 111 |
+
- Open Swagger and call each endpoint once.
|
| 112 |
+
- Confirm no 404/500.
|
| 113 |
+
|
| 114 |
+
### Pass criteria
|
| 115 |
+
- Route wiring complete.
|
| 116 |
+
- No real business logic yet.
|
| 117 |
+
|
| 118 |
+
---
|
| 119 |
+
|
| 120 |
+
## Milestone 3 - Config + Request/Response Contracts
|
| 121 |
+
|
| 122 |
+
### Dependencies
|
| 123 |
+
- `pydantic`
|
| 124 |
+
- `pydantic-settings`
|
| 125 |
+
- `python-dotenv`
|
| 126 |
+
|
| 127 |
+
### After adding dependencies
|
| 128 |
+
- `uv add requirements.txt`
|
| 129 |
+
- `uv pip install -r requirements.txt`
|
| 130 |
+
|
| 131 |
+
### Depends on previous milestones
|
| 132 |
+
- Milestone 2 route skeletons must be wired and reachable.
|
| 133 |
+
|
| 134 |
+
### Expected input
|
| 135 |
+
- Existing route handlers ready for request body integration.
|
| 136 |
+
- `.env` file available for settings values.
|
| 137 |
+
|
| 138 |
+
### Build scope
|
| 139 |
+
- Add `api/config.py` (env-backed settings).
|
| 140 |
+
- Add `models/requests.py` and `models/responses.py`.
|
| 141 |
+
- Apply request validation in routes.
|
| 142 |
+
|
| 143 |
+
### Expected output/result
|
| 144 |
+
- Config values read from `.env`.
|
| 145 |
+
- Valid requests succeed.
|
| 146 |
+
- Invalid payloads show schema errors.
|
| 147 |
+
|
| 148 |
+
### Start backend server- fastapi
|
| 149 |
+
- `uv run uvicorn api.main:app --host 0.0.0.0 --port 8000 --reload`
|
| 150 |
+
|
| 151 |
+
### Verification checks
|
| 152 |
+
```bash
|
| 153 |
+
curl -X POST http://localhost:8000/query/ask ^
|
| 154 |
+
-H "Content-Type: application/json" ^
|
| 155 |
+
-d "{\"question\":\"What are key risks?\",\"collection_name\":\"default\"}"
|
| 156 |
+
```
|
| 157 |
+
- Also test one invalid payload (e.g., missing `question`).
|
| 158 |
+
|
| 159 |
+
### Pass criteria
|
| 160 |
+
- Validation behavior is visible and correct.
|
| 161 |
+
|
| 162 |
+
---
|
| 163 |
+
|
| 164 |
+
## Milestone 4 - RAG Ingestion Pipeline (No Answer Generation)
|
| 165 |
+
|
| 166 |
+
### Dependencies
|
| 167 |
+
- `langchain`
|
| 168 |
+
- `langchain-core` (pulled via LangChain ecosystem)
|
| 169 |
+
- `langchain-chroma`
|
| 170 |
+
- `chromadb`
|
| 171 |
+
- `langchain-community`
|
| 172 |
+
- `langchain-ollama` (if using Ollama embeddings)
|
| 173 |
+
- `langchain-openai` + `openai` (if using OpenAI embeddings)
|
| 174 |
+
- `pymupdf` (PDF loading)
|
| 175 |
+
- `python-multipart` (file upload handling in ingest route)
|
| 176 |
+
|
| 177 |
+
### After adding dependencies
|
| 178 |
+
- `uv add requirements.txt`
|
| 179 |
+
- `uv pip install -r requirements.txt`
|
| 180 |
+
|
| 181 |
+
### Depends on previous milestones
|
| 182 |
+
- Milestone 3 contracts/config must pass.
|
| 183 |
+
|
| 184 |
+
### Provider selection
|
| 185 |
+
- Choose provider in `.env` via `LLM_PROVIDER`.
|
| 186 |
+
- Supported values:
|
| 187 |
+
- `ollama`
|
| 188 |
+
- `openai`
|
| 189 |
+
- `anthropic`
|
| 190 |
+
- `huggingface`
|
| 191 |
+
|
| 192 |
+
### Expected input
|
| 193 |
+
- Valid upload route available to accept files.
|
| 194 |
+
- Configuration values for chunking, Chroma path, and provider present.
|
| 195 |
+
- Test documents (PDF/TXT/MD) ready for ingestion.
|
| 196 |
+
|
| 197 |
+
### Build scope
|
| 198 |
+
- Implement:
|
| 199 |
+
- `rag/loader.py`
|
| 200 |
+
- `rag/chunker.py`
|
| 201 |
+
- `rag/embedder.py`
|
| 202 |
+
- `rag/vector_store.py`
|
| 203 |
+
- Wire ingest flow: load -> chunk -> embed -> persist.
|
| 204 |
+
- Preserve metadata:
|
| 205 |
+
- `source`
|
| 206 |
+
- `page`
|
| 207 |
+
- `chunk_index`
|
| 208 |
+
|
| 209 |
+
### Expected output/result
|
| 210 |
+
- Upload/ingest creates vectors in Chroma.
|
| 211 |
+
- Collection has stored documents/chunks.
|
| 212 |
+
|
| 213 |
+
### Start backend server- fastapi
|
| 214 |
+
- `uv run uvicorn api.main:app --host 0.0.0.0 --port 8000 --reload`
|
| 215 |
+
|
| 216 |
+
### Verification checks
|
| 217 |
+
- Ingest sample PDF(s).
|
| 218 |
+
- Confirm collection appears and has document count > 0.
|
| 219 |
+
|
| 220 |
+
### Pass criteria
|
| 221 |
+
- Vector persistence works.
|
| 222 |
+
- No LLM answer quality evaluation yet (that is Milestone 5).
|
| 223 |
+
|
| 224 |
+
---
|
| 225 |
+
|
| 226 |
+
## Milestone 5 - Retrieval + Grounded LLM Answer
|
| 227 |
+
|
| 228 |
+
### Dependencies
|
| 229 |
+
- Keep Milestone 4 dependencies.
|
| 230 |
+
- Add provider package(s) for your selected chat LLM:
|
| 231 |
+
- Ollama: `langchain-ollama`
|
| 232 |
+
- OpenAI: `langchain-openai`, `openai`
|
| 233 |
+
- Anthropic: `langchain-anthropic`, `anthropic`
|
| 234 |
+
- Hugging Face endpoint: `langchain-community` (+ API key)
|
| 235 |
+
|
| 236 |
+
### After adding dependencies
|
| 237 |
+
- `uv add requirements.txt`
|
| 238 |
+
- `uv pip install -r requirements.txt`
|
| 239 |
+
|
| 240 |
+
### Depends on previous milestones
|
| 241 |
+
- Milestone 4 vectors must exist in Chroma (ingestion verified).
|
| 242 |
+
|
| 243 |
+
### Expected input
|
| 244 |
+
- Ingested collection with non-zero document/chunk vectors.
|
| 245 |
+
- Query endpoint contract from Milestone 3.
|
| 246 |
+
- Valid LLM API key/local model access based on selected provider.
|
| 247 |
+
|
| 248 |
+
### Build scope
|
| 249 |
+
- Implement `rag/retriever.py` to:
|
| 250 |
+
- retrieve top-k chunks
|
| 251 |
+
- format context
|
| 252 |
+
- invoke configured LLM
|
| 253 |
+
- return answer + sources
|
| 254 |
+
- Wire query routes to retriever.
|
| 255 |
+
|
| 256 |
+
### Expected output/result
|
| 257 |
+
- Query returns grounded answer based on retrieved chunks.
|
| 258 |
+
- Source citations are included.
|
| 259 |
+
- Empty/no-match returns safe fallback answer.
|
| 260 |
+
|
| 261 |
+
### Start backend server- fastapi
|
| 262 |
+
- `uv run uvicorn api.main:app --host 0.0.0.0 --port 8000 --reload`
|
| 263 |
+
|
| 264 |
+
### Verification checks
|
| 265 |
+
- Ask question that is clearly answerable from uploaded docs.
|
| 266 |
+
- Ask question not present in docs and verify fallback message.
|
| 267 |
+
|
| 268 |
+
### Pass criteria
|
| 269 |
+
- Retrieve-then-generate flow works reliably.
|
| 270 |
+
|
| 271 |
+
---
|
| 272 |
+
|
| 273 |
+
## Milestone 6 - Audit Persistence
|
| 274 |
+
|
| 275 |
+
### Dependencies
|
| 276 |
+
- `aiosqlite`
|
| 277 |
+
|
| 278 |
+
### After adding dependencies
|
| 279 |
+
- `uv add requirements.txt`
|
| 280 |
+
- `uv pip install -r requirements.txt`
|
| 281 |
+
|
| 282 |
+
### Depends on previous milestones
|
| 283 |
+
- Milestone 5 query flow must return answers reliably.
|
| 284 |
+
|
| 285 |
+
### Expected input
|
| 286 |
+
- Query route already producing response payload (`answer`, `sources`, metadata).
|
| 287 |
+
- Writable SQLite path configured in environment.
|
| 288 |
+
|
| 289 |
+
### Build scope
|
| 290 |
+
- Implement `storage/audit_store.py` fully.
|
| 291 |
+
- Persist query request/response metadata.
|
| 292 |
+
- Add audit list/detail retrieval endpoints.
|
| 293 |
+
|
| 294 |
+
### Expected output/result
|
| 295 |
+
- Every query creates an audit record.
|
| 296 |
+
- You can fetch log entries by list and id.
|
| 297 |
+
|
| 298 |
+
### Start backend server- fastapi
|
| 299 |
+
- `uv run uvicorn api.main:app --host 0.0.0.0 --port 8000 --reload`
|
| 300 |
+
|
| 301 |
+
### Verification checks
|
| 302 |
+
- Run one query.
|
| 303 |
+
- Fetch corresponding audit entry.
|
| 304 |
+
|
| 305 |
+
### Pass criteria
|
| 306 |
+
- Audit trail is complete and query-linked.
|
| 307 |
+
|
| 308 |
+
---
|
| 309 |
+
|
| 310 |
+
## Milestone 7 - Background Ingestion Jobs
|
| 311 |
+
|
| 312 |
+
|
| 313 |
+
### Dependencies
|
| 314 |
+
- No mandatory new package (uses FastAPI background tasks + existing modules).
|
| 315 |
+
|
| 316 |
+
### After adding dependencies (if any)
|
| 317 |
+
- `uv add requirements.txt`
|
| 318 |
+
- `uv pip install -r requirements.txt`
|
| 319 |
+
|
| 320 |
+
### Depends on previous milestones
|
| 321 |
+
- Milestone 4 ingestion logic must work synchronously first.
|
| 322 |
+
- Milestone 6 persistence layer should be available for job tracking.
|
| 323 |
+
|
| 324 |
+
### Expected input
|
| 325 |
+
- Working ingest function (`load -> chunk -> add_documents`).
|
| 326 |
+
- Job status storage schema and endpoints available.
|
| 327 |
+
|
| 328 |
+
### Build scope
|
| 329 |
+
- Implement `workers/ingest_worker.py`.
|
| 330 |
+
- Move ingestion processing to background.
|
| 331 |
+
- Track status in jobs endpoints/store.
|
| 332 |
+
|
| 333 |
+
### Expected output/result
|
| 334 |
+
- Upload returns `job_id`.
|
| 335 |
+
- Status transitions:
|
| 336 |
+
- `queued`
|
| 337 |
+
- `processing`
|
| 338 |
+
- `completed` or `failed`
|
| 339 |
+
|
| 340 |
+
### Start backend server- fastapi
|
| 341 |
+
- `uv run uvicorn api.main:app --host 0.0.0.0 --port 8000 --reload`
|
| 342 |
+
|
| 343 |
+
### Verification checks
|
| 344 |
+
- Upload docs and poll job status endpoint.
|
| 345 |
+
|
| 346 |
+
### Pass criteria
|
| 347 |
+
- API remains responsive while ingestion runs.
|
| 348 |
+
|
| 349 |
+
---
|
| 350 |
+
|
| 351 |
+
## Milestone 8 - Endpoint Completion (Production Shape)
|
| 352 |
+
|
| 353 |
+
### Dependencies
|
| 354 |
+
- `httpx` (URL ingestion/download flow)
|
| 355 |
+
|
| 356 |
+
### After adding dependencies
|
| 357 |
+
- `uv add requirements.txt`
|
| 358 |
+
- `uv pip install -r requirements.txt`
|
| 359 |
+
|
| 360 |
+
### Depends on previous milestones
|
| 361 |
+
- Milestones 1 through 7 should be passing individually.
|
| 362 |
+
|
| 363 |
+
### Expected input
|
| 364 |
+
- Stable ingestion, retrieval, audit, and jobs internals.
|
| 365 |
+
- Final request/response models already defined.
|
| 366 |
+
|
| 367 |
+
### Build scope
|
| 368 |
+
- Ensure behavior and contracts are complete for:
|
| 369 |
+
- `POST /ingest/upload`
|
| 370 |
+
- `POST /ingest/url`
|
| 371 |
+
- `GET /ingest/collections`
|
| 372 |
+
- `DELETE /ingest/collection/{collection_name}`
|
| 373 |
+
- `POST /query/ask`
|
| 374 |
+
- `POST /query/summarise`
|
| 375 |
+
- `GET /jobs`
|
| 376 |
+
- `GET /jobs/{job_id}`
|
| 377 |
+
- `GET /audit/logs`
|
| 378 |
+
- `GET /audit/logs/{query_id}`
|
| 379 |
+
|
| 380 |
+
### Expected output/result
|
| 381 |
+
- Full backend flow works from upload to audited answer.
|
| 382 |
+
|
| 383 |
+
### Start backend server
|
| 384 |
+
- `uv run uvicorn api.main:app --host 0.0.0.0 --port 8000 --reload`
|
| 385 |
+
|
| 386 |
+
### Verification checks
|
| 387 |
+
- Run one complete cycle using API only:
|
| 388 |
+
- ingest -> job complete -> ask -> inspect sources -> fetch audit
|
| 389 |
+
|
| 390 |
+
### Pass criteria
|
| 391 |
+
- No contract mismatches or broken endpoints.
|
| 392 |
+
|
| 393 |
+
---
|
| 394 |
+
|
| 395 |
+
## Milestone 9 - Streamlit UI Integration
|
| 396 |
+
|
| 397 |
+
### Dependencies
|
| 398 |
+
- `streamlit`
|
| 399 |
+
|
| 400 |
+
### After adding dependencies
|
| 401 |
+
- `uv add requirements.txt`
|
| 402 |
+
- `uv pip install -r requirements.txt`
|
| 403 |
+
|
| 404 |
+
### Depends on previous milestones
|
| 405 |
+
- Milestone 8 full backend flow must be stable.
|
| 406 |
+
|
| 407 |
+
### Expected input
|
| 408 |
+
- Running backend server with all finalized endpoints.
|
| 409 |
+
- Predictable API payload shapes for upload/query/jobs/audit.
|
| 410 |
+
|
| 411 |
+
### Build scope
|
| 412 |
+
- Connect `streamlit_app.py` to backend API.
|
| 413 |
+
- Include Upload, Jobs, Ask, Summarise, Audit sections.
|
| 414 |
+
|
| 415 |
+
### Expected output/result
|
| 416 |
+
- Full flow works from UI alone.
|
| 417 |
+
|
| 418 |
+
### Start backend server- fastapi
|
| 419 |
+
```bash
|
| 420 |
+
uv run uvicorn api.main:app --host 0.0.0.0 --port 8000
|
| 421 |
+
```
|
| 422 |
+
### Start frontend server- Streamlit
|
| 423 |
+
```bash
|
| 424 |
+
uv run streamlit run streamlit_app.py --server.address=0.0.0.0 --server.port=8501
|
| 425 |
+
```
|
| 426 |
+
|
| 427 |
+
### Verification checks
|
| 428 |
+
- Perform end-to-end cycle from Streamlit without manual curl.
|
| 429 |
+
|
| 430 |
+
### Pass criteria
|
| 431 |
+
- UI reflects backend status and responses correctly.
|
| 432 |
+
|
| 433 |
+
---
|
| 434 |
+
|
| 435 |
+
## Milestone 10 - Tests and Hardening
|
| 436 |
+
|
| 437 |
+
### Dependencies
|
| 438 |
+
- `pytest`
|
| 439 |
+
- `pytest-asyncio`
|
| 440 |
+
|
| 441 |
+
### After adding dependencies
|
| 442 |
+
- `uv add requirements.txt`
|
| 443 |
+
- `uv pip install -r requirements.txt`
|
| 444 |
+
|
| 445 |
+
### Depends on previous milestones
|
| 446 |
+
- Milestones 1 through 9 completed and stable enough to test.
|
| 447 |
+
|
| 448 |
+
### Expected input
|
| 449 |
+
- Final endpoint behavior and contracts.
|
| 450 |
+
- Representative sample docs/test data and deterministic test cases.
|
| 451 |
+
|
| 452 |
+
### Build scope
|
| 453 |
+
- Add/update:
|
| 454 |
+
- `tests/test_ingest.py`
|
| 455 |
+
- `tests/test_query.py`
|
| 456 |
+
- `tests/test_audit.py`
|
| 457 |
+
- Cover success + validation + failure paths.
|
| 458 |
+
|
| 459 |
+
### Expected output/result
|
| 460 |
+
- Automated tests pass and catch regressions.
|
| 461 |
+
|
| 462 |
+
### Verification checks
|
| 463 |
+
```bash
|
| 464 |
+
uv run pytest -q
|
| 465 |
+
uv run pytest tests/test_ingest.py -q
|
| 466 |
+
uv run pytest tests/test_query.py -q
|
| 467 |
+
uv run pytest tests/test_audit.py -q
|
| 468 |
+
```
|
| 469 |
+
|
| 470 |
+
### Pass criteria
|
| 471 |
+
- Core behavior is test-covered and stable.
|
| 472 |
+
|
| 473 |
+
---
|
| 474 |
+
|
| 475 |
+
## Milestone-by-Milestone Output Checklist
|
| 476 |
+
|
| 477 |
+
Use this quick gate before advancing:
|
| 478 |
+
|
| 479 |
+
1. Milestone 1: API up + `/health` + `/docs`
|
| 480 |
+
2. Milestone 2: all route stubs reachable
|
| 481 |
+
3. Milestone 3: schema validation enforced
|
| 482 |
+
4. Milestone 4: vectors written to Chroma
|
| 483 |
+
5. Milestone 5: grounded answer + citations
|
| 484 |
+
6. Milestone 6: audit log persisted and fetchable
|
| 485 |
+
7. Milestone 7: background job lifecycle visible
|
| 486 |
+
8. Milestone 8: full API flow complete
|
| 487 |
+
9. Milestone 9: full UI flow complete
|
| 488 |
+
10. Milestone 10: tests passing
|
| 489 |
+
|
| 490 |
+
If any line fails, fix that milestone before moving forward.
|
| 491 |
+
|
| 492 |
+
---
|
| 493 |
+
|
| 494 |
+
## Development Completion Dependency Chain
|
| 495 |
+
|
| 496 |
+
Use this chain to understand what must be complete before a later milestone is considered valid:
|
| 497 |
+
|
| 498 |
+
- Milestone 2 depends on 1
|
| 499 |
+
- Milestone 3 depends on 2
|
| 500 |
+
- Milestone 4 depends on 3
|
| 501 |
+
- Milestone 5 depends on 4
|
| 502 |
+
- Milestone 6 depends on 5
|
| 503 |
+
- Milestone 7 depends on 4 and 6
|
| 504 |
+
- Milestone 8 depends on 1-7
|
| 505 |
+
- Milestone 9 depends on 8
|
| 506 |
+
- Milestone 10 depends on 1-9
|
README.md
CHANGED
|
@@ -34,4 +34,9 @@ uv run uvicorn api.main:app --host 0.0.0.0 --port 8000
|
|
| 34 |
### Start frontend server- Streamlit
|
| 35 |
```bash
|
| 36 |
uv run streamlit run streamlit_app.py --server.address=0.0.0.0 --server.port=8501
|
| 37 |
-
```
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 34 |
### Start frontend server- Streamlit
|
| 35 |
```bash
|
| 36 |
uv run streamlit run streamlit_app.py --server.address=0.0.0.0 --server.port=8501
|
| 37 |
+
```
|
| 38 |
+
|
| 39 |
+
git diff --name-status --diff-filter=AM <from_commit_hash> <to_commit_hash>
|
| 40 |
+
|
| 41 |
+
git diff --name-status --diff-filter=AM 18ad0e6c94d041b1fd902e7f9b60113738eee1fa 0f2ee3afa124348adece82df0ff0e5a0943a7b8b
|
| 42 |
+
|
api/main.py
CHANGED
|
@@ -1,4 +1,7 @@
|
|
| 1 |
from fastapi import FastAPI
|
|
|
|
|
|
|
|
|
|
| 2 |
from .routes import audit, ingest, jobs, query
|
| 3 |
|
| 4 |
app = FastAPI()
|
|
@@ -8,6 +11,12 @@ app.include_router(ingest.router)
|
|
| 8 |
app.include_router(jobs.router)
|
| 9 |
app.include_router(query.router)
|
| 10 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 11 |
@app.get("/health", tags=["Health"])
|
| 12 |
def health() -> dict[str, str]:
|
| 13 |
return {"status": "ok","app_name": "doc-audi-ai", "version": "0.1.0"}
|
|
|
|
| 1 |
from fastapi import FastAPI
|
| 2 |
+
|
| 3 |
+
from api.config import get_settings
|
| 4 |
+
from storage.audit_store import init_audit_db
|
| 5 |
from .routes import audit, ingest, jobs, query
|
| 6 |
|
| 7 |
app = FastAPI()
|
|
|
|
| 11 |
app.include_router(jobs.router)
|
| 12 |
app.include_router(query.router)
|
| 13 |
|
| 14 |
+
|
| 15 |
+
@app.on_event("startup")
|
| 16 |
+
async def startup() -> None:
|
| 17 |
+
settings = get_settings()
|
| 18 |
+
await init_audit_db(settings.audit_db_path)
|
| 19 |
+
|
| 20 |
@app.get("/health", tags=["Health"])
|
| 21 |
def health() -> dict[str, str]:
|
| 22 |
return {"status": "ok","app_name": "doc-audi-ai", "version": "0.1.0"}
|
api/routes/audit.py
CHANGED
|
@@ -1,8 +1,11 @@
|
|
| 1 |
from typing import Annotated
|
| 2 |
-
from fastapi import APIRouter, Depends, Query
|
| 3 |
|
|
|
|
|
|
|
|
|
|
| 4 |
from models.requests import AuditListParams
|
| 5 |
-
from models.responses import AuditListResponse
|
|
|
|
| 6 |
|
| 7 |
def _audit_list_params(
|
| 8 |
limit: Annotated[int, Query(ge=1, le=100)] = 10,
|
|
@@ -12,12 +15,28 @@ def _audit_list_params(
|
|
| 12 |
|
| 13 |
router = APIRouter(tags=["audit"])
|
| 14 |
|
| 15 |
-
@router.get("/audit", response_model=AuditListResponse
|
| 16 |
-
def
|
| 17 |
params: Annotated[AuditListParams, Depends(_audit_list_params)],
|
| 18 |
) -> AuditListResponse:
|
|
|
|
|
|
|
|
|
|
| 19 |
return AuditListResponse(
|
| 20 |
-
status="
|
| 21 |
-
message="
|
| 22 |
-
events=
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 23 |
)
|
|
|
|
| 1 |
from typing import Annotated
|
|
|
|
| 2 |
|
| 3 |
+
from fastapi import APIRouter, Depends, HTTPException, Query, status
|
| 4 |
+
|
| 5 |
+
from api.config import get_settings
|
| 6 |
from models.requests import AuditListParams
|
| 7 |
+
from models.responses import AuditDetailResponse, AuditEvent, AuditListResponse
|
| 8 |
+
from storage.audit_store import get_audit_event, list_audit_events
|
| 9 |
|
| 10 |
def _audit_list_params(
|
| 11 |
limit: Annotated[int, Query(ge=1, le=100)] = 10,
|
|
|
|
| 15 |
|
| 16 |
router = APIRouter(tags=["audit"])
|
| 17 |
|
| 18 |
+
@router.get("/audit", response_model=AuditListResponse)
|
| 19 |
+
async def audit_list(
|
| 20 |
params: Annotated[AuditListParams, Depends(_audit_list_params)],
|
| 21 |
) -> AuditListResponse:
|
| 22 |
+
settings = get_settings()
|
| 23 |
+
rows = await list_audit_events(settings.audit_db_path, limit=params.limit, offset=params.offset)
|
| 24 |
+
events = [AuditEvent.model_validate(row) for row in rows]
|
| 25 |
return AuditListResponse(
|
| 26 |
+
status="success",
|
| 27 |
+
message=f"Returned {len(events)} audit event(s).",
|
| 28 |
+
events=events,
|
| 29 |
+
)
|
| 30 |
+
|
| 31 |
+
|
| 32 |
+
@router.get("/audit/{event_id}", response_model=AuditDetailResponse)
|
| 33 |
+
async def audit_detail(event_id: str) -> AuditDetailResponse:
|
| 34 |
+
settings = get_settings()
|
| 35 |
+
event = await get_audit_event(settings.audit_db_path, event_id)
|
| 36 |
+
if event is None:
|
| 37 |
+
raise HTTPException(status_code=status.HTTP_404_NOT_FOUND, detail="Audit event not found.")
|
| 38 |
+
return AuditDetailResponse(
|
| 39 |
+
status="success",
|
| 40 |
+
message="Audit event retrieved.",
|
| 41 |
+
event=event,
|
| 42 |
)
|
api/routes/query.py
CHANGED
|
@@ -6,12 +6,13 @@ from models.responses import QueryResponse, QueryResultItem, QuerySourceItem
|
|
| 6 |
from rag.embedder import create_embedding_function
|
| 7 |
from rag.retriever import answer_with_grounding, retrieve_chunks
|
| 8 |
from rag.vector_store import get_vector_store
|
|
|
|
| 9 |
|
| 10 |
router = APIRouter(tags=["query"])
|
| 11 |
|
| 12 |
|
| 13 |
@router.post("/query", response_model=QueryResponse)
|
| 14 |
-
def query_endpoint(payload: QueryRequest) -> QueryResponse:
|
| 15 |
settings = get_settings()
|
| 16 |
try:
|
| 17 |
embedding_function = create_embedding_function()
|
|
@@ -36,10 +37,20 @@ def query_endpoint(payload: QueryRequest) -> QueryResponse:
|
|
| 36 |
)
|
| 37 |
for chunk in chunks
|
| 38 |
]
|
| 39 |
-
|
| 40 |
status="success",
|
| 41 |
message=f"Retrieved {len(results)} chunks from '{payload.collection_name}' and generated grounded answer.",
|
| 42 |
answer=answer,
|
| 43 |
sources=sources,
|
| 44 |
results=results,
|
| 45 |
-
)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 6 |
from rag.embedder import create_embedding_function
|
| 7 |
from rag.retriever import answer_with_grounding, retrieve_chunks
|
| 8 |
from rag.vector_store import get_vector_store
|
| 9 |
+
from storage.audit_store import persist_query_audit
|
| 10 |
|
| 11 |
router = APIRouter(tags=["query"])
|
| 12 |
|
| 13 |
|
| 14 |
@router.post("/query", response_model=QueryResponse)
|
| 15 |
+
async def query_endpoint(payload: QueryRequest) -> QueryResponse:
|
| 16 |
settings = get_settings()
|
| 17 |
try:
|
| 18 |
embedding_function = create_embedding_function()
|
|
|
|
| 37 |
)
|
| 38 |
for chunk in chunks
|
| 39 |
]
|
| 40 |
+
response = QueryResponse(
|
| 41 |
status="success",
|
| 42 |
message=f"Retrieved {len(results)} chunks from '{payload.collection_name}' and generated grounded answer.",
|
| 43 |
answer=answer,
|
| 44 |
sources=sources,
|
| 45 |
results=results,
|
| 46 |
+
)
|
| 47 |
+
try:
|
| 48 |
+
await persist_query_audit(
|
| 49 |
+
settings.audit_db_path,
|
| 50 |
+
question=payload.question,
|
| 51 |
+
collection_name=payload.collection_name,
|
| 52 |
+
response=response,
|
| 53 |
+
)
|
| 54 |
+
except Exception as exc:
|
| 55 |
+
raise HTTPException(status_code=status.HTTP_500_INTERNAL_SERVER_ERROR, detail=str(exc)) from exc
|
| 56 |
+
return response
|
models/responses.py
CHANGED
|
@@ -36,8 +36,30 @@ class JobListResponse(BaseModel):
|
|
| 36 |
class AuditEvent(BaseModel):
|
| 37 |
event_id: str
|
| 38 |
action: str
|
|
|
|
|
|
|
|
|
|
| 39 |
|
| 40 |
class AuditListResponse(BaseModel):
|
| 41 |
status: str
|
| 42 |
message: str
|
| 43 |
events: list[AuditEvent] = Field(default_factory=list)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 36 |
class AuditEvent(BaseModel):
|
| 37 |
event_id: str
|
| 38 |
action: str
|
| 39 |
+
question: str | None = None
|
| 40 |
+
collection_name: str | None = None
|
| 41 |
+
created_at: str | None = None
|
| 42 |
|
| 43 |
class AuditListResponse(BaseModel):
|
| 44 |
status: str
|
| 45 |
message: str
|
| 46 |
events: list[AuditEvent] = Field(default_factory=list)
|
| 47 |
+
|
| 48 |
+
|
| 49 |
+
class AuditDetail(BaseModel):
|
| 50 |
+
event_id: str
|
| 51 |
+
action: str
|
| 52 |
+
question: str
|
| 53 |
+
collection_name: str
|
| 54 |
+
answer: str | None = None
|
| 55 |
+
status: str
|
| 56 |
+
message: str
|
| 57 |
+
sources: list[QuerySourceItem] = Field(default_factory=list)
|
| 58 |
+
results: list[QueryResultItem] = Field(default_factory=list)
|
| 59 |
+
created_at: str
|
| 60 |
+
|
| 61 |
+
|
| 62 |
+
class AuditDetailResponse(BaseModel):
|
| 63 |
+
status: str
|
| 64 |
+
message: str
|
| 65 |
+
event: AuditDetail | None = None
|
pyproject.toml
CHANGED
|
@@ -19,6 +19,7 @@ dependencies = [
|
|
| 19 |
"pydantic-settings==2.3.4",
|
| 20 |
"pymupdf==1.24.3",
|
| 21 |
"python-multipart==0.0.9",
|
|
|
|
| 22 |
"uvicorn[standard]==0.29.0",
|
| 23 |
"huggingface-hub>=1.13.0",
|
| 24 |
"langchain-huggingface>=0.0.3",
|
|
|
|
| 19 |
"pydantic-settings==2.3.4",
|
| 20 |
"pymupdf==1.24.3",
|
| 21 |
"python-multipart==0.0.9",
|
| 22 |
+
"aiosqlite>=0.21.0",
|
| 23 |
"uvicorn[standard]==0.29.0",
|
| 24 |
"huggingface-hub>=1.13.0",
|
| 25 |
"langchain-huggingface>=0.0.3",
|
requirements.txt
CHANGED
|
@@ -13,5 +13,6 @@ openai==1.30.1
|
|
| 13 |
anthropic==0.28.1
|
| 14 |
pymupdf==1.24.3
|
| 15 |
python-multipart==0.0.9
|
|
|
|
| 16 |
huggingface-hub
|
| 17 |
langchain-huggingface
|
|
|
|
| 13 |
anthropic==0.28.1
|
| 14 |
pymupdf==1.24.3
|
| 15 |
python-multipart==0.0.9
|
| 16 |
+
aiosqlite
|
| 17 |
huggingface-hub
|
| 18 |
langchain-huggingface
|
storage/__init__.py
ADDED
|
@@ -0,0 +1 @@
|
|
|
|
|
|
|
| 1 |
+
|
storage/audit_store.py
ADDED
|
@@ -0,0 +1,102 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
import json
|
| 2 |
+
from pathlib import Path
|
| 3 |
+
from typing import Any
|
| 4 |
+
from uuid import uuid4
|
| 5 |
+
|
| 6 |
+
import aiosqlite
|
| 7 |
+
|
| 8 |
+
from models.responses import AuditDetail, QueryResponse
|
| 9 |
+
|
| 10 |
+
|
| 11 |
+
async def init_audit_db(db_path: str) -> None:
|
| 12 |
+
db_file = Path(db_path)
|
| 13 |
+
db_file.parent.mkdir(parents=True, exist_ok=True)
|
| 14 |
+
async with aiosqlite.connect(db_file.as_posix()) as conn:
|
| 15 |
+
await conn.execute(
|
| 16 |
+
"""
|
| 17 |
+
CREATE TABLE IF NOT EXISTS audit_events (
|
| 18 |
+
event_id TEXT PRIMARY KEY,
|
| 19 |
+
action TEXT NOT NULL,
|
| 20 |
+
question TEXT NOT NULL,
|
| 21 |
+
collection_name TEXT NOT NULL,
|
| 22 |
+
answer TEXT,
|
| 23 |
+
status TEXT NOT NULL,
|
| 24 |
+
message TEXT NOT NULL,
|
| 25 |
+
sources_json TEXT NOT NULL,
|
| 26 |
+
results_json TEXT NOT NULL,
|
| 27 |
+
created_at TEXT NOT NULL DEFAULT CURRENT_TIMESTAMP
|
| 28 |
+
)
|
| 29 |
+
"""
|
| 30 |
+
)
|
| 31 |
+
await conn.commit()
|
| 32 |
+
|
| 33 |
+
|
| 34 |
+
async def persist_query_audit(
|
| 35 |
+
db_path: str,
|
| 36 |
+
*,
|
| 37 |
+
question: str,
|
| 38 |
+
collection_name: str,
|
| 39 |
+
response: QueryResponse,
|
| 40 |
+
) -> str:
|
| 41 |
+
event_id = str(uuid4())
|
| 42 |
+
await init_audit_db(db_path)
|
| 43 |
+
async with aiosqlite.connect(db_path) as conn:
|
| 44 |
+
await conn.execute(
|
| 45 |
+
"""
|
| 46 |
+
INSERT INTO audit_events (
|
| 47 |
+
event_id, action, question, collection_name, answer, status, message, sources_json, results_json
|
| 48 |
+
) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?)
|
| 49 |
+
""",
|
| 50 |
+
(
|
| 51 |
+
event_id,
|
| 52 |
+
"query",
|
| 53 |
+
question,
|
| 54 |
+
collection_name,
|
| 55 |
+
response.answer,
|
| 56 |
+
response.status,
|
| 57 |
+
response.message,
|
| 58 |
+
json.dumps([item.model_dump() for item in response.sources]),
|
| 59 |
+
json.dumps([item.model_dump() for item in response.results]),
|
| 60 |
+
),
|
| 61 |
+
)
|
| 62 |
+
await conn.commit()
|
| 63 |
+
return event_id
|
| 64 |
+
|
| 65 |
+
|
| 66 |
+
async def list_audit_events(db_path: str, *, limit: int, offset: int) -> list[dict[str, Any]]:
|
| 67 |
+
await init_audit_db(db_path)
|
| 68 |
+
async with aiosqlite.connect(db_path) as conn:
|
| 69 |
+
conn.row_factory = aiosqlite.Row
|
| 70 |
+
cursor = await conn.execute(
|
| 71 |
+
"""
|
| 72 |
+
SELECT event_id, action, question, collection_name, created_at
|
| 73 |
+
FROM audit_events
|
| 74 |
+
ORDER BY datetime(created_at) DESC, rowid DESC
|
| 75 |
+
LIMIT ? OFFSET ?
|
| 76 |
+
""",
|
| 77 |
+
(limit, offset),
|
| 78 |
+
)
|
| 79 |
+
rows = await cursor.fetchall()
|
| 80 |
+
return [dict(row) for row in rows]
|
| 81 |
+
|
| 82 |
+
|
| 83 |
+
async def get_audit_event(db_path: str, event_id: str) -> AuditDetail | None:
|
| 84 |
+
await init_audit_db(db_path)
|
| 85 |
+
async with aiosqlite.connect(db_path) as conn:
|
| 86 |
+
conn.row_factory = aiosqlite.Row
|
| 87 |
+
cursor = await conn.execute(
|
| 88 |
+
"""
|
| 89 |
+
SELECT event_id, action, question, collection_name, answer, status, message, sources_json, results_json, created_at
|
| 90 |
+
FROM audit_events
|
| 91 |
+
WHERE event_id = ?
|
| 92 |
+
""",
|
| 93 |
+
(event_id,),
|
| 94 |
+
)
|
| 95 |
+
row = await cursor.fetchone()
|
| 96 |
+
if row is None:
|
| 97 |
+
return None
|
| 98 |
+
|
| 99 |
+
payload = dict(row)
|
| 100 |
+
payload["sources"] = json.loads(payload.pop("sources_json") or "[]")
|
| 101 |
+
payload["results"] = json.loads(payload.pop("results_json") or "[]")
|
| 102 |
+
return AuditDetail.model_validate(payload)
|
uv.lock
CHANGED
|
@@ -133,6 +133,15 @@ wheels = [
|
|
| 133 |
{ url = "https://files.pythonhosted.org/packages/fb/76/641ae371508676492379f16e2fa48f4e2c11741bd63c48be4b12a6b09cba/aiosignal-1.4.0-py3-none-any.whl", hash = "sha256:053243f8b92b990551949e63930a839ff0cf0b0ebbe0597b0f3fb19e1a0fe82e", size = 7490, upload-time = "2025-07-03T22:54:42.156Z" },
|
| 134 |
]
|
| 135 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 136 |
[[package]]
|
| 137 |
name = "annotated-doc"
|
| 138 |
version = "0.0.4"
|
|
@@ -524,6 +533,7 @@ name = "doc-audi-ai"
|
|
| 524 |
version = "0.1.0"
|
| 525 |
source = { virtual = "." }
|
| 526 |
dependencies = [
|
|
|
|
| 527 |
{ name = "anthropic" },
|
| 528 |
{ name = "chromadb" },
|
| 529 |
{ name = "fastapi" },
|
|
@@ -547,6 +557,7 @@ dependencies = [
|
|
| 547 |
|
| 548 |
[package.metadata]
|
| 549 |
requires-dist = [
|
|
|
|
| 550 |
{ name = "anthropic", specifier = "==0.28.1" },
|
| 551 |
{ name = "chromadb", specifier = "==0.5.0" },
|
| 552 |
{ name = "fastapi", specifier = "==0.111.0" },
|
|
|
|
| 133 |
{ url = "https://files.pythonhosted.org/packages/fb/76/641ae371508676492379f16e2fa48f4e2c11741bd63c48be4b12a6b09cba/aiosignal-1.4.0-py3-none-any.whl", hash = "sha256:053243f8b92b990551949e63930a839ff0cf0b0ebbe0597b0f3fb19e1a0fe82e", size = 7490, upload-time = "2025-07-03T22:54:42.156Z" },
|
| 134 |
]
|
| 135 |
|
| 136 |
+
[[package]]
|
| 137 |
+
name = "aiosqlite"
|
| 138 |
+
version = "0.22.1"
|
| 139 |
+
source = { registry = "https://pypi.org/simple" }
|
| 140 |
+
sdist = { url = "https://files.pythonhosted.org/packages/4e/8a/64761f4005f17809769d23e518d915db74e6310474e733e3593cfc854ef1/aiosqlite-0.22.1.tar.gz", hash = "sha256:043e0bd78d32888c0a9ca90fc788b38796843360c855a7262a532813133a0650", size = 14821, upload-time = "2025-12-23T19:25:43.997Z" }
|
| 141 |
+
wheels = [
|
| 142 |
+
{ url = "https://files.pythonhosted.org/packages/00/b7/e3bf5133d697a08128598c8d0abc5e16377b51465a33756de24fa7dee953/aiosqlite-0.22.1-py3-none-any.whl", hash = "sha256:21c002eb13823fad740196c5a2e9d8e62f6243bd9e7e4a1f87fb5e44ecb4fceb", size = 17405, upload-time = "2025-12-23T19:25:42.139Z" },
|
| 143 |
+
]
|
| 144 |
+
|
| 145 |
[[package]]
|
| 146 |
name = "annotated-doc"
|
| 147 |
version = "0.0.4"
|
|
|
|
| 533 |
version = "0.1.0"
|
| 534 |
source = { virtual = "." }
|
| 535 |
dependencies = [
|
| 536 |
+
{ name = "aiosqlite" },
|
| 537 |
{ name = "anthropic" },
|
| 538 |
{ name = "chromadb" },
|
| 539 |
{ name = "fastapi" },
|
|
|
|
| 557 |
|
| 558 |
[package.metadata]
|
| 559 |
requires-dist = [
|
| 560 |
+
{ name = "aiosqlite", specifier = ">=0.21.0" },
|
| 561 |
{ name = "anthropic", specifier = "==0.28.1" },
|
| 562 |
{ name = "chromadb", specifier = "==0.5.0" },
|
| 563 |
{ name = "fastapi", specifier = "==0.111.0" },
|