Spaces:
Running
Running
| # Interfaces | |
| > Reference document listing every external API the system depends on, and every important module-to-module contract inside the codebase. Use this when changing an integration or refactoring an internal module to understand what might break. | |
| ## External interfaces (the system calls out) | |
| ### OpenAI Chat Completions API | |
| | Aspect | Value | | |
| |---|---| | |
| | Endpoint | `https://api.openai.com/v1/chat/completions` | | |
| | Auth | `Authorization: Bearer ${OPENAI_API_KEY}` | | |
| | Wrapper | `src/llm/adapters.py` (via `OpenAIClient` + `LLMRegistry`) | | |
| | Models in use | `gpt-4o-mini` (cost-efficient β query rewriting, answer generation), `gpt-4.1` (strong β query analysis) | | |
| | Configured at | `src/config/settings.yaml::reader.OPENAI` and `reader.OPENAI_STRONG` | | |
| | Called from | `BaseMultiAgentChatbot._analyze_query_context`, `_rewrite_query_for_rag`; `MultiAgentRAGChatbot._generate_conversational_response*` | | |
| | Failure mode | Caught at every call site; returns a sensible fallback (original query, generic error message) | | |
| | Latency budget | 1-5 s per call typical; tolerated up to 30 s before the user sees a "thinking" timeout (Streamlit default) | | |
| | Rate limits | OpenAI's standard rate limits per account/key tier; not actively monitored by the app | | |
| ### Qdrant Cloud API | |
| | Aspect | Value | | |
| |---|---| | |
| | Endpoint | `${QDRANT_URL}` (set as HF Space secret) | | |
| | Auth | `api-key: ${QDRANT_API_KEY}` header | | |
| | Protocol | gRPC preferred (configurable via `prefer_grpc: true` in settings); HTTPS fallback | | |
| | Wrapper | `src/vectorstore.py::VectorStoreManager` (uses `langchain_qdrant.Qdrant`) | | |
| | Collection | `BAAI-bge-m3-full` (configured in `settings.yaml::qdrant.collection_name`) | | |
| | Operations called | `similarity_search_with_score` (vector search), `count` (pre-validation), `scroll` (metadata cache rebuild), `create_payload_index` (one-off setup) | | |
| | Called from | `src/retrieval/context.py`, `src/retrieval/filter.py` (`MetadataCache._fetch_from_qdrant`), `src/agents/agent_filtering.py` (`_prevalidate_filters`) | | |
| | Failure mode | Connection failure at startup β chatbot init fails, app shows error banner. Query failure β caught, returns empty results. | | |
| | Latency budget | `similarity_search`: ~200-500 ms typical. `count`: <100 ms typical. `scroll`: ~80 s on full collection (only on cold start without disk cache). | | |
| ### Hugging Face Hub β model file downloads | |
| | Aspect | Value | | |
| |---|---| | |
| | Endpoint | `https://huggingface.co/<model>/resolve/main/*` | | |
| | Auth | Public read, no auth needed for our models | | |
| | Wrapper | `transformers` library (transitive via `sentence-transformers` and `langchain_huggingface`) | | |
| | Models | `BAAI/bge-m3` (embeddings), `BAAI/bge-reranker-v2-m3` (reranker) | | |
| | When called | Only at **Docker build time** (`download_models.py`); pre-populated cache in image avoids runtime downloads | | |
| | Failure mode | Build fails; deploys blocked until HF Hub is reachable | | |
| ### Hugging Face Hub β dataset push (logging) | |
| | Aspect | Value | | |
| |---|---| | |
| | Endpoint | `https://huggingface.co/api/datasets/GIZ/spaces_logs/*` | | |
| | Auth | `Bearer ${SPACES_LOG}` (write token, set as HF Space secret) | | |
| | Wrapper | `src/logging.py` (via `huggingface_hub.HfApi`) | | |
| | What's pushed | Conversational JSON logs (audit trail) | | |
| | Called from | `BaseMultiAgentChatbot.chat()` after each turn | | |
| | Failure mode | Caught silently; logs an error but doesn't fail the user request | | |
| ### Ollama (optional, local development only) | |
| | Aspect | Value | | |
| |---|---| | |
| | Endpoint | `${OLLAMA_BASE_URL}` (e.g. `http://localhost:11434/`) | | |
| | Auth | None | | |
| | Wrapper | `src/llm/adapters.py` (via `langchain_ollama.OllamaLLM`) | | |
| | Status | **Not used in production**. Available for local dev where running OpenAI calls would be expensive or impossible offline. | | |
| ## Internal interfaces (module to module within the codebase) | |
| ### `app.py` β `BaseMultiAgentChatbot.chat()` | |
| The **only** call from the Streamlit layer into the agent layer. | |
| | Aspect | Value | | |
| |---|---| | |
| | Signature | `chat(user_input: str, conversation_id: str = "default") -> Dict[str, Any]` | | |
| | Input | `user_input` may include a `FILTER CONTEXT:` preamble with sidebar selections; `conversation_id` is the per-Streamlit-session UUID | | |
| | Output | Dict with keys: `response` (str), `rag_result` (PipelineResult), `agent_logs` (list), `relaxation_notes` (list), `gap_follow_up` (str or None) | | |
| | Stability contract | This signature should be considered stable. Streamlit, tests, and any future front-ends depend on it. | | |
| ### `MultiAgentRAGChatbot._perform_retrieval()` β `PipelineManager.run()` | |
| How the agent triggers retrieval. | |
| | Aspect | Value | | |
| |---|---| | |
| | Caller | `MultiAgentRAGChatbot._perform_retrieval` (subclass implementation of an abstract method) | | |
| | Callee | `src/pipeline.py::PipelineManager.run(query, sources, auto_infer_filters, filters, skip_answer, ...)` | | |
| | Key call-site arguments | `auto_infer_filters=False` (we did filter inference upstream); `skip_answer=True` (we do answer generation in the agent, not the pipeline) | | |
| | Returns | `PipelineResult` with `.sources` (List[Document]), `.answer` (always empty when `skip_answer=True`), `.metadata` | | |
| ### `PipelineManager.run()` β `ContextRetriever.retrieve_context()` | |
| How the pipeline triggers actual vector search. | |
| | Aspect | Value | | |
| |---|---| | |
| | Caller | `src/pipeline.py::PipelineManager.run` | | |
| | Callee | `src/retrieval/context.py::ContextRetriever.retrieve_context(query, reports, sources, subtype, year, district, filenames, entity_type, use_reranking, top_k, ...)` | | |
| | Returns | `List[Document]` with metadata fields `original_score`, `reranked_score` (if reranker applied), `reranking_applied`, plus the underlying Qdrant payload (year, district, source, filename, page, etc.) | | |
| ### Mixin contracts (intra-`src/agents/`) | |
| | Mixin | Public attributes it sets on `self` | Methods it exposes | | |
| |---|---|---| | |
| | `_MetadataMixin` (`metadata.py`) | `self.year_whitelist`, `self.source_whitelist`, `self.district_whitelist`, `self.db_metadata_context`, `self.district_doc_counts`, `self.current_year`, `self.latest_data_year`, `self.earliest_data_year`, `self.UGANDA_REGIONS` | `_load_dynamic_data()`, `_load_db_metadata(vectorstore)`, `_normalize_district_name(s)` | | |
| | `_FiltersMixin` (`agent_filtering.py`) | None | `_best_score(sources)` (static), `_prevalidate_filters(filters, anchored_keys)`, `_post_relaxation_relevance_check(sources, anchored_keys, original_filters)`, `_normalize_source_name(raw)`, `_llm_overrides_ui(...)` (static), `_validate_filter_values(filters)` | | |
| | `_ConversationHistoryMixin` (`conversation_history.py`) | None | `_load_conversation(file_path)`, `_save_conversation(file_path, conversation)` | | |
| All mixin methods are called via `self.X()` in `BaseMultiAgentChatbot`. The orchestrator stays at the same call-site convention regardless of physical file location (Python's MRO handles the dispatch). | |
| ### Filter-construction APIs in `src/retrieval/filter.py` | |
| Two coexisting filter constructors, with different consumers (see [ADR 002](architecture/adrs/002-dense-only-retrieval-hybrid-disabled.md) and DEFERRED #1 for the rationale): | |
| | Function | Signature | Used by | | |
| |---|---|---| | |
| | `build_qdrant_filter_from_dict(filters: dict)` | Dict-based, returns `Optional[Filter]` | `_FiltersMixin._prevalidate_filters` (for cheap `count()` queries before retrieval) | | |
| | `create_filter(reports=, sources=, subtype=, year=, district=, filenames=, entity_type=)` | Kwarg-based, returns `Filter`; handles filename mutual-exclusivity | `src/retrieval/hybrid.py`, `src/retrieval/context.py` (for actual vector-search queries) | | |
| ### LangGraph state contract (`MultiAgentState`) | |
| Defined in `src/agents/state.py`. Every agent node reads from and writes to a shared `MultiAgentState` (a TypedDict). Keys that travel between nodes: | |
| | Key | Written by | Read by | | |
| |---|---|---| | |
| | `current_query` | `chat()` | All agents | | |
| | `messages` | `chat()` (after a turn completes) | `_main_agent` (for context), `_rewrite_query_for_rag` | | |
| | `query_context` | `_main_agent._analyze_query_context` | `_route_after_main`, `_rag_agent`, `_response_agent` | | |
| | `rag_filters` | `_rag_agent._build_filters` | `_response_agent` | | |
| | `anchored_filter_keys` | `_rag_agent._build_filters` | `_response_agent` (relaxation logic) | | |
| | `rag_query` | `_response_agent._rewrite_query_for_rag` (after prevalidation succeeds) | `_response_agent._perform_retrieval` | | |
| | `retrieved_documents` | `_response_agent` (after retrieval) | `_response_agent._generate_conversational_response` | | |
| | `final_response` | `_response_agent`, or pre-validation early-exit | `chat()` (returned to caller) | | |
| | `agent_logs` | All agents (append-only) | `chat()` (returned to caller for debugging) | | |
| | `relaxation_notes` | `_response_agent` (during/after relaxation) | `chat()` (returned to caller) | | |
| | `gap_follow_up` | `_response_agent` (when pre-validation finds an all-anchored gap) | `chat()` | | |
| | `conversation_context["last_filters"]` | `_main_agent` (end of turn) | `_main_agent._analyze_query_context` (next turn's filter carryover) | | |
| ## Versioning and stability | |
| - **External APIs** β OpenAI and Qdrant are vendor-controlled; we depend on their API stability. Both have versioning policies; we're using stable public versions. | |
| - **Internal APIs** β no formal versioning; changes to mixin signatures or agent state keys are coordinated within the same PR. The code is small enough that this works without overhead. | |
| --- | |
| *Related:* [`docs/system-requirements.md`](system-requirements.md) details what external services must be available for the system to function. [`docs/architecture/05-deployment-view.md`](architecture/05-deployment-view.md) shows the deployment topology of these interfaces. | |