Interfaces
Reference document listing every external API the system depends on, and every important module-to-module contract inside the codebase. Use this when changing an integration or refactoring an internal module to understand what might break.
External interfaces (the system calls out)
OpenAI Chat Completions API
| Aspect |
Value |
| Endpoint |
https://api.openai.com/v1/chat/completions |
| Auth |
Authorization: Bearer ${OPENAI_API_KEY} |
| Wrapper |
src/llm/adapters.py (via OpenAIClient + LLMRegistry) |
| Models in use |
gpt-4o-mini (cost-efficient β query rewriting, answer generation), gpt-4.1 (strong β query analysis) |
| Configured at |
src/config/settings.yaml::reader.OPENAI and reader.OPENAI_STRONG |
| Called from |
BaseMultiAgentChatbot._analyze_query_context, _rewrite_query_for_rag; MultiAgentRAGChatbot._generate_conversational_response* |
| Failure mode |
Caught at every call site; returns a sensible fallback (original query, generic error message) |
| Latency budget |
1-5 s per call typical; tolerated up to 30 s before the user sees a "thinking" timeout (Streamlit default) |
| Rate limits |
OpenAI's standard rate limits per account/key tier; not actively monitored by the app |
Qdrant Cloud API
| Aspect |
Value |
| Endpoint |
${QDRANT_URL} (set as HF Space secret) |
| Auth |
api-key: ${QDRANT_API_KEY} header |
| Protocol |
gRPC preferred (configurable via prefer_grpc: true in settings); HTTPS fallback |
| Wrapper |
src/vectorstore.py::VectorStoreManager (uses langchain_qdrant.Qdrant) |
| Collection |
BAAI-bge-m3-full (configured in settings.yaml::qdrant.collection_name) |
| Operations called |
similarity_search_with_score (vector search), count (pre-validation), scroll (metadata cache rebuild), create_payload_index (one-off setup) |
| Called from |
src/retrieval/context.py, src/retrieval/filter.py (MetadataCache._fetch_from_qdrant), src/agents/agent_filtering.py (_prevalidate_filters) |
| Failure mode |
Connection failure at startup β chatbot init fails, app shows error banner. Query failure β caught, returns empty results. |
| Latency budget |
similarity_search: ~200-500 ms typical. count: <100 ms typical. scroll: ~80 s on full collection (only on cold start without disk cache). |
Hugging Face Hub β model file downloads
| Aspect |
Value |
| Endpoint |
https://huggingface.co/<model>/resolve/main/* |
| Auth |
Public read, no auth needed for our models |
| Wrapper |
transformers library (transitive via sentence-transformers and langchain_huggingface) |
| Models |
BAAI/bge-m3 (embeddings), BAAI/bge-reranker-v2-m3 (reranker) |
| When called |
Only at Docker build time (download_models.py); pre-populated cache in image avoids runtime downloads |
| Failure mode |
Build fails; deploys blocked until HF Hub is reachable |
Hugging Face Hub β dataset push (logging)
| Aspect |
Value |
| Endpoint |
https://huggingface.co/api/datasets/GIZ/spaces_logs/* |
| Auth |
Bearer ${SPACES_LOG} (write token, set as HF Space secret) |
| Wrapper |
src/logging.py (via huggingface_hub.HfApi) |
| What's pushed |
Conversational JSON logs (audit trail) |
| Called from |
BaseMultiAgentChatbot.chat() after each turn |
| Failure mode |
Caught silently; logs an error but doesn't fail the user request |
Ollama (optional, local development only)
| Aspect |
Value |
| Endpoint |
${OLLAMA_BASE_URL} (e.g. http://localhost:11434/) |
| Auth |
None |
| Wrapper |
src/llm/adapters.py (via langchain_ollama.OllamaLLM) |
| Status |
Not used in production. Available for local dev where running OpenAI calls would be expensive or impossible offline. |
Internal interfaces (module to module within the codebase)
app.py β BaseMultiAgentChatbot.chat()
The only call from the Streamlit layer into the agent layer.
| Aspect |
Value |
| Signature |
chat(user_input: str, conversation_id: str = "default") -> Dict[str, Any] |
| Input |
user_input may include a FILTER CONTEXT: preamble with sidebar selections; conversation_id is the per-Streamlit-session UUID |
| Output |
Dict with keys: response (str), rag_result (PipelineResult), agent_logs (list), relaxation_notes (list), gap_follow_up (str or None) |
| Stability contract |
This signature should be considered stable. Streamlit, tests, and any future front-ends depend on it. |
MultiAgentRAGChatbot._perform_retrieval() β PipelineManager.run()
How the agent triggers retrieval.
| Aspect |
Value |
| Caller |
MultiAgentRAGChatbot._perform_retrieval (subclass implementation of an abstract method) |
| Callee |
src/pipeline.py::PipelineManager.run(query, sources, auto_infer_filters, filters, skip_answer, ...) |
| Key call-site arguments |
auto_infer_filters=False (we did filter inference upstream); skip_answer=True (we do answer generation in the agent, not the pipeline) |
| Returns |
PipelineResult with .sources (List[Document]), .answer (always empty when skip_answer=True), .metadata |
PipelineManager.run() β ContextRetriever.retrieve_context()
How the pipeline triggers actual vector search.
| Aspect |
Value |
| Caller |
src/pipeline.py::PipelineManager.run |
| Callee |
src/retrieval/context.py::ContextRetriever.retrieve_context(query, reports, sources, subtype, year, district, filenames, entity_type, use_reranking, top_k, ...) |
| Returns |
List[Document] with metadata fields original_score, reranked_score (if reranker applied), reranking_applied, plus the underlying Qdrant payload (year, district, source, filename, page, etc.) |
Mixin contracts (intra-src/agents/)
| Mixin |
Public attributes it sets on self |
Methods it exposes |
_MetadataMixin (metadata.py) |
self.year_whitelist, self.source_whitelist, self.district_whitelist, self.db_metadata_context, self.district_doc_counts, self.current_year, self.latest_data_year, self.earliest_data_year, self.UGANDA_REGIONS |
_load_dynamic_data(), _load_db_metadata(vectorstore), _normalize_district_name(s) |
_FiltersMixin (agent_filtering.py) |
None |
_best_score(sources) (static), _prevalidate_filters(filters, anchored_keys), _post_relaxation_relevance_check(sources, anchored_keys, original_filters), _normalize_source_name(raw), _llm_overrides_ui(...) (static), _validate_filter_values(filters) |
_ConversationHistoryMixin (conversation_history.py) |
None |
_load_conversation(file_path), _save_conversation(file_path, conversation) |
All mixin methods are called via self.X() in BaseMultiAgentChatbot. The orchestrator stays at the same call-site convention regardless of physical file location (Python's MRO handles the dispatch).
Filter-construction APIs in src/retrieval/filter.py
Two coexisting filter constructors, with different consumers (see ADR 002 and DEFERRED #1 for the rationale):
| Function |
Signature |
Used by |
build_qdrant_filter_from_dict(filters: dict) |
Dict-based, returns Optional[Filter] |
_FiltersMixin._prevalidate_filters (for cheap count() queries before retrieval) |
create_filter(reports=, sources=, subtype=, year=, district=, filenames=, entity_type=) |
Kwarg-based, returns Filter; handles filename mutual-exclusivity |
src/retrieval/hybrid.py, src/retrieval/context.py (for actual vector-search queries) |
LangGraph state contract (MultiAgentState)
Defined in src/agents/state.py. Every agent node reads from and writes to a shared MultiAgentState (a TypedDict). Keys that travel between nodes:
| Key |
Written by |
Read by |
current_query |
chat() |
All agents |
messages |
chat() (after a turn completes) |
_main_agent (for context), _rewrite_query_for_rag |
query_context |
_main_agent._analyze_query_context |
_route_after_main, _rag_agent, _response_agent |
rag_filters |
_rag_agent._build_filters |
_response_agent |
anchored_filter_keys |
_rag_agent._build_filters |
_response_agent (relaxation logic) |
rag_query |
_response_agent._rewrite_query_for_rag (after prevalidation succeeds) |
_response_agent._perform_retrieval |
retrieved_documents |
_response_agent (after retrieval) |
_response_agent._generate_conversational_response |
final_response |
_response_agent, or pre-validation early-exit |
chat() (returned to caller) |
agent_logs |
All agents (append-only) |
chat() (returned to caller for debugging) |
relaxation_notes |
_response_agent (during/after relaxation) |
chat() (returned to caller) |
gap_follow_up |
_response_agent (when pre-validation finds an all-anchored gap) |
chat() |
conversation_context["last_filters"] |
_main_agent (end of turn) |
_main_agent._analyze_query_context (next turn's filter carryover) |
Versioning and stability
- External APIs β OpenAI and Qdrant are vendor-controlled; we depend on their API stability. Both have versioning policies; we're using stable public versions.
- Internal APIs β no formal versioning; changes to mixin signatures or agent state keys are coordinated within the same PR. The code is small enough that this works without overhead.
Related: docs/system-requirements.md details what external services must be available for the system to function. docs/architecture/05-deployment-view.md shows the deployment topology of these interfaces.