Spaces:

akryldigital
/

audit_assistant

Running

App Files Files Community

audit_assistant / docs /interfaces.md

akryldigital

add docs

815b494 verified 10 days ago

preview code

raw

history blame contribute delete

9.74 kB

Interfaces

Reference document listing every external API the system depends on, and every important module-to-module contract inside the codebase. Use this when changing an integration or refactoring an internal module to understand what might break.

External interfaces (the system calls out)

OpenAI Chat Completions API

Aspect	Value
Endpoint	`https://api.openai.com/v1/chat/completions`
Auth	`Authorization: Bearer ${OPENAI_API_KEY}`
Wrapper	`src/llm/adapters.py` (via `OpenAIClient` + `LLMRegistry`)
Models in use	`gpt-4o-mini` (cost-efficient — query rewriting, answer generation), `gpt-4.1` (strong — query analysis)
Configured at	`src/config/settings.yaml::reader.OPENAI` and `reader.OPENAI_STRONG`
Called from	`BaseMultiAgentChatbot._analyze_query_context`, `_rewrite_query_for_rag`; `MultiAgentRAGChatbot._generate_conversational_response*`
Failure mode	Caught at every call site; returns a sensible fallback (original query, generic error message)
Latency budget	1-5 s per call typical; tolerated up to 30 s before the user sees a "thinking" timeout (Streamlit default)
Rate limits	OpenAI's standard rate limits per account/key tier; not actively monitored by the app

Qdrant Cloud API

Aspect	Value
Endpoint	`${QDRANT_URL}` (set as HF Space secret)
Auth	`api-key: ${QDRANT_API_KEY}` header
Protocol	gRPC preferred (configurable via `prefer_grpc: true` in settings); HTTPS fallback
Wrapper	`src/vectorstore.py::VectorStoreManager` (uses `langchain_qdrant.Qdrant`)
Collection	`BAAI-bge-m3-full` (configured in `settings.yaml::qdrant.collection_name`)
Operations called	`similarity_search_with_score` (vector search), `count` (pre-validation), `scroll` (metadata cache rebuild), `create_payload_index` (one-off setup)
Called from	`src/retrieval/context.py`, `src/retrieval/filter.py` (`MetadataCache._fetch_from_qdrant`), `src/agents/agent_filtering.py` (`_prevalidate_filters`)
Failure mode	Connection failure at startup → chatbot init fails, app shows error banner. Query failure → caught, returns empty results.
Latency budget	`similarity_search`: ~200-500 ms typical. `count`: <100 ms typical. `scroll`: ~80 s on full collection (only on cold start without disk cache).

Hugging Face Hub — model file downloads

Aspect	Value
Endpoint	`https://huggingface.co/<model>/resolve/main/*`
Auth	Public read, no auth needed for our models
Wrapper	`transformers` library (transitive via `sentence-transformers` and `langchain_huggingface`)
Models	`BAAI/bge-m3` (embeddings), `BAAI/bge-reranker-v2-m3` (reranker)
When called	Only at Docker build time (`download_models.py`); pre-populated cache in image avoids runtime downloads
Failure mode	Build fails; deploys blocked until HF Hub is reachable

Hugging Face Hub — dataset push (logging)

Aspect	Value
Endpoint	`https://huggingface.co/api/datasets/GIZ/spaces_logs/*`
Auth	`Bearer ${SPACES_LOG}` (write token, set as HF Space secret)
Wrapper	`src/logging.py` (via `huggingface_hub.HfApi`)
What's pushed	Conversational JSON logs (audit trail)
Called from	`BaseMultiAgentChatbot.chat()` after each turn
Failure mode	Caught silently; logs an error but doesn't fail the user request

Ollama (optional, local development only)

Aspect	Value
Endpoint	`${OLLAMA_BASE_URL}` (e.g. `http://localhost:11434/`)
Auth	None
Wrapper	`src/llm/adapters.py` (via `langchain_ollama.OllamaLLM`)
Status	Not used in production. Available for local dev where running OpenAI calls would be expensive or impossible offline.

Internal interfaces (module to module within the codebase)

`app.py` → `BaseMultiAgentChatbot.chat()`

The only call from the Streamlit layer into the agent layer.

Aspect	Value
Signature	`chat(user_input: str, conversation_id: str = "default") -> Dict[str, Any]`
Input	`user_input` may include a `FILTER CONTEXT:` preamble with sidebar selections; `conversation_id` is the per-Streamlit-session UUID
Output	Dict with keys: `response` (str), `rag_result` (PipelineResult), `agent_logs` (list), `relaxation_notes` (list), `gap_follow_up` (str or None)
Stability contract	This signature should be considered stable. Streamlit, tests, and any future front-ends depend on it.

`MultiAgentRAGChatbot._perform_retrieval()` → `PipelineManager.run()`

How the agent triggers retrieval.

Aspect	Value
Caller	`MultiAgentRAGChatbot._perform_retrieval` (subclass implementation of an abstract method)
Callee	`src/pipeline.py::PipelineManager.run(query, sources, auto_infer_filters, filters, skip_answer, ...)`
Key call-site arguments	`auto_infer_filters=False` (we did filter inference upstream); `skip_answer=True` (we do answer generation in the agent, not the pipeline)
Returns	`PipelineResult` with `.sources` (List[Document]), `.answer` (always empty when `skip_answer=True`), `.metadata`

`PipelineManager.run()` → `ContextRetriever.retrieve_context()`

How the pipeline triggers actual vector search.

Aspect	Value
Caller	`src/pipeline.py::PipelineManager.run`
Callee	`src/retrieval/context.py::ContextRetriever.retrieve_context(query, reports, sources, subtype, year, district, filenames, entity_type, use_reranking, top_k, ...)`
Returns	`List[Document]` with metadata fields `original_score`, `reranked_score` (if reranker applied), `reranking_applied`, plus the underlying Qdrant payload (year, district, source, filename, page, etc.)

Mixin contracts (intra-`src/agents/`)

Mixin	Public attributes it sets on `self`	Methods it exposes
`_MetadataMixin` (`metadata.py`)	`self.year_whitelist`, `self.source_whitelist`, `self.district_whitelist`, `self.db_metadata_context`, `self.district_doc_counts`, `self.current_year`, `self.latest_data_year`, `self.earliest_data_year`, `self.UGANDA_REGIONS`	`_load_dynamic_data()`, `_load_db_metadata(vectorstore)`, `_normalize_district_name(s)`
`_FiltersMixin` (`agent_filtering.py`)	None	`_best_score(sources)` (static), `_prevalidate_filters(filters, anchored_keys)`, `_post_relaxation_relevance_check(sources, anchored_keys, original_filters)`, `_normalize_source_name(raw)`, `_llm_overrides_ui(...)` (static), `_validate_filter_values(filters)`
`_ConversationHistoryMixin` (`conversation_history.py`)	None	`_load_conversation(file_path)`, `_save_conversation(file_path, conversation)`

All mixin methods are called via self.X() in BaseMultiAgentChatbot. The orchestrator stays at the same call-site convention regardless of physical file location (Python's MRO handles the dispatch).

Filter-construction APIs in `src/retrieval/filter.py`

Two coexisting filter constructors, with different consumers (see ADR 002 and DEFERRED #1 for the rationale):

Function	Signature	Used by
`build_qdrant_filter_from_dict(filters: dict)`	Dict-based, returns `Optional[Filter]`	`_FiltersMixin._prevalidate_filters` (for cheap `count()` queries before retrieval)
`create_filter(reports=, sources=, subtype=, year=, district=, filenames=, entity_type=)`	Kwarg-based, returns `Filter`; handles filename mutual-exclusivity	`src/retrieval/hybrid.py`, `src/retrieval/context.py` (for actual vector-search queries)

LangGraph state contract (`MultiAgentState`)

Defined in src/agents/state.py. Every agent node reads from and writes to a shared MultiAgentState (a TypedDict). Keys that travel between nodes:

Key	Written by	Read by
`current_query`	`chat()`	All agents
`messages`	`chat()` (after a turn completes)	`_main_agent` (for context), `_rewrite_query_for_rag`
`query_context`	`_main_agent._analyze_query_context`	`_route_after_main`, `_rag_agent`, `_response_agent`
`rag_filters`	`_rag_agent._build_filters`	`_response_agent`
`anchored_filter_keys`	`_rag_agent._build_filters`	`_response_agent` (relaxation logic)
`rag_query`	`_response_agent._rewrite_query_for_rag` (after prevalidation succeeds)	`_response_agent._perform_retrieval`
`retrieved_documents`	`_response_agent` (after retrieval)	`_response_agent._generate_conversational_response`
`final_response`	`_response_agent`, or pre-validation early-exit	`chat()` (returned to caller)
`agent_logs`	All agents (append-only)	`chat()` (returned to caller for debugging)
`relaxation_notes`	`_response_agent` (during/after relaxation)	`chat()` (returned to caller)
`gap_follow_up`	`_response_agent` (when pre-validation finds an all-anchored gap)	`chat()`
`conversation_context["last_filters"]`	`_main_agent` (end of turn)	`_main_agent._analyze_query_context` (next turn's filter carryover)

Versioning and stability

External APIs — OpenAI and Qdrant are vendor-controlled; we depend on their API stability. Both have versioning policies; we're using stable public versions.
Internal APIs — no formal versioning; changes to mixin signatures or agent state keys are coordinated within the same PR. The code is small enough that this works without overhead.

Related: docs/system-requirements.md details what external services must be available for the system to function. docs/architecture/05-deployment-view.md shows the deployment topology of these interfaces.