Spaces:
Running
Running
XQ commited on
Commit ·
6fd2f67
1
Parent(s): 6ce81cf
Add agent flow
Browse files- README.md +60 -4
- src/agent/react_router.py +253 -0
- src/agent/tools.py +153 -0
- src/api/main.py +22 -9
- src/api/routes.py +3 -2
- src/config.py +7 -0
- src/retrieval/vector_store.py +50 -1
- src/ui/app.py +29 -8
README.md
CHANGED
|
@@ -14,13 +14,27 @@ A RAG-based document assistant for Danish-language PDFs, featuring hybrid search
|
|
| 14 |
|
| 15 |
## Architecture
|
| 16 |
|
| 17 |
-
The system follows a three-stage RAG pipeline:
|
| 18 |
|
| 19 |
**Ingestion:** PDF documents are parsed with PyMuPDF, cleaned, and split into chunks using one of three strategies (fixed-size, recursive, or semantic). Each chunk is embedded via a multilingual sentence-transformer and stored in a Qdrant vector collection. A parallel BM25 index is built from the same chunks for sparse keyword matching.
|
| 20 |
|
| 21 |
-
**Retrieval:** User queries run through both dense (Qdrant cosine similarity) and sparse (BM25) search paths. Results are merged via reciprocal rank fusion, then a cross-encoder reranker scores each candidate for final ordering.
|
| 22 |
|
| 23 |
-
**Generation:** Top-ranked chunks are assembled into a prompt context and passed to the LLM. The
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 24 |
|
| 25 |
## Tech Stack
|
| 26 |
|
|
@@ -37,6 +51,7 @@ The system follows a three-stage RAG pipeline:
|
|
| 37 |
| Evaluation | RAGAS |
|
| 38 |
| UI | Streamlit |
|
| 39 |
| Config | python-dotenv |
|
|
|
|
| 40 |
|
| 41 |
## Provider Support
|
| 42 |
|
|
@@ -51,6 +66,45 @@ Both LLM and embedding backends are swappable through environment variables —
|
|
| 51 |
|
| 52 |
Switch providers by editing `LLM_PROVIDER` and `EMBEDDING_PROVIDER` in your `.env` file. See `.env.example` for per-provider configuration details.
|
| 53 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 54 |
## Quick Start
|
| 55 |
|
| 56 |
Prerequisites: Python 3.11+ and [Ollama](https://ollama.com/) installed.
|
|
@@ -146,7 +200,9 @@ src/
|
|
| 146 |
routes.py # REST endpoints (query, ingest, health)
|
| 147 |
agent/
|
| 148 |
intent_classifier.py # Query intent detection
|
| 149 |
-
router.py #
|
|
|
|
|
|
|
| 150 |
evaluation/
|
| 151 |
evaluator.py # RAGAS-based retrieval quality metrics
|
| 152 |
ui/
|
|
|
|
| 14 |
|
| 15 |
## Architecture
|
| 16 |
|
| 17 |
+
The system follows a three-stage RAG pipeline with an optional Agent Flows mode:
|
| 18 |
|
| 19 |
**Ingestion:** PDF documents are parsed with PyMuPDF, cleaned, and split into chunks using one of three strategies (fixed-size, recursive, or semantic). Each chunk is embedded via a multilingual sentence-transformer and stored in a Qdrant vector collection. A parallel BM25 index is built from the same chunks for sparse keyword matching.
|
| 20 |
|
| 21 |
+
**Retrieval:** User queries run through both dense (Qdrant cosine similarity) and sparse (BM25) search paths. Results are merged via reciprocal rank fusion, then a cross-encoder reranker scores each candidate for final ordering.
|
| 22 |
|
| 23 |
+
**Generation:** Top-ranked chunks are assembled into a prompt context and passed to the LLM. The response is returned via a FastAPI endpoint with full SSE streaming and displayed in a Streamlit UI. Retrieval quality can be measured offline using RAGAS metrics.
|
| 24 |
+
|
| 25 |
+
**Routing — two modes (switchable via `AGENT_MODE`):**
|
| 26 |
+
|
| 27 |
+
- **Pipeline mode** (default, `AGENT_MODE=pipeline`): Fixed LangGraph DAG — language detection → optional translation → hybrid retrieval → cross-encoder reranking → intent-specific generation. Robust on any LLM including local Ollama models.
|
| 28 |
+
|
| 29 |
+
- **ReAct Agent mode** (`AGENT_MODE=react`): Replaces the fixed DAG with a multi-step reasoning loop. The LLM decides which tools to call and how many times, then produces a grounded answer citing source documents. Supports multi-hop questions, comparisons across documents, and procedural queries that benefit from iterative retrieval. Requires an LLM with tool-calling support (OpenAI, Anthropic, Google GenAI, or compatible Ollama models such as `llama3.1` / `qwen2.5`).
|
| 30 |
+
|
| 31 |
+
Available tools in ReAct mode:
|
| 32 |
+
|
| 33 |
+
| Tool | When the LLM uses it |
|
| 34 |
+
|------|----------------------|
|
| 35 |
+
| `hybrid_search(query, top_k)` | Find relevant passages — called once or multiple times with refined queries |
|
| 36 |
+
| `list_documents()` | Discover which documents are in the knowledge base |
|
| 37 |
+
| `fetch_document(document_id)` | Read the full text of a named document (e.g. for summaries) |
|
| 38 |
|
| 39 |
## Tech Stack
|
| 40 |
|
|
|
|
| 51 |
| Evaluation | RAGAS |
|
| 52 |
| UI | Streamlit |
|
| 53 |
| Config | python-dotenv |
|
| 54 |
+
| Agent Flows | LangGraph `create_react_agent` + LangChain `@tool` |
|
| 55 |
|
| 56 |
## Provider Support
|
| 57 |
|
|
|
|
| 66 |
|
| 67 |
Switch providers by editing `LLM_PROVIDER` and `EMBEDDING_PROVIDER` in your `.env` file. See `.env.example` for per-provider configuration details.
|
| 68 |
|
| 69 |
+
## Agent Mode
|
| 70 |
+
|
| 71 |
+
The system supports two routing modes, controlled by `AGENT_MODE` in `.env`:
|
| 72 |
+
|
| 73 |
+
| Mode | Value | Description |
|
| 74 |
+
|------|-------|-------------|
|
| 75 |
+
| Pipeline (default) | `AGENT_MODE=pipeline` | Fixed LangGraph DAG. Works with any LLM including local Ollama models such as `gemma3:4b`. |
|
| 76 |
+
| ReAct Agent | `AGENT_MODE=react` | Multi-step reasoning loop. The LLM calls tools as many times as needed — `hybrid_search` for targeted passages, `list_documents` to navigate the knowledge base, `fetch_document` for full document reads — then cites sources in the final answer. |
|
| 77 |
+
|
| 78 |
+
**LLM compatibility for ReAct mode:**
|
| 79 |
+
|
| 80 |
+
`AGENT_MODE=react` requires a model with native tool-calling support. Use `AGENT_MODE=pipeline` (the default) if your model does not support it.
|
| 81 |
+
|
| 82 |
+
| Provider | Tool-calling support |
|
| 83 |
+
|----------|---------------------|
|
| 84 |
+
| OpenAI (`gpt-4o-mini`, `gpt-4o`) | Yes |
|
| 85 |
+
| Anthropic (`claude-*`) | Yes |
|
| 86 |
+
| Google GenAI (`gemini-*`) | Yes |
|
| 87 |
+
| Azure OpenAI | Yes |
|
| 88 |
+
| Ollama — `llama3.1`, `qwen2.5`, `mistral-nemo` | Yes (model-dependent) |
|
| 89 |
+
| Ollama — `gemma3:4b` (default) | No → use `pipeline` mode |
|
| 90 |
+
|
| 91 |
+
Example `.env` for ReAct mode with OpenAI:
|
| 92 |
+
|
| 93 |
+
```dotenv
|
| 94 |
+
AGENT_MODE=react
|
| 95 |
+
LLM_PROVIDER=openai
|
| 96 |
+
OPENAI_API_KEY=sk-...
|
| 97 |
+
OPENAI_MODEL=gpt-4o-mini
|
| 98 |
+
```
|
| 99 |
+
|
| 100 |
+
Example `.env` for pipeline mode with local Ollama (default, no API key needed):
|
| 101 |
+
|
| 102 |
+
```dotenv
|
| 103 |
+
AGENT_MODE=pipeline
|
| 104 |
+
LLM_PROVIDER=ollama
|
| 105 |
+
OLLAMA_MODEL=gemma3:4b
|
| 106 |
+
```
|
| 107 |
+
|
| 108 |
## Quick Start
|
| 109 |
|
| 110 |
Prerequisites: Python 3.11+ and [Ollama](https://ollama.com/) installed.
|
|
|
|
| 200 |
routes.py # REST endpoints (query, ingest, health)
|
| 201 |
agent/
|
| 202 |
intent_classifier.py # Query intent detection
|
| 203 |
+
router.py # Fixed-DAG pipeline router (AGENT_MODE=pipeline)
|
| 204 |
+
tools.py # @tool-decorated hybrid_search + ToolResultStore
|
| 205 |
+
react_router.py # ReAct agent router with tool-calling loop (AGENT_MODE=react)
|
| 206 |
evaluation/
|
| 207 |
evaluator.py # RAGAS-based retrieval quality metrics
|
| 208 |
ui/
|
src/agent/react_router.py
ADDED
|
@@ -0,0 +1,253 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""ReAct agent router using a LangGraph tool-calling loop.
|
| 2 |
+
|
| 3 |
+
Replaces the fixed detect→translate→retrieve→rerank→generate DAG with a
|
| 4 |
+
multi-step reasoning loop where the LLM decides which tools to call and
|
| 5 |
+
when it has gathered enough information to produce a final answer.
|
| 6 |
+
|
| 7 |
+
Requires an LLM that supports bind_tools (OpenAI, Anthropic, Google GenAI,
|
| 8 |
+
and compatible Ollama models such as llama3.1 / qwen2.5). Set
|
| 9 |
+
AGENT_MODE=react in .env to activate; falls back to QueryRouter otherwise.
|
| 10 |
+
"""
|
| 11 |
+
|
| 12 |
+
import logging
|
| 13 |
+
from collections.abc import Generator
|
| 14 |
+
|
| 15 |
+
from langchain_core.messages import AIMessage, HumanMessage, SystemMessage, ToolMessage
|
| 16 |
+
from langchain_core.runnables import Runnable
|
| 17 |
+
from langgraph.prebuilt import create_react_agent
|
| 18 |
+
|
| 19 |
+
from src.models import GenerationResponse, IntentType, PipelineDetails, QueryResult
|
| 20 |
+
from src.agent.tools import ToolResultStore, make_retrieval_tools
|
| 21 |
+
from src.retrieval.hybrid import HybridRetriever
|
| 22 |
+
from src.retrieval.reranker import Reranker
|
| 23 |
+
from src.retrieval.vector_store import VectorStore
|
| 24 |
+
|
| 25 |
+
logger = logging.getLogger(__name__)
|
| 26 |
+
|
| 27 |
+
_SYSTEM_PROMPT = (
|
| 28 |
+
"You are a helpful assistant for administrative staff at the University of Copenhagen (KU).\n\n"
|
| 29 |
+
"You have access to a hybrid_search tool that searches KU policy documents stored in the "
|
| 30 |
+
"knowledge base.\n\n"
|
| 31 |
+
"Guidelines:\n"
|
| 32 |
+
"- Always call hybrid_search before answering questions about KU rules, policies, exams, "
|
| 33 |
+
"employment conditions, or administrative procedures.\n"
|
| 34 |
+
"- If the first search does not return sufficient information, call hybrid_search again "
|
| 35 |
+
"with a refined or more specific query.\n"
|
| 36 |
+
"- For comparison questions, search for each item separately.\n"
|
| 37 |
+
"- Cite the document sources ([1], [2], …) in your answer.\n"
|
| 38 |
+
"- Answer in the same language as the user's question."
|
| 39 |
+
)
|
| 40 |
+
|
| 41 |
+
|
| 42 |
+
def _ser_sources(sources: list[QueryResult]) -> list[dict]:
|
| 43 |
+
"""Serialise QueryResult list to a JSON-safe list of dicts."""
|
| 44 |
+
return [
|
| 45 |
+
{
|
| 46 |
+
"chunk_id": r.chunk.chunk_id,
|
| 47 |
+
"document_id": r.chunk.document_id,
|
| 48 |
+
"text": r.chunk.text,
|
| 49 |
+
"score": r.score,
|
| 50 |
+
"source": r.source,
|
| 51 |
+
}
|
| 52 |
+
for r in sources
|
| 53 |
+
]
|
| 54 |
+
|
| 55 |
+
|
| 56 |
+
class ReActRouter:
|
| 57 |
+
"""Routes queries through a multi-step ReAct agent with tool-calling LLM.
|
| 58 |
+
|
| 59 |
+
The agent runs in a loop: the LLM reasons about the query, calls
|
| 60 |
+
hybrid_search as many times as needed, observes results, and finally
|
| 61 |
+
produces a grounded answer. Results from every tool call are merged into
|
| 62 |
+
a single ranked source list that is returned alongside the answer.
|
| 63 |
+
"""
|
| 64 |
+
|
| 65 |
+
def __init__(
|
| 66 |
+
self,
|
| 67 |
+
llm: Runnable,
|
| 68 |
+
hybrid_retriever: HybridRetriever,
|
| 69 |
+
reranker: Reranker,
|
| 70 |
+
vector_store: VectorStore,
|
| 71 |
+
default_top_k: int = 5,
|
| 72 |
+
) -> None:
|
| 73 |
+
"""Initialise the ReAct router.
|
| 74 |
+
|
| 75 |
+
Args:
|
| 76 |
+
llm: LLM with tool-calling support (must implement bind_tools).
|
| 77 |
+
hybrid_retriever: HybridRetriever instance.
|
| 78 |
+
reranker: Reranker instance.
|
| 79 |
+
vector_store: VectorStore instance for document-level tool access.
|
| 80 |
+
default_top_k: Default number of results returned per tool call.
|
| 81 |
+
"""
|
| 82 |
+
self._llm = llm
|
| 83 |
+
self._hybrid_retriever = hybrid_retriever
|
| 84 |
+
self._reranker = reranker
|
| 85 |
+
self._vector_store = vector_store
|
| 86 |
+
self._default_top_k = default_top_k
|
| 87 |
+
|
| 88 |
+
# ------------------------------------------------------------------
|
| 89 |
+
# Internal helpers
|
| 90 |
+
# ------------------------------------------------------------------
|
| 91 |
+
|
| 92 |
+
def _make_graph(self, store: ToolResultStore) -> object:
|
| 93 |
+
"""Build a fresh ReAct graph bound to *store* for one request."""
|
| 94 |
+
tools = make_retrieval_tools(
|
| 95 |
+
self._hybrid_retriever,
|
| 96 |
+
self._reranker,
|
| 97 |
+
self._vector_store,
|
| 98 |
+
store,
|
| 99 |
+
self._default_top_k,
|
| 100 |
+
)
|
| 101 |
+
return create_react_agent(self._llm, tools)
|
| 102 |
+
|
| 103 |
+
@staticmethod
|
| 104 |
+
def _extract_answer(messages: list) -> str:
|
| 105 |
+
"""Return the last non-tool-call AIMessage content as the final answer."""
|
| 106 |
+
for msg in reversed(messages):
|
| 107 |
+
if (
|
| 108 |
+
isinstance(msg, AIMessage)
|
| 109 |
+
and msg.content
|
| 110 |
+
and not getattr(msg, "tool_calls", None)
|
| 111 |
+
):
|
| 112 |
+
return str(msg.content)
|
| 113 |
+
return ""
|
| 114 |
+
|
| 115 |
+
# ------------------------------------------------------------------
|
| 116 |
+
# Public interface (mirrors QueryRouter)
|
| 117 |
+
# ------------------------------------------------------------------
|
| 118 |
+
|
| 119 |
+
def route(self, query: str, top_k: int) -> GenerationResponse:
|
| 120 |
+
"""Route a query through the ReAct agent pipeline.
|
| 121 |
+
|
| 122 |
+
Args:
|
| 123 |
+
query: The user's natural language query.
|
| 124 |
+
top_k: Number of top documents to retrieve per tool call.
|
| 125 |
+
|
| 126 |
+
Returns:
|
| 127 |
+
GenerationResponse with answer, sources, intent, and confidence.
|
| 128 |
+
"""
|
| 129 |
+
logger.info("ReAct routing query: %s", query)
|
| 130 |
+
store = ToolResultStore()
|
| 131 |
+
graph = self._make_graph(store)
|
| 132 |
+
|
| 133 |
+
result = graph.invoke(
|
| 134 |
+
{
|
| 135 |
+
"messages": [
|
| 136 |
+
SystemMessage(content=_SYSTEM_PROMPT),
|
| 137 |
+
HumanMessage(content=query),
|
| 138 |
+
]
|
| 139 |
+
}
|
| 140 |
+
)
|
| 141 |
+
|
| 142 |
+
messages = result.get("messages", [])
|
| 143 |
+
answer = self._extract_answer(messages)
|
| 144 |
+
|
| 145 |
+
sources = store.retrieved[:top_k]
|
| 146 |
+
confidence = max((r.score for r in sources), default=0.0)
|
| 147 |
+
|
| 148 |
+
logger.info(
|
| 149 |
+
"ReAct answer ready (confidence=%.4f, sources=%d, tool_calls=%d)",
|
| 150 |
+
confidence,
|
| 151 |
+
len(sources),
|
| 152 |
+
len(store.tool_calls),
|
| 153 |
+
)
|
| 154 |
+
|
| 155 |
+
return GenerationResponse(
|
| 156 |
+
answer=answer,
|
| 157 |
+
sources=sources,
|
| 158 |
+
intent=IntentType.RAG if sources else IntentType.FACTUAL,
|
| 159 |
+
confidence=confidence,
|
| 160 |
+
pipeline_details=PipelineDetails(
|
| 161 |
+
original_query=query,
|
| 162 |
+
retrieval_query=", ".join(q for _, q in store.tool_calls) or query,
|
| 163 |
+
reranked_results=sources,
|
| 164 |
+
),
|
| 165 |
+
)
|
| 166 |
+
|
| 167 |
+
def route_stream(self, query: str, top_k: int) -> Generator[dict, None, None]:
|
| 168 |
+
"""Stream ReAct agent events step by step.
|
| 169 |
+
|
| 170 |
+
Yields event dicts with the following step types (in addition to the
|
| 171 |
+
existing pipeline steps understood by the UI):
|
| 172 |
+
|
| 173 |
+
- ``tool_call`` — LLM decided to call a tool; carries ``tool`` and ``query``.
|
| 174 |
+
- ``tool_result`` — Tool returned; carries ``tool``, ``result_count``.
|
| 175 |
+
- ``generate`` — LLM is writing the final answer.
|
| 176 |
+
- ``done`` — Final event with the full result payload.
|
| 177 |
+
|
| 178 |
+
Args:
|
| 179 |
+
query: User query.
|
| 180 |
+
top_k: Number of results to retrieve per tool call.
|
| 181 |
+
|
| 182 |
+
Yields:
|
| 183 |
+
Step event dicts.
|
| 184 |
+
"""
|
| 185 |
+
store = ToolResultStore()
|
| 186 |
+
graph = self._make_graph(store)
|
| 187 |
+
|
| 188 |
+
all_messages: list = []
|
| 189 |
+
|
| 190 |
+
for chunk in graph.stream(
|
| 191 |
+
{
|
| 192 |
+
"messages": [
|
| 193 |
+
SystemMessage(content=_SYSTEM_PROMPT),
|
| 194 |
+
HumanMessage(content=query),
|
| 195 |
+
]
|
| 196 |
+
},
|
| 197 |
+
stream_mode="updates",
|
| 198 |
+
):
|
| 199 |
+
for _node_name, update in chunk.items():
|
| 200 |
+
if update is None:
|
| 201 |
+
continue
|
| 202 |
+
node_messages = update.get("messages", [])
|
| 203 |
+
all_messages.extend(node_messages)
|
| 204 |
+
|
| 205 |
+
for msg in node_messages:
|
| 206 |
+
if isinstance(msg, AIMessage):
|
| 207 |
+
for tc in getattr(msg, "tool_calls", []):
|
| 208 |
+
yield {
|
| 209 |
+
"step": "tool_call",
|
| 210 |
+
"tool": tc.get("name", ""),
|
| 211 |
+
"query": tc.get("args", {}).get("query", ""),
|
| 212 |
+
}
|
| 213 |
+
if msg.content and not getattr(msg, "tool_calls", None):
|
| 214 |
+
yield {"step": "generate"}
|
| 215 |
+
|
| 216 |
+
elif isinstance(msg, ToolMessage):
|
| 217 |
+
yield {
|
| 218 |
+
"step": "tool_result",
|
| 219 |
+
"tool": getattr(msg, "name", ""),
|
| 220 |
+
"result_count": len(store.retrieved),
|
| 221 |
+
}
|
| 222 |
+
|
| 223 |
+
answer = self._extract_answer(all_messages)
|
| 224 |
+
sources = store.retrieved[:top_k]
|
| 225 |
+
confidence = max((r.score for r in sources), default=0.0)
|
| 226 |
+
|
| 227 |
+
yield {
|
| 228 |
+
"step": "done",
|
| 229 |
+
"result": {
|
| 230 |
+
"answer": answer,
|
| 231 |
+
"sources": _ser_sources(sources),
|
| 232 |
+
"intent": (IntentType.RAG if sources else IntentType.FACTUAL).value,
|
| 233 |
+
"confidence": confidence,
|
| 234 |
+
"pipeline_details": {
|
| 235 |
+
"original_query": query,
|
| 236 |
+
"retrieval_query": ", ".join(q for _, q in store.tool_calls) or query,
|
| 237 |
+
"detected_language": "unknown",
|
| 238 |
+
"translated": False,
|
| 239 |
+
"dense_results": [],
|
| 240 |
+
"sparse_results": [],
|
| 241 |
+
"fused_results": [],
|
| 242 |
+
"reranked_results": [
|
| 243 |
+
{
|
| 244 |
+
"document_id": r.chunk.document_id,
|
| 245 |
+
"chunk_id": r.chunk.chunk_id,
|
| 246 |
+
"score": r.score,
|
| 247 |
+
"source": r.source,
|
| 248 |
+
}
|
| 249 |
+
for r in sources
|
| 250 |
+
],
|
| 251 |
+
},
|
| 252 |
+
},
|
| 253 |
+
}
|
src/agent/tools.py
ADDED
|
@@ -0,0 +1,153 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""LangChain tools for the ReAct agent."""
|
| 2 |
+
|
| 3 |
+
import logging
|
| 4 |
+
from dataclasses import dataclass, field
|
| 5 |
+
|
| 6 |
+
from langchain_core.tools import tool
|
| 7 |
+
|
| 8 |
+
from src.models import QueryResult
|
| 9 |
+
from src.retrieval.hybrid import HybridRetriever
|
| 10 |
+
from src.retrieval.reranker import Reranker
|
| 11 |
+
from src.retrieval.vector_store import VectorStore
|
| 12 |
+
|
| 13 |
+
logger = logging.getLogger(__name__)
|
| 14 |
+
|
| 15 |
+
|
| 16 |
+
@dataclass
|
| 17 |
+
class ToolResultStore:
|
| 18 |
+
"""Captures structured retrieval results produced during tool invocations.
|
| 19 |
+
|
| 20 |
+
Attributes:
|
| 21 |
+
retrieved: Accumulated QueryResult list across all hybrid_search calls,
|
| 22 |
+
merged by chunk_id and sorted by descending score.
|
| 23 |
+
tool_calls: Log of (tool_name, query_or_arg) tuples in invocation order.
|
| 24 |
+
"""
|
| 25 |
+
|
| 26 |
+
retrieved: list[QueryResult] = field(default_factory=list)
|
| 27 |
+
tool_calls: list[tuple[str, str]] = field(default_factory=list)
|
| 28 |
+
|
| 29 |
+
|
| 30 |
+
def make_retrieval_tools(
|
| 31 |
+
hybrid_retriever: HybridRetriever,
|
| 32 |
+
reranker: Reranker,
|
| 33 |
+
vector_store: VectorStore,
|
| 34 |
+
store: ToolResultStore,
|
| 35 |
+
default_top_k: int = 5,
|
| 36 |
+
) -> list:
|
| 37 |
+
"""Create retrieval tools bound to the given components and result store.
|
| 38 |
+
|
| 39 |
+
The returned tools write structured QueryResult objects into *store* on each
|
| 40 |
+
invocation so the calling router can surface them as sources without having
|
| 41 |
+
to re-parse the tool's text output.
|
| 42 |
+
|
| 43 |
+
Args:
|
| 44 |
+
hybrid_retriever: HybridRetriever instance.
|
| 45 |
+
reranker: Reranker instance.
|
| 46 |
+
vector_store: VectorStore instance for document-level access.
|
| 47 |
+
store: Shared ToolResultStore that captures structured results.
|
| 48 |
+
default_top_k: Default number of results to return per call.
|
| 49 |
+
|
| 50 |
+
Returns:
|
| 51 |
+
List of LangChain tool callables ready for bind_tools / ToolNode.
|
| 52 |
+
"""
|
| 53 |
+
|
| 54 |
+
@tool
|
| 55 |
+
def hybrid_search(query: str, top_k: int = default_top_k) -> str:
|
| 56 |
+
"""Search the KU document knowledge base using hybrid retrieval.
|
| 57 |
+
|
| 58 |
+
Combines dense semantic search (Qdrant) and sparse keyword search (BM25),
|
| 59 |
+
then re-ranks results with a cross-encoder. Use this tool to find relevant
|
| 60 |
+
passages from ingested KU policy documents about rules, regulations, exam
|
| 61 |
+
procedures, employment conditions, and administrative guidelines.
|
| 62 |
+
|
| 63 |
+
Call this tool before answering any question that requires factual
|
| 64 |
+
information from KU documents. You may call it multiple times with
|
| 65 |
+
different queries if the first result is insufficient.
|
| 66 |
+
|
| 67 |
+
Args:
|
| 68 |
+
query: Search query. Danish gives the best recall against KU documents.
|
| 69 |
+
top_k: Number of top results to return (1–20). Default is 5.
|
| 70 |
+
|
| 71 |
+
Returns:
|
| 72 |
+
Formatted string of ranked document passages with source references
|
| 73 |
+
and relevance scores.
|
| 74 |
+
"""
|
| 75 |
+
logger.info("Tool hybrid_search: query=%r top_k=%d", query, top_k)
|
| 76 |
+
store.tool_calls.append(("hybrid_search", query))
|
| 77 |
+
|
| 78 |
+
hybrid_result = hybrid_retriever.search_detailed(query, top_k=top_k)
|
| 79 |
+
results = reranker.rerank(query, hybrid_result.fused_results, top_k=top_k)
|
| 80 |
+
|
| 81 |
+
# Accumulate results across multiple calls (union by chunk_id, keep highest score)
|
| 82 |
+
existing = {r.chunk.chunk_id: r for r in store.retrieved}
|
| 83 |
+
for r in results:
|
| 84 |
+
cid = r.chunk.chunk_id
|
| 85 |
+
if cid not in existing or r.score > existing[cid].score:
|
| 86 |
+
existing[cid] = r
|
| 87 |
+
store.retrieved = sorted(existing.values(), key=lambda r: r.score, reverse=True)
|
| 88 |
+
|
| 89 |
+
if not results:
|
| 90 |
+
return "Ingen relevante dokumenter fundet. (No relevant documents found.)"
|
| 91 |
+
|
| 92 |
+
parts: list[str] = []
|
| 93 |
+
for i, r in enumerate(results, 1):
|
| 94 |
+
parts.append(
|
| 95 |
+
f"[{i}] {r.chunk.document_id} (relevance: {r.score:.3f})\n{r.chunk.text}"
|
| 96 |
+
)
|
| 97 |
+
return "\n\n---\n\n".join(parts)
|
| 98 |
+
|
| 99 |
+
@tool
|
| 100 |
+
def list_documents() -> str:
|
| 101 |
+
"""List all documents currently available in the KU knowledge base.
|
| 102 |
+
|
| 103 |
+
Use this tool when the user asks which documents are available, wants to
|
| 104 |
+
know what topics are covered, or before fetching a specific document by ID.
|
| 105 |
+
|
| 106 |
+
Returns:
|
| 107 |
+
Newline-separated list of document IDs, or a message if the
|
| 108 |
+
knowledge base is empty.
|
| 109 |
+
"""
|
| 110 |
+
logger.info("Tool list_documents called")
|
| 111 |
+
store.tool_calls.append(("list_documents", ""))
|
| 112 |
+
|
| 113 |
+
ids = vector_store.list_document_ids()
|
| 114 |
+
if not ids:
|
| 115 |
+
return "Ingen dokumenter i videnbasen. (Knowledge base is empty.)"
|
| 116 |
+
lines = "\n".join(f"- {doc_id}" for doc_id in ids)
|
| 117 |
+
return f"Dokumenter i videnbasen ({len(ids)} i alt):\n{lines}"
|
| 118 |
+
|
| 119 |
+
@tool
|
| 120 |
+
def fetch_document(document_id: str) -> str:
|
| 121 |
+
"""Fetch the full text of a specific document from the knowledge base.
|
| 122 |
+
|
| 123 |
+
Use this tool when the user asks for a summary or overview of a named
|
| 124 |
+
document, or when hybrid_search results reference a document that
|
| 125 |
+
warrants deeper reading. Prefer hybrid_search for targeted questions.
|
| 126 |
+
|
| 127 |
+
Args:
|
| 128 |
+
document_id: The exact document ID as returned by list_documents or
|
| 129 |
+
seen in hybrid_search results (e.g. 'ku_ai_policy.pdf').
|
| 130 |
+
|
| 131 |
+
Returns:
|
| 132 |
+
The concatenated text of all chunks belonging to the document, or
|
| 133 |
+
an error message if the document ID is not found.
|
| 134 |
+
"""
|
| 135 |
+
logger.info("Tool fetch_document: document_id=%r", document_id)
|
| 136 |
+
store.tool_calls.append(("fetch_document", document_id))
|
| 137 |
+
|
| 138 |
+
chunks = vector_store.get_chunks_by_document_id(document_id)
|
| 139 |
+
if not chunks:
|
| 140 |
+
return (
|
| 141 |
+
f"Dokumentet '{document_id}' blev ikke fundet i videnbasen. "
|
| 142 |
+
f"(Document not found. Use list_documents to see available IDs.)"
|
| 143 |
+
)
|
| 144 |
+
|
| 145 |
+
# Sort chunks by chunk_id to preserve document order
|
| 146 |
+
chunks.sort(key=lambda c: c.chunk_id)
|
| 147 |
+
full_text = "\n\n".join(c.text for c in chunks)
|
| 148 |
+
return (
|
| 149 |
+
f"Dokument: {document_id} ({len(chunks)} afsnit)\n\n"
|
| 150 |
+
f"{full_text}"
|
| 151 |
+
)
|
| 152 |
+
|
| 153 |
+
return [hybrid_search, list_documents, fetch_document]
|
src/api/main.py
CHANGED
|
@@ -16,6 +16,7 @@ from src.retrieval.hybrid import HybridRetriever
|
|
| 16 |
from src.retrieval.reranker import Reranker
|
| 17 |
from src.agent.intent_classifier import IntentClassifier
|
| 18 |
from src.agent.router import QueryRouter
|
|
|
|
| 19 |
from src.ingestion.pipeline import IngestionPipeline
|
| 20 |
from src.api.routes import router, set_dependencies
|
| 21 |
|
|
@@ -69,15 +70,27 @@ def create_app() -> FastAPI:
|
|
| 69 |
bm25_weight=settings.bm25_weight,
|
| 70 |
)
|
| 71 |
reranker = Reranker(model=create_reranker(settings.reranker_model))
|
| 72 |
-
|
| 73 |
-
|
| 74 |
-
|
| 75 |
-
|
| 76 |
-
|
| 77 |
-
|
| 78 |
-
|
| 79 |
-
|
| 80 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 81 |
|
| 82 |
set_dependencies(
|
| 83 |
query_router=query_router,
|
|
|
|
| 16 |
from src.retrieval.reranker import Reranker
|
| 17 |
from src.agent.intent_classifier import IntentClassifier
|
| 18 |
from src.agent.router import QueryRouter
|
| 19 |
+
from src.agent.react_router import ReActRouter
|
| 20 |
from src.ingestion.pipeline import IngestionPipeline
|
| 21 |
from src.api.routes import router, set_dependencies
|
| 22 |
|
|
|
|
| 70 |
bm25_weight=settings.bm25_weight,
|
| 71 |
)
|
| 72 |
reranker = Reranker(model=create_reranker(settings.reranker_model))
|
| 73 |
+
|
| 74 |
+
if settings.agent_mode == "react":
|
| 75 |
+
logger.info("Agent mode: ReAct (tool-calling loop)")
|
| 76 |
+
query_router: QueryRouter | ReActRouter = ReActRouter(
|
| 77 |
+
llm=llm,
|
| 78 |
+
hybrid_retriever=hybrid_retriever,
|
| 79 |
+
reranker=reranker,
|
| 80 |
+
vector_store=vector_store,
|
| 81 |
+
default_top_k=settings.top_k,
|
| 82 |
+
)
|
| 83 |
+
else:
|
| 84 |
+
logger.info("Agent mode: pipeline (fixed DAG)")
|
| 85 |
+
intent_classifier = IntentClassifier(llm=llm, model_name=settings.generation_model)
|
| 86 |
+
generator = llm | StrOutputParser()
|
| 87 |
+
query_router = QueryRouter(
|
| 88 |
+
intent_classifier=intent_classifier,
|
| 89 |
+
hybrid_retriever=hybrid_retriever,
|
| 90 |
+
reranker=reranker,
|
| 91 |
+
generator=generator,
|
| 92 |
+
translate_query=settings.translate_query,
|
| 93 |
+
)
|
| 94 |
|
| 95 |
set_dependencies(
|
| 96 |
query_router=query_router,
|
src/api/routes.py
CHANGED
|
@@ -14,6 +14,7 @@ from pydantic import BaseModel
|
|
| 14 |
|
| 15 |
if TYPE_CHECKING:
|
| 16 |
from src.agent.router import QueryRouter
|
|
|
|
| 17 |
from src.config import Settings
|
| 18 |
from src.ingestion.pipeline import IngestionPipeline
|
| 19 |
from src.retrieval.bm25_search import BM25Search
|
|
@@ -24,7 +25,7 @@ logger = logging.getLogger(__name__)
|
|
| 24 |
|
| 25 |
router = APIRouter()
|
| 26 |
|
| 27 |
-
_query_router: "QueryRouter | None" = None
|
| 28 |
_ingestion_pipeline: "IngestionPipeline | None" = None
|
| 29 |
_embedder: "Embedder | None" = None
|
| 30 |
_vector_store: "VectorStore | None" = None
|
|
@@ -33,7 +34,7 @@ _settings: "Settings | None" = None
|
|
| 33 |
|
| 34 |
|
| 35 |
def set_dependencies(
|
| 36 |
-
query_router: "QueryRouter",
|
| 37 |
ingestion_pipeline: "IngestionPipeline",
|
| 38 |
embedder: "Embedder",
|
| 39 |
vector_store: "VectorStore",
|
|
|
|
| 14 |
|
| 15 |
if TYPE_CHECKING:
|
| 16 |
from src.agent.router import QueryRouter
|
| 17 |
+
from src.agent.react_router import ReActRouter
|
| 18 |
from src.config import Settings
|
| 19 |
from src.ingestion.pipeline import IngestionPipeline
|
| 20 |
from src.retrieval.bm25_search import BM25Search
|
|
|
|
| 25 |
|
| 26 |
router = APIRouter()
|
| 27 |
|
| 28 |
+
_query_router: "QueryRouter | ReActRouter | None" = None
|
| 29 |
_ingestion_pipeline: "IngestionPipeline | None" = None
|
| 30 |
_embedder: "Embedder | None" = None
|
| 31 |
_vector_store: "VectorStore | None" = None
|
|
|
|
| 34 |
|
| 35 |
|
| 36 |
def set_dependencies(
|
| 37 |
+
query_router: "QueryRouter | ReActRouter",
|
| 38 |
ingestion_pipeline: "IngestionPipeline",
|
| 39 |
embedder: "Embedder",
|
| 40 |
vector_store: "VectorStore",
|
src/config.py
CHANGED
|
@@ -64,6 +64,9 @@ class Settings:
|
|
| 64 |
# Query translation
|
| 65 |
translate_query: bool
|
| 66 |
|
|
|
|
|
|
|
|
|
|
| 67 |
|
| 68 |
def _parse_bool(value: str, *, default: bool) -> bool:
|
| 69 |
"""Parse a boolean environment variable string.
|
|
@@ -141,4 +144,8 @@ def load_settings() -> Settings:
|
|
| 141 |
os.environ.get("TRANSLATE_QUERY", ""),
|
| 142 |
default=os.environ.get("LLM_PROVIDER", "ollama") == "ollama",
|
| 143 |
),
|
|
|
|
|
|
|
|
|
|
|
|
|
| 144 |
)
|
|
|
|
| 64 |
# Query translation
|
| 65 |
translate_query: bool
|
| 66 |
|
| 67 |
+
# Agent mode: "pipeline" (fixed DAG) or "react" (tool-calling ReAct loop)
|
| 68 |
+
agent_mode: str
|
| 69 |
+
|
| 70 |
|
| 71 |
def _parse_bool(value: str, *, default: bool) -> bool:
|
| 72 |
"""Parse a boolean environment variable string.
|
|
|
|
| 144 |
os.environ.get("TRANSLATE_QUERY", ""),
|
| 145 |
default=os.environ.get("LLM_PROVIDER", "ollama") == "ollama",
|
| 146 |
),
|
| 147 |
+
|
| 148 |
+
# Agent mode: "pipeline" keeps the existing fixed DAG; "react" enables
|
| 149 |
+
# the multi-step ReAct loop (requires an LLM with tool-calling support).
|
| 150 |
+
agent_mode=os.environ.get("AGENT_MODE", "pipeline"),
|
| 151 |
)
|
src/retrieval/vector_store.py
CHANGED
|
@@ -9,7 +9,7 @@ from langchain_core.documents import Document
|
|
| 9 |
from langchain_core.retrievers import BaseRetriever
|
| 10 |
from pydantic import ConfigDict
|
| 11 |
from qdrant_client import QdrantClient
|
| 12 |
-
from qdrant_client.models import Distance, PointStruct, VectorParams
|
| 13 |
|
| 14 |
from src.models import ChunkStrategy, DocumentChunk, QueryResult
|
| 15 |
|
|
@@ -144,6 +144,55 @@ class VectorStore:
|
|
| 144 |
logger.info("Loaded %d chunks from collection '%s'", len(chunks), self._collection_name)
|
| 145 |
return chunks
|
| 146 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 147 |
def as_retriever(self, embedder: Any, top_k: int) -> BaseRetriever:
|
| 148 |
"""Return a LangChain BaseRetriever wrapping this vector store.
|
| 149 |
|
|
|
|
| 9 |
from langchain_core.retrievers import BaseRetriever
|
| 10 |
from pydantic import ConfigDict
|
| 11 |
from qdrant_client import QdrantClient
|
| 12 |
+
from qdrant_client.models import Distance, FieldCondition, Filter, MatchValue, PointStruct, VectorParams
|
| 13 |
|
| 14 |
from src.models import ChunkStrategy, DocumentChunk, QueryResult
|
| 15 |
|
|
|
|
| 144 |
logger.info("Loaded %d chunks from collection '%s'", len(chunks), self._collection_name)
|
| 145 |
return chunks
|
| 146 |
|
| 147 |
+
def list_document_ids(self) -> list[str]:
|
| 148 |
+
"""Return a sorted list of unique document IDs in the collection.
|
| 149 |
+
|
| 150 |
+
Returns:
|
| 151 |
+
Sorted list of document ID strings.
|
| 152 |
+
"""
|
| 153 |
+
all_chunks = self.get_all_chunks()
|
| 154 |
+
ids = sorted({chunk.document_id for chunk in all_chunks})
|
| 155 |
+
logger.debug("Found %d unique document IDs", len(ids))
|
| 156 |
+
return ids
|
| 157 |
+
|
| 158 |
+
def get_chunks_by_document_id(self, document_id: str) -> list[DocumentChunk]:
|
| 159 |
+
"""Retrieve all chunks belonging to a specific document.
|
| 160 |
+
|
| 161 |
+
Uses a Qdrant payload filter to avoid loading the full collection.
|
| 162 |
+
|
| 163 |
+
Args:
|
| 164 |
+
document_id: The document identifier to filter by.
|
| 165 |
+
|
| 166 |
+
Returns:
|
| 167 |
+
List of DocumentChunk objects for that document, in storage order.
|
| 168 |
+
"""
|
| 169 |
+
records, _offset = self._client.scroll(
|
| 170 |
+
collection_name=self._collection_name,
|
| 171 |
+
scroll_filter=Filter(
|
| 172 |
+
must=[FieldCondition(key="document_id", match=MatchValue(value=document_id))]
|
| 173 |
+
),
|
| 174 |
+
limit=10_000,
|
| 175 |
+
with_payload=True,
|
| 176 |
+
with_vectors=False,
|
| 177 |
+
)
|
| 178 |
+
|
| 179 |
+
chunks: list[DocumentChunk] = []
|
| 180 |
+
for record in records:
|
| 181 |
+
payload = record.payload
|
| 182 |
+
chunks.append(
|
| 183 |
+
DocumentChunk(
|
| 184 |
+
chunk_id=payload["chunk_id"],
|
| 185 |
+
document_id=payload["document_id"],
|
| 186 |
+
text=payload["text"],
|
| 187 |
+
metadata=json.loads(payload["metadata"]),
|
| 188 |
+
strategy=ChunkStrategy(payload["strategy"]),
|
| 189 |
+
)
|
| 190 |
+
)
|
| 191 |
+
logger.debug(
|
| 192 |
+
"Fetched %d chunks for document '%s'", len(chunks), document_id
|
| 193 |
+
)
|
| 194 |
+
return chunks
|
| 195 |
+
|
| 196 |
def as_retriever(self, embedder: Any, top_k: int) -> BaseRetriever:
|
| 197 |
"""Return a LangChain BaseRetriever wrapping this vector store.
|
| 198 |
|
src/ui/app.py
CHANGED
|
@@ -54,8 +54,10 @@ TEXTS: dict[str, dict[str, str]] = {
|
|
| 54 |
"- **LLM-integration** — provider-agnostisk, prompt-styret "
|
| 55 |
"svargenerering\n"
|
| 56 |
"- **Evaluering** — RAGAS-baseret kvalitetsmaaling\n"
|
| 57 |
-
"- **Agent
|
| 58 |
-
"
|
|
|
|
|
|
|
| 59 |
),
|
| 60 |
"chunking_label": "Chunking-strategi",
|
| 61 |
"chunking_help": "Vaelg hvordan dokumenterne opdeles i tekststykker.",
|
|
@@ -99,7 +101,7 @@ TEXTS: dict[str, dict[str, str]] = {
|
|
| 99 |
"pipeline_original": "Original foresporgsel",
|
| 100 |
"pipeline_translated": "Oversat til dansk",
|
| 101 |
"pipeline_lang": "Sprog registreret",
|
| 102 |
-
"pipeline_no_translation": "Ingen oversaettelse
|
| 103 |
"pipeline_bm25": "BM25-resultater (leksikalsk soegning)",
|
| 104 |
"pipeline_dense": "Vektorsoegning (semantisk)",
|
| 105 |
"pipeline_fused": "RRF-fusioneret raekkefoelge",
|
|
@@ -129,8 +131,10 @@ TEXTS: dict[str, dict[str, str]] = {
|
|
| 129 |
"- **LLM integration** — provider-agnostic, prompt-driven "
|
| 130 |
"answer generation\n"
|
| 131 |
"- **Evaluation** — RAGAS-based quality measurement\n"
|
| 132 |
-
"- **Agent
|
| 133 |
-
"
|
|
|
|
|
|
|
| 134 |
),
|
| 135 |
"chunking_label": "Chunking strategy",
|
| 136 |
"chunking_help": "Choose how documents are split into text chunks.",
|
|
@@ -174,7 +178,7 @@ TEXTS: dict[str, dict[str, str]] = {
|
|
| 174 |
"pipeline_original": "Original query",
|
| 175 |
"pipeline_translated": "Translated to Danish",
|
| 176 |
"pipeline_lang": "Detected language",
|
| 177 |
-
"pipeline_no_translation": "No
|
| 178 |
"pipeline_bm25": "BM25 Results (lexical search)",
|
| 179 |
"pipeline_dense": "Vector Search (semantic)",
|
| 180 |
"pipeline_fused": "RRF Fused Ranking",
|
|
@@ -487,9 +491,9 @@ if search_clicked and question.strip():
|
|
| 487 |
)
|
| 488 |
else:
|
| 489 |
st.write(
|
| 490 |
-
"
|
| 491 |
if lang == "da"
|
| 492 |
-
else "
|
| 493 |
)
|
| 494 |
|
| 495 |
elif _step == "retrieve":
|
|
@@ -510,6 +514,23 @@ if search_clicked and question.strip():
|
|
| 510 |
else (f"Reranked to **{_rc}** results · confidence **{_cf:.0%}**")
|
| 511 |
)
|
| 512 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 513 |
elif _step == "generate":
|
| 514 |
st.write(
|
| 515 |
"Svar genereret"
|
|
|
|
| 54 |
"- **LLM-integration** — provider-agnostisk, prompt-styret "
|
| 55 |
"svargenerering\n"
|
| 56 |
"- **Evaluering** — RAGAS-baseret kvalitetsmaaling\n"
|
| 57 |
+
"- **Agent Flows** — valgfri ReAct-loop med vaerktoejskald: "
|
| 58 |
+
"LLM bestemmer selv hvor mange soegninger der behoeves og "
|
| 59 |
+
"stoetter flertrinssraesonnering paa tvaers af dokumenter "
|
| 60 |
+
"(`AGENT_MODE=react`)"
|
| 61 |
),
|
| 62 |
"chunking_label": "Chunking-strategi",
|
| 63 |
"chunking_help": "Vaelg hvordan dokumenterne opdeles i tekststykker.",
|
|
|
|
| 101 |
"pipeline_original": "Original foresporgsel",
|
| 102 |
"pipeline_translated": "Oversat til dansk",
|
| 103 |
"pipeline_lang": "Sprog registreret",
|
| 104 |
+
"pipeline_no_translation": "Ingen oversaettelse nødvendig",
|
| 105 |
"pipeline_bm25": "BM25-resultater (leksikalsk soegning)",
|
| 106 |
"pipeline_dense": "Vektorsoegning (semantisk)",
|
| 107 |
"pipeline_fused": "RRF-fusioneret raekkefoelge",
|
|
|
|
| 131 |
"- **LLM integration** — provider-agnostic, prompt-driven "
|
| 132 |
"answer generation\n"
|
| 133 |
"- **Evaluation** — RAGAS-based quality measurement\n"
|
| 134 |
+
"- **Agent Flows** — optional ReAct loop with tool calling: "
|
| 135 |
+
"the LLM decides how many searches are needed and supports "
|
| 136 |
+
"multi-step reasoning across documents "
|
| 137 |
+
"(`AGENT_MODE=react`)"
|
| 138 |
),
|
| 139 |
"chunking_label": "Chunking strategy",
|
| 140 |
"chunking_help": "Choose how documents are split into text chunks.",
|
|
|
|
| 178 |
"pipeline_original": "Original query",
|
| 179 |
"pipeline_translated": "Translated to Danish",
|
| 180 |
"pipeline_lang": "Detected language",
|
| 181 |
+
"pipeline_no_translation": "No need for translation",
|
| 182 |
"pipeline_bm25": "BM25 Results (lexical search)",
|
| 183 |
"pipeline_dense": "Vector Search (semantic)",
|
| 184 |
"pipeline_fused": "RRF Fused Ranking",
|
|
|
|
| 491 |
)
|
| 492 |
else:
|
| 493 |
st.write(
|
| 494 |
+
"Ingen oversættelse nødvendig for forespørgslen"
|
| 495 |
if lang == "da"
|
| 496 |
+
else "No translation needed for the query"
|
| 497 |
)
|
| 498 |
|
| 499 |
elif _step == "retrieve":
|
|
|
|
| 514 |
else (f"Reranked to **{_rc}** results · confidence **{_cf:.0%}**")
|
| 515 |
)
|
| 516 |
|
| 517 |
+
elif _step == "tool_call":
|
| 518 |
+
_tool_name = _event.get("tool", "")
|
| 519 |
+
_tool_query = _event.get("query", "")
|
| 520 |
+
st.write(
|
| 521 |
+
(f"Vaerktoej **{_tool_name}** kaldt: _{_tool_query}_")
|
| 522 |
+
if lang == "da"
|
| 523 |
+
else (f"Tool **{_tool_name}** called: _{_tool_query}_")
|
| 524 |
+
)
|
| 525 |
+
|
| 526 |
+
elif _step == "tool_result":
|
| 527 |
+
_rc = _event.get("result_count", 0)
|
| 528 |
+
st.write(
|
| 529 |
+
(f"Hentet **{_rc}** dokumenter")
|
| 530 |
+
if lang == "da"
|
| 531 |
+
else (f"Retrieved **{_rc}** documents")
|
| 532 |
+
)
|
| 533 |
+
|
| 534 |
elif _step == "generate":
|
| 535 |
st.write(
|
| 536 |
"Svar genereret"
|