bfh-studadmin-assist / docs /RAG_Email_Assistant_Specifications_v1.0.md
awellis's picture
Create RAG_Email_Assistant_Specifications_v1.0.md
fd727a1

A newer version of the Gradio SDK is available: 6.13.0

Upgrade

RAG Email Assistant (Haystack 2 + PydanticAI + Gradio)

Implementation Specifications — 2025-ready, modular, production-minded

Full engineering specification for implementing the staff-assist RAG system based on Haystack 2, PydanticAI, and Gradio.


Overview

This document specifies how to implement a Retrieval-Augmented Generation (RAG) system that helps university administrative staff draft replies to student emails, using policy documents written in Markdown.

It defines:

  • System architecture
  • Technologies and dependencies
  • Data ingestion and indexing
  • Retrieval and generation pipelines
  • Agents and validation
  • User interface behavior
  • Deployment and CI/CD guidance

System Objectives

  • Automate the first draft of admin replies using institution policy sources.
  • Ensure every output is traceable and grounded in authoritative sources.
  • Provide a non-technical, auditable interface for staff.
  • Maintain strict privacy and institutional compliance.

Core Architecture

Components

Layer Technology Description
Storage OpenSearch Hybrid BM25 + vector retrieval, 1024-d embeddings
Embeddings multilingual-e5-large-instruct Multilingual, normalized vectors
Retrieval Haystack 2 pipelines Modular retrievers + re-ranker
Reasoning PydanticAI agents Typed, validated outputs
Interface Gradio 4.x Simple GUI for staff use
Logging Structlog JSON Observability and auditability

Agents

  1. IntentExtractor — Extracts intent, questions, entities, and language from a student email.
  2. Retriever (Haystack) — Finds relevant chunks based on extracted questions.
  3. Composer — Drafts a reply using retrieved content.
  4. FactChecker — Validates draft against evidence and flags unsupported claims.

Project Layout

app/
  config.py                # settings
  logging_setup.py
  models.py                # Pydantic schemas
  utils/
    markdown_loader.py     # Markdown parser
  retriever/
    indexer.py
    pipeline.py
  agents/
    llm_client.py
    intent_extractor.py
    composer.py
    fact_checker.py
  gradio_app.py
  main.py
scripts/
  ingest_markdown.py
  create_index.py
  healthcheck.py
configs/
  opensearch_index.json
docs/
  RAG_Email_Assistant_Specifications_v1.0.md

Technologies & Dependencies

  • Python: 3.11+
  • Libraries: haystack-ai, opensearch-py, sentence-transformers, pydantic, pydantic-ai, gradio, structlog, fastapi
  • Model weights: hosted or local (intfloat/multilingual-e5-large-instruct, BAAI/bge-reranker-v2-m3, openai/gpt-oss-20b)
  • Infra: OpenSearch ≥ 2.11 with k-NN; Docker-compose optional.

Configuration

All config uses environment variables (Twelve-Factor pattern).
Example prefix: RAG_MODELS__LLM_MODEL=openai/gpt-oss-20b

See full mapping in earlier chat message.


Markdown Ingestion and Indexing

  • Parse Markdown with markdown-it-py.
  • Preserve headings, tables, and metadata (section_path, lang, title).
  • Sentence-based chunking (~350 tokens) for prose; 1 chunk per table.
  • Embed all chunks with normalized E5 embeddings and store in OpenSearch.

Retrieval Pipeline

  1. Hybrid retrieval = BM25 + dense embeddings.
  2. Fuse via Reciprocal Rank Fusion (RRF).
  3. Re-rank top candidates using multilingual cross-encoder.
  4. Return top 5 for generation.

Agentic Pipeline (PydanticAI)

StudentQuery(intent, questions, language, entities)
→ retrieve()
→ compose()
→ fact_check()
→ EmailDraft(body, citations, warnings)

Agents exchange typed objects, not raw text, ensuring safe re-prompting and audit logging.


Gradio UI

  • Single text box → staff pastes student email.
  • Button → “Generate Draft”.
  • Output → editable email draft.
  • Accordion (hidden by default) → shows evidence chunks with metadata.

Designed for Hugging Face Spaces deployment with external OpenSearch.


Deployment Notes

  • Run OpenSearch separately (local Docker, managed cluster, or university infra).
  • Host Gradio app on Spaces; connect via HTTPS.
  • Optional FAISS fallback for demo mode.
  • Logging to JSON; metrics optional.

Quality & Evaluation

  • Maintain gold dataset of real queries (anonymized).
  • Evaluate Recall@5, nDCG, and groundedness ratio.
  • Audit warnings rate (% of drafts flagged by FactChecker).

Security & Privacy

  • No student data stored persistently.
  • All transmissions over HTTPS.
  • LLM endpoints configured for no data retention.
  • Mask PII in logs.

File Naming and Storage

  • Save this specification as: docs/RAG_Email_Assistant_Specifications_v1.0.md
  • Treat it as the authoritative internal design document.

Maintainer: Virtuelle Akademie AI Lab
Author: Andrew Ellis, BFH
Version: 1.0 (October 2025)
License: Internal use only.