Spaces:

awellis
/

bfh-studadmin-assist

Sleeping

App Files Files Community

bfh-studadmin-assist / docs /RAG_Email_Assistant_Specifications_v1.0.md

awellis

Create RAG_Email_Assistant_Specifications_v1.0.md

fd727a1 7 months ago

preview code

raw

history blame contribute delete

5.04 kB

A newer version of the Gradio SDK is available: 6.13.0

Upgrade

RAG Email Assistant (Haystack 2 + PydanticAI + Gradio)

Implementation Specifications — 2025-ready, modular, production-minded

Full engineering specification for implementing the staff-assist RAG system based on Haystack 2, PydanticAI, and Gradio.

Overview

This document specifies how to implement a Retrieval-Augmented Generation (RAG) system that helps university administrative staff draft replies to student emails, using policy documents written in Markdown.

It defines:

System architecture
Technologies and dependencies
Data ingestion and indexing
Retrieval and generation pipelines
Agents and validation
User interface behavior
Deployment and CI/CD guidance

System Objectives

Automate the first draft of admin replies using institution policy sources.
Ensure every output is traceable and grounded in authoritative sources.
Provide a non-technical, auditable interface for staff.
Maintain strict privacy and institutional compliance.

Core Architecture

Components

Layer	Technology	Description
Storage	OpenSearch	Hybrid BM25 + vector retrieval, 1024-d embeddings
Embeddings	multilingual-e5-large-instruct	Multilingual, normalized vectors
Retrieval	Haystack 2 pipelines	Modular retrievers + re-ranker
Reasoning	PydanticAI agents	Typed, validated outputs
Interface	Gradio 4.x	Simple GUI for staff use
Logging	Structlog JSON	Observability and auditability

Agents

IntentExtractor — Extracts intent, questions, entities, and language from a student email.
Retriever (Haystack) — Finds relevant chunks based on extracted questions.
Composer — Drafts a reply using retrieved content.
FactChecker — Validates draft against evidence and flags unsupported claims.

Project Layout

app/
  config.py                # settings
  logging_setup.py
  models.py                # Pydantic schemas
  utils/
    markdown_loader.py     # Markdown parser
  retriever/
    indexer.py
    pipeline.py
  agents/
    llm_client.py
    intent_extractor.py
    composer.py
    fact_checker.py
  gradio_app.py
  main.py
scripts/
  ingest_markdown.py
  create_index.py
  healthcheck.py
configs/
  opensearch_index.json
docs/
  RAG_Email_Assistant_Specifications_v1.0.md

Technologies & Dependencies

Python: 3.11+
Libraries: haystack-ai, opensearch-py, sentence-transformers, pydantic, pydantic-ai, gradio, structlog, fastapi
Model weights: hosted or local (intfloat/multilingual-e5-large-instruct, BAAI/bge-reranker-v2-m3, openai/gpt-oss-20b)
Infra: OpenSearch ≥ 2.11 with k-NN; Docker-compose optional.

Configuration

All config uses environment variables (Twelve-Factor pattern).
Example prefix: RAG_MODELS__LLM_MODEL=openai/gpt-oss-20b

See full mapping in earlier chat message.

Markdown Ingestion and Indexing

Parse Markdown with markdown-it-py.
Preserve headings, tables, and metadata (section_path, lang, title).
Sentence-based chunking (~350 tokens) for prose; 1 chunk per table.
Embed all chunks with normalized E5 embeddings and store in OpenSearch.

Retrieval Pipeline

Hybrid retrieval = BM25 + dense embeddings.
Fuse via Reciprocal Rank Fusion (RRF).
Re-rank top candidates using multilingual cross-encoder.
Return top 5 for generation.

Agentic Pipeline (PydanticAI)

StudentQuery(intent, questions, language, entities)
→ retrieve()
→ compose()
→ fact_check()
→ EmailDraft(body, citations, warnings)

Agents exchange typed objects, not raw text, ensuring safe re-prompting and audit logging.

Gradio UI

Single text box → staff pastes student email.
Button → “Generate Draft”.
Output → editable email draft.
Accordion (hidden by default) → shows evidence chunks with metadata.

Designed for Hugging Face Spaces deployment with external OpenSearch.

Deployment Notes

Run OpenSearch separately (local Docker, managed cluster, or university infra).
Host Gradio app on Spaces; connect via HTTPS.
Optional FAISS fallback for demo mode.
Logging to JSON; metrics optional.

Quality & Evaluation

Maintain gold dataset of real queries (anonymized).
Evaluate Recall@5, nDCG, and groundedness ratio.
Audit warnings rate (% of drafts flagged by FactChecker).

Security & Privacy

No student data stored persistently.
All transmissions over HTTPS.
LLM endpoints configured for no data retention.
Mask PII in logs.

File Naming and Storage

Save this specification as: docs/RAG_Email_Assistant_Specifications_v1.0.md
Treat it as the authoritative internal design document.

Maintainer: Virtuelle Akademie AI Lab
Author: Andrew Ellis, BFH
Version: 1.0 (October 2025)
License: Internal use only.