Spaces:
Sleeping
Sleeping
metadata
title: Hanbridge Clare Assistant (Product UI)
emoji: 💬
colorFrom: yellow
colorTo: purple
sdk: docker
pinned: false
license: mit
Hanbridge Clare Assistant – Product Version
This Space hosts Clare, an AI-powered personalized learning assistant for Hanbridge University.
运行方式(推荐:产品版 Web UI)
使用 React 产品界面(Hanbridge 仪表盘风格:Ask / Review / Quiz、侧边栏、SmartReview 等):
# 1. 安装 Python 依赖(项目根目录)
pip install -r requirements.txt
# 2. 配置 .env(至少设置 OPENAI_API_KEY)
# 3. 一键启动(会自动构建 web 并启动后端,浏览器访问 http://localhost:8000)
chmod +x run_web.sh && ./run_web.sh
或分步执行:
cd web && npm install && npm run build
cd .. && uvicorn api.server:app --host 0.0.0.0 --port 8000
更多说明见 web/使用说明.md。
可选:Gradio 界面(根目录 python app.py,端口 7860)适用于快速演示或 Hugging Face Space 的 Gradio 版;产品部署推荐使用上述 Web UI。
Architecture Overview
- Frontend: React + Vite (exported from Figma design)
- Backend: FastAPI (Python)
- LLM Orchestration: OpenAI + LangChain
- RAG: Vector database (FAISS) + OpenAI embeddings (text-embedding-3-small)
- PDF Parsing: unstructured.io (priority) + pypdf (fallback)
- Observability: LangSmith
- Deployment: Hugging Face Docker Space
Optional: Text-to-Speech & Podcast
- TTS: Uses the same OpenAI API key (no extra secrets). Right panel: “Listen (TTS)” converts the current export/summary text to speech.
- Podcast: “Podcast (summary)” or “Podcast (chat)” generates an MP3 from the session summary or full conversation.
- Hugging Face: Set
OPENAI_API_KEYin the Space Settings → Secrets. No extra env vars needed. For long podcasts, the Space may need sufficient timeout (default backend allows up to 2 minutes for/api/podcast).
📦 project/
├── app.py
├── api/
│ ├── server.py
│ ├── clare_core.py
│ ├── rag_engine.py ← RAG with vector DB (FAISS) + embeddings
│ └── tts_podcast.py ← TTS & podcast (OpenAI TTS)
├── web/ ← React frontend
└── requirements.txt
RAG with Vector Database
- Embeddings: OpenAI
text-embedding-3-small(1536 dimensions) - Vector Storage: FAISS (in-memory, L2 distance)
- Retrieval Strategy: Vector similarity search + token overlap rerank
- PDF Parsing:
- Primary:
unstructured.io(better quality, handles complex layouts) - Fallback:
pypdf(if unstructured fails)
- Primary:
- Backward Compatible: Falls back to token-based retrieval if embeddings unavailable
Optional: GenAICoursesDB 向量知识库(方案三)
Clare 可调用 Hugging Face 上的 GenAICoursesDB Space 获取 GENAI 课程检索结果。设置 GENAI_COURSES_SPACE=claudqunwang/GenAICoursesDB 即可启用;Clare 会在每次对话时自动将课程知识库的检索结果补充到 RAG 上下文中。