Spaces:
Sleeping
Sleeping
| title: Hanbridge Clare Assistant (Product UI) | |
| emoji: 💬 | |
| colorFrom: yellow | |
| colorTo: purple | |
| sdk: docker | |
| pinned: false | |
| license: mit | |
| # Hanbridge Clare Assistant – Product Version | |
| This Space hosts **Clare**, an AI-powered personalized learning assistant for Hanbridge University. | |
| ## 运行方式(推荐:产品版 Web UI) | |
| **使用 React 产品界面(Hanbridge 仪表盘风格:Ask / Review / Quiz、侧边栏、SmartReview 等):** | |
| ```bash | |
| # 1. 安装 Python 依赖(项目根目录) | |
| pip install -r requirements.txt | |
| # 2. 配置 .env(至少设置 OPENAI_API_KEY) | |
| # 3. 一键启动(会自动构建 web 并启动后端,浏览器访问 http://localhost:8000) | |
| chmod +x run_web.sh && ./run_web.sh | |
| ``` | |
| 或分步执行: | |
| ```bash | |
| cd web && npm install && npm run build | |
| cd .. && uvicorn api.server:app --host 0.0.0.0 --port 8000 | |
| ``` | |
| 更多说明见 **web/使用说明.md**。 | |
| **可选:Gradio 界面**(根目录 `python app.py`,端口 7860)适用于快速演示或 Hugging Face Space 的 Gradio 版;产品部署推荐使用上述 Web UI。 | |
| ## Architecture Overview | |
| - **Frontend**: React + Vite (exported from Figma design) | |
| - **Backend**: FastAPI (Python) | |
| - **LLM Orchestration**: OpenAI + LangChain | |
| - **RAG**: Vector database (FAISS) + OpenAI embeddings (text-embedding-3-small) | |
| - **PDF Parsing**: unstructured.io (priority) + pypdf (fallback) | |
| - **Observability**: LangSmith | |
| - **Deployment**: Hugging Face Docker Space | |
| ### Optional: Text-to-Speech & Podcast | |
| - **TTS**: Uses the same **OpenAI API key** (no extra secrets). Right panel: “Listen (TTS)” converts the current export/summary text to speech. | |
| - **Podcast**: “Podcast (summary)” or “Podcast (chat)” generates an MP3 from the session summary or full conversation. | |
| - **Hugging Face**: Set `OPENAI_API_KEY` in the Space **Settings → Secrets**. No extra env vars needed. For long podcasts, the Space may need sufficient timeout (default backend allows up to 2 minutes for `/api/podcast`). | |
| ``` | |
| 📦 project/ | |
| ├── app.py | |
| ├── api/ | |
| │ ├── server.py | |
| │ ├── clare_core.py | |
| │ ├── rag_engine.py ← RAG with vector DB (FAISS) + embeddings | |
| │ └── tts_podcast.py ← TTS & podcast (OpenAI TTS) | |
| ├── web/ ← React frontend | |
| └── requirements.txt | |
| ``` | |
| ### RAG with Vector Database | |
| - **Embeddings**: OpenAI `text-embedding-3-small` (1536 dimensions) | |
| - **Vector Storage**: FAISS (in-memory, L2 distance) | |
| - **Retrieval Strategy**: Vector similarity search + token overlap rerank | |
| - **PDF Parsing**: | |
| - Primary: `unstructured.io` (better quality, handles complex layouts) | |
| - Fallback: `pypdf` (if unstructured fails) | |
| - **Backward Compatible**: Falls back to token-based retrieval if embeddings unavailable | |
| ### Optional: GenAICoursesDB 向量知识库(方案三) | |
| Clare 可调用 Hugging Face 上的 **GenAICoursesDB** Space 获取 GENAI 课程检索结果。设置 `GENAI_COURSES_SPACE=claudqunwang/GenAICoursesDB` 即可启用;Clare 会在每次对话时自动将课程知识库的检索结果补充到 RAG 上下文中。 | |