| --- |
| title: GitConnect FastAPI Service |
| emoji: "🚀" |
| colorFrom: blue |
| colorTo: green |
| sdk: docker |
| sdk_version: "1.0.0" |
| python_version: "3.12" |
| app_file: app.py |
| pinned: false |
| --- |
| |
| # GitConnect FastAPI Service |
|
|
| FastAPI backend with two primary features: |
|
|
| - Syllabus processing from PDF URLs with FAISS indexing and multilingual AI summaries. |
| - Chatbot responses grounded with RAG from both syllabus content and student performance data. |
|
|
| ## Core Stack |
|
|
| - API: FastAPI + Uvicorn |
| - Embeddings: sentence-transformers/all-MiniLM-L6-v2 |
| - Vector Search: FAISS (IndexFlatIP with normalized vectors) |
| - LLM generation/summarization: Gemini |
|
|
| RAG index persistence: |
|
|
| - Runtime retrieval uses semester FAISS files plus semester PKL records. |
| - Optional cloud persistence to Neon is enabled with RAG_INDEX_DB_URL. |
| - On syllabus processing, updated semester FAISS and PKL are uploaded to Neon. |
| - On server restart, if local files are missing, the server hydrates them from Neon automatically. |
| |
| About stored artifacts: |
| |
| - Needed for RAG runtime: semester_x.faiss and semester_x.pkl. |
| - Raw text files under data/raw_text are not required for search-time RAG. |
| - Raw text is mainly useful for debugging/auditing ingestion outputs. |
|
|
| ## Local Setup |
|
|
| 1. Create and activate a virtual environment. |
| 2. Install dependencies: |
|
|
| ```bash |
| pip install -r requirements.txt |
| ``` |
|
|
| 3. Create `.env` from `.env.example` and set `GEMINI_API_KEY`. |
| 4. Run the service: |
|
|
| ```bash |
| uvicorn app.main:app --reload |
| ``` |
|
|
| ## Endpoints |
|
|
| - `GET /health` |
| - `POST /api/syllabus/process` |
| - `POST /api/chat` |
|
|
| Student performance source: |
|
|
| - `STUDENT_PERFORMANCE_URL_TEMPLATE` (default) |
| - `https://git-connect-backend-v2.vercel.app/api/student/{student_id}/performance` |
|
|
| ## Hugging Face Spaces Deployment |
|
|
| This repository is set up for Docker Spaces deployment. |
|
|
| Deployment-critical files: |
|
|
| - `Dockerfile` |
| - `requirements.txt` |
| - `app/` |
| - `.github/workflows/deploy-hf-space.yml` |
|
|
| Files excluded from git push: |
|
|
| - `.env` and `app/.env` |
| - generated `data/` files |
| - local caches and `__pycache__/` |
|
|
| GitHub Action deployment workflow: |
|
|
| - Trigger: push to `main` or manual run |
| - Workflow: `.github/workflows/deploy-hf-space.yml` |
| - Required repository secrets: |
| - `HF_TOKEN` |
| - `HF_SPACE_REPO_ID` (format: `username/space-name`) |
|
|
| ## Sample Requests |
|
|
| Syllabus processing: |
|
|
| ```json |
| [ |
| { |
| "course_code": "22CS501", |
| "name": "Database Management Systems", |
| "course_type": "theory", |
| "syllabus_url": "https://example.com/dbms.pdf", |
| "semester": 5 |
| } |
| ] |
| ``` |
|
|
| Chat: |
|
|
| ```json |
| { |
| "query": "How can I improve attendance this semester?", |
| "history": [ |
| {"role": "user", "content": "Hi"}, |
| {"role": "assistant", "content": "Hello"} |
| ], |
| "student_id": 2, |
| "lang_code": "en", |
| "semester": 5 |
| } |
| ``` |
|
|
| Configuration reference: |
|
|
| - https://huggingface.co/docs/hub/spaces-config-reference |
|
|