--- title: Non-QM Glossary Bot emoji: 🏠 colorFrom: indigo colorTo: purple sdk: gradio sdk_version: 5.31.0 app_file: app.py pinned: false --- # Non-QM Glossary Chatbot A professional RAG-powered chatbot that provides instant, accurate definitions of Non-Qualified Mortgage terms with strict compliance controls and conversation memory. ## Features - 🏠 **Non-QM Expertise**: Specialized glossary of mortgage terminology - 💬 **Conversation Memory**: Smart follow-up question handling - 🔒 **Compliance First**: Built-in disclaimers and PII protection - ⚡ **Streaming Responses**: Real-time text generation - 🎨 **Professional UI**: Modern Gradio interface with custom styling - 💰 **Cost Efficient**: Optimized for <$10/month operation ## Prerequisites - Python 3.8 or higher - OpenAI API key (for embeddings) - OpenRouter API key (for Gemini LLM access) ## Installation 1. **Clone the repository:** ```bash git clone cd ChatBot ``` 2. **Create and activate a virtual environment:** ```bash python -m venv venv # On Windows: venv\Scripts\activate # On macOS/Linux: source venv/bin/activate ``` 3. **Install dependencies:** ```bash pip install -r requirements.txt ``` ## API Key Setup ### 1. OpenAI API Key 1. Go to [OpenAI API Keys](https://platform.openai.com/api-keys) 2. Create a new API key 3. Copy the key (starts with `sk-proj-...`) ### 2. OpenRouter API Key 1. Go to [OpenRouter Keys](https://openrouter.ai/keys) 2. Create a new API key 3. Copy the key (starts with `sk-or-...`) ### 3. Environment Configuration Create a `.env` file in the project root: ```bash # Create .env file touch .env ``` Add your API keys to the `.env` file: ```env OPENAI_API_KEY=sk-proj-your-openai-key-here OPENROUTER_API_KEY=sk-or-your-openrouter-key-here ``` ⚠️ **Important:** Never commit your `.env` file to version control. It's already included in `.gitignore`. ## Running the Application ### 1. Generate Vector Index (First Time Only) Before running the chatbot for the first time, generate the search index: ```bash python build_index.py ``` This creates: - `glossary.index` - FAISS vector search index - `chunks.json` - Text chunks metadata ### 2. Start the Chatbot ```bash python app.py ``` The application will start and display: ``` Running on local URL: http://127.0.0.1:7860 ``` ### 3. Access the Interface Open your browser and go to: `http://127.0.0.1:7860` ## Usage ### Basic Questions Ask about Non-QM mortgage terms: - "What is a Non-QM loan?" - "Define debt-to-income ratio" - "What does DSCR mean?" - "Explain asset-based lending" ### Follow-up Questions The chatbot remembers conversation context: - After asking about a term, say "tell me more" - "Can you elaborate on that?" - "Give me more details" ### What NOT to Ask - Personal financial information - Rate quotes or loan applications - Questions outside the glossary scope ## Project Structure ``` ChatBot/ ├── app.py # Main Gradio application ├── build_index.py # Vector index generation ├── requirements.txt # Python dependencies ├── glossary.txt # Source glossary content ├── glossary.index # Generated FAISS index (after build) ├── chunks.json # Generated text chunks (after build) ├── .env # API keys (create this file) ├── .gitignore # Files to exclude from git └── memory-bank/ # Project documentation ``` ## Configuration Key settings in `app.py`: ```python EMBED_MODEL = "text-embedding-3-small" # OpenAI embeddings GPT_MODEL = "google/gemini-2.5-flash-preview-05-20" # OpenRouter LLM SIM_THRESHOLD = 0.30 # Similarity threshold TOP_K = 3 # Number of chunks to retrieve ``` ## Deployment ### Hugging Face Spaces 1. **Create a new Space:** - Go to [Hugging Face Spaces](https://huggingface.co/spaces) - Choose Gradio SDK - Set hardware to CPU Basic (free) 2. **Upload required files:** ``` app.py requirements.txt glossary.txt glossary.index chunks.json build_index.py ``` 3. **Configure secrets in HF Spaces:** - Go to Settings → Variables and Secrets - Add `OPENAI_API_KEY` - Add `OPENROUTER_API_KEY` 4. **Deploy:** - Push files to the Space repository - The app will automatically build and deploy ## Maintenance ### Updating the Glossary 1. Edit `glossary.txt` with new terms 2. Regenerate the index: ```bash python build_index.py ``` 3. Restart the application ### Cost Monitoring - **OpenAI**: ~$0.0001 per query (embeddings) - **OpenRouter**: ~$0.005 per response (Gemini) - **Target**: <$10/month total operation ### Troubleshooting **Common Issues:** 1. **"Module not found" error:** ```bash pip install -r requirements.txt ``` 2. **"No such file" for index files:** ```bash python build_index.py ``` 3. **API key errors:** - Check `.env` file exists and has correct keys - Verify API keys are valid and have sufficient credits 4. **Import errors:** ```bash pip install faiss-cpu numpy openai requests gradio python-dotenv ``` ## Compliance Features - **Automatic Disclaimers**: Every response includes required compliance text - **PII Detection**: Blocks emails, SSNs, and credit score references - **Scope Limiting**: Only answers questions about glossary terms - **Session Memory**: Context resets when chat is cleared (no persistent data) ## Security - API keys stored in environment variables - No user data persistence - Input sanitization and validation - PII detection and rejection ## Support For technical issues: 1. Check the troubleshooting section above 2. Verify all dependencies are installed 3. Ensure API keys are correctly configured 4. Check that vector index files exist ## License This project is designed for internal compliance-focused use with strict business requirements.