Spaces:

decodingdatascience
/

insuranceomantel

Sleeping

File size: 5,181 Bytes

60eb159

DDS Insurance Q&A — RAG Assistant (Pinecone + OpenAI + Gradio)

Summary: A beginner-friendly, document-grounded insurance bot that you can replicate and deploy on Hugging Face Spaces. It answers only from your uploaded insurance documents using LlamaIndex + Pinecone (serverless) + OpenAI with a simple, polite system prompt.

What You’ll Get

Deployed Space URL you can share.

Grounded answers (no docs → the bot politely says it can’t find it).

Simple UI with an FAQ dropdown + free-text question box.

Clean structure designed for easy replication.

Features

Answers strictly from your data/ documents (RAG).

Pinecone serverless index (AWS us-east-1, cosine, 1536-dim).

OpenAI for embeddings (text-embedding-3-small) and LLM (gpt-4o-mini).

Gradio interface with a centered required logo (data/dds_logo.png).



Beginner-friendly defaults and error messages.



Repository Structure

.

├─ data/                     # Your insurance docs + required logo

│  └─ dds_logo.png           # REQUIRED (shown in header)
├─ app.py                    # Main app: indexing + query + Gradio UI
├─ requirements.txt          # Dependencies
└─ README.md                 # This file

Configuration (in app.py)
EMBED_MODEL = "text-embedding-3-small"   # 1536-dim

LLM_MODEL   = "gpt-4o-mini"
TOP_K       = 4                          # retrieval depth





System Prompt (keeps answers grounded + polite):



SYSTEM_PROMPT = """You are Aisha, a polite and professional Insurance assistant.
Answer ONLY using the information found in the indexed insurance document(s).
If the answer is not in the document(s), say: "I couldn’t find that in the document."
Keep responses concise, helpful, and courteous.
"""


FAQ List (editable):

FAQS = [
    "",

    "What benefits are covered under the policy?",

    "How do I file a claim and what documents are required?",

    "What are the exclusions and limitations?",

    "Is pre-authorization needed for hospitalization?",

    "What is the reimbursement timeline?",

    "How are outpatient vs inpatient services handled?",

    "How can I check my network hospitals/clinics?",

    "What is the co-pay or deductible policy?",

]


Deploy to Hugging Face Spaces (Beginner-Friendly)
1) Create a Space

Go to Hugging Face → Spaces → New Space

SDK: Gradio

Visibility/licensing: your choice

2) Add Project Files

Upload these into your Space:

app.py

requirements.txt

README.md

Create folder data/ and upload:

Your insurance documents (PDF/TXT/MD…)

dds_logo.png (mandatory; exact filename)



Tip: Your Space file tree should match the Repository Structure above.



3) Set Secrets (Environment Variables)



In Space → Settings → Variables and secrets, add:



OPENAI_API_KEY → your OpenAI key



PINECONE_API_KEY → your Pinecone key



No legacy Pinecone environment URL needed. This app uses pinecone-client ≥ 5 with serverless.



4) Build & Run



Spaces auto-install from requirements.txt.



Default CPU hardware is fine.



Entry point auto-detected from app.py.



On first start, the app will:



Ensure a Pinecone serverless index:

dds-insurance-index · cosine · 1536-dim · aws/us-east-1



Read and index documents from data/



Launch the Gradio UI



Your deployed link is simply the Space URL once its status is Running.



5) Updating Documents Later



Upload/change files in data/



Click Restart on the Space so it re-indexes your documents



Troubleshooting (Common Issues)



“Missing PINECONE_API_KEY or OPENAI_API_KEY”

Add both secrets in Space → Settings → Variables and secrets.



Pinecone 401 / “Malformed domain”



Ensure you’re on pinecone-client>=5.0.1 (already in requirements.txt).



Use a valid Pinecone API key; no environment URL needed for serverless.



“Logo not found: data/dds_logo.png”
Upload an image named exactly dds_logo.png into the data/ folder.



“No documents found in data/”

Upload at least one doc (PDF/TXT/MD) into data/, then Restart the Space.



OpenAI authorization/rate-limit errors

Confirm key validity and model access; reduce usage if rate-limited.



Slow first load

First run installs dependencies and builds the index; later runs are faster.



Manual Test Checklist



Ask a question clearly answered in your docs → response should quote that knowledge.



Ask something not in your docs → bot should say it can’t find it.



Adjust TOP_K in app.py to see how answer completeness changes.

Requirements (from requirements.txt)
gradio>=4.44.0
pinecone-client>=5.0.1
openai>=1.51.0
llama-index>=0.11.0
llama-index-vector-stores-pinecone>=0.3.0
llama-index-embeddings-openai>=0.3.0
llama-index-llms-openai>=0.2.0
tiktoken>=0.7.0

Customization Ideas

Swap LLMs by editing LLM_MODEL.



Add a file uploader to refresh docs from the UI.



Add metadata filters (e.g., policy type).



Log queries to refine the FAQ list.



License



Add your chosen license (e.g., MIT) as LICENSE.



Acknowledgments



Thanks to LlamaIndex, Pinecone, OpenAI, and Gradio for the tooling that makes this simple and reproducible.