Spaces:
Sleeping
Sleeping
Upload 8 files
Browse files- .Dockerignore +2 -0
- .opik.config +5 -0
- Dockerfile +30 -0
- README.md +127 -11
- app.py +240 -0
- requirements.txt +0 -0
- start.sh +39 -0
- streamlit_app.py +76 -0
.Dockerignore
ADDED
|
@@ -0,0 +1,2 @@
|
|
|
|
|
|
|
|
|
|
| 1 |
+
.env
|
| 2 |
+
__pycache__/
|
.opik.config
ADDED
|
@@ -0,0 +1,5 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
[opik]
|
| 2 |
+
url_override = https://www.comet.com/opik/api/
|
| 3 |
+
workspace = komalgupta991000-gmail-com
|
| 4 |
+
api_key = BX9OYn3NZBKuztCxL4XvMOeeI
|
| 5 |
+
|
Dockerfile
ADDED
|
@@ -0,0 +1,30 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
FROM python:3.11.4-slim-buster
|
| 2 |
+
|
| 3 |
+
|
| 4 |
+
|
| 5 |
+
# Install curl and Ollama
|
| 6 |
+
RUN apt-get update && apt-get install -y curl && \
|
| 7 |
+
curl -fsSL https://ollama.ai/install.sh | sh && \
|
| 8 |
+
apt-get clean && rm -rf /var/lib/apt/lists/*
|
| 9 |
+
|
| 10 |
+
# Set up user and environment
|
| 11 |
+
RUN useradd -m -u 1000 user
|
| 12 |
+
USER user
|
| 13 |
+
ENV HOME=/home/user \
|
| 14 |
+
PATH="/home/user/.local/bin:$PATH"
|
| 15 |
+
|
| 16 |
+
WORKDIR $HOME/app
|
| 17 |
+
|
| 18 |
+
COPY --chown=user requirements.txt .
|
| 19 |
+
RUN pip install --no-cache-dir --upgrade -r requirements.txt
|
| 20 |
+
COPY . .
|
| 21 |
+
|
| 22 |
+
|
| 23 |
+
COPY --chown=user . .
|
| 24 |
+
|
| 25 |
+
# Make the start script executable
|
| 26 |
+
RUN chmod +x start.sh
|
| 27 |
+
# Expose FastAPI & Streamlit ports
|
| 28 |
+
EXPOSE 7860 8501
|
| 29 |
+
|
| 30 |
+
CMD ["./start.sh"]
|
README.md
CHANGED
|
@@ -1,11 +1,127 @@
|
|
| 1 |
-
|
| 2 |
-
|
| 3 |
-
|
| 4 |
-
|
| 5 |
-
|
| 6 |
-
|
| 7 |
-
|
| 8 |
-
|
| 9 |
-
|
| 10 |
-
|
| 11 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# AI Assistant API
|
| 2 |
+
|
| 3 |
+
## 🚀 Overview
|
| 4 |
+
|
| 5 |
+
This project is an AI-powered assistant that uses FastAPI and FAISS for retrieval-augmented generation (RAG). It processes user queries using a vector database and evaluates responses with Opik.
|
| 6 |
+
|
| 7 |
+
## 🛠️ Features
|
| 8 |
+
|
| 9 |
+
- Upload and manage datasets
|
| 10 |
+
- Query AI assistant with domain-specific constraints
|
| 11 |
+
- Use FAISS for efficient document retrieval
|
| 12 |
+
- Evaluate LLM responses using Opik
|
| 13 |
+
|
| 14 |
+
## 📽️ Demo Video
|
| 15 |
+
|
| 16 |
+
[🎥 Click here to watch the demo](https://drive.google.com/file/d/10h4VnTm_y5SBczI6NnoTuqRxyq55HAn5/view?usp=sharing)
|
| 17 |
+
|
| 18 |
+
|
| 19 |
+
## 📦 Installation
|
| 20 |
+
|
| 21 |
+
### Install Ollama
|
| 22 |
+
|
| 23 |
+
Ollama is required for this project. Follow these steps to install it:
|
| 24 |
+
|
| 25 |
+
```bash
|
| 26 |
+
# For macOS
|
| 27 |
+
brew install ollama
|
| 28 |
+
|
| 29 |
+
# For Linux
|
| 30 |
+
curl -fsSL https://ollama.ai/install.sh | sh
|
| 31 |
+
|
| 32 |
+
# Verify installation
|
| 33 |
+
ollama --version
|
| 34 |
+
|
| 35 |
+
# Windows
|
| 36 |
+
You can download from web https://ollama.com/
|
| 37 |
+
```
|
| 38 |
+
# Clone and Setup the Project
|
| 39 |
+
## Clone the repository
|
| 40 |
+
```
|
| 41 |
+
git clone https://github.com/Komal-99/cyfuture_bot.git
|
| 42 |
+
```
|
| 43 |
+
|
| 44 |
+
## Navigate to the project directory
|
| 45 |
+
```
|
| 46 |
+
cd cyfuture_bot
|
| 47 |
+
```
|
| 48 |
+
|
| 49 |
+
## Install dependencies
|
| 50 |
+
```
|
| 51 |
+
pip install -r requirements.txt # For Python projects
|
| 52 |
+
yarn install # For JavaScript projects
|
| 53 |
+
```
|
| 54 |
+
|
| 55 |
+
🚀 Usage
|
| 56 |
+
Start the Project
|
| 57 |
+
Run the ```start.sh``` script to set up and launch the application:
|
| 58 |
+
```
|
| 59 |
+
chmod +x start.sh
|
| 60 |
+
./start.sh
|
| 61 |
+
|
| 62 |
+
```
|
| 63 |
+
This script:
|
| 64 |
+
|
| 65 |
+
Sets environment variables for optimization
|
| 66 |
+
|
| 67 |
+
Starts Ollama in the background
|
| 68 |
+
|
| 69 |
+
Pulls required models (deepseek-r1:7b, nomic-embed-text)
|
| 70 |
+
|
| 71 |
+
Waits for Ollama to initialize
|
| 72 |
+
|
| 73 |
+
### Launches the FastAPI server on http://127.0.0.1:7860
|
| 74 |
+
### Streamlit Application - http://127.0.0.1:8501
|
| 75 |
+
|
| 76 |
+
## API Endpoints
|
| 77 |
+
Upload Dataset
|
| 78 |
+
```
|
| 79 |
+
POST /upload_dataset/ #Upload an Excel dataset to be used for evaluation.
|
| 80 |
+
```
|
| 81 |
+
Run Evaluation
|
| 82 |
+
```
|
| 83 |
+
POST /run_evaluation/ #Evaluate the model's performance using Opik.
|
| 84 |
+
|
| 85 |
+
```
|
| 86 |
+
Query AI Assistant
|
| 87 |
+
```
|
| 88 |
+
GET /query/?input_text=your_question # Ask the assistant a question. The model retrieves relevant information and generates an answer based on indexed documents.
|
| 89 |
+
|
| 90 |
+
```
|
| 91 |
+
📂 Folder Structure
|
| 92 |
+
|
| 93 |
+
```
|
| 94 |
+
.
|
| 95 |
+
├── AI_Agent/ # Datasource
|
| 96 |
+
├── deepseek_cyfuture/ # DeepSeek Vector db
|
| 97 |
+
├── .env # Environment variables
|
| 98 |
+
├── .gitignore # Files to ignore in Git
|
| 99 |
+
├── dataset.xlsx # Sample dataset file
|
| 100 |
+
├── Dockerfile # Docker configuration
|
| 101 |
+
├── requirements.txt # Dependencies (Python projects)
|
| 102 |
+
├── start.sh # Startup script
|
| 103 |
+
├── app.py # Main application file
|
| 104 |
+
├── README.md # Project documentation
|
| 105 |
+
|
| 106 |
+
```
|
| 107 |
+
🤝 Contributing
|
| 108 |
+
Contributions are welcome! Please follow these steps:
|
| 109 |
+
|
| 110 |
+
Fork the repository
|
| 111 |
+
|
| 112 |
+
Create a new branch (git checkout -b feature-branch)
|
| 113 |
+
|
| 114 |
+
Commit your changes (git commit -m 'Add new feature')
|
| 115 |
+
|
| 116 |
+
Push to the branch (git push origin feature-branch)
|
| 117 |
+
|
| 118 |
+
Create a pull request
|
| 119 |
+
|
| 120 |
+
📜 License
|
| 121 |
+
This project is licensed under the MIT License - see the LICENSE file for details.
|
| 122 |
+
|
| 123 |
+
📬 Contact
|
| 124 |
+
For questions or issues, reach out:
|
| 125 |
+
|
| 126 |
+
GitHub: https://github.com/Komal-99
|
| 127 |
+
|
app.py
ADDED
|
@@ -0,0 +1,240 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
import os
|
| 2 |
+
import re
|
| 3 |
+
import pandas as pd
|
| 4 |
+
import backoff
|
| 5 |
+
import asyncio
|
| 6 |
+
from datetime import datetime
|
| 7 |
+
from dotenv import load_dotenv
|
| 8 |
+
from langchain_ollama import OllamaEmbeddings, ChatOllama
|
| 9 |
+
from langchain_community.vectorstores import FAISS
|
| 10 |
+
|
| 11 |
+
from langchain_core.prompts import ChatPromptTemplate
|
| 12 |
+
from langchain_core.output_parsers import StrOutputParser
|
| 13 |
+
from langchain_core.runnables import RunnablePassthrough
|
| 14 |
+
from opik import Opik, track, evaluate
|
| 15 |
+
from opik.evaluation.metrics import Hallucination, AnswerRelevance
|
| 16 |
+
import litellm
|
| 17 |
+
import opik
|
| 18 |
+
from fastapi.responses import StreamingResponse
|
| 19 |
+
from litellm.integrations.opik.opik import OpikLogger
|
| 20 |
+
from litellm import completion, APIConnectionError
|
| 21 |
+
from fastapi import FastAPI, UploadFile, File, HTTPException, Query, Response
|
| 22 |
+
|
| 23 |
+
from langchain.document_loaders import PyMuPDFLoader, UnstructuredWordDocumentLoader
|
| 24 |
+
from langchain.text_splitter import RecursiveCharacterTextSplitter
|
| 25 |
+
|
| 26 |
+
app = FastAPI()
|
| 27 |
+
|
| 28 |
+
def initialize_opik():
|
| 29 |
+
opik_logger = OpikLogger()
|
| 30 |
+
litellm.callbacks = [opik_logger]
|
| 31 |
+
opik.configure(api_key=os.getenv("OPIK_API_KEY"),workspace=os.getenv("workspace"),force=True)
|
| 32 |
+
|
| 33 |
+
|
| 34 |
+
# Initialize Opik and load environment variables
|
| 35 |
+
load_dotenv()
|
| 36 |
+
initialize_opik()
|
| 37 |
+
|
| 38 |
+
# Initialize Opik Client
|
| 39 |
+
dataset = Opik().get_or_create_dataset(
|
| 40 |
+
name="Cyfuture_faq",
|
| 41 |
+
description="Dataset on IGL FAQ",
|
| 42 |
+
)
|
| 43 |
+
|
| 44 |
+
@app.post("/upload_dataset/")
|
| 45 |
+
def upload_dataset(file: UploadFile = File(...)):
|
| 46 |
+
try:
|
| 47 |
+
df = pd.read_excel(file.file)
|
| 48 |
+
dataset.insert(df.to_dict(orient='records'))
|
| 49 |
+
return {"message": "Dataset uploaded successfully"}
|
| 50 |
+
except Exception as e:
|
| 51 |
+
raise HTTPException(status_code=500, detail=str(e))
|
| 52 |
+
|
| 53 |
+
# To use the uploaded dataset in the evaluation task manually
|
| 54 |
+
def upload_dataset():
|
| 55 |
+
df = pd.read_excel("dataset.xlsx")
|
| 56 |
+
dataset.insert(df.to_dict(orient='records'))
|
| 57 |
+
return "Dataset uploaded successfully"
|
| 58 |
+
|
| 59 |
+
# Initialize LLM Models
|
| 60 |
+
model = ChatOllama(model="deepseek-r1:7b", base_url="http://localhost:11434", temperature=0.2, max_tokens=200)
|
| 61 |
+
|
| 62 |
+
def load_documents(folder_path):
|
| 63 |
+
text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=50)
|
| 64 |
+
all_documents = []
|
| 65 |
+
os.makedirs('data', exist_ok=True)
|
| 66 |
+
|
| 67 |
+
for filename in os.listdir(folder_path):
|
| 68 |
+
file_path = os.path.join(folder_path, filename)
|
| 69 |
+
|
| 70 |
+
if filename.endswith('.pdf'):
|
| 71 |
+
loader = PyMuPDFLoader(file_path)
|
| 72 |
+
elif filename.endswith('.docx'):
|
| 73 |
+
loader = UnstructuredWordDocumentLoader(file_path)
|
| 74 |
+
else:
|
| 75 |
+
continue # Skip unsupported files
|
| 76 |
+
|
| 77 |
+
documents = loader.load()
|
| 78 |
+
all_documents.extend(text_splitter.split_documents(documents))
|
| 79 |
+
print(f"Processed and indexed {filename}")
|
| 80 |
+
|
| 81 |
+
return all_documents
|
| 82 |
+
# Vector Store Setup
|
| 83 |
+
def setup_vector_store(documents):
|
| 84 |
+
embeddings = OllamaEmbeddings(model='nomic-embed-text', base_url="http://localhost:11434")
|
| 85 |
+
vectorstore = FAISS.from_documents(documents, embeddings)
|
| 86 |
+
vectorstore.save_local("deepseek_cyfuture")
|
| 87 |
+
return vectorstore
|
| 88 |
+
|
| 89 |
+
|
| 90 |
+
# Create RAG Chain
|
| 91 |
+
def create_rag_chain(retriever):
|
| 92 |
+
prompt_template = ChatPromptTemplate.from_template(
|
| 93 |
+
"""
|
| 94 |
+
You are an AI questiona answering assistant specialized in answering user queries strictly from the provided context. Give detailed answer to user question considering the context.
|
| 95 |
+
|
| 96 |
+
STRICT RULES:
|
| 97 |
+
- You *must not* answer any questions outside the provided context.
|
| 98 |
+
- If the question is unrelated to billing, payments, customer, or meter reading, respond with exactly:
|
| 99 |
+
**"This question is outside my specialized domain."**
|
| 100 |
+
- Do NOT attempt to generate an answer from loosely related context.
|
| 101 |
+
- If the context does not contain a valid answer, simply state: **"I don't know the answer."**
|
| 102 |
+
|
| 103 |
+
VALIDATION STEP:
|
| 104 |
+
1. Check if the query is related to **billing, payments, customer, or meter reading**.
|
| 105 |
+
2. If NOT, respond with: `"This question is outside my specialized domain."` and nothing else.
|
| 106 |
+
3. If the context does not contain relevant data try to find best possible answer from the context.
|
| 107 |
+
4. Do NOT generate speculative answers.
|
| 108 |
+
5. if the generated answer don't adress the question then try to find the best possible answer from the context you can add more releavnt context to the answer.
|
| 109 |
+
|
| 110 |
+
Question: {question}
|
| 111 |
+
Context: {context}
|
| 112 |
+
Answer:
|
| 113 |
+
"""
|
| 114 |
+
|
| 115 |
+
)
|
| 116 |
+
return (
|
| 117 |
+
{"context": retriever | format_docs, "question": RunnablePassthrough()}
|
| 118 |
+
| prompt_template
|
| 119 |
+
| model
|
| 120 |
+
| StrOutputParser()
|
| 121 |
+
)
|
| 122 |
+
|
| 123 |
+
def format_docs(docs):
|
| 124 |
+
return "\n\n".join(doc.page_content for doc in docs)
|
| 125 |
+
|
| 126 |
+
def clean_response(response):
|
| 127 |
+
return re.sub(r'<think>.*?</think>', '', response, flags=re.DOTALL).strip()
|
| 128 |
+
|
| 129 |
+
|
| 130 |
+
|
| 131 |
+
@track()
|
| 132 |
+
def llm_chain(input_text):
|
| 133 |
+
try:
|
| 134 |
+
context = "\n".join(doc.page_content for doc in retriever.invoke(input_text))
|
| 135 |
+
response = "".join(chunk for chunk in rag_chain.stream(input_text) if isinstance(chunk, str))
|
| 136 |
+
return {"response": clean_response(response), "context_used": context}
|
| 137 |
+
except Exception as e:
|
| 138 |
+
return {"error": str(e)}
|
| 139 |
+
|
| 140 |
+
def evaluation_task(x):
|
| 141 |
+
try:
|
| 142 |
+
result = llm_chain(x['user_question'])
|
| 143 |
+
return {"input": x['user_question'], "output": result["response"], "context": result["context_used"], "expected": x['expected_output']}
|
| 144 |
+
except Exception as e:
|
| 145 |
+
return {"input": x['user_question'], "output": "", "context": x['expected_output']}
|
| 146 |
+
|
| 147 |
+
# experiment_name = f"Deepseek_{dataset.name}_{datetime.now().strftime('%Y-%m-%d_%H-%M-%S')}"
|
| 148 |
+
# metrics = [Hallucination(model=model1), AnswerRelevance(model=model1)]
|
| 149 |
+
|
| 150 |
+
|
| 151 |
+
@app.post("/run_evaluation/")
|
| 152 |
+
@backoff.on_exception(backoff.expo, (APIConnectionError, Exception), max_tries=3, max_time=300)
|
| 153 |
+
def run_evaluation():
|
| 154 |
+
experiment_name = f"Deepseek_{dataset.name}_{datetime.now().strftime('%Y-%m-%d_%H-%M-%S')}"
|
| 155 |
+
metrics = [Hallucination(), AnswerRelevance()]
|
| 156 |
+
try:
|
| 157 |
+
evaluate(
|
| 158 |
+
experiment_name=experiment_name,
|
| 159 |
+
dataset=dataset,
|
| 160 |
+
task=evaluation_task,
|
| 161 |
+
scoring_metrics=metrics,
|
| 162 |
+
experiment_config={"model": model},
|
| 163 |
+
task_threads=2
|
| 164 |
+
)
|
| 165 |
+
return {"message": "Evaluation completed successfully"}
|
| 166 |
+
except Exception as e:
|
| 167 |
+
raise HTTPException(status_code=500, detail=str(e))
|
| 168 |
+
|
| 169 |
+
|
| 170 |
+
# @backoff.on_exception(backoff.expo, (APIConnectionError, Exception), max_tries=3, max_time=300)
|
| 171 |
+
# def run_evaluation():
|
| 172 |
+
# return evaluate(experiment_name=experiment_name, dataset=dataset, task=evaluation_task, scoring_metrics=metrics, experiment_config={"model": model}, task_threads=2)
|
| 173 |
+
|
| 174 |
+
# run_evaluation()
|
| 175 |
+
|
| 176 |
+
# Create Vector Database
|
| 177 |
+
def create_db():
|
| 178 |
+
source = r'AI Agent'
|
| 179 |
+
markdown_content = load_documents(source)
|
| 180 |
+
setup_vector_store(markdown_content)
|
| 181 |
+
return "Database created successfully"
|
| 182 |
+
|
| 183 |
+
embeddings = OllamaEmbeddings(model='nomic-embed-text', base_url="http://localhost:11434")
|
| 184 |
+
vectorstore = FAISS.load_local("deepseek_cyfuture", embeddings, allow_dangerous_deserialization=True)
|
| 185 |
+
retriever = vectorstore.as_retriever( search_kwargs={'k': 2})
|
| 186 |
+
rag_chain = create_rag_chain(retriever)
|
| 187 |
+
|
| 188 |
+
@track()
|
| 189 |
+
@app.get("/query/")
|
| 190 |
+
def chain(input_text: str = Query(..., description="Enter your question")):
|
| 191 |
+
try:
|
| 192 |
+
# def generate():
|
| 193 |
+
# for chunk in rag_chain.stream(input_text):
|
| 194 |
+
# if isinstance(chunk, str):
|
| 195 |
+
# yield chunk
|
| 196 |
+
def generate():
|
| 197 |
+
buffer = "" # Temporary buffer to hold chunks until `</think>` is found
|
| 198 |
+
start_sending = False
|
| 199 |
+
|
| 200 |
+
for chunk in rag_chain.stream(input_text):
|
| 201 |
+
# if isinstance(chunk, str):
|
| 202 |
+
# buffer += chunk # Append chunk to buffer
|
| 203 |
+
|
| 204 |
+
# # Check if `</think>` is found
|
| 205 |
+
# if "</think>" in buffer:
|
| 206 |
+
# start_sending = True
|
| 207 |
+
# # Yield everything after `</think>` (including `</think>` itself)
|
| 208 |
+
# yield buffer.split("</think>", 1)[1]
|
| 209 |
+
# buffer = "" # Clear the buffer after sending the first response
|
| 210 |
+
# elif start_sending:
|
| 211 |
+
yield chunk # Continue yielding after the `</think>` tag
|
| 212 |
+
return StreamingResponse(generate(), media_type="text/plain")
|
| 213 |
+
|
| 214 |
+
except Exception as e:
|
| 215 |
+
raise HTTPException(status_code=500, detail=str(e))
|
| 216 |
+
@app.get("/")
|
| 217 |
+
def read_root():
|
| 218 |
+
return {"message": "Welcome to the AI Assistant API!"}
|
| 219 |
+
|
| 220 |
+
if __name__ == "__main__":
|
| 221 |
+
# start my fastapi app
|
| 222 |
+
import uvicorn
|
| 223 |
+
uvicorn.run(app, host="127.0.0.1", port=7860)
|
| 224 |
+
|
| 225 |
+
|
| 226 |
+
# questions=[ "Is the website accessible through mobile also? please tell the benefits of it","How do I register for a new connection?","how to make payments?",]
|
| 227 |
+
# # Questions for retrieval
|
| 228 |
+
# # Answer questions
|
| 229 |
+
# create_db()
|
| 230 |
+
# # Load Vector Store
|
| 231 |
+
# embeddings = OllamaEmbeddings(model='nomic-embed-text', base_url="http://localhost:11434")
|
| 232 |
+
# vectorstore = FAISS.load_local("deepseek_cyfuture", embeddings, allow_dangerous_deserialization=True)
|
| 233 |
+
# retriever = vectorstore.as_retriever( search_kwargs={'k': 3})
|
| 234 |
+
# rag_chain = create_rag_chain(retriever)
|
| 235 |
+
|
| 236 |
+
# for question in questions:
|
| 237 |
+
# print(f"Question: {question}")
|
| 238 |
+
# for chunk in rag_chain.stream(question):
|
| 239 |
+
# print(chunk, end="", flush=True)
|
| 240 |
+
# print("\n" + "-" * 50 + "\n")
|
requirements.txt
ADDED
|
Binary file (8.09 kB). View file
|
|
|
start.sh
ADDED
|
@@ -0,0 +1,39 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
#!/bin/bash
|
| 2 |
+
|
| 3 |
+
# Set environment variables for optimization
|
| 4 |
+
export OMP_NUM_THREADS=4
|
| 5 |
+
export MKL_NUM_THREADS=4
|
| 6 |
+
export CUDA_VISIBLE_DEVICES=0,1
|
| 7 |
+
|
| 8 |
+
# Start Ollama in the background
|
| 9 |
+
ollama serve &
|
| 10 |
+
|
| 11 |
+
# Pull the model if not already present
|
| 12 |
+
if ! ollama list | grep -q "deepseek-r1:7b"; then
|
| 13 |
+
ollama pull deepseek-r1:7b
|
| 14 |
+
fi
|
| 15 |
+
if ! ollama list | grep -q "nomic-embed-text"; then
|
| 16 |
+
ollama pull nomic-embed-text
|
| 17 |
+
fi
|
| 18 |
+
# Wait for Ollama to start up
|
| 19 |
+
max_attempts=30
|
| 20 |
+
attempt=0
|
| 21 |
+
while ! curl -s http://localhost:11434/api/tags >/dev/null; do
|
| 22 |
+
sleep 1
|
| 23 |
+
attempt=$((attempt + 1))
|
| 24 |
+
if [ $attempt -eq $max_attempts ]; then
|
| 25 |
+
echo "Ollama failed to start within 30 seconds. Exiting."
|
| 26 |
+
exit 1
|
| 27 |
+
fi
|
| 28 |
+
done
|
| 29 |
+
|
| 30 |
+
echo "Ollama is ready."
|
| 31 |
+
|
| 32 |
+
# Print the API URL
|
| 33 |
+
echo "API is running on: http://0.0.0.0:7860"
|
| 34 |
+
|
| 35 |
+
# Start FastAPI in the background
|
| 36 |
+
uvicorn app:app --host 0.0.0.0 --port 7860 --workers 4 --limit-concurrency 20 &
|
| 37 |
+
|
| 38 |
+
# Start Streamlit for UI
|
| 39 |
+
streamlit run streamlit_app.py --server.port 8501 --server.address 0.0.0.0
|
streamlit_app.py
ADDED
|
@@ -0,0 +1,76 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
|
| 2 |
+
import streamlit as st
|
| 3 |
+
import requests
|
| 4 |
+
import re # For space cleanup
|
| 5 |
+
|
| 6 |
+
st.set_page_config(page_title="AI Chatbot", layout="centered")
|
| 7 |
+
st.title("🤖 AI Chatbot")
|
| 8 |
+
|
| 9 |
+
if "messages" not in st.session_state:
|
| 10 |
+
st.session_state.messages = []
|
| 11 |
+
|
| 12 |
+
# Function to query AI API and stream response
|
| 13 |
+
def query_ai(question):
|
| 14 |
+
url = "http://127.0.0.1:7860/query/"
|
| 15 |
+
params = {"input_text": question}
|
| 16 |
+
|
| 17 |
+
with requests.get(url, params=params, stream=True) as response:
|
| 18 |
+
if response.status_code == 200:
|
| 19 |
+
full_response = ""
|
| 20 |
+
for chunk in response.iter_content(chunk_size=1024):
|
| 21 |
+
if chunk:
|
| 22 |
+
text_chunk = chunk.decode("utf-8")
|
| 23 |
+
full_response += text_chunk
|
| 24 |
+
yield full_response # Streamed response
|
| 25 |
+
|
| 26 |
+
# Custom CSS for spacing fix
|
| 27 |
+
st.markdown("""
|
| 28 |
+
<style>
|
| 29 |
+
.chat-box {
|
| 30 |
+
background-color: #1e1e1e;
|
| 31 |
+
padding: 12px;
|
| 32 |
+
border-radius: 10px;
|
| 33 |
+
margin-top: 5px;
|
| 34 |
+
font-size: 154x;
|
| 35 |
+
font-family: monospace;
|
| 36 |
+
white-space: pre-wrap;
|
| 37 |
+
word-wrap: break-word;
|
| 38 |
+
line-height: 1.2;
|
| 39 |
+
color: #ffffff;
|
| 40 |
+
}
|
| 41 |
+
</style>
|
| 42 |
+
""", unsafe_allow_html=True)
|
| 43 |
+
|
| 44 |
+
user_input = st.text_input("Ask a question:", "", key="user_input")
|
| 45 |
+
submit_button = st.button("Submit")
|
| 46 |
+
|
| 47 |
+
if submit_button and user_input:
|
| 48 |
+
st.session_state.messages.append({"role": "user", "content": user_input})
|
| 49 |
+
|
| 50 |
+
# Placeholder for streaming
|
| 51 |
+
response_container = st.empty()
|
| 52 |
+
full_response = ""
|
| 53 |
+
|
| 54 |
+
with st.spinner("🤖 AI is thinking..."):
|
| 55 |
+
for chunk in query_ai(user_input):
|
| 56 |
+
full_response = chunk
|
| 57 |
+
response_container.markdown(f'<div class="chat-box">{full_response}</div>', unsafe_allow_html=True)
|
| 58 |
+
|
| 59 |
+
response_container.empty() # Hides the streamed "Thinking" response after completion
|
| 60 |
+
|
| 61 |
+
# Extract refined answer after "</think>"
|
| 62 |
+
if "</think>" in full_response:
|
| 63 |
+
refined_response = full_response.split("</think>", 1)[-1].strip()
|
| 64 |
+
else:
|
| 65 |
+
refined_response = full_response # Fallback if </think> is missing
|
| 66 |
+
|
| 67 |
+
# Remove extra newlines and excessive spaces
|
| 68 |
+
refined_response = re.sub(r'\n\s*\n', '\n', refined_response.strip())
|
| 69 |
+
|
| 70 |
+
# Expandable AI Thought Process Box
|
| 71 |
+
with st.expander("🤖 AI's Thought Process (Click to Expand)"):
|
| 72 |
+
st.markdown(f'<div class="chat-box">{full_response}</div>', unsafe_allow_html=True)
|
| 73 |
+
|
| 74 |
+
# Display refined answer with clean formatting
|
| 75 |
+
st.write("Answer:")
|
| 76 |
+
st.markdown(refined_response, unsafe_allow_html=True)
|