| --- |
| title: SolarAI |
| emoji: π |
| colorFrom: yellow |
| colorTo: green |
| sdk: gradio |
| sdk_version: 5.16.0 |
| app_file: app.py |
| pinned: false |
| short_description: It is an AI-powered chatbot. |
| --- |
| # π SolarAI Chatbot |
|
|
| This project is an **AI-powered chatbot** that provides accurate and insightful information about the **solar industry**, including **solar panel technology, installation processes, maintenance, costs, ROI analysis, and market trends**. The chatbot integrates **LLM (ChatGroq - Mixtral-8x7B)** with **vector search (FAISS)** for better context-aware responses. |
|
|
| ## π Features |
|
|
| - Extracts solar panels knowledge from a **DOCX file** (manually created from data available on internet) |
| - Converts text into **embeddings** using `SentenceTransformer` |
| - Stores embeddings in a **FAISS vector database** for efficient retrieval |
| - Queries relevant information before sending it to **ChatGroq (Mixtral-8x7B)** |
| - Provides an **interactive chatbot UI using Gradio** |
|
|
| --- |
|
|
| ## π Installation & Setup |
|
|
| ### **Step 1: Install Dependencies** |
|
|
| ```bash |
| pip install -r requirements.txt |
| ``` |
|
|
| ### **Step 2: Run the Chatbot** |
|
|
| ```bash |
| python app.py |
| ``` |
|
|
| --- |
|
|
| ## π Code Breakdown (Function-by-Function) |
|
|
| ### **1οΈβ£ Extracting Text from DOCX** |
|
|
| ```python |
| def extract_text_from_docx(file_path): |
| doc = Document(file_path) |
| text = "\n".join([para.text for para in doc.paragraphs if para.text.strip()]) |
| return text |
| ``` |
|
|
| πΉ **Purpose:** Reads a `.docx` file and extracts useful solar-related information. |
|
|
| --- |
|
|
| ### **2οΈβ£ Splitting Text into Chunks** |
|
|
| ```python |
| def split_text(text, chunk_size=300): |
| sentences = text.split(". ") |
| chunks, current_chunk = [], "" |
| for sentence in sentences: |
| if len(current_chunk) + len(sentence) < chunk_size: |
| current_chunk += sentence + ". " |
| else: |
| chunks.append(current_chunk.strip()) |
| current_chunk = sentence + ". " |
| if current_chunk: |
| chunks.append(current_chunk.strip()) |
| return chunks |
| ``` |
|
|
| πΉ **Purpose:** Splits large text data into smaller, meaningful **chunks** for better vector search performance. |
|
|
| --- |
|
|
| ### **3οΈβ£ Generating Embeddings** |
|
|
| ```python |
| from sentence_transformers import SentenceTransformer |
| |
| model = SentenceTransformer("all-MiniLM-L6-v2") # Embedding model |
| embeddings = model.encode(chunks) |
| ``` |
|
|
| πΉ **Purpose:** Converts text **chunks** into numerical representations (vectors) for similarity search. |
|
|
| --- |
|
|
| ### **4οΈβ£ Storing Embeddings in FAISS Vector Database** |
|
|
| ```python |
| import faiss |
| import numpy as np |
| |
| vector_dim = embeddings.shape[1] |
| index = faiss.IndexFlatL2(vector_dim) |
| index.add(np.array(embeddings)) |
| faiss.write_index(index, "solar_vectors.index") |
| ``` |
|
|
| πΉ **Purpose:** Uses **FAISS** to efficiently store and retrieve relevant text when a user asks a question. |
|
|
| --- |
|
|
| ### **5οΈβ£ Retrieving Relevant Information** |
|
|
| ```python |
| def retrieve_relevant_text(query, top_k=2): |
| query_embedding = model.encode([query]) |
| distances, indices = index.search(np.array(query_embedding), top_k) |
| return " ".join([chunks[i] for i in indices[0]]) |
| ``` |
|
|
| πΉ **Purpose:** Finds the **most relevant** pieces of information to **pass to the chatbot** before generating a response. |
|
|
| --- |
|
|
| ### **6οΈβ£ Chatbot Integration with ChatGroq (Mixtral-8x7B)** |
|
|
| ```python |
| from langchain_core.prompts import ChatPromptTemplate |
| from langchain_groq import ChatGroq |
| |
| llm = ChatGroq(model="mixtral-8x7b-32768", temperature=0.2) |
| |
| def chat_with_groq(user_query): |
| retrieved_text = retrieve_relevant_text(user_query) |
| system_message = "You are an AI assistant that provides accurate solar energy information." |
| prompt_template = ChatPromptTemplate.from_messages([ |
| ("system", system_message), |
| ("human", f"Use the following information to answer: {retrieved_text} \n\nUser Query: {user_query}") |
| ]) |
| chain = prompt_template | llm |
| response = chain.invoke({"text": user_query}) |
| return response.content |
| ``` |
|
|
| πΉ **Purpose:** Uses **retrieved data + user query** to generate an **LLM-based response**. |
|
|
| --- |
|
|
| ### **7οΈβ£ Gradio Chatbot UI** |
|
|
| ```python |
| import gradio as gr |
| |
| def gradio_chatbot(user_input): |
| response = chat_with_groq(user_input) |
| return response |
| |
| with gr.Blocks() as demo: |
| gr.Markdown("# π SolarAI π") |
| with gr.Row(): |
| user_input = gr.Textbox(placeholder="Ask me anything about solar energy...", lines=2, interactive=True) |
| with gr.Row(): |
| output_box = gr.Textbox(lines=6, interactive=True, label="Chatbot Response") |
| submit_btn = gr.Button("Ask") |
| submit_btn.click(fn=gradio_chatbot, inputs=user_input, outputs=output_box) |
| |
| demo.launch() |
| ``` |
|
|
| πΉ **Purpose:** Creates a **Gradio-powered UI** for user interaction with the chatbot. |
|
|
| --- |
|
|
| ## π Deployment Guide |
|
|
| ### **Option 1: Run Locally** |
|
|
| ```bash |
| python app.py |
| ``` |
|
|
| ### **Option 2: Deploy on Hugging Face Spaces** |
|
|
| 1. Create `requirements.txt`. |
|
|
| 2. Push to Hugging Face: |
|
|
| ```bash |
| git init |
| git add . |
| git commit -m "Deploy Solar Chatbot" |
| git remote add origin https://huggingface.co/spaces/YOUR_USERNAME/solar-chatbot |
| git push origin main |
| ``` |
|
|
| β
Your chatbot is now **LIVE** on Hugging Face Spaces! |
|
|
| --- |
|
|
| ## π― Future Improvements |
|
|
| β
Add **voice-based interaction** ποΈ β
Improve **multi-turn conversation memory** β
Enable **real-time solar industry data fetching** β
Integrate **WhatsApp/Telegram bot support** π² |
|
|
|
|
|
|
|
|
| Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference |
|
|