--- title: Stecu RAG Chatbot emoji: 🏃‍♂️ colorFrom: blue colorTo: purple sdk: gradio sdk_version: 5.35.0 app_file: app.py --- # 🏃‍♂️ Stecu: Scrum Teaching Chatbot Unit **Live Demo:** [**Try Stecu on Hugging Face Spaces**](https://huggingface.co/spaces/firman-ml/Stecu-RAG) --- Welcome to Stecu, your personal AI Scrum coach! Stecu is a specialized chatbot designed to answer your questions about the Scrum framework with high accuracy, drawing its knowledge exclusively from the official Scrum Guide. Whether you are a beginner learning the basics or an experienced practitioner needing a quick reference, Stecu is here to help you understand Scrum concepts, roles, events, and artifacts. ## ✨ How It Works: Retrieval-Augmented Generation (RAG) Stecu is not a general-purpose chatbot. It's built using a **Retrieval-Augmented Generation (RAG)** architecture to ensure its answers are accurate and trustworthy. This prevents the model from "hallucinating" or providing information from outside its designated knowledge base. The process is as follows: 1. **Load Knowledge:** The official "Scrum Guide.pdf" is loaded and split into small, manageable chunks of text. 2. **Create Embeddings:** Each chunk is converted into a numerical representation (a vector embedding) using a sentence-transformer model. These vectors are stored in a `Chroma` vector database. 3. **Retrieve Context:** When you ask a question, Stecu converts your query into a vector and searches the database to find the most semantically relevant chunks from the Scrum Guide. 4. **Generate Answer:** The retrieved chunks are passed as context to a large language model (`mistralai/Mistral-7B-Instruct-v0.3`). The model is explicitly instructed to formulate an answer **only** using the provided information. This ensures that every answer is grounded in the official Scrum Guide. ## 🛠️ Tech Stack This project was built using the following key technologies and libraries: * **Python:** The core programming language. * **Gradio:** To create the interactive web UI for the chatbot. * **LangChain:** To orchestrate the RAG pipeline, including document loading and text splitting. * **Hugging Face:** For the `InferenceClient` to access the Mistral model and `HuggingFaceEmbeddings` for creating text embeddings. * **ChromaDB:** As the in-memory vector store for efficient similarity search. * **PyPDFLoader:** To load and parse the content from the PDF file. ## 🚀 How to Run Locally Want to run Stecu on your own machine? Follow these steps. ### Prerequisites * Python 3.8 or higher * The `Scrum Guide.pdf` file in the project directory. ### 1. Clone the Repository First, get the files from the Hugging Face Space repository. ```bash git clone [https://huggingface.co/spaces/firman-ml/Stecu-RAG](https://huggingface.co/spaces/firman-ml/Stecu-RAG) cd Stecu-RAG ``` ### 2. Create a Virtual Environment It's highly recommended to use a virtual environment to manage dependencies. ```bash # Create the virtual environment python -m venv .venv # Activate it # On Windows .venv\Scripts\activate # On macOS/Linux source .venv/bin/activate ``` ### 3. Install Dependencies The required libraries are listed in the `requirements.txt` file. ```bash pip install -r requirements.txt ``` ### 4. Run the Application Launch the Gradio app with the following command: ```bash python app.py ``` A local URL (e.g., `http://127.0.0.1:7860`) will be displayed in your terminal. Open it in your browser to start chatting with Stecu! ## ⚠️ Disclaimer Stecu's knowledge is strictly limited to the contents of the 2020 version of the Scrum Guide. It cannot answer questions outside of this scope or provide opinions. Its purpose is to be a reliable and accurate guide to the Scrum framework as written.