Spaces:

firman-ml
/

Stecu-RAG

Sleeping

App Files Files Community

Stecu-RAG / README.md

firman-ml

Update README.md

e06af59 verified 7 months ago

preview code

raw

history blame contribute delete

3.81 kB

	---
	title: Stecu RAG Chatbot
	emoji: 🏃‍♂️
	colorFrom: blue
	colorTo: purple
	sdk: gradio
	sdk_version: 5.35.0
	app_file: app.py
	---

	# 🏃‍♂️ Stecu: Scrum Teaching Chatbot Unit

	Live Demo: [Try Stecu on Hugging Face Spaces](https://huggingface.co/spaces/firman-ml/Stecu-RAG)

	---

	Welcome to Stecu, your personal AI Scrum coach! Stecu is a specialized chatbot designed to answer your questions about the Scrum framework with high accuracy, drawing its knowledge exclusively from the official Scrum Guide.

	Whether you are a beginner learning the basics or an experienced practitioner needing a quick reference, Stecu is here to help you understand Scrum concepts, roles, events, and artifacts.

	## ✨ How It Works: Retrieval-Augmented Generation (RAG)

	Stecu is not a general-purpose chatbot. It's built using a Retrieval-Augmented Generation (RAG) architecture to ensure its answers are accurate and trustworthy. This prevents the model from "hallucinating" or providing information from outside its designated knowledge base.

	The process is as follows:

	1. Load Knowledge: The official "Scrum Guide.pdf" is loaded and split into small, manageable chunks of text.
	2. Create Embeddings: Each chunk is converted into a numerical representation (a vector embedding) using a sentence-transformer model. These vectors are stored in a `Chroma` vector database.
	3. Retrieve Context: When you ask a question, Stecu converts your query into a vector and searches the database to find the most semantically relevant chunks from the Scrum Guide.
	4. Generate Answer: The retrieved chunks are passed as context to a large language model (`mistralai/Mistral-7B-Instruct-v0.3`). The model is explicitly instructed to formulate an answer only using the provided information.

	This ensures that every answer is grounded in the official Scrum Guide.

	## 🛠️ Tech Stack

	This project was built using the following key technologies and libraries:

	* Python: The core programming language.
	* Gradio: To create the interactive web UI for the chatbot.
	* LangChain: To orchestrate the RAG pipeline, including document loading and text splitting.
	* Hugging Face: For the `InferenceClient` to access the Mistral model and `HuggingFaceEmbeddings` for creating text embeddings.
	* ChromaDB: As the in-memory vector store for efficient similarity search.
	* PyPDFLoader: To load and parse the content from the PDF file.

	## 🚀 How to Run Locally

	Want to run Stecu on your own machine? Follow these steps.

	### Prerequisites

	* Python 3.8 or higher
	* The `Scrum Guide.pdf` file in the project directory.

	### 1. Clone the Repository

	First, get the files from the Hugging Face Space repository.

	```bash
	git clone [https://huggingface.co/spaces/firman-ml/Stecu-RAG](https://huggingface.co/spaces/firman-ml/Stecu-RAG)
	cd Stecu-RAG
	```

	### 2. Create a Virtual Environment

	It's highly recommended to use a virtual environment to manage dependencies.

	```bash
	# Create the virtual environment
	python -m venv .venv

	# Activate it
	# On Windows
	.venv\Scripts\activate
	# On macOS/Linux
	source .venv/bin/activate
	```

	### 3. Install Dependencies

	The required libraries are listed in the `requirements.txt` file.

	```bash
	pip install -r requirements.txt
	```

	### 4. Run the Application

	Launch the Gradio app with the following command:

	```bash
	python app.py
	```

	A local URL (e.g., `http://127.0.0.1:7860`) will be displayed in your terminal. Open it in your browser to start chatting with Stecu!

	## ⚠️ Disclaimer

	Stecu's knowledge is strictly limited to the contents of the 2020 version of the Scrum Guide. It cannot answer questions outside of this scope or provide opinions. Its purpose is to be a reliable and accurate guide to the Scrum framework as written.