Spaces:

aiqualitylab
/

qa-bug-triage

Sleeping

App Files Files Community

qa-bug-triage / README.md

aiqualitylab

Update README.md

a00a036 verified about 1 month ago

preview code

raw

history blame contribute delete

6.17 kB

A newer version of the Gradio SDK is available: 6.16.0

Upgrade

metadata

license: mit
title: ' QA Bug Triage Pipeline'
sdk: gradio
emoji: 🏆
colorFrom: red
colorTo: gray

🐛 QA Bug Triage Pipeline

A modern RAG workflow for turning messy app reviews into structured, searchable QA bug intelligence.

🔗 Links: Hugging Face Demo · GitHub Repository

📖 Overview

Teams often receive product feedback as noisy, repetitive, and unstructured review text. This project converts those reviews into structured bug reports with an LLM, stores them in a local vector database, and makes them easy to search and summarize.

The result is a lightweight bug triage assistant built with Python, Gradio, OpenAI, ChromaDB, and RAG evaluation tooling.

✨ What It Does

Capability	Description
📥 Review collection	Fetches real Google Play reviews
🔀 Query routing	Classifies incoming text before triage
🗂️ Structured triage	Generates JSON bug reports with consistent fields
🔍 Hybrid retrieval	Combines semantic retrieval with BM25 keyword matching
🤖 AI summaries	Produces concise summaries for triage and search results
🗑️ Store reset	Clears persisted bugs directly from the UI

🏗️ Architecture

Google Play Reviews
        │
        ▼
  ┌─────────────┐
  │ Query Router │  ──→  feature request / general complaint (dropped)
  └─────────────┘
        │ bug report
        ▼
  ┌─────────────┐
  │   Triage    │  ──→  structured JSON bug record
  └─────────────┘
        │
        ▼
  ┌─────────────┐
  │  ChromaDB   │  ──→  vector + BM25 hybrid index
  └─────────────┘
        │
        ▼
  ┌─────────────┐
  │ AI Summary  │  ──→  concise triage output
  └─────────────┘

🚀 Quick Start

# Windows PowerShell
python -m venv .venv
.\.venv\Scripts\Activate.ps1
pip install -r requirements.txt
python app.py

Then open the local Gradio URL in your browser.

🔑 API Keys

This app uses BYOK (Bring Your Own Key):

Paste your OpenAI API key into the masked field in the UI
The key input is masked and never committed to the repository

⚠️ Never commit API keys to source control.

🖥️ How To Use

Collect — fetch and triage live Google Play reviews
Triage — analyze a single custom review
Search — retrieve similar bugs via hybrid retrieval
Clear bugs — reset the ChromaDB store

📁 Project Structure

qa-bug-triage/
├── app.py                  # Gradio app and interaction flows
├── collect.py              # Google Play review collection
├── triage.py               # Routing and structured triage logic
├── rag.py                  # Chroma storage and hybrid retrieval
└── eval/
    ├── eval.py             # RAG evaluation script
    ├── eval_dataset.json   # Evaluation dataset
    └── results.json        # Latest saved evaluation metrics

📊 Evaluation

Run the evaluation suite:

python eval\eval.py --api-key YOUR_OPENAI_API_KEY

Latest results:

Metric	Score
Answer Relevancy	`0.868`
Faithfulness	`0.292`
Context Precision	`0.020`

💰 Cost Estimate

Target: under $0.50 for a short demo session.

Parameter	Value
Token range	~8k – 20k tokens
Typical cost	< $0.50 per session
Recommended max reviews	5 – 10

Tips to keep costs low:

Keep max reviews between 5 and 10
Avoid repeated large collect runs
Use short test inputs for manual triage validation

🛠️ Tech Stack

Tool	Role
Python	Core language
Gradio	Web UI
OpenAI GPT-4o	LLM for triage and summaries
ChromaDB	Vector store
rank-bm25	Keyword retrieval
RAGAS	RAG evaluation framework
google-play-scraper	Review data source

✅ Functionalities Implemented

Requirements covered

RAG project written in Python
Uses at least one LLM
Public repository with collection and curation scripts
README with project explanation and setup
BYOK input in the UI — see API Keys
Cost estimate included — see Cost Estimate
API key requirements listed — see API Keys
More than 5 optional techniques covered (7 total — see below)

Techniques implemented

Streaming responses in the UI — app.py
Dynamic few-shot prompting using similar bugs — triage.py
Evaluation code and dataset included — eval/eval.py, eval/eval_dataset.json
Domain-specific app for QA bug triage — triage.py, app.py
Structured JSON data curation for RAG — triage.py
Hybrid retrieval with semantic search and BM25 — rag.py
Query routing in the active app flow — triage.py