npuliga commited on
Commit
f551b90
·
1 Parent(s): 18d2107

updated files

Browse files
Files changed (4) hide show
  1. Dockerfile +2 -2
  2. README.md +4 -60
  3. app.py +3 -3
  4. requirements.txt +1 -4
Dockerfile CHANGED
@@ -17,5 +17,5 @@ COPY . .
17
  RUN mkdir -p /code/cache && chmod 777 /code/cache
18
 
19
  # Command to run the application
20
- # We use host 0.0.0.0 and port 7860 (Hugging Face's default port)
21
- CMD ["uvicorn", "app:app", "--host", "0.0.0.0", "--port", "7860"]
 
17
  RUN mkdir -p /code/cache && chmod 777 /code/cache
18
 
19
  # Command to run the application
20
+ # Run Gradio directly (compatible with Hugging Face Spaces)
21
+ CMD ["python", "app.py"]
README.md CHANGED
@@ -3,74 +3,18 @@ title: RAG Analytics Dashboard
3
  colorFrom: blue
4
  colorTo: green
5
  sdk: gradio
6
- sdk_version: 4.44.1
7
  app_file: app.py
8
  pinned: false
9
  license: apache-2.0
10
- short_description: Compare RAG system performance across multiple domains
11
  ---
12
 
13
  # RAG Pipeline Analytics Dashboard
14
 
15
- Interactive dashboard for analyzing RAG (Retrieval-Augmented Generation) system performance across multiple domains.
16
-
17
- ## Features
18
-
19
- - **Intra-Domain Analysis:** Compare different RAG configurations within a single domain
20
- - **Performance Metrics:** RMSE (Relevance, Utilization, Completeness), F1 Score, AUC-ROC
21
- - **Interactive Filtering:** Filter tests by reranker model, summarization model, and chunking strategy
22
- - **Inter-Domain Comparison:** Compare peak performance across different domains
23
- - **Data Preview:** Inspect raw data and configuration parameters
24
-
25
- ## Supported Domains
26
-
27
- - **Biomedical** (PubMedQA)
28
- - **Finance** (FinQA)
29
- - **General** (MS MARCO)
30
- - **Legal** (CUAD)
31
 
32
  ## Usage
33
 
34
- 1. **Load Data:** Click "Load/Refresh Data" to load all test results
35
- 2. **Select Domain:** Choose a domain from the dropdown
36
- 3. **Apply Filters:** Use the filter dropdowns to compare specific configurations
37
- 4. **View Metrics:**
38
- - RMSE graph shows relevance, utilization, and completeness (lower is better)
39
- - Performance graph shows F1 Score and AUC-ROC (higher is better)
40
- 5. **Compare Domains:** Switch to "Inter-Domain Comparison" tab to see overall best configurations
41
-
42
- ## Interpreting Results
43
-
44
- ### RMSE Metrics (Lower is Better)
45
- - **Relevance:** How well retrieved documents match the query
46
- - **Utilization:** How efficiently the context is used
47
- - **Completeness:** Coverage of required information
48
-
49
- ### Performance Metrics (Higher is Better)
50
- - **F1 Score:** Balance of precision and recall
51
- - **AUC-ROC:** Overall classification performance
52
-
53
- ## Configuration Parameters
54
-
55
- The dashboard analyzes variations in:
56
- - Embedding models
57
- - Reranker models
58
- - Summarization strategies
59
- - Chunking strategies
60
- - Retrieval strategies (Dense, Sparse, Hybrid)
61
- - Hyperparameters (chunk size, overlap, alpha, top-k)
62
-
63
- ## Technology Stack
64
-
65
- - **Framework:** Gradio 4.0+
66
- - **Visualization:** Plotly Express
67
- - **Data Processing:** Pandas
68
- - **Backend:** FastAPI
69
-
70
- ## License
71
-
72
- Apache 2.0
73
-
74
- ---
75
 
76
- **Version:** v2.1.0-fixed | Built for AIML @ IIIT Hyderabad - TalentSprint
 
3
  colorFrom: blue
4
  colorTo: green
5
  sdk: gradio
 
6
  app_file: app.py
7
  pinned: false
8
  license: apache-2.0
 
9
  ---
10
 
11
  # RAG Pipeline Analytics Dashboard
12
 
13
+ Interactive dashboard for analyzing RAG system performance across multiple domains (Biomedical, Finance, General, Legal).
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
14
 
15
  ## Usage
16
 
17
+ 1. Click **Load/Refresh Data** to load test results
18
+ 2. Select a domain and apply filters to compare configurations
19
+ 3. View RMSE metrics (lower is better) and Performance metrics (higher is better)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
20
 
 
app.py CHANGED
@@ -1,13 +1,11 @@
1
  import pandas as pd
2
  import gradio as gr
3
  import plotly.express as px
4
- from fastapi import FastAPI
5
  from typing import Dict
6
 
7
  from config import METADATA_COLUMNS, DATA_FOLDER
8
  from data_loader import load_csv_from_folder, get_available_datasets
9
 
10
- app = FastAPI()
11
  DB: Dict[str, pd.DataFrame] = {}
12
 
13
  # --- 1. DATA PROCESSING FUNCTIONS ---
@@ -332,4 +330,6 @@ print(f"Loading data from {DATA_FOLDER}...")
332
  startup_status = load_data()
333
  print(startup_status)
334
 
335
- app = gr.mount_gradio_app(app, demo, path="/")
 
 
 
1
  import pandas as pd
2
  import gradio as gr
3
  import plotly.express as px
 
4
  from typing import Dict
5
 
6
  from config import METADATA_COLUMNS, DATA_FOLDER
7
  from data_loader import load_csv_from_folder, get_available_datasets
8
 
 
9
  DB: Dict[str, pd.DataFrame] = {}
10
 
11
  # --- 1. DATA PROCESSING FUNCTIONS ---
 
330
  startup_status = load_data()
331
  print(startup_status)
332
 
333
+ # Launch Gradio app
334
+ if __name__ == "__main__":
335
+ demo.launch()
requirements.txt CHANGED
@@ -1,7 +1,4 @@
1
  gradio==4.44.1
2
  huggingface-hub==0.22.2
3
  plotly>=5.18.0
4
- pandas>=2.0.0
5
- fastapi>=0.104.0
6
- uvicorn[standard]>=0.24.0
7
- python-multipart>=0.0.6
 
1
  gradio==4.44.1
2
  huggingface-hub==0.22.2
3
  plotly>=5.18.0
4
+ pandas>=2.0.0