André Oliveira commited on
Commit
f7d462d
·
1 Parent(s): 434392c

refactor: mcp entrypoint changed

Browse files
Files changed (3) hide show
  1. README.md +140 -2
  2. api.py +3 -4
  3. app.py +16 -14
README.md CHANGED
@@ -6,9 +6,147 @@ colorTo: purple
6
  sdk: gradio
7
  sdk_version: "5.49.1"
8
  app_file: app.py
9
- pinned: false
10
  ---
11
 
12
  # Ragmint MCP Server
13
 
14
- Gradio-based MCP server for Ragmint.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
6
  sdk: gradio
7
  sdk_version: "5.49.1"
8
  app_file: app.py
9
+ pinned: true
10
  ---
11
 
12
  # Ragmint MCP Server
13
 
14
+ Gradio-based MCP server for Ragmint, enabling **Retrieval-Augmented Generation (RAG) pipeline optimization and tuning** via an MCP interface.
15
+
16
+ <p align="center">
17
+ <img src="https://raw.githubusercontent.com/andyolivers/ragmint/main/src/ragmint/assets/img/ragmint-banner.png" width="auto" height="70px" alt="Ragmint Banner">
18
+ </p>
19
+
20
+ ![Python](https://img.shields.io/badge/python-3.9%2B-blue) ![License](https://img.shields.io/badge/license-Apache%202.0-green) ![Status](https://img.shields.io/badge/Status-Active-success)
21
+
22
+ ---
23
+
24
+ ## 🧩 Overview
25
+
26
+ Ragmint MCP Server exposes the full power of **Ragmint**, a modular Python library for **evaluating, optimizing, and tuning RAG pipelines**, through a **Multimodal Control Plane (MCP)**. This allows external clients (like Claude Desktop or Cursor) to **run experiments, retrieve leaderboard results, and tune RAG parameters programmatically**.
27
+
28
+ ### Features exposed via MCP:
29
+
30
+ * ✅ Automated hyperparameter optimization (Grid, Random, Bayesian via Optuna)
31
+ * 🤖 Auto-RAG Tuner for dynamic retriever–embedding recommendations
32
+ * 🧮 Validation QA generation for corpora without labeled data
33
+ * 🏆 Leaderboard tracking and experiment comparison
34
+ * 🧠 Explainability via Gemini / Claude
35
+ * 📦 Chunking, embeddings, retrievers, rerankers configuration
36
+ * ⚙️ Full RAG pipeline control programmatically
37
+
38
+ ---
39
+
40
+ ## 🚀 Quick Start
41
+
42
+ ### Installation
43
+
44
+ ```bash
45
+ pip install -r requirements.txt
46
+ ```
47
+
48
+ ### Running the MCP Server
49
+
50
+ ```bash
51
+ python app.py
52
+ ```
53
+
54
+ The server will expose MCP-compatible endpoints, allowing clients to:
55
+
56
+ * Perform optimization experiments
57
+ * Automatically autotune pipelines.
58
+ * Generate validation QA sets with LLM.
59
+
60
+
61
+ ### Environment Variables
62
+
63
+ Set API keys for LLMs used in explainability and QA generation:
64
+
65
+ ```bash
66
+ export ANTHROPIC_API_KEY="your_claude_key"
67
+ export GOOGLE_API_KEY="your_gemini_key"
68
+ ```
69
+
70
+ ---
71
+
72
+ ## 🧠 MCP Usage
73
+
74
+ Ragmint MCP Server provides Python-callable interfaces for programmatic control. Example usage with MCP:
75
+
76
+ ```python
77
+ from mcp_client import MCPClient
78
+
79
+ client = MCPClient(server_url="http://localhost:7860")
80
+
81
+ # Run Auto-RAG tuning
82
+ config, results = client.autotune(docs_path="data/docs/", trials=5)
83
+ print("Best config:", config)
84
+
85
+ # Retrieve leaderboard
86
+ top_results = client.leaderboard(top_k=5)
87
+ print(top_results)
88
+ ```
89
+
90
+ ---
91
+
92
+ ## 🔤 Supported Embeddings
93
+
94
+ * `sentence-transformers/all-MiniLM-L6-v2`
95
+ * `sentence-transformers/all-mpnet-base-v2`
96
+ * `BAAI/bge-base-en-v1.5`
97
+ * `intfloat/multilingual-e5-base`
98
+
99
+ ### Configuration Example
100
+
101
+ ```yaml
102
+ embedding_model: sentence-transformers/all-MiniLM-L6-v2
103
+ ```
104
+
105
+ ---
106
+
107
+ ## 🔍 Supported Retrievers
108
+
109
+ | Retriever | Description |
110
+ | ------------ | ---------------------------------- |
111
+ | FAISS | Fast vector similarity search |
112
+ | Chroma | Persistent vector DB |
113
+ | scikit-learn | Local lightweight NearestNeighbors |
114
+
115
+ ### Configuration Example
116
+
117
+ ```yaml
118
+ retriever: faiss
119
+ ```
120
+
121
+ ---
122
+
123
+ ## 🧮 Dataset Options
124
+
125
+ | Mode | Example | Description |
126
+ | -------------------- | ---------------------------------- | -------------------------------------------- |
127
+ | Default | validation_set=None | Uses built-in experiments/validation_qa.json |
128
+ | Custom File | validation_set="data/my_eval.json" | Your QA dataset |
129
+ | Hugging Face Dataset | validation_set="squad" | Downloads benchmark dataset |
130
+
131
+
132
+ ---
133
+
134
+ ## 🧩 Folder Structure
135
+
136
+ ```
137
+ ragmint_mcp_server/
138
+ ├── app.py # MCP server entrypoint
139
+ ├── models.py
140
+ └── api.py
141
+ ```
142
+
143
+ ---
144
+
145
+ ## 📘 License
146
+
147
+ This project is licensed under the Apache License 2.0. See the [LICENSE](LICENSE) file for details.
148
+
149
+
150
+ <p align="center">
151
+ <sub>Built with ❤️ by <a href="https://andyolivers.com">André Oliveira</a> | Apache 2.0 License</sub>
152
+ </p>
api.py CHANGED
@@ -303,7 +303,6 @@ def generate_qa(req: QARequest):
303
  raise HTTPException(status_code=500, detail=str(exc))
304
 
305
 
306
- # only run uvicorn if script is executed directly
307
- if __name__ == "__main__":
308
- import uvicorn as _uvicorn
309
- _uvicorn.run(app, host="0.0.0.0", port=7860, log_level="info")
 
303
  raise HTTPException(status_code=500, detail=str(exc))
304
 
305
 
306
+ def start_api():
307
+ import uvicorn
308
+ uvicorn.run(app, host="0.0.0.0", port=8000, log_level="info")
 
app.py CHANGED
@@ -1,15 +1,17 @@
1
- # app.py
2
  import gradio as gr
3
  import requests
4
  import json
5
  import os
6
  import shutil
7
- import uvicorn
8
  from models import OptimizeRequest, AutotuneRequest, QARequest
9
- from api import app as backend_app # import the FastAPI app we just saved
 
10
 
11
- # Base URL for internal calls (same process)
12
- BASE_INTERNAL = "http://127.0.0.1:7860"
 
 
 
13
 
14
  def call_api(endpoint: str, payload: dict) -> str:
15
  try:
@@ -46,13 +48,13 @@ DEFAULT_AUTOTUNE_JSON = model_to_json(AutotuneRequest)
46
  DEFAULT_QA_JSON = model_to_json(QARequest)
47
 
48
  with gr.Blocks(theme=gr.themes.Soft()) as demo:
49
- gr.Markdown("# Ragmint MCP Client (UI)")
50
  with gr.Column():
51
  gr.Markdown("## Upload Documents")
52
  upload_files = gr.File(file_count="multiple", type="filepath")
53
  upload_path = gr.Textbox(value=DEFAULT_UPLOAD_PATH, label="Docs Path")
54
  upload_btn = gr.Button("Upload", variant="primary")
55
- upload_out = gr.Textbox()
56
  upload_btn.click(upload_docs_tool, inputs=[upload_files, upload_path], outputs=upload_out)
57
  gr.Markdown("---")
58
 
@@ -60,7 +62,7 @@ with gr.Blocks(theme=gr.themes.Soft()) as demo:
60
  gr.Markdown("## Optimize RAG")
61
  optimize_input = gr.Textbox(lines=12, value=DEFAULT_OPTIMIZE_JSON, label="OptimizeRequest JSON")
62
  optimize_btn = gr.Button("Submit", variant="primary")
63
- optimize_out = gr.Textbox(lines=15)
64
  optimize_btn.click(optimize_rag_tool, inputs=optimize_input, outputs=optimize_out)
65
  gr.Markdown("---")
66
 
@@ -76,13 +78,13 @@ with gr.Blocks(theme=gr.themes.Soft()) as demo:
76
  gr.Markdown("## Generate QA")
77
  qa_input = gr.Textbox(lines=12, value=DEFAULT_QA_JSON, label="QARequest JSON")
78
  qa_btn = gr.Button("Submit", variant="primary")
79
- qa_out = gr.Textbox(lines=15)
80
  qa_btn.click(generate_qa_tool, inputs=qa_input, outputs=qa_out)
81
  gr.Markdown("---")
82
 
83
- # mount the Gradio app on FastAPI at root ("/")
84
- gr.mount_gradio_app(backend_app, demo, path="/")
85
-
86
- # When run directly, serve with uvicorn (HF will run this)
87
  if __name__ == "__main__":
88
- uvicorn.run(backend_app, host="0.0.0.0", port=7860, log_level="info")
 
 
 
 
 
 
1
  import gradio as gr
2
  import requests
3
  import json
4
  import os
5
  import shutil
 
6
  from models import OptimizeRequest, AutotuneRequest, QARequest
7
+ import threading
8
+ from api import start_api
9
 
10
+ threading.Thread(target=start_api, daemon=True).start()
11
+
12
+
13
+ # Base URL for internal calls
14
+ BASE_INTERNAL = "http://127.0.0.1:8000"
15
 
16
  def call_api(endpoint: str, payload: dict) -> str:
17
  try:
 
48
  DEFAULT_QA_JSON = model_to_json(QARequest)
49
 
50
  with gr.Blocks(theme=gr.themes.Soft()) as demo:
51
+ gr.Markdown("# Ragmint MCP Client")
52
  with gr.Column():
53
  gr.Markdown("## Upload Documents")
54
  upload_files = gr.File(file_count="multiple", type="filepath")
55
  upload_path = gr.Textbox(value=DEFAULT_UPLOAD_PATH, label="Docs Path")
56
  upload_btn = gr.Button("Upload", variant="primary")
57
+ upload_out = gr.Textbox(label="Response")
58
  upload_btn.click(upload_docs_tool, inputs=[upload_files, upload_path], outputs=upload_out)
59
  gr.Markdown("---")
60
 
 
62
  gr.Markdown("## Optimize RAG")
63
  optimize_input = gr.Textbox(lines=12, value=DEFAULT_OPTIMIZE_JSON, label="OptimizeRequest JSON")
64
  optimize_btn = gr.Button("Submit", variant="primary")
65
+ optimize_out = gr.Textbox(lines=15,label="Response")
66
  optimize_btn.click(optimize_rag_tool, inputs=optimize_input, outputs=optimize_out)
67
  gr.Markdown("---")
68
 
 
78
  gr.Markdown("## Generate QA")
79
  qa_input = gr.Textbox(lines=12, value=DEFAULT_QA_JSON, label="QARequest JSON")
80
  qa_btn = gr.Button("Submit", variant="primary")
81
+ qa_out = gr.Textbox(lines=15,label="Response")
82
  qa_btn.click(generate_qa_tool, inputs=qa_input, outputs=qa_out)
83
  gr.Markdown("---")
84
 
 
 
 
 
85
  if __name__ == "__main__":
86
+ demo.launch(
87
+ server_name="0.0.0.0",
88
+ server_port=7860,
89
+ mcp_server=True
90
+ )