Andrรฉ Oliveira commited on
Commit
709c564
ยท
1 Parent(s): 92127ad

docs: updated tool info

Browse files
Files changed (2) hide show
  1. README.md +21 -19
  2. app.py +34 -35
README.md CHANGED
@@ -44,11 +44,11 @@ Ragmint MCP Server exposes the full power of **Ragmint**, a modular Python libra
44
 
45
  ### Features exposed via MCP:
46
 
47
- * โœ… Automated hyperparameter optimization (Grid, Random, Bayesian via Optuna)
48
- * ๐Ÿค– Auto-RAG Tuner for dynamic retrieverโ€“embedding recommendations
49
- * ๐Ÿงฎ Validation QA generation for corpora without labeled data
50
- * ๐Ÿ“ฆ Chunking, embeddings, retrievers, rerankers configuration
51
- * โš™๏ธ Full RAG pipeline control programmatically
52
 
53
  ---
54
 
@@ -68,7 +68,7 @@ python app.py
68
 
69
  The server will expose MCP-compatible endpoints, allowing clients to:
70
 
71
- * Perform optimization experiments
72
  * Automatically autotune pipelines.
73
  * Generate validation QA sets with LLM.
74
 
@@ -108,11 +108,13 @@ embedding_model: sentence-transformers/all-MiniLM-L6-v2
108
 
109
  ## ๐Ÿ” Supported Retrievers
110
 
111
- | Retriever | Description |
112
- | ------------ | ---------------------------------- |
113
- | FAISS | Fast vector similarity search |
114
- | Chroma | Persistent vector DB |
115
- | scikit-learn | Local lightweight NearestNeighbors |
 
 
116
 
117
  ### Configuration Example
118
 
@@ -124,12 +126,12 @@ retriever: faiss
124
 
125
  ## ๐Ÿงฎ Dataset Options
126
 
127
- | Mode | Example | Description |
128
- | -------------------- | ---------------------------------- | -------------------------------------------- |
129
- | Default | validation_set=None | Uses built-in experiments/validation_qa.json |
130
- | Custom File | validation_set="data/my_eval.json" | Your QA dataset |
131
- | Hugging Face Dataset | validation_set="squad" | Downloads benchmark dataset |
132
-
133
 
134
  ---
135
 
@@ -497,8 +499,8 @@ The Ragmint MCP Server exposes three main endpoints with the following example o
497
  ```
498
  </details>
499
 
500
- - **deleted_files**: Number of documents removed
501
- - **status**: "ok" indicates successful workspace reset
502
 
503
  ---
504
 
 
44
 
45
  ### Features exposed via MCP:
46
 
47
+ * โœ… Automated hyperparameter optimization (Grid, Random, Bayesian via Optuna).
48
+ * ๐Ÿค– Auto-RAG Tuner for dynamic retrieverโ€“embedding recommendations.
49
+ * ๐Ÿงฎ Validation QA generation for corpora without labeled data.
50
+ * ๐Ÿ“ฆ Chunking, embeddings, retrievers, rerankers configuration.
51
+ * โš™๏ธ Full RAG pipeline control programmatically.
52
 
53
  ---
54
 
 
68
 
69
  The server will expose MCP-compatible endpoints, allowing clients to:
70
 
71
+ * Perform optimization experiments.
72
  * Automatically autotune pipelines.
73
  * Generate validation QA sets with LLM.
74
 
 
108
 
109
  ## ๐Ÿ” Supported Retrievers
110
 
111
+ | Retriever | Description |
112
+ |--------------|------------------------------------------------------------------|
113
+ | FAISS | Fast vector similarity search and indexing. |
114
+ | Chroma | Persistent vector database with embeddings. |
115
+ | scikit-learn | Local lightweight NearestNeighbors retrieval. |
116
+ | bm25 | Classical lexical search based on term relevance (TF-IDF-style). |
117
+ | numpy | Brute-force similarity search using raw vectors and matrix ops. |
118
 
119
  ### Configuration Example
120
 
 
126
 
127
  ## ๐Ÿงฎ Dataset Options
128
 
129
+ | Mode | Example | Description |
130
+ |----------------------|------------------------------------|------------------------------------|
131
+ | Default | validation_set=None | Uses built-in validation_qa.json. |
132
+ | Custom File | validation_set="data/my_eval.json" | Your QA dataset. |
133
+ | Hugging Face Dataset | validation_set="squad" | Downloads benchmark dataset. |
134
+ | Generate | validation_set="generate" | Generates the QA dataset with LLM. |
135
 
136
  ---
137
 
 
499
  ```
500
  </details>
501
 
502
+ - **deleted_files**: Number of documents removed.
503
+ - **status**: "ok" indicates successful workspace reset.
504
 
505
  ---
506
 
app.py CHANGED
@@ -25,7 +25,7 @@ def call_api(endpoint: str, payload: dict) -> str:
25
 
26
  def clear_cache_tool(docs_path="data/docs"):
27
  """
28
- ๐Ÿงน Clear Cache MCP Tool.
29
 
30
  Deletes all files and directories inside docs_path on the server.
31
 
@@ -136,8 +136,8 @@ DEFAULT_AUTOTUNE_JSON = model_to_json(AutotuneRequest)
136
  DEFAULT_QA_JSON = model_to_json(QARequest)
137
 
138
 
139
- with gr.Blocks(theme=gr.themes.Soft()) as demo:
140
- gr.Markdown("# ๐Ÿค– Ragmint MCP Server")
141
 
142
  gr.HTML("""
143
  <div style="display:flex; gap:5px; flex-wrap:wrap; align-items:center;">
@@ -163,35 +163,35 @@ with gr.Blocks(theme=gr.themes.Soft()) as demo:
163
 
164
  <br>
165
 
166
- ## ๐Ÿ”ง MCP Tools (AI-Driven & Automated)
167
 
168
- - ๐Ÿ“„ **Upload Docs**: Upload .txt files to workspace for evaluation
169
- - ๐Ÿ”— **Upload URLs**: Import remote .txt docs via URLs
170
- - ๐Ÿง  **Optimize RAG**: Full hyperparameter search (Grid / Random / Bayesian) with metrics
171
- - โš™๏ธ **Autotune RAG**: Automated recommendations for best chunking + embeddings
172
- - โ“ **Generate QA Dataset**: Create validation QA pairs with LLMs for benchmarking
173
- - ๐Ÿงน **Clear Cache**: Reset workspace and delete stored docs
174
 
175
  <br>
176
 
177
  ## ๐Ÿง  What Ragmint Solves
178
 
179
- - Automated RAG hyperparameter optimization
180
- - Retriever, embedding, reranker selection
181
- - Synthetic validation QA generation
182
- - Evaluation metrics (faithfulness, latency, etc.)
183
- - Experiment tracking & reproducible pipeline comparison
184
 
185
  ๐Ÿ”ฌ **Built for RAG engineers, researchers, and LLM developers** who want consistent performance improvement without trial-and-error.
186
 
187
  <br>
188
 
189
- ## ๐Ÿง  Powered by
190
 
191
- - **Optuna** (Bayesian Optimization)
192
- - **Google Gemini 2.5 Flash Lite / Pro**
193
- - **FAISS, Chroma, BM25, scikit-learn retrievers**
194
- - **Sentence-Transformers / BGE embeddings**
195
 
196
  <br>
197
 
@@ -207,10 +207,10 @@ with gr.Blocks(theme=gr.themes.Soft()) as demo:
207
 
208
  ## ๐Ÿ“ฆ Example MCP Use Cases
209
 
210
- - ๐Ÿง  Run Auto-Optimization for RAG pipelines
211
- - ๐Ÿ“Š Compare embedding + retriever combinations
212
- - โ“ Automatically generate QA validation datasets
213
- - ๐Ÿ” Rapid experiment iteration inside Claude / Cursor
214
 
215
  <br>
216
 
@@ -234,18 +234,17 @@ with gr.Blocks(theme=gr.themes.Soft()) as demo:
234
  # Upload Documents
235
  with gr.Column(scale=1):
236
  gr.Markdown("## Upload Documents")
237
- gr.Markdown("๐Ÿ“‚ Upload files (local paths or URLs) to your `data/docs` folder")
238
  upload_files = gr.File(file_count="multiple", type="filepath")
239
  upload_path = gr.Textbox(value=DEFAULT_UPLOAD_PATH, label="Docs Path")
240
- upload_btn = gr.Button("Upload", variant="primary")
241
  upload_out = gr.JSON(label="Response")
242
  upload_btn.click(upload_docs_tool, inputs=[upload_files, upload_path], outputs=upload_out)
243
 
244
-
245
  # Upload MCP Documents (no file uploader)
246
  with gr.Column(scale=1):
247
  gr.Markdown("## Upload Documents from URLs")
248
- gr.Markdown("๐Ÿ“‚ Upload files (URLs) to your `data/docs` folder on MCP.")
249
 
250
  upload_mcp_input = gr.TextArea(
251
  placeholder="Paste URLs (one per line without commas)",
@@ -265,7 +264,7 @@ with gr.Blocks(theme=gr.themes.Soft()) as demo:
265
  return upload_docs_tool(urls, docs_path)
266
 
267
  upload_mcp_path = gr.Textbox(value=DEFAULT_UPLOAD_PATH, label="Docs Path")
268
- upload_mcp_btn = gr.Button("Upload", variant="primary")
269
  upload_mcp_out = gr.JSON(label="Response")
270
 
271
  upload_mcp_btn.click(
@@ -323,7 +322,7 @@ with gr.Blocks(theme=gr.themes.Soft()) as demo:
323
  label="LLM Model"
324
  )
325
 
326
- autotune_btn = gr.Button("Autotune", variant="primary")
327
  autotune_out = gr.Textbox(label="Response", lines=15)
328
 
329
 
@@ -433,7 +432,7 @@ with gr.Blocks(theme=gr.themes.Soft()) as demo:
433
  label="LLM Model"
434
  )
435
 
436
- optimize_btn = gr.Button("Optimize", variant="primary")
437
  optimize_out = gr.Textbox(label="Response", lines=15)
438
 
439
 
@@ -497,7 +496,7 @@ with gr.Blocks(theme=gr.themes.Soft()) as demo:
497
  min_q = gr.Slider(1, 20, step=1, value=3, label="Min Questions")
498
  max_q = gr.Slider(1, 50, step=1, value=25, label="Max Questions")
499
 
500
- qa_btn = gr.Button("Generate QA", variant="primary")
501
  qa_out = gr.Textbox(lines=15, label="Response")
502
 
503
 
@@ -524,13 +523,13 @@ with gr.Blocks(theme=gr.themes.Soft()) as demo:
524
 
525
  gr.Markdown("---")
526
 
527
- with gr.Tab("๐Ÿงน Clear Cache"):
528
  # Clear Cache
529
  with gr.Column():
530
  gr.Markdown("## Clear Cache")
531
- gr.Markdown("๐Ÿงน Deletes all files and directories inside docs_path on the server.")
532
  clear_path = gr.Textbox(value=DEFAULT_UPLOAD_PATH, label="Docs Path to Clear")
533
- clear_btn = gr.Button("Clear Cache", variant="primary")
534
  clear_out = gr.JSON(label="Response")
535
  clear_btn.click(clear_cache_tool, inputs=[clear_path], outputs=clear_out)
536
  gr.Markdown("---")
 
25
 
26
  def clear_cache_tool(docs_path="data/docs"):
27
  """
28
+ ๐Ÿ—‘๏ธ Clear Cache MCP Tool.
29
 
30
  Deletes all files and directories inside docs_path on the server.
31
 
 
136
  DEFAULT_QA_JSON = model_to_json(QARequest)
137
 
138
 
139
+ with gr.Blocks(theme=gr.themes.Ocean()) as demo:
140
+ gr.Markdown("# ๐Ÿง  Ragmint MCP Server")
141
 
142
  gr.HTML("""
143
  <div style="display:flex; gap:5px; flex-wrap:wrap; align-items:center;">
 
163
 
164
  <br>
165
 
166
+ ## ๐Ÿ”ง MCP Tools
167
 
168
+ - ๐Ÿ“„ **Upload Docs**: Upload .txt files to workspace for evaluation using `upload_docs`.
169
+ - ๐Ÿ”— **Upload URLs**: Import remote docs via URLs with `upload_urls`.
170
+ - ๐Ÿ”ง **Optimize RAG**: Full hyperparameter search (Grid/Random/Bayesian) with metrics on `optimize_rag`.
171
+ - โšก๏ธ **Autotune RAG**: Automated recommendations for best chunking and embeddings with `autotune`.
172
+ - ๐Ÿงฉ **Generate QA Dataset**: Create validation QA pairs with LLMs for benchmarking using `generate_qa`.
173
+ - ๐Ÿ—‘๏ธ **Clear Cache**: Reset workspace and delete stored docs with `clear_cache`.
174
 
175
  <br>
176
 
177
  ## ๐Ÿง  What Ragmint Solves
178
 
179
+ - Automated RAG hyperparameter optimization.
180
+ - Retriever, embedding, reranker selection.
181
+ - Synthetic validation QA generation.
182
+ - Evaluation metrics (faithfulness, latency, etc.).
183
+ - Experiment tracking & reproducible pipeline comparison.
184
 
185
  ๐Ÿ”ฌ **Built for RAG engineers, researchers, and LLM developers** who want consistent performance improvement without trial-and-error.
186
 
187
  <br>
188
 
189
+ ## โš™ Powered by
190
 
191
+ - Optuna (Bayesian Optimization).
192
+ - Google Gemini 2.5 Flash Lite/Pro.
193
+ - FAISS, Chroma, BM25, scikit-learn retrievers.
194
+ - Sentence-Transformers/BGE embeddings.
195
 
196
  <br>
197
 
 
207
 
208
  ## ๐Ÿ“ฆ Example MCP Use Cases
209
 
210
+ - Run Auto-Optimization for RAG pipelines.
211
+ - Compare embedding + retriever combinations.
212
+ - Automatically generate QA validation datasets.
213
+ - Rapid experiment iteration inside Claude/Cursor.
214
 
215
  <br>
216
 
 
234
  # Upload Documents
235
  with gr.Column(scale=1):
236
  gr.Markdown("## Upload Documents")
237
+ gr.Markdown("๐Ÿ“„ Upload files (local paths or URLs) to your `data/docs` folder.")
238
  upload_files = gr.File(file_count="multiple", type="filepath")
239
  upload_path = gr.Textbox(value=DEFAULT_UPLOAD_PATH, label="Docs Path")
240
+ upload_btn = gr.Button("Upload", variant="huggingface")
241
  upload_out = gr.JSON(label="Response")
242
  upload_btn.click(upload_docs_tool, inputs=[upload_files, upload_path], outputs=upload_out)
243
 
 
244
  # Upload MCP Documents (no file uploader)
245
  with gr.Column(scale=1):
246
  gr.Markdown("## Upload Documents from URLs")
247
+ gr.Markdown("๐Ÿ”— Upload files (URLs) to your `data/docs` folder on MCP.")
248
 
249
  upload_mcp_input = gr.TextArea(
250
  placeholder="Paste URLs (one per line without commas)",
 
264
  return upload_docs_tool(urls, docs_path)
265
 
266
  upload_mcp_path = gr.Textbox(value=DEFAULT_UPLOAD_PATH, label="Docs Path")
267
+ upload_mcp_btn = gr.Button("Upload", variant="huggingface")
268
  upload_mcp_out = gr.JSON(label="Response")
269
 
270
  upload_mcp_btn.click(
 
322
  label="LLM Model"
323
  )
324
 
325
+ autotune_btn = gr.Button("Autotune", variant="huggingface")
326
  autotune_out = gr.Textbox(label="Response", lines=15)
327
 
328
 
 
432
  label="LLM Model"
433
  )
434
 
435
+ optimize_btn = gr.Button("Optimize", variant="huggingface")
436
  optimize_out = gr.Textbox(label="Response", lines=15)
437
 
438
 
 
496
  min_q = gr.Slider(1, 20, step=1, value=3, label="Min Questions")
497
  max_q = gr.Slider(1, 50, step=1, value=25, label="Max Questions")
498
 
499
+ qa_btn = gr.Button("Generate QA", variant="huggingface")
500
  qa_out = gr.Textbox(lines=15, label="Response")
501
 
502
 
 
523
 
524
  gr.Markdown("---")
525
 
526
+ with gr.Tab("๐Ÿ—‘๏ธ Clear Cache"):
527
  # Clear Cache
528
  with gr.Column():
529
  gr.Markdown("## Clear Cache")
530
+ gr.Markdown("๐Ÿ—‘๏ธ Deletes all files and directories inside docs_path on the server.")
531
  clear_path = gr.Textbox(value=DEFAULT_UPLOAD_PATH, label="Docs Path to Clear")
532
+ clear_btn = gr.Button("Clear Cache", variant="huggingface")
533
  clear_out = gr.JSON(label="Response")
534
  clear_btn.click(clear_cache_tool, inputs=[clear_path], outputs=clear_out)
535
  gr.Markdown("---")