Sync backend Docker context from GitHub main
Browse files
README.md
CHANGED
|
@@ -26,6 +26,7 @@ This project is a Retrieval-Augmented Generation (RAG) system built to answer CB
|
|
| 26 |
- [Installation and Setup](#installation-and-setup)
|
| 27 |
- [Configuration](#configuration)
|
| 28 |
- [Testing](#testing)
|
|
|
|
| 29 |
- [Contributors](#contributors)
|
| 30 |
|
| 31 |
## Live Demo and Repository
|
|
@@ -124,13 +125,41 @@ To replicate the system, ensure your environment variables contain valid API key
|
|
| 124 |
|
| 125 |
## Testing
|
| 126 |
|
| 127 |
-
Run `test.py` to
|
| 128 |
|
| 129 |
```bash
|
| 130 |
python test.py
|
| 131 |
```
|
| 132 |
|
| 133 |
-
This script evaluates multiple test queries across the configured chunking techniques and retrieval strategies, then writes the full output to `retrieval_report.md`.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 134 |
|
| 135 |
## Contributors
|
| 136 |
|
|
|
|
| 26 |
- [Installation and Setup](#installation-and-setup)
|
| 27 |
- [Configuration](#configuration)
|
| 28 |
- [Testing](#testing)
|
| 29 |
+
- [Running the Main Pipeline](#running-the-main-pipeline)
|
| 30 |
- [Contributors](#contributors)
|
| 31 |
|
| 32 |
## Live Demo and Repository
|
|
|
|
| 125 |
|
| 126 |
## Testing
|
| 127 |
|
| 128 |
+
Run `test.py` to benchmark the chunking strategies and retrieval configurations, then generate a complete Markdown report of the results.
|
| 129 |
|
| 130 |
```bash
|
| 131 |
python test.py
|
| 132 |
```
|
| 133 |
|
| 134 |
+
This script evaluates multiple test queries across the configured chunking techniques and retrieval strategies, then writes the full output to `retrieval_report.md`. Use that report to choose the best chunking strategy and retrieval configuration.
|
| 135 |
+
|
| 136 |
+
### Key variables you can change in `test.py`
|
| 137 |
+
|
| 138 |
+
- `test_queries`: the questions used for benchmarking.
|
| 139 |
+
- `CHUNKING_TECHNIQUES_FILTERED`: the chunking strategies included in the report.
|
| 140 |
+
- `RETRIEVAL_STRATEGIES`: the retrieval modes and MMR settings being compared.
|
| 141 |
+
- `index_name`: the Pinecone index that stores the chunked data.
|
| 142 |
+
- `top_k` and `final_k`: how many candidates are retrieved and how many are kept in the final context.
|
| 143 |
+
|
| 144 |
+
## Running the Main Pipeline
|
| 145 |
+
|
| 146 |
+
After testing, run `main.py` to reproduce the main experiment with the selected configuration and evaluate faithfulness and relevancy across the model set. This script is part of the reproducibility workflow, since changing its configuration lets you rerun the same evaluation under different chunking, retrieval, and model settings.
|
| 147 |
+
|
| 148 |
+
```bash
|
| 149 |
+
python main.py
|
| 150 |
+
```
|
| 151 |
+
|
| 152 |
+
This step runs the end-to-end comparison flow for all models, measures faithfulness and relevancy for each one, and writes the detailed findings to `rag_ablation_findings.md`.
|
| 153 |
+
|
| 154 |
+
### Key variables you can change in `main.py`
|
| 155 |
+
|
| 156 |
+
- `CHUNKING_TECHNIQUES` or the technique filter used in the script: controls which chunking methods are evaluated.
|
| 157 |
+
- `test_queries`: the query set used for the ablation study.
|
| 158 |
+
- `MODEL_MAP`: the model lineup being compared.
|
| 159 |
+
- `retrieval_strategy`: the retrieval mode, MMR setting, and label for each run.
|
| 160 |
+
- `top_k` and `final_k`: candidate retrieval depth and final context size.
|
| 161 |
+
- `temperature` in `cfg.gen`: generation randomness for the model outputs.
|
| 162 |
+
- `output_file`: the markdown report written by the run, usually `rag_ablation_findings.md`.
|
| 163 |
|
| 164 |
## Contributors
|
| 165 |
|