juliaturc commited on
Commit
0470638
·
1 Parent(s): 8a9d053

Add usage instructions to retrieve_kaggle.py

Browse files
benchmarks/retrieval/retrieve_kaggle.py CHANGED
@@ -1,4 +1,19 @@
1
- """Script to call retrieval on the Kaggle dataset."""
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2
 
3
  import csv
4
  import json
 
1
+ """Script to call retrieval on the Kaggle dataset.
2
+
3
+ Steps:
4
+ 1. Make sure that your repository is already indexed. You can find instructions in the README for how to run the `sage-index` command.
5
+ 2. Download the test file from the Kaggle competition (https://www.kaggle.com/competitions/code-retrieval-for-hugging-face-transformers/data). You will pass the path to this file via the --benchmark flag below.
6
+ 3. Run this script:
7
+ ```
8
+ # After you cloned the repository:
9
+ cd sage
10
+ pip install -e .
11
+
12
+ # Run the actual retrieval script. Your flags may vary, but this is one example:
13
+ python benchmarks/retrieval/retrieve_kaggle.py --benchmark=/path/to/kaggle/test/file.csv --mode=remote --pinecone-index-name=your-index --index-namespace=your-namespace
14
+ ```
15
+ To see a full list of flags, checkout config.py (https://github.com/Storia-AI/sage/blob/main/sage/config.py).
16
+ """
17
 
18
  import csv
19
  import json