juliaturc commited on
Commit
7cec107
·
1 Parent(s): 292215b

Default --retrieval-alpha to 1.0

Browse files

We've shown on our benchmark that BM-25 is actively damaging when retrieving code. Also, it comes with some overhead (needing nltk models, etc.) so it makes sense to default to dense-retrieval only.

Files changed (1) hide show
  1. sage/config.py +1 -1
sage/config.py CHANGED
@@ -137,7 +137,7 @@ def add_vector_store_args(parser: ArgumentParser) -> Callable:
137
  )
138
  parser.add(
139
  "--retrieval-alpha",
140
- default=0.5,
141
  type=float,
142
  help="Takes effect for Pinecone retriever only. The weight of the dense (embeddings-based) vs sparse (BM25) "
143
  "encoder in the final retrieval score. A value of 0.0 means BM25 only, 1.0 means embeddings only.",
 
137
  )
138
  parser.add(
139
  "--retrieval-alpha",
140
+ default=1.0,
141
  type=float,
142
  help="Takes effect for Pinecone retriever only. The weight of the dense (embeddings-based) vs sparse (BM25) "
143
  "encoder in the final retrieval score. A value of 0.0 means BM25 only, 1.0 means embeddings only.",