Spaces:

sentence-transformers
/

quantized-retrieval

Running

Tom Aarsen commited on Jan 5

Commit

cf19736

1 Parent(s): e8e8b51

Keep URL when filtering dataset, removes only id

Files changed (1) hide show

app.py CHANGED Viewed

@@ -10,7 +10,7 @@ import numpy as np
 # Load titles, texts, and int8 embeddings in a lazy Dataset, allowing us to efficiently access specific rows on demand
 # Note that we never actually use the int8 embeddings for search directly, they are only used for rescoring after the binary search
-title_text_int8_dataset = load_dataset("sentence-transformers/quantized-retrieval-data", split="train").select_columns(["title", "text", "embedding"])
 # title_text_int8_dataset = load_from_disk("wikipedia-mxbai-embed-int8-index").select_columns(["url", "title", "text", "embedding"])
 # Load the binary indices

 # Load titles, texts, and int8 embeddings in a lazy Dataset, allowing us to efficiently access specific rows on demand
 # Note that we never actually use the int8 embeddings for search directly, they are only used for rescoring after the binary search
+title_text_int8_dataset = load_dataset("sentence-transformers/quantized-retrieval-data", split="train").select_columns(["url", "title", "text", "embedding"])
 # title_text_int8_dataset = load_from_disk("wikipedia-mxbai-embed-int8-index").select_columns(["url", "title", "text", "embedding"])
 # Load the binary indices