Stephen-SMJ
/

DARE-R-Retriever

Feature Extraction

sentence-transformers

sentence-similarity

text-embeddings-inference

Model card Files Files and versions

Stephen-SMJ commited on Feb 27

Commit

1d99f91

·

verified ·

1 Parent(s): 4a8f15d

Update README.md

Files changed (1) hide show

README.md +4 -5

README.md CHANGED Viewed

@@ -14,7 +14,7 @@ base_model: sentence-transformers/all-MiniLM-L6-v2
 # DARE: Distribution-Aware Retrieval for R Functions
-DARE (Distribution-Aware Retrieval Embedding) is a specialized bi-encoder model designed to retrieve statistical and data analysis tools (R functions) based on **both natural language user queries and underlying data constraints** (data profiles).
 It is fine-tuned from `sentence-transformers/all-MiniLM-L6-v2` to serve as a high-precision tool retrieval module for Large Language Model (LLM) Agents in automated data science workflows.
@@ -22,13 +22,12 @@ It is fine-tuned from `sentence-transformers/all-MiniLM-L6-v2` to serve as a hig
 - **Architecture:** Bi-encoder (Sentence Transformer)
 - **Base Model:** `sentence-transformers/all-MiniLM-L6-v2` (22.7M parameters)
 - **Task:** Dense Retrieval for Tool-Augmented LLMs
 - **Domain:** R programming language, Data Science, Statistical Analysis functions
 - **Max Sequence Length:** 256 tokens
-## 💡 Why DARE? (The Input Formatting)
-Unlike traditional semantic search models that only take a natural language query, DARE is trained to be **distribution-conditional**. It expects a concatenated input of the user's intent AND the data profile (e.g., high-dimensional, sparse, categorical).
-To get optimal retrieval results, **do not just pass the raw query**. Append the data constraints as a JSON-like string at the end of the query.
 ### Usage (Sentence-Transformers)

 # DARE: Distribution-Aware Retrieval for R Functions
+DARE (Distribution-Aware Retrieval Embedding) is a specialized bi-encoder model designed to retrieve statistical and data analysis tools (R functions) based on **both user queries and conditional on data profile**.
 It is fine-tuned from `sentence-transformers/all-MiniLM-L6-v2` to serve as a high-precision tool retrieval module for Large Language Model (LLM) Agents in automated data science workflows.
 - **Architecture:** Bi-encoder (Sentence Transformer)
 - **Base Model:** `sentence-transformers/all-MiniLM-L6-v2` (22.7M parameters)
 - **Task:** Dense Retrieval for Tool-Augmented LLMs
+- **Performance**: SoTA on R package retrieval tasks.
 - **Domain:** R programming language, Data Science, Statistical Analysis functions
 - **Max Sequence Length:** 256 tokens
+<!-- ## 💡 Why DARE? (The Input Formatting)
+Unlike traditional semantic search models that only take a natural language query, DARE is trained to be **distribution-conditional**. It expects a concatenated input of the user's intent AND the data profile (e.g., high-dimensional, sparse, categorical). -->
 ### Usage (Sentence-Transformers)