infly
/

inf-query-aligner

Reinforcement Learning

query-rewriting

Model card Files Files and versions

Mosaic-glasses commited on Jan 5

Commit

6c0a937

·

verified ·

1 Parent(s): 6d7e3da

Update README.md

Files changed (1) hide show

README.md +43 -1

README.md CHANGED Viewed

@@ -104,6 +104,48 @@ response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
 print(response)
 ```
 ---
 ## 🖊️ Citation
@@ -124,5 +166,5 @@ If you find this model useful, please consider citing our work:
 ## 📬 Contact
-Yichen Yao ([eason.yyc@inftech.ai](mailto:eason.yyc@inftech.ai))

 print(response)
 ```
+---
+## Performance
+**INF-X-Retriever** achieves state-of-the-art results on the [BRIGHT Benchmark](https://brightbenchmark.github.io/) (as of Dec 20, 2025).
+The **BRIGHT** (Benchmark for Reasoning-Intensive Grounded HT) is a rigorous text retrieval benchmark designed to evaluate the capability of retrieval models in handling questions that require intensive reasoning and cross-document synthesis. Collected from real-world sources such as StackExchange, competitive programming platforms, and mathematical competitions, it comprises complex queries spanning diverse domains like mathematics, coding, biology, economics, and robotics.
+### Short document
+#### Overall & Category Performance
+| Model | **Avg ALL** | **StackExchange** | **Coding** | **Theorem-based** |
+|:---|:---:|:---:|:---:|:---:|
+| **INF-X-Retriever** | **63.4** | **68.3** | **55.3** | **57.7** |
+| DIVER (v3) | 46.8 | 51.8 | 39.9 | 39.7 |
+| BGE-Reasoner-0928 | 46.4 | 52.0 | 35.3 | 40.7 |
+| LATTICE | 42.1 | 51.6 | 26.9 | 30.0 |
+| ReasonRank | 40.8 | 46.9 | 27.6 | 35.5 |
+| XDR2 | 40.3 | 47.1 | 28.5 | 32.1 |
+#### Detailed Results Across 12 Datasets
+| Model | Avg | Bio. | Earth. | Econ. | Psy. | Rob. | Stack. | Sus. | Leet. | Pony | AoPS | TheoQ. | TheoT. |
+| :--- | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
+| **INF-X-Retriever** | **63.4** | **79.8** | **70.9** | **69.9** | **73.3** | **57.7** | **64.3** | **61.9** | **56.1** | **54.5** | **51.9** | **53.1** | **67.9** |
+| DIVER (v3) | 46.8 | 66.0 | 63.7 | 42.4 | 55.0 | 40.6 | 44.7 | 50.4 | 32.5 | 47.3 | 17.2 | 46.4 | 55.6 |
+| BGE-Reasoner-0928 | 46.4 | 68.5 | 66.4 | 40.6 | 53.1 | 43.2 | 44.1 | 47.8 | 29.0 | 41.6 | 17.2 | 46.5 | 58.4 |
+| LATTICE | 42.1 | 64.4 | 62.4 | 45.4 | 57.4 | 47.6 | 37.6 | 46.4 | 19.9 | 34.0 | 12.0 | 30.1 | 47.8 |
+| ReasonRank | 40.8 | 62.7 | 55.5 | 36.7 | 54.6 | 35.7 | 38.0 | 44.8 | 29.5 | 25.6 | 14.4 | 42.0 | 50.1 |
+| XDR2 | 40.3 | 63.1 | 55.4 | 38.5 | 52.9 | 37.1 | 38.2 | 44.6 | 21.9 | 35.0 | 15.7 | 34.4 | 46.2 |
+### Long document
+#### Detailed Results Across 8 Datasets
+| Model | Avg | Bio. | Earth. | Econ. | Pony | Psy. | Rob. | Stack. | Sus. |
+| :--- | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
+| **INF-X-Retriever** | **54.6** | **73.2** | **59.6** | **69.3** | **12.1** | **74.3** | **55.9** | **27.8** | **64.8** |
+| inf-retriever-v1-pro | 30.5 | 44.1 | 42.2 | 31.4 | 0.4 | 43.1 | 20.8 | 21.4 | 41.0 |
 ---
 ## 🖊️ Citation
 ## 📬 Contact
+Email: [eason.yyc@inftech.ai](mailto:eason.yyc@inftech.ai)