Add metadata for library_name and pipeline_tag

#1
by nielsr HF Staff - opened
Files changed (1) hide show
  1. README.md +30 -11
README.md CHANGED
@@ -1,31 +1,49 @@
1
  ---
2
  base_model:
3
  - Qwen/Qwen3-Embedding-4B
 
 
4
  ---
 
 
 
5
  AgentIR-4B is a retriever specialized for Deep Research agents. Unlike conventional retrievers that process queries with no awareness of the agent, AgentIR explicitly incorporates the agent's reasoning trace by jointly embedding it with the query, leveraging the rich intent and contextual information expressed in the agent's reasoning.
6
 
7
  When employed for end-to-end Deep Research on BrowseComp-Plus, AgentIR brings substantial effectiveness and efficiency gains for agents, improving agent accuracy while reducing the number of problem-solving iterations.
8
- - Paper: https://arxiv.org/abs/2603.04384
9
- - Code: https://github.com/texttron/AgentIR/tree/main
10
- - Data: https://huggingface.co/datasets/Tevatron/AgentIR-data
11
- - Project Page: https://texttron.github.io/AgentIR/
 
12
 
13
  ## Quick Usage
 
14
  For a quick start with only minimal dependencies (`torch` and `transformers`):
15
 
16
- ```py
17
  import torch
18
  from transformers import AutoModel, AutoTokenizer
19
 
20
  MODEL = "Tevatron/AgentIR-4B"
21
- PREFIX = "Instruct: Given a user's reasoning followed by a web search query, retrieve relevant passages that answer the query while incorporating the user's reasoning\nQuery:"
 
22
  QUERY = """Reasoning: Search results show some relevant info about music and Grammy. We need a composer who won a Grammy, could be from Sweden/Finland/Austria (joined 1995)? The person is known for a certain creation that is a subgenre known for euphoric finale. Which subgenre has a euphoric finale? "Progressive house"? There's a structure: Build-up, breakdown, climax, drop, euphoria. They started creating this piece in a small studio's backroom.
23
 
24
  Query: "backroom" "studio" "early 2010s" "euphoric"
25
  """
26
  DOCS = [
27
- "35+ Studios With Upcoming Games to Watch: Turtle Rock Studios\n\nMaking its name on the classic Left 4 Dead series of games, Turtle Rock Studios is working on an all-new co-op game called Back 4 Blood that sees you fighting through a zombie apocalypse. Sound familiar? Announced in early 2019 and being published",
28
- "name: Otto Knows\nimage_upright: 1.25\nbirth_name: Otto Jettman\nbirth_date: 6 05 1989\nbirth_place: Stockholm, Sweden\ngenre: Electro house, house, progressive house\noccupation: DJ, music producer, remixer\n\nOtto Jettman (born 6 May 1989), better known by his stage name Otto Knows is a Swedish DJ, producer and remixer who has had a number of hits in Sweden, Belgium and the Netherlands"
 
 
 
 
 
 
 
 
 
 
29
  ]
30
 
31
  def embed(texts, model, tokenizer, device, is_query=False):
@@ -52,14 +70,15 @@ for doc, vec in zip(DOCS, docs):
52
  print(f"{torch.dot(q, vec).item():.6f} {doc}")
53
  ```
54
 
55
- To run end-to-end Deep Research with AgentIR-4B, please see https://github.com/texttron/AgentIR/tree/main.
56
 
57
  ## Citation
58
- ```
 
59
  @article{chen2026AgentIR,
60
  title={AgentIR: Reasoning-Aware Retrieval for Deep Research Agents},
61
  author={Zijian Chen and Xueguang Ma and Shengyao Zhuang and Jimmy Lin and Akari Asai and Victor Zhong},
62
  year={2026},
63
  journal={arXiv preprint arXiv:2603.04384}
64
  }
65
- ```
 
1
  ---
2
  base_model:
3
  - Qwen/Qwen3-Embedding-4B
4
+ library_name: transformers
5
+ pipeline_tag: feature-extraction
6
  ---
7
+
8
+ # AgentIR-4B
9
+
10
  AgentIR-4B is a retriever specialized for Deep Research agents. Unlike conventional retrievers that process queries with no awareness of the agent, AgentIR explicitly incorporates the agent's reasoning trace by jointly embedding it with the query, leveraging the rich intent and contextual information expressed in the agent's reasoning.
11
 
12
  When employed for end-to-end Deep Research on BrowseComp-Plus, AgentIR brings substantial effectiveness and efficiency gains for agents, improving agent accuracy while reducing the number of problem-solving iterations.
13
+
14
+ - **Paper:** [AgentIR: Reasoning-Aware Retrieval for Deep Research Agents](https://arxiv.org/abs/2603.04384)
15
+ - **Code:** [https://github.com/texttron/AgentIR](https://github.com/texttron/AgentIR)
16
+ - **Data:** [Tevatron/AgentIR-data](https://huggingface.co/datasets/Tevatron/AgentIR-data)
17
+ - **Project Page:** [https://texttron.github.io/AgentIR/](https://texttron.github.io/AgentIR/)
18
 
19
  ## Quick Usage
20
+
21
  For a quick start with only minimal dependencies (`torch` and `transformers`):
22
 
23
+ ```python
24
  import torch
25
  from transformers import AutoModel, AutoTokenizer
26
 
27
  MODEL = "Tevatron/AgentIR-4B"
28
+ PREFIX = "Instruct: Given a user's reasoning followed by a web search query, retrieve relevant passages that answer the query while incorporating the user's reasoning
29
+ Query:"
30
  QUERY = """Reasoning: Search results show some relevant info about music and Grammy. We need a composer who won a Grammy, could be from Sweden/Finland/Austria (joined 1995)? The person is known for a certain creation that is a subgenre known for euphoric finale. Which subgenre has a euphoric finale? "Progressive house"? There's a structure: Build-up, breakdown, climax, drop, euphoria. They started creating this piece in a small studio's backroom.
31
 
32
  Query: "backroom" "studio" "early 2010s" "euphoric"
33
  """
34
  DOCS = [
35
+ "35+ Studios With Upcoming Games to Watch: Turtle Rock Studios
36
+
37
+ Making its name on the classic Left 4 Dead series of games, Turtle Rock Studios is working on an all-new co-op game called Back 4 Blood that sees you fighting through a zombie apocalypse. Sound familiar? Announced in early 2019 and being published",
38
+ "name: Otto Knows
39
+ image_upright: 1.25
40
+ birth_name: Otto Jettman
41
+ birth_date: 6 05 1989
42
+ birth_place: Stockholm, Sweden
43
+ genre: Electro house, house, progressive house
44
+ occupation: DJ, music producer, remixer
45
+
46
+ Otto Jettman (born 6 May 1989), better known by his stage name Otto Knows is a Swedish DJ, producer and remixer who has had a number of hits in Sweden, Belgium and the Netherlands"
47
  ]
48
 
49
  def embed(texts, model, tokenizer, device, is_query=False):
 
70
  print(f"{torch.dot(q, vec).item():.6f} {doc}")
71
  ```
72
 
73
+ To run end-to-end Deep Research with AgentIR-4B, please see [https://github.com/texttron/AgentIR/tree/main](https://github.com/texttron/AgentIR/tree/main).
74
 
75
  ## Citation
76
+
77
+ ```bibtex
78
  @article{chen2026AgentIR,
79
  title={AgentIR: Reasoning-Aware Retrieval for Deep Research Agents},
80
  author={Zijian Chen and Xueguang Ma and Shengyao Zhuang and Jimmy Lin and Akari Asai and Victor Zhong},
81
  year={2026},
82
  journal={arXiv preprint arXiv:2603.04384}
83
  }
84
+ ```