Add pipeline tag and library name metadata

#1
by nielsr HF Staff - opened
Files changed (1) hide show
  1. README.md +7 -5
README.md CHANGED
@@ -1,15 +1,17 @@
1
  ---
2
- license: apache-2.0
 
3
  language:
4
  - en
 
 
 
5
  tags:
6
  - long-context
7
  - reinforcement-learning
8
  - reasoning
9
  - rubric-reward
10
  - qwen3
11
- base_model:
12
- - Qwen/Qwen3-4B
13
  ---
14
 
15
  # LongTraceRL-4B
@@ -19,7 +21,7 @@ base_model:
19
 
20
  ## Model Description
21
 
22
- **LongTraceRL-4B** is a 4-billion parameter reasoning model trained with reinforcement learning on long-context multi-hop QA tasks using trajectory-based tiered distractors and entity-level rubric rewards.
23
 
24
  ## Model Details
25
 
@@ -61,4 +63,4 @@ tokenizer = AutoTokenizer.from_pretrained("THU-KEG/LongTraceRL-4B")
61
  primaryClass={cs.CL},
62
  url={https://arxiv.org/abs/2605.31584},
63
  }
64
- ```
 
1
  ---
2
+ base_model:
3
+ - Qwen/Qwen3-4B
4
  language:
5
  - en
6
+ license: apache-2.0
7
+ library_name: transformers
8
+ pipeline_tag: text-generation
9
  tags:
10
  - long-context
11
  - reinforcement-learning
12
  - reasoning
13
  - rubric-reward
14
  - qwen3
 
 
15
  ---
16
 
17
  # LongTraceRL-4B
 
21
 
22
  ## Model Description
23
 
24
+ **LongTraceRL-4B** is a 4-billion parameter reasoning model introduced in the paper [LongTraceRL: Learning Long-Context Reasoning from Search Agent Trajectories with Rubric Rewards](https://arxiv.org/abs/2605.31584). It is trained with reinforcement learning on long-context multi-hop QA tasks using trajectory-based tiered distractors and entity-level rubric rewards.
25
 
26
  ## Model Details
27
 
 
63
  primaryClass={cs.CL},
64
  url={https://arxiv.org/abs/2605.31584},
65
  }
66
+ ```