Improve model card: Add pipeline tag, library name, and relevant tags (#1)
Browse files- Improve model card: Add pipeline tag, library name, and relevant tags (3cd0fdaa10cb6b4ce123561e54be797f13f6592b)
Co-authored-by: Niels Rogge <nielsr@users.noreply.huggingface.co>
README.md
CHANGED
|
@@ -1,19 +1,28 @@
|
|
| 1 |
---
|
| 2 |
-
license: apache-2.0
|
| 3 |
-
language:
|
| 4 |
-
- en
|
| 5 |
base_model:
|
| 6 |
- Qwen/Qwen2.5-14B-Instruct
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 7 |
---
|
|
|
|
| 8 |
# Introduction
|
| 9 |
|
| 10 |
This is the official repo of the paper [Annotation-Efficient Universal Honesty Alignment](https://arxiv.org/abs/2510.17509)
|
| 11 |
|
| 12 |
This repository provides modules that extend **Qwen2.5-14B-Instruct** with the ability to generate accurate confidence scores *before* response generation, indicating how likely the model is to answer a given question correctly across tasks. We offer two types of modules—**LoRA + Linear Head** and **Linear Head**—along with model parameters under three training settings:
|
| 13 |
|
| 14 |
-
1.
|
| 15 |
-
2.
|
| 16 |
-
3.
|
| 17 |
|
| 18 |
For both **Calibration-Only** and **EliCal** settings, we provide models trained with different amounts of annotated data (1k, 2k, 3k, 5k, 8k, 10k, 20k, 30k, 50k, 80k, 200k, 560k+). Since **LoRA + Linear Head** is the main configuration used in our paper, the following description is based on this setup.
|
| 19 |
|
|
@@ -131,4 +140,6 @@ base_model = AutoModel.from_pretrained(args.model_path)
|
|
| 131 |
|
| 132 |
/mlp
|
| 133 |
...
|
| 134 |
-
```
|
|
|
|
|
|
|
|
|
| 1 |
---
|
|
|
|
|
|
|
|
|
|
| 2 |
base_model:
|
| 3 |
- Qwen/Qwen2.5-14B-Instruct
|
| 4 |
+
language:
|
| 5 |
+
- en
|
| 6 |
+
license: apache-2.0
|
| 7 |
+
pipeline_tag: text-generation
|
| 8 |
+
library_name: transformers
|
| 9 |
+
tags:
|
| 10 |
+
- honesty-alignment
|
| 11 |
+
- confidence-calibration
|
| 12 |
+
- lora
|
| 13 |
+
- peft
|
| 14 |
+
- llm-alignment
|
| 15 |
---
|
| 16 |
+
|
| 17 |
# Introduction
|
| 18 |
|
| 19 |
This is the official repo of the paper [Annotation-Efficient Universal Honesty Alignment](https://arxiv.org/abs/2510.17509)
|
| 20 |
|
| 21 |
This repository provides modules that extend **Qwen2.5-14B-Instruct** with the ability to generate accurate confidence scores *before* response generation, indicating how likely the model is to answer a given question correctly across tasks. We offer two types of modules—**LoRA + Linear Head** and **Linear Head**—along with model parameters under three training settings:
|
| 22 |
|
| 23 |
+
1. **Elicitation (greedy):** Trained on all questions (over 560k) using self-consistency-based confidence annotations.
|
| 24 |
+
2. **Calibration-Only (right):** Trained on questions with explicit correctness annotations.
|
| 25 |
+
3. **EliCal (hybrid):** Initialized from the Elicitation model and further trained on correctness-labeled data.
|
| 26 |
|
| 27 |
For both **Calibration-Only** and **EliCal** settings, we provide models trained with different amounts of annotated data (1k, 2k, 3k, 5k, 8k, 10k, 20k, 30k, 50k, 80k, 200k, 560k+). Since **LoRA + Linear Head** is the main configuration used in our paper, the following description is based on this setup.
|
| 28 |
|
|
|
|
| 140 |
|
| 141 |
/mlp
|
| 142 |
...
|
| 143 |
+
```
|
| 144 |
+
|
| 145 |
+
For more details, visit the [GitHub repository](https://github.com/Trustworthy-Information-Access/Annotation-Efficient-Universal-Honesty-Alignment).
|