Add library name and pipeline tag
Browse filesThis PR adds the `library_name` and `pipeline_tag` as metadata, ensuring the "how to use" button appears, and ensuring people can find your model at https://huggingface.co/models?pipeline_tag=text-generation. It also adds the paper abstract and a link to the demo.
README.md
CHANGED
|
@@ -1,22 +1,27 @@
|
|
| 1 |
---
|
| 2 |
-
|
|
|
|
| 3 |
datasets:
|
| 4 |
- allenai/tulu-v2-sft-mixture
|
| 5 |
- weqweasdas/preference_dataset_mixture2_and_safe_pku
|
| 6 |
language:
|
| 7 |
- en
|
| 8 |
-
|
| 9 |
-
|
|
|
|
| 10 |
---
|
|
|
|
| 11 |
# TESS 2 RM
|
| 12 |
|
| 13 |
This model is the reward model used for reward guidance decoding.
|
| 14 |
This model was finetuned from Mistral 7B v0.1, first instruction tuning using the Tulu 2 SFT mixture, and then RM-trained using the preference dataset mixture found [here](https://huggingface.co/datasets/weqweasdas/preference_dataset_mixture2_and_safe_pku).
|
| 15 |
For more details, please check out our paper [TESS-2: A Large-Scale, Generalist Diffusion Language Model](https://arxiv.org/abs/2502.13917).
|
| 16 |
|
|
|
|
|
|
|
| 17 |
## Using this model
|
| 18 |
|
| 19 |
-
This model is intended to be used with the repository https://github.com/hamishivi/tess-2 for guiding diffusion LM generations.
|
| 20 |
|
| 21 |
To run to this, first clone https://github.com/hamishivi/tess-2.
|
| 22 |
|
|
|
|
| 1 |
---
|
| 2 |
+
base_model:
|
| 3 |
+
- mistralai/Mistral-7B-v0.1
|
| 4 |
datasets:
|
| 5 |
- allenai/tulu-v2-sft-mixture
|
| 6 |
- weqweasdas/preference_dataset_mixture2_and_safe_pku
|
| 7 |
language:
|
| 8 |
- en
|
| 9 |
+
license: apache-2.0
|
| 10 |
+
library_name: transformers
|
| 11 |
+
pipeline_tag: text-generation
|
| 12 |
---
|
| 13 |
+
|
| 14 |
# TESS 2 RM
|
| 15 |
|
| 16 |
This model is the reward model used for reward guidance decoding.
|
| 17 |
This model was finetuned from Mistral 7B v0.1, first instruction tuning using the Tulu 2 SFT mixture, and then RM-trained using the preference dataset mixture found [here](https://huggingface.co/datasets/weqweasdas/preference_dataset_mixture2_and_safe_pku).
|
| 18 |
For more details, please check out our paper [TESS-2: A Large-Scale, Generalist Diffusion Language Model](https://arxiv.org/abs/2502.13917).
|
| 19 |
|
| 20 |
+
We introduce TESS 2, a general instruction-following diffusion language model that outperforms contemporary instruction-tuned diffusion models, as well as matches and sometimes exceeds strong autoregressive (AR) models. We train TESS 2 by first adapting a strong AR model via continued pretraining with the usual cross-entropy as diffusion loss, and then performing further instruction tuning. We find that adaptation training as well as the choice of the base model is crucial for training good instruction-following diffusion models. We further propose reward guidance, a novel and modular inference-time guidance procedure to align model outputs without needing to train the underlying model. Finally, we show that TESS 2 further improves with increased inference-time compute, highlighting the utility of diffusion LMs in having fine-grained controllability over the amount of compute used at inference time. Code and models are available at https://github.com/hamishivi/tess-2.
|
| 21 |
+
|
| 22 |
## Using this model
|
| 23 |
|
| 24 |
+
This model is intended to be used with the repository https://github.com/hamishivi/tess-2 for guiding diffusion LM generations. You can also try out the demo: https://huggingface.co/spaces/hamishivi/tess-2-demo.
|
| 25 |
|
| 26 |
To run to this, first clone https://github.com/hamishivi/tess-2.
|
| 27 |
|