|
|
--- |
|
|
library_name: transformers |
|
|
license: llama3.2 |
|
|
datasets: |
|
|
- ConicCat/Lamp-P-ImplicitPreference |
|
|
base_model: |
|
|
- meta-llama/Llama-3.2-3B-Instruct |
|
|
--- |
|
|
|
|
|
# ConicCat/Lamp-P-Writing-Quality-RM |
|
|
|
|
|
This is a paragraph level writing quality Bradley-Terry reward model, trained using the Lamp-P dataset from "[AI-Slop to AI-Polish? Aligning Language Models through Edit-Based Writing Rewards and Test-time Computation](https://arxiv.org/abs/2504.07532v1)" |
|
|
|
|
|
This model achieves a validation set accuracy of 100% and an eval loss of 0.0756. |
|
|
|
|
|
For accurate scoring of long texts I highly reccomend that you chunk input text into paragraphs, compute scores for each paragraph, then average. |
|
|
|
|
|
Compared to ConicCat/Litbench-Creative-Writing-RM-3B this model is focused on low level writing skill. |
|
|
|
|
|
|
|
|
## Inference: |
|
|
|
|
|
``` |
|
|
from transformers import AutoModelForSequenceClassification, AutoTokenizer |
|
|
model = AutoModelForSequenceClassification.from_pretrained("ConicCat/Lamp-P-Writing-Quality-RM", torch_dtype="bfloat16") |
|
|
tokenizer = AutoTokenizer.from_pretrained("ConicCat/Lamp-P-Writing-Quality-RM") |
|
|
|
|
|
text = "Dummy text." #Expects raw text input, no instructions, chat template, or formatting. |
|
|
|
|
|
tokenized_text = tokenizer(text, return_tensors="pt").to("cuda:0") |
|
|
print(model(**tokenized_text).logits[0][0].item()) # Reward score |
|
|
``` |
|
|
|
|
|
|
|
|
|