|
|
--- |
|
|
library_name: transformers |
|
|
datasets: |
|
|
- HumanLLMs/Human-Like-DPO-Dataset |
|
|
language: |
|
|
- en |
|
|
base_model: |
|
|
- google-bert/bert-base-uncased |
|
|
--- |
|
|
|
|
|
# BERT Human-like Reward Model |
|
|
|
|
|
This is a reward model based on Bert Uncased. |
|
|
|
|
|
|
|
|
### Inference |
|
|
|
|
|
```python |
|
|
!pip install transformers accelerate |
|
|
model_name = "entfane/BERT_human_like_RM" |
|
|
tokenizer = AutoTokenizer.from_pretrained(model_name) |
|
|
model = AutoModelForSequenceClassification.from_pretrained(model_name) |
|
|
messages = ["How are you doing? Great", |
|
|
"How are you doing? Greetings! I am doing just fine, may I ask you, how are you doing?" |
|
|
] |
|
|
input = tokenizer(messages, return_tensors="pt", padding="max_length").to(model.device) |
|
|
output = model(**input) |
|
|
print(output) |
|
|
``` |