HAL-Qwen14B
Collection
HAL: Inducing Human-likeness in LLMs with Alignment • 3 items • Updated
YAML Metadata Warning:empty or missing yaml metadata in repo card
Check out the documentation for more information.
HAL is an alignment framework that makes LLMs more human-like in conversation by optimizing for interpretable conversational traits derived from contrastive dialogue data. It uses a transparent reward signal to align models via preference optimization, improving perceived human-likeness without degrading core capabilities.
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig
model_path = "roc-hci/HAL-Qwen14B-8bit"
bnb_config = BitsAndBytesConfig(
load_in_8bit=True
)
tokenizer = AutoTokenizer.from_pretrained(model_path)
model = AutoModelForCausalLM.from_pretrained(
model_path,
quantization_config=bnb_config,
device_map="auto"
)
model.eval()
prompt = """Generate a full conversation of the following person at the doctor's visit.
Give them a unique personality based on their biography. Follow a linguistic style suitable for the person. Vary the statement lengths to make it natural.
Don't say anything else.
<input_data>
Biography: Elliot Meyer is a 25-year-old researcher based in Boston. Known for his deep, contemplative demeanor, Elliot was always that child who asked too many questi$
Medical condition: chronic tension headaches
Reason for clinical visit: Elliot is visiting his primary care physician to discuss worsening tension headaches that have started to interfere with his work and concen$
</input_data>
<dialogue_format>
P: ...
D: ...
P: ...
D: ...
</dialogue_format>
<dialogue>"""
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
with torch.no_grad():
outputs = model.generate(
**inputs,
max_new_tokens=100,
do_sample=True,
temperature=0.7
)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))