|
|
--- |
|
|
library_name: transformers |
|
|
tags: |
|
|
- parenting |
|
|
- empathy |
|
|
license: apache-2.0 |
|
|
language: |
|
|
- en |
|
|
base_model: |
|
|
- mistralai/Mistral-7B-Instruct-v0.3 |
|
|
pipeline_tag: text-generation |
|
|
--- |
|
|
|
|
|
# Model Card for Model ID |
|
|
|
|
|
**ParentPalAI** fine-tunes `mistralai/Mistral-7B-Instruct-v0.3` using **Direct Preference Optimization (DPO)** combined with **Parameter-Efficient Fine-Tuning (PEFT)** via **Quantized Low-Rank Adaptation (QLoRA)**. |
|
|
The goal is to enhance **empathy and emotional resonance** in parenting-related conversations while studying the trade-offs between emotional alignment, clarity, and factual quality. |
|
|
|
|
|
|
|
|
## Model Details |
|
|
|
|
|
### Model Description |
|
|
|
|
|
**Goal:** Improve the empathy and emotional resonance of parenting-focused LLM responses while analyzing the impact of alignment techniques on overall quality. |
|
|
|
|
|
**Action:** Fine-tuned Mistral-7B-Instruct on ~1K synthetic preference pairs using Direct Preference Optimization (DPO) with Parameter-Efficient Fine Tuning (PEFT) i.e. Quantized Low-Rank Adaptation (QLoRA). Built a complete alignment workflow covering prompt engineering, preference pairs generation, QLoRA fine-tuning, and LLM-as-a-Judge (GPT-4o) evaluation with custom empathy and quality metrics. |
|
|
|
|
|
**Result:** Drove a +65-point increase in empathy win rate (11% to 76%), revealing meaningful trade-offs between emotional alignment, and clarity and overall quality to inform subsequent multi-objective fine-tuning strategies. |
|
|
|
|
|
- **Developed by:** Prerna Chikersal |
|
|
- **Model type:** PEFT |
|
|
- **Language(s) (NLP):** English |
|
|
- **License:** Apache 2.0 |
|
|
- **Finetuned from model:** Mistral-7B-Instruct-v0.3 |
|
|
|
|
|
### Model Sources |
|
|
|
|
|
- **Repository:** https://github.com/prernaa/ParentPalAI (includes sample responses) |
|
|
|
|
|
## Uses |
|
|
|
|
|
ParentPalAI was developed for research and educational purposes β primarily to explore: |
|
|
|
|
|
- How fine-tuning on synthetic preference pairs affects empathy, tone, and relatability in LLM responses. |
|
|
- The trade-off between emotional resonance and clarity/helpfulness in aligned models. |
|
|
- Methods for enhancing warmth and naturalness in conversational AI through DPO and PEFT (QLoRA). |
|
|
|
|
|
Researchers, educators, and ML practitioners can use this model to: |
|
|
|
|
|
- Study fine-tuning effects on emotional style and alignment. |
|
|
- Prototype empathy-driven LLMs for social or psychological dialogue settings. |
|
|
|
|
|
### Direct Use |
|
|
|
|
|
You can use ParentPalAI to: |
|
|
|
|
|
- Generate empathetic, supportive, and warm responses to parenting-related prompts. |
|
|
- Experiment with style transfer and tone control in conversational AI. |
|
|
- Test LLM evaluation metrics (e.g., LLM-as-a-Judge) for empathy, tone, and clarity. |
|
|
|
|
|
Example: |
|
|
``` |
|
|
prompt = "My toddler cries every night before bed. What should I do?" |
|
|
inputs = tokenizer(prompt, return_tensors="pt").to(model.device) |
|
|
outputs = model.generate(**inputs, max_new_tokens=250) |
|
|
print(tokenizer.decode(outputs[0], skip_special_tokens=True)) |
|
|
``` |
|
|
|
|
|
|
|
|
### Out-of-Scope Use |
|
|
|
|
|
This model is not suitable for: |
|
|
|
|
|
- Clinical, medical, or therapeutic advice. |
|
|
- Real-world parenting counseling or behavioral guidance. |
|
|
- Any deployment scenario involving high-stakes decision-making, mental health support, or childcare recommendations. |
|
|
- Content moderation, bias-free generation, or factual question answering β the Reddit dataset may contain noisy or biased language. |
|
|
|
|
|
## Bias, Risks, and Limitations |
|
|
|
|
|
The model should not be used for real parenting, psychological, or medical guidance. Instead, it serves as a research tool for exploring empathy and tone in language models, and all outputs should be reviewed critically before use. |
|
|
|
|
|
### Recommendations |
|
|
|
|
|
- Always pair this adapter with the base model mistralai/Mistral-7B-Instruct-v0.3. |
|
|
- Use bfloat16 precision and FlashAttention 2 on A100 or H100 GPUs for optimal speed. |
|
|
- Evaluate generations qualitatively for empathy, clarity, and factual accuracy before any downstream use. |
|
|
- For production or sensitive domains, fine-tune further using curated, high-quality data or Direct Preference Optimization (DPO) to balance warmth and helpfulness. |
|
|
|
|
|
## How to Get Started with the Model |
|
|
|
|
|
This repository only contains PEFT adapter weights β not the full 7B model. |
|
|
To use the model, you must load the base Mistral model and apply this adapter. |
|
|
|
|
|
- Base model: mistralai/Mistral-7B-Instruct-v0.3 |
|
|
- Fine-tuning method: QLoRA (PEFT) |
|
|
- Training data: synthetic preference pairs data from GPT |
|
|
- Goal: Explore how DPO by optimization for empathy and overall quality affects empathy and warmth in responses. |
|
|
|
|
|
```python |
|
|
# LOAD THE BASE MODEL IN 4-BIT PRECISION WITH DOUBLE QUANTIZATION |
|
|
|
|
|
import torch |
|
|
from transformers import AutoModelForCausalLM, BitsAndBytesConfig, AutoTokenizer |
|
|
|
|
|
torch.backends.cuda.matmul.allow_tf32 = True |
|
|
torch.set_float32_matmul_precision("high") |
|
|
|
|
|
bnb_config = BitsAndBytesConfig( |
|
|
load_in_4bit=True, # loads base model in 4-bit precision |
|
|
bnb_4bit_use_double_quant=True, # double quantization saves VRAM |
|
|
bnb_4bit_quant_type="nf4", |
|
|
bnb_4bit_compute_dtype=torch.bfloat16 |
|
|
) |
|
|
|
|
|
model = AutoModelForCausalLM.from_pretrained( |
|
|
BASE_MODEL_ID, |
|
|
quantization_config=bnb_config, |
|
|
device_map="auto", |
|
|
dtype=torch.bfloat16, |
|
|
attn_implementation="flash_attention_2", # FA2 is fastest on A100 |
|
|
token=HF_TOKEN # login to hugging face |
|
|
) |
|
|
|
|
|
model.config.pad_token_id = tokenizer.pad_token_id |
|
|
model.generation_config.pad_token_id = tokenizer.pad_token_id |
|
|
|
|
|
## Load the ParentPalAI PEFT Model |
|
|
from peft import PeftModel |
|
|
model = PeftModel.from_pretrained(model, "prernac1/parentpalai") |
|
|
|
|
|
## Inference |
|
|
prompt = """Youβre a supportive parent responding to another parent who is struggling with toddler tantrums.""" |
|
|
inputs = tokenizer(prompt, return_tensors="pt").to(model.device) |
|
|
outputs = model.generate( |
|
|
**inputs, |
|
|
max_new_tokens=300, |
|
|
temperature=0.7, |
|
|
top_p=0.9 |
|
|
) |
|
|
print(tokenizer.decode(outputs[0], skip_special_tokens=True)) |
|
|
|
|
|
``` |
|
|
|
|
|
## Training Details |
|
|
|
|
|
### Training Data |
|
|
|
|
|
V1 (optimizing for empathy): https://github.com/prernaa/ParentPalAI/blob/main/data_to_share/dpo_dataset_dpo_labels_v1.jsonl |
|
|
V2 (optimizing for overall quality): https://github.com/prernaa/ParentPalAI/blob/main/data_to_share/dpo_dataset_dpo_labels_v2.jsonl |
|
|
|
|
|
### Training Procedure |
|
|
|
|
|
PEFT with QLoRA (4-bit precision) on A100 Google Collab. |
|
|
|
|
|
#### Training Hyperparameters |
|
|
|
|
|
- ParentPalAI was fine-tuned using Quantized Low-Rank Adaptation (QLoRA) on the base model mistralai/Mistral-7B-Instruct-v0.3. |
|
|
The model was trained in 4-bit precision with double quantization (NF4) and bfloat16 compute, optimized for VRAM efficiency on T4 and A100 GPUs. The model was trained on A100. |
|
|
|
|
|
- Training method: QLoRA (Parameter-Efficient Fine-Tuning) |
|
|
|
|
|
- Precision: 4-bit quantization (NF4) with double quantization, compute in bfloat16 |
|
|
|
|
|
- Optimizer: paged_adamw_8bit |
|
|
|
|
|
- Scheduler: Cosine learning rate decay with 3% warmup |
|
|
|
|
|
Batching: Effective batch size of 24 (per_device_train_batch_size=6, gradient_accumulation_steps=4) |
|
|
|
|
|
Epochs: 1β2 (best checkpoint after 1 epoch, ~40 steps) |
|
|
|
|
|
Dropout: 0.15 (LoRA) |
|
|
|
|
|
LoRA rank: 8 (r=8), scaling factor alpha=32 |
|
|
|
|
|
Trainable parameters: ~0.18% of total model parameters |
|
|
|
|
|
Gradient checkpointing: Enabled |
|
|
|
|
|
Attention implementation: FlashAttention 2 |
|
|
|
|
|
Mixed precision: bfloat16 mixed precision |
|
|
|
|
|
Base precision (non-quantized runs): bfloat16 |
|
|
|
|
|
## Evaluation |
|
|
|
|
|
### Testing Data, Factors & Metrics |
|
|
|
|
|
#### Testing Data |
|
|
|
|
|
Here is the test dataset generated by GPT4o: https://github.com/prernaa/ParentPalAI/blob/main/data_to_share/dpo_dataset_test.jsonl |
|
|
|
|
|
#### Metrics |
|
|
ParentPalAI was evaluated using **GPT-4o as an LLM-as-a-Judge**, comparing its responses (System B) to the base model `mistralai/Mistral-7B-Instruct-v0.3` (System A). |
|
|
Each model pair was scored on six qualitative dimensions β *empathy, clarity, comprehensiveness, practicality, adoptability,* and *overall quality* β across 100 GPT-generated parenting prompts. |
|
|
Two variants of ParentPalAI were tested to understand alignment trade-offs. |
|
|
|
|
|
|
|
|
### Results |
|
|
|
|
|
### **Version 1 (Empathy-Focused DPO)** |
|
|
*(optimized for empathy but considers overall quality)* |
|
|
|
|
|
| System | winner_empathy | winner_clarity | winner_overall | |
|
|
|:-------|:---------------:|:---------------:|:---------------:| |
|
|
| **System A** | 0.1066 | **0.8883** | **0.7462** | |
|
|
| **System B (ParentPalAI V1)** | **0.7640** | 0.1117 | 0.2538 | |
|
|
|
|
|
**Findings:** |
|
|
- ParentPalAI V1 dramatically increased *empathy* (+65 points, from ~11% β 76%). |
|
|
- The model produced noticeably warmer, more supportive tone but with reduced *clarity* and *practical helpfulness*. |
|
|
- Despite lower clarity, some responses were judged as more *relatable* and *emotionally resonant*, showing that empathic alignment can enhance perceived authenticity even when utility drops. |
|
|
|
|
|
### **Version 2 (Overall-Quality-Focused DPO)** |
|
|
*(optimized only for overall win rate)* |
|
|
|
|
|
| System | winner_empathy | winner_clarity | winner_overall | |
|
|
|:-------|:---------------:|:---------------:|:---------------:| |
|
|
| **System A** | 0.4340 | **0.8604** | **0.6371** | |
|
|
| **System B (ParentPalAI V2)** | 0.2843 | 0.1371 | 0.3629 | |
|
|
|
|
|
**Findings:** |
|
|
- Optimizing purely for *overall quality* partially recovered clarity and practicality but reduced empathic warmth (43 % β 28 %). |
|
|
- The model balanced tone and coherence better than V1 but sounded less emotionally attuned. |
|
|
- This highlights a core alignment tension: maximizing clarity and factual strength can come at the expense of empathy and perceived connection. |
|
|
|
|
|
|
|
|
#### Summary |
|
|
- **V1**: Highest empathy and relatability, weaker clarity -> ideal for exploring affective alignment. |
|
|
- **V2**: More balanced but emotionally flatter -> better for generalized instruction following. |
|
|
- Empathy and clarity appear inversely correlated when optimizing single-objective DPO. |
|
|
- Future work will explore **multi-objective DPO** and **reinforcement from human preferences** to jointly optimize warmth, clarity, and factual helpfulness. |
|
|
|
|
|
## Environmental Impact |
|
|
|
|
|
<!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly --> |
|
|
|
|
|
Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). |
|
|
|
|
|
- **Hardware Type:** A100 |
|
|
- **Hours used:** 5 |
|
|
- **Cloud Provider:** Google Collab |
|
|
- **Compute Region:** USA |
|
|
- **Carbon Emitted:** [More Information Needed] |
|
|
|
|
|
## Citation [optional] |
|
|
|
|
|
``` |
|
|
@misc{chikersal2025parentpalai, |
|
|
author = {Prerna Chikersal}, |
|
|
title = {ParentPalAI β Empathic Fine-Tuning of LLMs using Direct Preference Optimization (DPO) with QLoRA}, |
|
|
year = {2025}, |
|
|
publisher = {GitHub}, |
|
|
howpublished = {\url{https://github.com/prernaa/ParentPalAI}}, |
|
|
note = {Hugging Face Model: https://huggingface.co/prernac1/parentpalai} |
|
|
} |
|
|
``` |
|
|
|
|
|
## Model Card Contact |
|
|
|
|
|
Prerna Chikersal: pchikersal@gmail.com |