parentpalai / README.md
prernac1's picture
Update README.md
9b1f311 verified
---
library_name: transformers
tags:
- parenting
- empathy
license: apache-2.0
language:
- en
base_model:
- mistralai/Mistral-7B-Instruct-v0.3
pipeline_tag: text-generation
---
# Model Card for Model ID
**ParentPalAI** fine-tunes `mistralai/Mistral-7B-Instruct-v0.3` using **Direct Preference Optimization (DPO)** combined with **Parameter-Efficient Fine-Tuning (PEFT)** via **Quantized Low-Rank Adaptation (QLoRA)**.
The goal is to enhance **empathy and emotional resonance** in parenting-related conversations while studying the trade-offs between emotional alignment, clarity, and factual quality.
## Model Details
### Model Description
**Goal:** Improve the empathy and emotional resonance of parenting-focused LLM responses while analyzing the impact of alignment techniques on overall quality.
**Action:** Fine-tuned Mistral-7B-Instruct on ~1K synthetic preference pairs using Direct Preference Optimization (DPO) with Parameter-Efficient Fine Tuning (PEFT) i.e. Quantized Low-Rank Adaptation (QLoRA). Built a complete alignment workflow covering prompt engineering, preference pairs generation, QLoRA fine-tuning, and LLM-as-a-Judge (GPT-4o) evaluation with custom empathy and quality metrics.
**Result:** Drove a +65-point increase in empathy win rate (11% to 76%), revealing meaningful trade-offs between emotional alignment, and clarity and overall quality to inform subsequent multi-objective fine-tuning strategies.
- **Developed by:** Prerna Chikersal
- **Model type:** PEFT
- **Language(s) (NLP):** English
- **License:** Apache 2.0
- **Finetuned from model:** Mistral-7B-Instruct-v0.3
### Model Sources
- **Repository:** https://github.com/prernaa/ParentPalAI (includes sample responses)
## Uses
ParentPalAI was developed for research and educational purposes β€” primarily to explore:
- How fine-tuning on synthetic preference pairs affects empathy, tone, and relatability in LLM responses.
- The trade-off between emotional resonance and clarity/helpfulness in aligned models.
- Methods for enhancing warmth and naturalness in conversational AI through DPO and PEFT (QLoRA).
Researchers, educators, and ML practitioners can use this model to:
- Study fine-tuning effects on emotional style and alignment.
- Prototype empathy-driven LLMs for social or psychological dialogue settings.
### Direct Use
You can use ParentPalAI to:
- Generate empathetic, supportive, and warm responses to parenting-related prompts.
- Experiment with style transfer and tone control in conversational AI.
- Test LLM evaluation metrics (e.g., LLM-as-a-Judge) for empathy, tone, and clarity.
Example:
```
prompt = "My toddler cries every night before bed. What should I do?"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=250)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
```
### Out-of-Scope Use
This model is not suitable for:
- Clinical, medical, or therapeutic advice.
- Real-world parenting counseling or behavioral guidance.
- Any deployment scenario involving high-stakes decision-making, mental health support, or childcare recommendations.
- Content moderation, bias-free generation, or factual question answering β€” the Reddit dataset may contain noisy or biased language.
## Bias, Risks, and Limitations
The model should not be used for real parenting, psychological, or medical guidance. Instead, it serves as a research tool for exploring empathy and tone in language models, and all outputs should be reviewed critically before use.
### Recommendations
- Always pair this adapter with the base model mistralai/Mistral-7B-Instruct-v0.3.
- Use bfloat16 precision and FlashAttention 2 on A100 or H100 GPUs for optimal speed.
- Evaluate generations qualitatively for empathy, clarity, and factual accuracy before any downstream use.
- For production or sensitive domains, fine-tune further using curated, high-quality data or Direct Preference Optimization (DPO) to balance warmth and helpfulness.
## How to Get Started with the Model
This repository only contains PEFT adapter weights β€” not the full 7B model.
To use the model, you must load the base Mistral model and apply this adapter.
- Base model: mistralai/Mistral-7B-Instruct-v0.3
- Fine-tuning method: QLoRA (PEFT)
- Training data: synthetic preference pairs data from GPT
- Goal: Explore how DPO by optimization for empathy and overall quality affects empathy and warmth in responses.
```python
# LOAD THE BASE MODEL IN 4-BIT PRECISION WITH DOUBLE QUANTIZATION
import torch
from transformers import AutoModelForCausalLM, BitsAndBytesConfig, AutoTokenizer
torch.backends.cuda.matmul.allow_tf32 = True
torch.set_float32_matmul_precision("high")
bnb_config = BitsAndBytesConfig(
load_in_4bit=True, # loads base model in 4-bit precision
bnb_4bit_use_double_quant=True, # double quantization saves VRAM
bnb_4bit_quant_type="nf4",
bnb_4bit_compute_dtype=torch.bfloat16
)
model = AutoModelForCausalLM.from_pretrained(
BASE_MODEL_ID,
quantization_config=bnb_config,
device_map="auto",
dtype=torch.bfloat16,
attn_implementation="flash_attention_2", # FA2 is fastest on A100
token=HF_TOKEN # login to hugging face
)
model.config.pad_token_id = tokenizer.pad_token_id
model.generation_config.pad_token_id = tokenizer.pad_token_id
## Load the ParentPalAI PEFT Model
from peft import PeftModel
model = PeftModel.from_pretrained(model, "prernac1/parentpalai")
## Inference
prompt = """You’re a supportive parent responding to another parent who is struggling with toddler tantrums."""
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(
**inputs,
max_new_tokens=300,
temperature=0.7,
top_p=0.9
)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
```
## Training Details
### Training Data
V1 (optimizing for empathy): https://github.com/prernaa/ParentPalAI/blob/main/data_to_share/dpo_dataset_dpo_labels_v1.jsonl
V2 (optimizing for overall quality): https://github.com/prernaa/ParentPalAI/blob/main/data_to_share/dpo_dataset_dpo_labels_v2.jsonl
### Training Procedure
PEFT with QLoRA (4-bit precision) on A100 Google Collab.
#### Training Hyperparameters
- ParentPalAI was fine-tuned using Quantized Low-Rank Adaptation (QLoRA) on the base model mistralai/Mistral-7B-Instruct-v0.3.
The model was trained in 4-bit precision with double quantization (NF4) and bfloat16 compute, optimized for VRAM efficiency on T4 and A100 GPUs. The model was trained on A100.
- Training method: QLoRA (Parameter-Efficient Fine-Tuning)
- Precision: 4-bit quantization (NF4) with double quantization, compute in bfloat16
- Optimizer: paged_adamw_8bit
- Scheduler: Cosine learning rate decay with 3% warmup
Batching: Effective batch size of 24 (per_device_train_batch_size=6, gradient_accumulation_steps=4)
Epochs: 1–2 (best checkpoint after 1 epoch, ~40 steps)
Dropout: 0.15 (LoRA)
LoRA rank: 8 (r=8), scaling factor alpha=32
Trainable parameters: ~0.18% of total model parameters
Gradient checkpointing: Enabled
Attention implementation: FlashAttention 2
Mixed precision: bfloat16 mixed precision
Base precision (non-quantized runs): bfloat16
## Evaluation
### Testing Data, Factors & Metrics
#### Testing Data
Here is the test dataset generated by GPT4o: https://github.com/prernaa/ParentPalAI/blob/main/data_to_share/dpo_dataset_test.jsonl
#### Metrics
ParentPalAI was evaluated using **GPT-4o as an LLM-as-a-Judge**, comparing its responses (System B) to the base model `mistralai/Mistral-7B-Instruct-v0.3` (System A).
Each model pair was scored on six qualitative dimensions β€” *empathy, clarity, comprehensiveness, practicality, adoptability,* and *overall quality* β€” across 100 GPT-generated parenting prompts.
Two variants of ParentPalAI were tested to understand alignment trade-offs.
### Results
### **Version 1 (Empathy-Focused DPO)**
*(optimized for empathy but considers overall quality)*
| System | winner_empathy | winner_clarity | winner_overall |
|:-------|:---------------:|:---------------:|:---------------:|
| **System A** | 0.1066 | **0.8883** | **0.7462** |
| **System B (ParentPalAI V1)** | **0.7640** | 0.1117 | 0.2538 |
**Findings:**
- ParentPalAI V1 dramatically increased *empathy* (+65 points, from ~11% β†’ 76%).
- The model produced noticeably warmer, more supportive tone but with reduced *clarity* and *practical helpfulness*.
- Despite lower clarity, some responses were judged as more *relatable* and *emotionally resonant*, showing that empathic alignment can enhance perceived authenticity even when utility drops.
### **Version 2 (Overall-Quality-Focused DPO)**
*(optimized only for overall win rate)*
| System | winner_empathy | winner_clarity | winner_overall |
|:-------|:---------------:|:---------------:|:---------------:|
| **System A** | 0.4340 | **0.8604** | **0.6371** |
| **System B (ParentPalAI V2)** | 0.2843 | 0.1371 | 0.3629 |
**Findings:**
- Optimizing purely for *overall quality* partially recovered clarity and practicality but reduced empathic warmth (43 % β†’ 28 %).
- The model balanced tone and coherence better than V1 but sounded less emotionally attuned.
- This highlights a core alignment tension: maximizing clarity and factual strength can come at the expense of empathy and perceived connection.
#### Summary
- **V1**: Highest empathy and relatability, weaker clarity -> ideal for exploring affective alignment.
- **V2**: More balanced but emotionally flatter -> better for generalized instruction following.
- Empathy and clarity appear inversely correlated when optimizing single-objective DPO.
- Future work will explore **multi-objective DPO** and **reinforcement from human preferences** to jointly optimize warmth, clarity, and factual helpfulness.
## Environmental Impact
<!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
- **Hardware Type:** A100
- **Hours used:** 5
- **Cloud Provider:** Google Collab
- **Compute Region:** USA
- **Carbon Emitted:** [More Information Needed]
## Citation [optional]
```
@misc{chikersal2025parentpalai,
author = {Prerna Chikersal},
title = {ParentPalAI β€” Empathic Fine-Tuning of LLMs using Direct Preference Optimization (DPO) with QLoRA},
year = {2025},
publisher = {GitHub},
howpublished = {\url{https://github.com/prernaa/ParentPalAI}},
note = {Hugging Face Model: https://huggingface.co/prernac1/parentpalai}
}
```
## Model Card Contact
Prerna Chikersal: pchikersal@gmail.com