---
library_name: transformers
tags:
- parenting
- empathy
license: apache-2.0
language:
- en
base_model:
- mistralai/Mistral-7B-Instruct-v0.3
pipeline_tag: text-generation
---

# Model Card for Model ID

**ParentPalAI** fine-tunes `mistralai/Mistral-7B-Instruct-v0.3` using **Direct Preference Optimization (DPO)** combined with **Parameter-Efficient Fine-Tuning (PEFT)** via **Quantized Low-Rank Adaptation (QLoRA)**.  
The goal is to enhance **empathy and emotional resonance** in parenting-related conversations while studying the trade-offs between emotional alignment, clarity, and factual quality.


## Model Details

### Model Description

**Goal:** Improve the empathy and emotional resonance of parenting-focused LLM responses while analyzing the impact of alignment techniques on overall quality.

**Action:** Fine-tuned Mistral-7B-Instruct on ~1K synthetic preference pairs using Direct Preference Optimization (DPO) with Parameter-Efficient Fine Tuning (PEFT) i.e. Quantized Low-Rank Adaptation (QLoRA). Built a complete alignment workflow covering prompt engineering, preference pairs generation, QLoRA fine-tuning, and LLM-as-a-Judge (GPT-4o) evaluation with custom empathy and quality metrics.

**Result:** Drove a +65-point increase in empathy win rate (11% to 76%), revealing meaningful trade-offs between emotional alignment, and clarity and overall quality to inform subsequent multi-objective fine-tuning strategies.

- **Developed by:** Prerna Chikersal
- **Model type:** PEFT
- **Language(s) (NLP):** English
- **License:** Apache 2.0
- **Finetuned from model:** Mistral-7B-Instruct-v0.3

### Model Sources

- **Repository:** https://github.com/prernaa/ParentPalAI (includes sample responses)

## Uses

ParentPalAI was developed for research and educational purposes — primarily to explore:

- How fine-tuning on synthetic preference pairs affects empathy, tone, and relatability in LLM responses.
- The trade-off between emotional resonance and clarity/helpfulness in aligned models.
- Methods for enhancing warmth and naturalness in conversational AI through DPO and PEFT (QLoRA).

Researchers, educators, and ML practitioners can use this model to:

- Study fine-tuning effects on emotional style and alignment.
- Prototype empathy-driven LLMs for social or psychological dialogue settings.

### Direct Use

You can use ParentPalAI to:

- Generate empathetic, supportive, and warm responses to parenting-related prompts.
- Experiment with style transfer and tone control in conversational AI.
- Test LLM evaluation metrics (e.g., LLM-as-a-Judge) for empathy, tone, and clarity.

Example:
```
prompt = "My toddler cries every night before bed. What should I do?"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=250)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
```


### Out-of-Scope Use

This model is not suitable for:

- Clinical, medical, or therapeutic advice.
- Real-world parenting counseling or behavioral guidance.
- Any deployment scenario involving high-stakes decision-making, mental health support, or childcare recommendations.
- Content moderation, bias-free generation, or factual question answering — the Reddit dataset may contain noisy or biased language.

## Bias, Risks, and Limitations

The model should not be used for real parenting, psychological, or medical guidance. Instead, it serves as a research tool for exploring empathy and tone in language models, and all outputs should be reviewed critically before use.

### Recommendations

- Always pair this adapter with the base model mistralai/Mistral-7B-Instruct-v0.3.
- Use bfloat16 precision and FlashAttention 2 on A100 or H100 GPUs for optimal speed.
- Evaluate generations qualitatively for empathy, clarity, and factual accuracy before any downstream use.
- For production or sensitive domains, fine-tune further using curated, high-quality data or Direct Preference Optimization (DPO) to balance warmth and helpfulness.

## How to Get Started with the Model

This repository only contains PEFT adapter weights — not the full 7B model.
To use the model, you must load the base Mistral model and apply this adapter.

- Base model: mistralai/Mistral-7B-Instruct-v0.3
- Fine-tuning method: QLoRA (PEFT)
- Training data: synthetic preference pairs data from GPT
- Goal: Explore how DPO by optimization for empathy and overall quality affects empathy and warmth in responses.

```python
# LOAD THE BASE MODEL IN 4-BIT PRECISION WITH DOUBLE QUANTIZATION

import torch
from transformers import AutoModelForCausalLM, BitsAndBytesConfig, AutoTokenizer

torch.backends.cuda.matmul.allow_tf32 = True
torch.set_float32_matmul_precision("high")

bnb_config = BitsAndBytesConfig(
    load_in_4bit=True, # loads base model in 4-bit precision
    bnb_4bit_use_double_quant=True, # double quantization saves VRAM
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype=torch.bfloat16
)

model = AutoModelForCausalLM.from_pretrained(
    BASE_MODEL_ID,
    quantization_config=bnb_config,
    device_map="auto",
    dtype=torch.bfloat16,
    attn_implementation="flash_attention_2", # FA2 is fastest on A100
    token=HF_TOKEN # login to hugging face
)

model.config.pad_token_id = tokenizer.pad_token_id
model.generation_config.pad_token_id = tokenizer.pad_token_id

## Load the ParentPalAI PEFT Model
from peft import PeftModel
model = PeftModel.from_pretrained(model, "prernac1/parentpalai")

## Inference
prompt = """You’re a supportive parent responding to another parent who is struggling with toddler tantrums."""
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(
    **inputs,
    max_new_tokens=300,
    temperature=0.7,
    top_p=0.9
)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

```

## Training Details

### Training Data

V1 (optimizing for empathy): https://github.com/prernaa/ParentPalAI/blob/main/data_to_share/dpo_dataset_dpo_labels_v1.jsonl
V2 (optimizing for overall quality): https://github.com/prernaa/ParentPalAI/blob/main/data_to_share/dpo_dataset_dpo_labels_v2.jsonl

### Training Procedure

PEFT with QLoRA (4-bit precision) on A100 Google Collab.

#### Training Hyperparameters

- ParentPalAI was fine-tuned using Quantized Low-Rank Adaptation (QLoRA) on the base model mistralai/Mistral-7B-Instruct-v0.3.
The model was trained in 4-bit precision with double quantization (NF4) and bfloat16 compute, optimized for VRAM efficiency on T4 and A100 GPUs. The model was trained on A100.

- Training method: QLoRA (Parameter-Efficient Fine-Tuning)

- Precision: 4-bit quantization (NF4) with double quantization, compute in bfloat16

- Optimizer: paged_adamw_8bit

- Scheduler: Cosine learning rate decay with 3% warmup

Batching: Effective batch size of 24 (per_device_train_batch_size=6, gradient_accumulation_steps=4)

Epochs: 1–2 (best checkpoint after 1 epoch, ~40 steps)

Dropout: 0.15 (LoRA)

LoRA rank: 8 (r=8), scaling factor alpha=32

Trainable parameters: ~0.18% of total model parameters

Gradient checkpointing: Enabled

Attention implementation: FlashAttention 2

Mixed precision: bfloat16 mixed precision

Base precision (non-quantized runs): bfloat16

## Evaluation

### Testing Data, Factors & Metrics

#### Testing Data

Here is the test dataset generated by GPT4o: https://github.com/prernaa/ParentPalAI/blob/main/data_to_share/dpo_dataset_test.jsonl

#### Metrics
ParentPalAI was evaluated using **GPT-4o as an LLM-as-a-Judge**, comparing its responses (System B) to the base model `mistralai/Mistral-7B-Instruct-v0.3` (System A).  
Each model pair was scored on six qualitative dimensions — *empathy, clarity, comprehensiveness, practicality, adoptability,* and *overall quality* — across 100 GPT-generated parenting prompts.  
Two variants of ParentPalAI were tested to understand alignment trade-offs.


### Results

### **Version 1 (Empathy-Focused DPO)**  
*(optimized for empathy but considers overall quality)*  

| System | winner_empathy | winner_clarity | winner_overall |
|:-------|:---------------:|:---------------:|:---------------:|
| **System A** | 0.1066 | **0.8883** | **0.7462** |
| **System B (ParentPalAI V1)** | **0.7640** | 0.1117 | 0.2538 |

**Findings:**  
- ParentPalAI V1 dramatically increased *empathy* (+65 points, from ~11% → 76%).  
- The model produced noticeably warmer, more supportive tone but with reduced *clarity* and *practical helpfulness*.  
- Despite lower clarity, some responses were judged as more *relatable* and *emotionally resonant*, showing that empathic alignment can enhance perceived authenticity even when utility drops.

### **Version 2 (Overall-Quality-Focused DPO)**  
*(optimized only for overall win rate)*  

| System | winner_empathy | winner_clarity | winner_overall |
|:-------|:---------------:|:---------------:|:---------------:|
| **System A** | 0.4340 | **0.8604** | **0.6371** |
| **System B (ParentPalAI V2)** | 0.2843 | 0.1371 | 0.3629 |

**Findings:**  
- Optimizing purely for *overall quality* partially recovered clarity and practicality but reduced empathic warmth (43 % → 28 %).  
- The model balanced tone and coherence better than V1 but sounded less emotionally attuned.  
- This highlights a core alignment tension: maximizing clarity and factual strength can come at the expense of empathy and perceived connection.


#### Summary
- **V1**: Highest empathy and relatability, weaker clarity -> ideal for exploring affective alignment.  
- **V2**: More balanced but emotionally flatter -> better for generalized instruction following.  
- Empathy and clarity appear inversely correlated when optimizing single-objective DPO.  
- Future work will explore **multi-objective DPO** and **reinforcement from human preferences** to jointly optimize warmth, clarity, and factual helpfulness.

## Environmental Impact

<!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->

Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).

- **Hardware Type:** A100
- **Hours used:** 5
- **Cloud Provider:** Google Collab
- **Compute Region:** USA
- **Carbon Emitted:** [More Information Needed]

## Citation [optional]

```
@misc{chikersal2025parentpalai,
author = {Prerna Chikersal},
title = {ParentPalAI — Empathic Fine-Tuning of LLMs using Direct Preference Optimization (DPO) with QLoRA},
year = {2025},
publisher = {GitHub},
howpublished = {\url{https://github.com/prernaa/ParentPalAI}},
note = {Hugging Face Model: https://huggingface.co/prernac1/parentpalai}
}
```

## Model Card Contact

Prerna Chikersal: pchikersal@gmail.com