parentpalai / README.md

Update README.md

9b1f311 verified 3 months ago

10.8 kB

	---
	library_name: transformers
	tags:
	- parenting
	- empathy
	license: apache-2.0
	language:
	- en
	base_model:
	- mistralai/Mistral-7B-Instruct-v0.3
	pipeline_tag: text-generation
	---

	# Model Card for Model ID

	ParentPalAI fine-tunes `mistralai/Mistral-7B-Instruct-v0.3` using Direct Preference Optimization (DPO) combined with Parameter-Efficient Fine-Tuning (PEFT) via Quantized Low-Rank Adaptation (QLoRA).
	The goal is to enhance empathy and emotional resonance in parenting-related conversations while studying the trade-offs between emotional alignment, clarity, and factual quality.


	## Model Details

	### Model Description

	Goal: Improve the empathy and emotional resonance of parenting-focused LLM responses while analyzing the impact of alignment techniques on overall quality.

	Action: Fine-tuned Mistral-7B-Instruct on ~1K synthetic preference pairs using Direct Preference Optimization (DPO) with Parameter-Efficient Fine Tuning (PEFT) i.e. Quantized Low-Rank Adaptation (QLoRA). Built a complete alignment workflow covering prompt engineering, preference pairs generation, QLoRA fine-tuning, and LLM-as-a-Judge (GPT-4o) evaluation with custom empathy and quality metrics.

	Result: Drove a +65-point increase in empathy win rate (11% to 76%), revealing meaningful trade-offs between emotional alignment, and clarity and overall quality to inform subsequent multi-objective fine-tuning strategies.

	- Developed by: Prerna Chikersal
	- Model type: PEFT
	- Language(s) (NLP): English
	- License: Apache 2.0
	- Finetuned from model: Mistral-7B-Instruct-v0.3

	### Model Sources

	- Repository: https://github.com/prernaa/ParentPalAI (includes sample responses)

	## Uses

	ParentPalAI was developed for research and educational purposes — primarily to explore:

	- How fine-tuning on synthetic preference pairs affects empathy, tone, and relatability in LLM responses.
	- The trade-off between emotional resonance and clarity/helpfulness in aligned models.
	- Methods for enhancing warmth and naturalness in conversational AI through DPO and PEFT (QLoRA).

	Researchers, educators, and ML practitioners can use this model to:

	- Study fine-tuning effects on emotional style and alignment.
	- Prototype empathy-driven LLMs for social or psychological dialogue settings.

	### Direct Use

	You can use ParentPalAI to:

	- Generate empathetic, supportive, and warm responses to parenting-related prompts.
	- Experiment with style transfer and tone control in conversational AI.
	- Test LLM evaluation metrics (e.g., LLM-as-a-Judge) for empathy, tone, and clarity.

	Example:
	```
	prompt = "My toddler cries every night before bed. What should I do?"
	inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
	outputs = model.generate(**inputs, max_new_tokens=250)
	print(tokenizer.decode(outputs[0], skip_special_tokens=True))
	```


	### Out-of-Scope Use

	This model is not suitable for:

	- Clinical, medical, or therapeutic advice.
	- Real-world parenting counseling or behavioral guidance.
	- Any deployment scenario involving high-stakes decision-making, mental health support, or childcare recommendations.
	- Content moderation, bias-free generation, or factual question answering — the Reddit dataset may contain noisy or biased language.

	## Bias, Risks, and Limitations

	The model should not be used for real parenting, psychological, or medical guidance. Instead, it serves as a research tool for exploring empathy and tone in language models, and all outputs should be reviewed critically before use.

	### Recommendations

	- Always pair this adapter with the base model mistralai/Mistral-7B-Instruct-v0.3.
	- Use bfloat16 precision and FlashAttention 2 on A100 or H100 GPUs for optimal speed.
	- Evaluate generations qualitatively for empathy, clarity, and factual accuracy before any downstream use.
	- For production or sensitive domains, fine-tune further using curated, high-quality data or Direct Preference Optimization (DPO) to balance warmth and helpfulness.

	## How to Get Started with the Model

	This repository only contains PEFT adapter weights — not the full 7B model.
	To use the model, you must load the base Mistral model and apply this adapter.

	- Base model: mistralai/Mistral-7B-Instruct-v0.3
	- Fine-tuning method: QLoRA (PEFT)
	- Training data: synthetic preference pairs data from GPT
	- Goal: Explore how DPO by optimization for empathy and overall quality affects empathy and warmth in responses.

	```python
	# LOAD THE BASE MODEL IN 4-BIT PRECISION WITH DOUBLE QUANTIZATION

	import torch
	from transformers import AutoModelForCausalLM, BitsAndBytesConfig, AutoTokenizer

	torch.backends.cuda.matmul.allow_tf32 = True
	torch.set_float32_matmul_precision("high")

	bnb_config = BitsAndBytesConfig(
	load_in_4bit=True, # loads base model in 4-bit precision
	bnb_4bit_use_double_quant=True, # double quantization saves VRAM
	bnb_4bit_quant_type="nf4",
	bnb_4bit_compute_dtype=torch.bfloat16
	)

	model = AutoModelForCausalLM.from_pretrained(
	BASE_MODEL_ID,
	quantization_config=bnb_config,
	device_map="auto",
	dtype=torch.bfloat16,
	attn_implementation="flash_attention_2", # FA2 is fastest on A100
	token=HF_TOKEN # login to hugging face
	)

	model.config.pad_token_id = tokenizer.pad_token_id
	model.generation_config.pad_token_id = tokenizer.pad_token_id

	## Load the ParentPalAI PEFT Model
	from peft import PeftModel
	model = PeftModel.from_pretrained(model, "prernac1/parentpalai")

	## Inference
	prompt = """You’re a supportive parent responding to another parent who is struggling with toddler tantrums."""
	inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
	outputs = model.generate(
	**inputs,
	max_new_tokens=300,
	temperature=0.7,
	top_p=0.9
	)
	print(tokenizer.decode(outputs[0], skip_special_tokens=True))

	```

	## Training Details

	### Training Data

	V1 (optimizing for empathy): https://github.com/prernaa/ParentPalAI/blob/main/data_to_share/dpo_dataset_dpo_labels_v1.jsonl
	V2 (optimizing for overall quality): https://github.com/prernaa/ParentPalAI/blob/main/data_to_share/dpo_dataset_dpo_labels_v2.jsonl

	### Training Procedure

	PEFT with QLoRA (4-bit precision) on A100 Google Collab.

	#### Training Hyperparameters

	- ParentPalAI was fine-tuned using Quantized Low-Rank Adaptation (QLoRA) on the base model mistralai/Mistral-7B-Instruct-v0.3.
	The model was trained in 4-bit precision with double quantization (NF4) and bfloat16 compute, optimized for VRAM efficiency on T4 and A100 GPUs. The model was trained on A100.

	- Training method: QLoRA (Parameter-Efficient Fine-Tuning)

	- Precision: 4-bit quantization (NF4) with double quantization, compute in bfloat16

	- Optimizer: paged_adamw_8bit

	- Scheduler: Cosine learning rate decay with 3% warmup

	Batching: Effective batch size of 24 (per_device_train_batch_size=6, gradient_accumulation_steps=4)

	Epochs: 1–2 (best checkpoint after 1 epoch, ~40 steps)

	Dropout: 0.15 (LoRA)

	LoRA rank: 8 (r=8), scaling factor alpha=32

	Trainable parameters: ~0.18% of total model parameters

	Gradient checkpointing: Enabled

	Attention implementation: FlashAttention 2

	Mixed precision: bfloat16 mixed precision

	Base precision (non-quantized runs): bfloat16

	## Evaluation

	### Testing Data, Factors & Metrics

	#### Testing Data

	Here is the test dataset generated by GPT4o: https://github.com/prernaa/ParentPalAI/blob/main/data_to_share/dpo_dataset_test.jsonl

	#### Metrics
	ParentPalAI was evaluated using GPT-4o as an LLM-as-a-Judge, comparing its responses (System B) to the base model `mistralai/Mistral-7B-Instruct-v0.3` (System A).
	Each model pair was scored on six qualitative dimensions — empathy, clarity, comprehensiveness, practicality, adoptability, and overall quality — across 100 GPT-generated parenting prompts.
	Two variants of ParentPalAI were tested to understand alignment trade-offs.


	### Results

	### Version 1 (Empathy-Focused DPO)
	(optimized for empathy but considers overall quality)

	\| System \| winner_empathy \| winner_clarity \| winner_overall \|
	\|:-------\|:---------------:\|:---------------:\|:---------------:\|
	\| System A \| 0.1066 \| 0.8883 \| 0.7462 \|
	\| System B (ParentPalAI V1) \| 0.7640 \| 0.1117 \| 0.2538 \|

	Findings:
	- ParentPalAI V1 dramatically increased empathy (+65 points, from ~11% → 76%).
	- The model produced noticeably warmer, more supportive tone but with reduced clarity and practical helpfulness.
	- Despite lower clarity, some responses were judged as more relatable and emotionally resonant, showing that empathic alignment can enhance perceived authenticity even when utility drops.

	### Version 2 (Overall-Quality-Focused DPO)
	(optimized only for overall win rate)

	\| System \| winner_empathy \| winner_clarity \| winner_overall \|
	\|:-------\|:---------------:\|:---------------:\|:---------------:\|
	\| System A \| 0.4340 \| 0.8604 \| 0.6371 \|
	\| System B (ParentPalAI V2) \| 0.2843 \| 0.1371 \| 0.3629 \|

	Findings:
	- Optimizing purely for overall quality partially recovered clarity and practicality but reduced empathic warmth (43 % → 28 %).
	- The model balanced tone and coherence better than V1 but sounded less emotionally attuned.
	- This highlights a core alignment tension: maximizing clarity and factual strength can come at the expense of empathy and perceived connection.


	#### Summary
	- V1: Highest empathy and relatability, weaker clarity -> ideal for exploring affective alignment.
	- V2: More balanced but emotionally flatter -> better for generalized instruction following.
	- Empathy and clarity appear inversely correlated when optimizing single-objective DPO.
	- Future work will explore multi-objective DPO and reinforcement from human preferences to jointly optimize warmth, clarity, and factual helpfulness.

	## Environmental Impact

	<!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->

	Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).

	- Hardware Type: A100
	- Hours used: 5
	- Cloud Provider: Google Collab
	- Compute Region: USA
	- Carbon Emitted: [More Information Needed]

	## Citation [optional]

	```
	@misc{chikersal2025parentpalai,
	author = {Prerna Chikersal},
	title = {ParentPalAI — Empathic Fine-Tuning of LLMs using Direct Preference Optimization (DPO) with QLoRA},
	year = {2025},
	publisher = {GitHub},
	howpublished = {\url{https://github.com/prernaa/ParentPalAI}},
	note = {Hugging Face Model: https://huggingface.co/prernac1/parentpalai}
	}
	```

	## Model Card Contact

	Prerna Chikersal: pchikersal@gmail.com