prernac1 commited on
Commit
21e54db
·
verified ·
1 Parent(s): 71fb319

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +9 -20
README.md CHANGED
@@ -18,18 +18,18 @@ pipeline_tag: text-generation
18
  # Model Card for Model ID
19
 
20
  **PretendParentAI** is a fine-tuned variant of `mistralai/Mistral-7B-Instruct-v0.3`, adapted using **Quantized Low-Rank Adaptation (QLoRA)** on Reddit parenting discussions.
21
- It produces more *empathetic, warm, and relatable* parenting advice — though occasionally at the cost of clarity and precision.
22
 
23
 
24
  ## Model Details
25
 
26
  ### Model Description
27
 
28
- Goal: Explore how instruction fine-tuning can enhance warmth, relatability, and storytelling in parenting advice LLMs, while assessing trade-offs with factual precision.
29
 
30
- Action: Fine-tuned Mistral-7B-Instruct on ~40K curated Reddit parenting Q&A pairs (Alpaca format), using Supervised Fine Tuning (SFT) with Parameter-Efficient Fine Tuning (PEFT) i.e. Quantized Low-Rank Adaptation (QLoRA). Built a full instruction-tuning pipeline including Reddit data curation, efficient training/inference using QLoRA, and LLM-as-a-Judge evaluation across empathy, relatability, and other metrics.
31
 
32
- Result: Produced highly human-like, narrative responses that excelled in empathy (30% to 70%) and relatability (2% to 98%), though often over-personalized or hallucinated personal anecdotes—yielding key insights into the tension between emotional alignment and factual grounding in instruction tuning when using human-generated data (e.g. from reddit).
33
 
34
  - **Developed by:** Prerna Chikersal
35
  - **Model type:** PEFT
@@ -76,13 +76,13 @@ This model is not suitable for:
76
  - Content moderation, bias-free generation, or factual question answering — the Reddit dataset may contain noisy or biased language.
77
 
78
  ## Bias, Risks, and Limitations
79
- PretendParentAI was fine-tuned on Reddit parenting discussions, which reflect the biases and tone of online Western, English-speaking communities. As a result, the model may adopt informal, opinionated, or culturally specific parenting perspectives. It can also **hallucinate personal details** — such as referring to imaginary “sons,” “daughters,” or “partners” — because it imitates how Reddit users often share personal anecdotes. These outputs should not be interpreted as factual or autobiographical.
80
 
81
  The model should **not** be used for real parenting, psychological, or medical guidance. Instead, it serves as a research tool for exploring empathy and tone in language models, and all outputs should be reviewed critically before use.
82
 
83
  ### Recommendations
84
 
85
- - Always pair this adapter with the base model mistralai/Mistral-7B-Instruct-v0.3 for best performance.
86
  - Use bfloat16 precision and FlashAttention 2 on A100 or H100 GPUs for optimal speed.
87
  - Evaluate generations qualitatively for empathy, clarity, and factual accuracy before any downstream use.
88
  - For production or sensitive domains, fine-tune further using curated, high-quality data or Direct Preference Optimization (DPO) to balance warmth and helpfulness.
@@ -91,26 +91,15 @@ The model should **not** be used for real parenting, psychological, or medical g
91
 
92
  ## How to Get Started with the Model
93
 
94
- # 🧸 PretendParentAI
95
-
96
- **PretendParentAI** is a fine-tuned variant of `mistralai/Mistral-7B-Instruct-v0.3`, adapted using **Quantized Low-Rank Adaptation (QLoRA)** on Reddit parenting discussions.
97
- It produces more *empathetic, warm, and relatable* parenting advice — though occasionally at the cost of clarity and precision.
98
-
99
- > ⚠️ This repository only contains **PEFT adapter weights** — not the full 7B model.
100
- > To use the model, you must load the base Mistral model and apply this adapter.
101
-
102
- ---
103
-
104
- ## 🧠 Model Details
105
 
106
  - **Base model:** `mistralai/Mistral-7B-Instruct-v0.3`
107
  - **Fine-tuning method:** QLoRA (PEFT)
108
  - **Training data:** Curated Reddit parenting discussions (r/Parenting, r/Mommit, r/Daddit)
109
  - **Goal:** Explore how instruction tuning on real-world parenting dialogue affects empathy and warmth in responses.
110
 
111
- ---
112
-
113
- ## 🚀 How to Load the Model
114
 
115
  ```python
116
  ## Load the base model
 
18
  # Model Card for Model ID
19
 
20
  **PretendParentAI** is a fine-tuned variant of `mistralai/Mistral-7B-Instruct-v0.3`, adapted using **Quantized Low-Rank Adaptation (QLoRA)** on Reddit parenting discussions.
21
+ It produces more *empathetic, warm, and relatable* parenting advice.
22
 
23
 
24
  ## Model Details
25
 
26
  ### Model Description
27
 
28
+ **Goal:** Explore how instruction fine-tuning can enhance warmth, relatability, and storytelling in parenting advice LLMs, while assessing trade-offs with factual precision.
29
 
30
+ **Action:** Fine-tuned Mistral-7B-Instruct on ~40K curated Reddit parenting Q&A pairs (Alpaca format), using Supervised Fine Tuning (SFT) with Parameter-Efficient Fine Tuning (PEFT) i.e. Quantized Low-Rank Adaptation (QLoRA). Built a full instruction-tuning pipeline including Reddit data curation, efficient training/inference using QLoRA, and LLM-as-a-Judge evaluation across empathy, relatability, and other metrics.
31
 
32
+ **Result:** Produced highly human-like, narrative responses that excelled in empathy (30% to 70%) and relatability (2% to 98%), though often over-personalized or hallucinated personal anecdotes—yielding key insights into the tension between emotional alignment and factual grounding in instruction tuning when using human-generated data (e.g. from reddit).
33
 
34
  - **Developed by:** Prerna Chikersal
35
  - **Model type:** PEFT
 
76
  - Content moderation, bias-free generation, or factual question answering — the Reddit dataset may contain noisy or biased language.
77
 
78
  ## Bias, Risks, and Limitations
79
+ PretendParentAI can **hallucinate personal details** — such as referring to imaginary “sons,” “daughters,” or “partners” — because it imitates how Reddit users often share personal anecdotes. These outputs should not be interpreted as factual or autobiographical.
80
 
81
  The model should **not** be used for real parenting, psychological, or medical guidance. Instead, it serves as a research tool for exploring empathy and tone in language models, and all outputs should be reviewed critically before use.
82
 
83
  ### Recommendations
84
 
85
+ - Always pair this adapter with the base model mistralai/Mistral-7B-Instruct-v0.3.
86
  - Use bfloat16 precision and FlashAttention 2 on A100 or H100 GPUs for optimal speed.
87
  - Evaluate generations qualitatively for empathy, clarity, and factual accuracy before any downstream use.
88
  - For production or sensitive domains, fine-tune further using curated, high-quality data or Direct Preference Optimization (DPO) to balance warmth and helpfulness.
 
91
 
92
  ## How to Get Started with the Model
93
 
94
+ This repository only contains **PEFT adapter weights** — not the full 7B model.
95
+ To use the model, you must load the base Mistral model and apply this adapter.
 
 
 
 
 
 
 
 
 
96
 
97
  - **Base model:** `mistralai/Mistral-7B-Instruct-v0.3`
98
  - **Fine-tuning method:** QLoRA (PEFT)
99
  - **Training data:** Curated Reddit parenting discussions (r/Parenting, r/Mommit, r/Daddit)
100
  - **Goal:** Explore how instruction tuning on real-world parenting dialogue affects empathy and warmth in responses.
101
 
102
+ ### How to Load the Model
 
 
103
 
104
  ```python
105
  ## Load the base model