Update README.md

df18e6e verified 15 days ago

9.68 kB

	---
	license: mit
	datasets:
	- Jhcircle/KardiaBench
	language:
	- en
	base_model:
	- Qwen/Qwen2.5-7B-Instruct
	pipeline_tag: question-answering
	tags:
	- agent
	---

	<h1>Kardia-R1: Unleashing LLMs to Reason toward Understanding and Empathy for Emotional Support via Rubric-as-Judge Reinforcement Learning</h1>

	_(Accepted by WWW 2026)_

	[![Paper](https://img.shields.io/badge/arXiv-2512.01282-b31b1b.svg)](https://arxiv.org/abs/2512.01282)
	![GitHub Repo stars](https://img.shields.io/github/stars/JhCircle/Kardia-R1?style=social)

	✨ Like Kardia-R1? Give us a ⭐ Star on GitHub! Your support keeps us going! [JhCircle/Kardia-R1](https://github.com/JhCircle/Kardia-R1)
	---

	## 🎯 Overview

	Kardia-R1 is a specialized 7B-parameter large language model fine-tuned from [Qwen2.5-7B-Instruct](https://huggingface.co/Qwen/Qwen2.5-7B-Instruct) for emotional support dialogue. Unlike standard conversational AI, Kardia-R1 employs a novel Rubric-as-Judge Reinforcement Learning (Rubric-ERL) framework that explicitly trains the model to:

	1. Understand user emotions through structured reasoning
	2. Empathize using validated psychological principles (affective/cognitive empathy, reflective listening)
	3. Respond with concise, personalized emotional support

	The model generates structured outputs with four distinct reasoning stages: Understanding → Reasoning → Emotion Recognition → Response Generation.


	## 📝 Citation
	```markdown
	@article{yuan2025kardia,
	title={Kardia-R1: Unleashing LLMs to Reason toward Understanding and Empathy for Emotional Support via Rubric-as-Judge Reinforcement Learning},
	author={Yuan, Jiahao and Cui, Zhiqing and Wang, Hanqing and Gao, Yuansheng and Zhou, Yucheng and Naseem, Usman},
	journal={arXiv preprint arXiv:2512.01282},
	year={2025}
	}
	```

	## 🧠 Model Architecture

	- Base Model: Qwen2.5-7B-Instruct
	- Fine-tuning Method: Rubric-as-Judge Reinforcement Learning (Rubric-ERL)
	- Context Window: 32K tokens
	- Special Tokens:
	- `<\\|understanding_begin\\|>` / `<\\|understanding_end\\|>`
	- `<\\|reasoning_begin\\|>` / `<\\|reasoning_end\\|>`
	- `<\\|emotion_begin\\|>` / `<\\|emotion_end\\|>`
	- `<\\|response_begin\\|>` / `<\\|response_end\\|>`


	## 🚀 Usage

	### Installation

	```bash
	pip install transformers torch
	<!-- or -->
	pip install ms-swift
	```

	### Quick Start with Transformers

	```python
	from transformers import AutoModelForCausalLM, AutoTokenizer
	import torch

	model_id = "Jhcircle/Kardia-R1"

	# Load model and tokenizer
	tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)
	model = AutoModelForCausalLM.from_pretrained(
	model_id,
	torch_dtype=torch.bfloat16,
	device_map="auto",
	trust_remote_code=True
	)

	# Prepare system prompt
	system_prompt = """You are an emotional dialogue assistant and a psychological expert. Your task is to respond to the User's message in a roleplay scenario, taking into account the User's personality, emotional state, and situation. ### Role and Objective ### - Act as both a supportive therapist and an empathetic conversational partner. - Prioritize understanding the User’s feelings and providing emotional validation. - Keep the conversation natural, emotionally resonant, and aligned with the User's profile. ### Response Requirements ### - Structure your reply in 4 sections: <\|understanding_begin\|>: Summarize the User's message, intent, and key emotional cues. <\|reasoning_begin\|>: Explain your empathic rationale, considering psychological principles such as affective and cognitive empathy, emotion validation, and reflective listening. <\|emotion_begin\|>: Accurately reflect the User's current emotional state. <\|response_begin\|>: Provide a concise, natural, emotionally supportive reply (≤30 tokens), coherent and aligned with the User’s personality. - Avoid asking unnecessary questions; focus on reflecting, validating, and supporting the User. - Ensure each section is clear, concise, and well-structured.
	### User Profile
	{{profile}}
	### Situation ###
	{{situation}}
	### <\|understanding_begin\|>{{Concise summary of user's message, intent, and key emotional cues.}}<\|understanding_end\|>
	<\|reasoning_begin\|>{{Brief empathic rationale using perspective-taking and emotion validation.}}<\|reasoning_end\|>
	<\|emotion_begin\|>{{Select the most fitting emotion from: sentimental, afraid, proud, faithful, terrified, joyful, angry, sad, jealous, grateful, prepared, embarrassed, excited, annoyed, lonely, ashamed, guilty, surprised, nostalgic, confident, furious, disappointed, caring, trusting, disgusted, anticipating, anxious, hopeful, content, impressed, apprehensive, devastated}}<\|emotion_end\|>
	<\|response_begin\|>{{Provide a concise, supportive reply (≤30 tokens) aligned with the user's personality and emotional state.}}<\|response_end\|>
	"""

	# Generate response
	messages = [
	{"role": "system", "content": system_prompt},
	{"role": "user", "content": "I don't know how to process this. Everything feels numb."}
	]

	inputs = tokenizer.apply_chat_template(messages, return_tensors="pt", add_generation_prompt=True).to(model.device)

	outputs = model.generate(
	inputs,
	max_new_tokens=512,
	temperature=0.0,
	do_sample=False,
	)

	response = tokenizer.decode(outputs[0][inputs.shape[1]:], skip_special_tokens=False)
	print(response)
	```

	### Quick Start with Ms-Swift

	```python
	import os
	os.environ['CUDA_VISIBLE_DEVICES'] = '0'

	from swift.llm import PtEngine, RequestConfig, InferRequest, get_model_tokenizer, get_template

	model_path = "Jhcircle/Kardia-R1"
	model_type = "qwen2_5"

	# Initialize model
	model, tokenizer = get_model_tokenizer(model_path, model_type=model_type)
	template = get_template(model.model_meta.template, tokenizer, default_system=None)

	# Create inference engine
	engine = PtEngine.from_model_template(model, template, max_batch_size=2)
	request_config = RequestConfig(max_tokens=512, temperature=0.0)

	# Prepare system prompt
	system_prompt = """You are an emotional dialogue assistant and a psychological expert. Your task is to respond to the User's message in a roleplay scenario, taking into account the User's personality, emotional state, and situation. ### Role and Objective ### - Act as both a supportive therapist and an empathetic conversational partner. - Prioritize understanding the User’s feelings and providing emotional validation. - Keep the conversation natural, emotionally resonant, and aligned with the User's profile. ### Response Requirements ### - Structure your reply in 4 sections: <\|understanding_begin\|>: Summarize the User's message, intent, and key emotional cues. <\|reasoning_begin\|>: Explain your empathic rationale, considering psychological principles such as affective and cognitive empathy, emotion validation, and reflective listening. <\|emotion_begin\|>: Accurately reflect the User's current emotional state. <\|response_begin\|>: Provide a concise, natural, emotionally supportive reply (≤30 tokens), coherent and aligned with the User’s personality. - Avoid asking unnecessary questions; focus on reflecting, validating, and supporting the User. - Ensure each section is clear, concise, and well-structured.
	### User Profile
	{{profile}}
	### Situation ###
	{{situation}}
	### <\|understanding_begin\|>{{Concise summary of user's message, intent, and key emotional cues.}}<\|understanding_end\|>
	<\|reasoning_begin\|>{{Brief empathic rationale using perspective-taking and emotion validation.}}<\|reasoning_end\|>
	<\|emotion_begin\|>{{Select the most fitting emotion from: sentimental, afraid, proud, faithful, terrified, joyful, angry, sad, jealous, grateful, prepared, embarrassed, excited, annoyed, lonely, ashamed, guilty, surprised, nostalgic, confident, furious, disappointed, caring, trusting, disgusted, anticipating, anxious, hopeful, content, impressed, apprehensive, devastated}}<\|emotion_end\|>
	<\|response_begin\|>{{Provide a concise, supportive reply (≤30 tokens) aligned with the user's personality and emotional state.}}<\|response_end\|>
	"""

	infer_requests = [
	InferRequest(messages=[
	{"role": "system", "content": system_prompt},
	{"role": "user", "content": "I feel like I'm drowning. No matter how much I study, it's never enough."}
	]),
	]

	# Run inference
	resp_list = engine.infer(infer_requests, request_config)
	print(f'Response: {resp_list[0].choices[0].message.content}')
	```


	## 🏋️ Training Details

	- Dataset: [KardiaBench](https://huggingface.co/datasets/Jhcircle/KardiaBench) - A curated dataset of emotional support dialogues with rubric-based annotations
	- Training Method: Rubric-as-Judge RL (Rubric-ERL)
	- Uses structured evaluation rubrics as reward signals
	- Optimizes for both empathy and response quality
	- Incorporates psychological safety constraints
	- Compute: Training details available in our paper (https://arxiv.org/abs/2512.01282)
	- License: MIT



	## ⚠️ Limitations & Safety

	Important: Kardia-R1 is designed for emotional support and companionship, not clinical therapy.

	- Not a Replacement for Professional Help: This model cannot diagnose mental health conditions or provide clinical treatment. Users experiencing severe mental health crises should contact professional services.
	- Crisis Detection: The model includes basic crisis detection patterns but may not reliably identify all emergency situations.
	- Bias: As with all LLMs, outputs may reflect biases present in training data.
	- Consistency: Emotional support quality may vary across different contexts and user inputs.

	---


	<div align="center">

	⭐ Star us on [GitHub](https://github.com/JhCircle/Kardia-R1) if you find this work helpful!

	</div>