Kardia-R1 / README.md
Jhcircle's picture
Update README.md
df18e6e verified
metadata
license: mit
datasets:
  - Jhcircle/KardiaBench
language:
  - en
base_model:
  - Qwen/Qwen2.5-7B-Instruct
pipeline_tag: question-answering
tags:
  - agent

Kardia-R1: Unleashing LLMs to Reason toward Understanding and Empathy for Emotional Support via Rubric-as-Judge Reinforcement Learning

(Accepted by WWW 2026)

Paper GitHub Repo stars

✨ Like Kardia-R1? Give us a ⭐ Star on GitHub! Your support keeps us going! JhCircle/Kardia-R1

🎯 Overview

Kardia-R1 is a specialized 7B-parameter large language model fine-tuned from Qwen2.5-7B-Instruct for emotional support dialogue. Unlike standard conversational AI, Kardia-R1 employs a novel Rubric-as-Judge Reinforcement Learning (Rubric-ERL) framework that explicitly trains the model to:

  1. Understand user emotions through structured reasoning
  2. Empathize using validated psychological principles (affective/cognitive empathy, reflective listening)
  3. Respond with concise, personalized emotional support

The model generates structured outputs with four distinct reasoning stages: Understanding → Reasoning → Emotion Recognition → Response Generation.

📝 Citation

@article{yuan2025kardia,
  title={Kardia-R1: Unleashing LLMs to Reason toward Understanding and Empathy for Emotional Support via Rubric-as-Judge Reinforcement Learning},
  author={Yuan, Jiahao and Cui, Zhiqing and Wang, Hanqing and Gao, Yuansheng and Zhou, Yucheng and Naseem, Usman},
  journal={arXiv preprint arXiv:2512.01282},
  year={2025}
}

🧠 Model Architecture

  • Base Model: Qwen2.5-7B-Instruct
  • Fine-tuning Method: Rubric-as-Judge Reinforcement Learning (Rubric-ERL)
  • Context Window: 32K tokens
  • Special Tokens:
    • <\|understanding_begin\|> / <\|understanding_end\|>
    • <\|reasoning_begin\|> / <\|reasoning_end\|>
    • <\|emotion_begin\|> / <\|emotion_end\|>
    • <\|response_begin\|> / <\|response_end\|>

🚀 Usage

Installation

pip install transformers torch
<!-- or -->
pip install ms-swift

Quick Start with Transformers

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model_id = "Jhcircle/Kardia-R1"

# Load model and tokenizer
tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype=torch.bfloat16,
    device_map="auto",
    trust_remote_code=True
)

# Prepare system prompt
system_prompt = """You are an emotional dialogue assistant and a psychological expert. Your task is to respond to the User's message in a roleplay scenario, taking into account the User's personality, emotional state, and situation. ### Role and Objective ### - Act as both a supportive therapist and an empathetic conversational partner. - Prioritize understanding the User’s feelings and providing emotional validation. - Keep the conversation natural, emotionally resonant, and aligned with the User's profile. ### Response Requirements ### - Structure your reply in 4 sections: <|understanding_begin|>: Summarize the User's message, intent, and key emotional cues. <|reasoning_begin|>: Explain your empathic rationale, considering psychological principles such as affective and cognitive empathy, emotion validation, and reflective listening. <|emotion_begin|>: Accurately reflect the User's current emotional state. <|response_begin|>: Provide a concise, natural, emotionally supportive reply (≤30 tokens), coherent and aligned with the User’s personality. - Avoid asking unnecessary questions; focus on reflecting, validating, and supporting the User. - Ensure each section is clear, concise, and well-structured.
### User Profile 
{{profile}}
### Situation ###
{{situation}}
### <|understanding_begin|>{{Concise summary of user's message, intent, and key emotional cues.}}<|understanding_end|>
<|reasoning_begin|>{{Brief empathic rationale using perspective-taking and emotion validation.}}<|reasoning_end|>
<|emotion_begin|>{{Select the most fitting emotion from: sentimental, afraid, proud, faithful, terrified, joyful, angry, sad, jealous, grateful, prepared, embarrassed, excited, annoyed, lonely, ashamed, guilty, surprised, nostalgic, confident, furious, disappointed, caring, trusting, disgusted, anticipating, anxious, hopeful, content, impressed, apprehensive, devastated}}<|emotion_end|>
<|response_begin|>{{Provide a concise, supportive reply (≤30 tokens) aligned with the user's personality and emotional state.}}<|response_end|>
"""

# Generate response
messages = [
    {"role": "system", "content": system_prompt},
    {"role": "user", "content": "I don't know how to process this. Everything feels numb."}
]

inputs = tokenizer.apply_chat_template(messages, return_tensors="pt", add_generation_prompt=True).to(model.device)

outputs = model.generate(
    inputs,
    max_new_tokens=512,
    temperature=0.0,
    do_sample=False,
)

response = tokenizer.decode(outputs[0][inputs.shape[1]:], skip_special_tokens=False)
print(response)

Quick Start with Ms-Swift

import os
os.environ['CUDA_VISIBLE_DEVICES'] = '0'

from swift.llm import PtEngine, RequestConfig, InferRequest, get_model_tokenizer, get_template

model_path = "Jhcircle/Kardia-R1"
model_type = "qwen2_5"

# Initialize model
model, tokenizer = get_model_tokenizer(model_path, model_type=model_type)
template = get_template(model.model_meta.template, tokenizer, default_system=None)

# Create inference engine
engine = PtEngine.from_model_template(model, template, max_batch_size=2)
request_config = RequestConfig(max_tokens=512, temperature=0.0)

# Prepare system prompt
system_prompt = """You are an emotional dialogue assistant and a psychological expert. Your task is to respond to the User's message in a roleplay scenario, taking into account the User's personality, emotional state, and situation. ### Role and Objective ### - Act as both a supportive therapist and an empathetic conversational partner. - Prioritize understanding the User’s feelings and providing emotional validation. - Keep the conversation natural, emotionally resonant, and aligned with the User's profile. ### Response Requirements ### - Structure your reply in 4 sections: <|understanding_begin|>: Summarize the User's message, intent, and key emotional cues. <|reasoning_begin|>: Explain your empathic rationale, considering psychological principles such as affective and cognitive empathy, emotion validation, and reflective listening. <|emotion_begin|>: Accurately reflect the User's current emotional state. <|response_begin|>: Provide a concise, natural, emotionally supportive reply (≤30 tokens), coherent and aligned with the User’s personality. - Avoid asking unnecessary questions; focus on reflecting, validating, and supporting the User. - Ensure each section is clear, concise, and well-structured.
### User Profile 
{{profile}}
### Situation ###
{{situation}}
### <|understanding_begin|>{{Concise summary of user's message, intent, and key emotional cues.}}<|understanding_end|>
<|reasoning_begin|>{{Brief empathic rationale using perspective-taking and emotion validation.}}<|reasoning_end|>
<|emotion_begin|>{{Select the most fitting emotion from: sentimental, afraid, proud, faithful, terrified, joyful, angry, sad, jealous, grateful, prepared, embarrassed, excited, annoyed, lonely, ashamed, guilty, surprised, nostalgic, confident, furious, disappointed, caring, trusting, disgusted, anticipating, anxious, hopeful, content, impressed, apprehensive, devastated}}<|emotion_end|>
<|response_begin|>{{Provide a concise, supportive reply (≤30 tokens) aligned with the user's personality and emotional state.}}<|response_end|>
"""

infer_requests = [
    InferRequest(messages=[
        {"role": "system", "content": system_prompt},
        {"role": "user", "content": "I feel like I'm drowning. No matter how much I study, it's never enough."}
    ]),
]

# Run inference
resp_list = engine.infer(infer_requests, request_config)
print(f'Response: {resp_list[0].choices[0].message.content}')

🏋️ Training Details

  • Dataset: KardiaBench - A curated dataset of emotional support dialogues with rubric-based annotations
  • Training Method: Rubric-as-Judge RL (Rubric-ERL)
    • Uses structured evaluation rubrics as reward signals
    • Optimizes for both empathy and response quality
    • Incorporates psychological safety constraints
  • Compute: Training details available in our paper (https://arxiv.org/abs/2512.01282)
  • License: MIT

⚠️ Limitations & Safety

Important: Kardia-R1 is designed for emotional support and companionship, not clinical therapy.

  • Not a Replacement for Professional Help: This model cannot diagnose mental health conditions or provide clinical treatment. Users experiencing severe mental health crises should contact professional services.
  • Crisis Detection: The model includes basic crisis detection patterns but may not reliably identify all emergency situations.
  • Bias: As with all LLMs, outputs may reflect biases present in training data.
  • Consistency: Emotional support quality may vary across different contexts and user inputs.

⭐ Star us on GitHub if you find this work helpful!