| | --- |
| | license: mit |
| | datasets: |
| | - Jhcircle/KardiaBench |
| | language: |
| | - en |
| | base_model: |
| | - Qwen/Qwen2.5-7B-Instruct |
| | pipeline_tag: question-answering |
| | tags: |
| | - agent |
| | --- |
| | |
| | <h1>Kardia-R1: Unleashing LLMs to Reason toward Understanding and Empathy for Emotional Support via Rubric-as-Judge Reinforcement Learning</h1> |
| |
|
| | _(Accepted by WWW 2026)_ |
| |
|
| | [](https://arxiv.org/abs/2512.01282) |
| |  |
| |
|
| | ✨ Like Kardia-R1? Give us a ⭐ Star on GitHub! Your support keeps us going! [**JhCircle/Kardia-R1**](https://github.com/JhCircle/Kardia-R1) |
| | --- |
| |
|
| | ## 🎯 Overview |
| |
|
| | **Kardia-R1** is a specialized 7B-parameter large language model fine-tuned from [Qwen2.5-7B-Instruct](https://huggingface.co/Qwen/Qwen2.5-7B-Instruct) for **emotional support dialogue**. Unlike standard conversational AI, Kardia-R1 employs a novel **Rubric-as-Judge Reinforcement Learning (Rubric-ERL)** framework that explicitly trains the model to: |
| |
|
| | 1. **Understand** user emotions through structured reasoning |
| | 2. **Empathize** using validated psychological principles (affective/cognitive empathy, reflective listening) |
| | 3. **Respond** with concise, personalized emotional support |
| |
|
| | The model generates structured outputs with four distinct reasoning stages: Understanding → Reasoning → Emotion Recognition → Response Generation. |
| |
|
| |
|
| | ## 📝 Citation |
| | ```markdown |
| | @article{yuan2025kardia, |
| | title={Kardia-R1: Unleashing LLMs to Reason toward Understanding and Empathy for Emotional Support via Rubric-as-Judge Reinforcement Learning}, |
| | author={Yuan, Jiahao and Cui, Zhiqing and Wang, Hanqing and Gao, Yuansheng and Zhou, Yucheng and Naseem, Usman}, |
| | journal={arXiv preprint arXiv:2512.01282}, |
| | year={2025} |
| | } |
| | ``` |
| |
|
| | ## 🧠 Model Architecture |
| |
|
| | - **Base Model**: Qwen2.5-7B-Instruct |
| | - **Fine-tuning Method**: Rubric-as-Judge Reinforcement Learning (Rubric-ERL) |
| | - **Context Window**: 32K tokens |
| | - **Special Tokens**: |
| | - `<\|understanding_begin\|>` / `<\|understanding_end\|>` |
| | - `<\|reasoning_begin\|>` / `<\|reasoning_end\|>` |
| | - `<\|emotion_begin\|>` / `<\|emotion_end\|>` |
| | - `<\|response_begin\|>` / `<\|response_end\|>` |
| |
|
| |
|
| | ## 🚀 Usage |
| |
|
| | ### Installation |
| |
|
| | ```bash |
| | pip install transformers torch |
| | <!-- or --> |
| | pip install ms-swift |
| | ``` |
| |
|
| | ### Quick Start with Transformers |
| |
|
| | ```python |
| | from transformers import AutoModelForCausalLM, AutoTokenizer |
| | import torch |
| | |
| | model_id = "Jhcircle/Kardia-R1" |
| | |
| | # Load model and tokenizer |
| | tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True) |
| | model = AutoModelForCausalLM.from_pretrained( |
| | model_id, |
| | torch_dtype=torch.bfloat16, |
| | device_map="auto", |
| | trust_remote_code=True |
| | ) |
| | |
| | # Prepare system prompt |
| | system_prompt = """You are an emotional dialogue assistant and a psychological expert. Your task is to respond to the User's message in a roleplay scenario, taking into account the User's personality, emotional state, and situation. ### Role and Objective ### - Act as both a supportive therapist and an empathetic conversational partner. - Prioritize understanding the User’s feelings and providing emotional validation. - Keep the conversation natural, emotionally resonant, and aligned with the User's profile. ### Response Requirements ### - Structure your reply in 4 sections: <|understanding_begin|>: Summarize the User's message, intent, and key emotional cues. <|reasoning_begin|>: Explain your empathic rationale, considering psychological principles such as affective and cognitive empathy, emotion validation, and reflective listening. <|emotion_begin|>: Accurately reflect the User's current emotional state. <|response_begin|>: Provide a concise, natural, emotionally supportive reply (≤30 tokens), coherent and aligned with the User’s personality. - Avoid asking unnecessary questions; focus on reflecting, validating, and supporting the User. - Ensure each section is clear, concise, and well-structured. |
| | ### User Profile |
| | {{profile}} |
| | ### Situation ### |
| | {{situation}} |
| | ### <|understanding_begin|>{{Concise summary of user's message, intent, and key emotional cues.}}<|understanding_end|> |
| | <|reasoning_begin|>{{Brief empathic rationale using perspective-taking and emotion validation.}}<|reasoning_end|> |
| | <|emotion_begin|>{{Select the most fitting emotion from: sentimental, afraid, proud, faithful, terrified, joyful, angry, sad, jealous, grateful, prepared, embarrassed, excited, annoyed, lonely, ashamed, guilty, surprised, nostalgic, confident, furious, disappointed, caring, trusting, disgusted, anticipating, anxious, hopeful, content, impressed, apprehensive, devastated}}<|emotion_end|> |
| | <|response_begin|>{{Provide a concise, supportive reply (≤30 tokens) aligned with the user's personality and emotional state.}}<|response_end|> |
| | """ |
| | |
| | # Generate response |
| | messages = [ |
| | {"role": "system", "content": system_prompt}, |
| | {"role": "user", "content": "I don't know how to process this. Everything feels numb."} |
| | ] |
| | |
| | inputs = tokenizer.apply_chat_template(messages, return_tensors="pt", add_generation_prompt=True).to(model.device) |
| | |
| | outputs = model.generate( |
| | inputs, |
| | max_new_tokens=512, |
| | temperature=0.0, |
| | do_sample=False, |
| | ) |
| | |
| | response = tokenizer.decode(outputs[0][inputs.shape[1]:], skip_special_tokens=False) |
| | print(response) |
| | ``` |
| |
|
| | ### Quick Start with Ms-Swift |
| |
|
| | ```python |
| | import os |
| | os.environ['CUDA_VISIBLE_DEVICES'] = '0' |
| | |
| | from swift.llm import PtEngine, RequestConfig, InferRequest, get_model_tokenizer, get_template |
| | |
| | model_path = "Jhcircle/Kardia-R1" |
| | model_type = "qwen2_5" |
| | |
| | # Initialize model |
| | model, tokenizer = get_model_tokenizer(model_path, model_type=model_type) |
| | template = get_template(model.model_meta.template, tokenizer, default_system=None) |
| | |
| | # Create inference engine |
| | engine = PtEngine.from_model_template(model, template, max_batch_size=2) |
| | request_config = RequestConfig(max_tokens=512, temperature=0.0) |
| | |
| | # Prepare system prompt |
| | system_prompt = """You are an emotional dialogue assistant and a psychological expert. Your task is to respond to the User's message in a roleplay scenario, taking into account the User's personality, emotional state, and situation. ### Role and Objective ### - Act as both a supportive therapist and an empathetic conversational partner. - Prioritize understanding the User’s feelings and providing emotional validation. - Keep the conversation natural, emotionally resonant, and aligned with the User's profile. ### Response Requirements ### - Structure your reply in 4 sections: <|understanding_begin|>: Summarize the User's message, intent, and key emotional cues. <|reasoning_begin|>: Explain your empathic rationale, considering psychological principles such as affective and cognitive empathy, emotion validation, and reflective listening. <|emotion_begin|>: Accurately reflect the User's current emotional state. <|response_begin|>: Provide a concise, natural, emotionally supportive reply (≤30 tokens), coherent and aligned with the User’s personality. - Avoid asking unnecessary questions; focus on reflecting, validating, and supporting the User. - Ensure each section is clear, concise, and well-structured. |
| | ### User Profile |
| | {{profile}} |
| | ### Situation ### |
| | {{situation}} |
| | ### <|understanding_begin|>{{Concise summary of user's message, intent, and key emotional cues.}}<|understanding_end|> |
| | <|reasoning_begin|>{{Brief empathic rationale using perspective-taking and emotion validation.}}<|reasoning_end|> |
| | <|emotion_begin|>{{Select the most fitting emotion from: sentimental, afraid, proud, faithful, terrified, joyful, angry, sad, jealous, grateful, prepared, embarrassed, excited, annoyed, lonely, ashamed, guilty, surprised, nostalgic, confident, furious, disappointed, caring, trusting, disgusted, anticipating, anxious, hopeful, content, impressed, apprehensive, devastated}}<|emotion_end|> |
| | <|response_begin|>{{Provide a concise, supportive reply (≤30 tokens) aligned with the user's personality and emotional state.}}<|response_end|> |
| | """ |
| | |
| | infer_requests = [ |
| | InferRequest(messages=[ |
| | {"role": "system", "content": system_prompt}, |
| | {"role": "user", "content": "I feel like I'm drowning. No matter how much I study, it's never enough."} |
| | ]), |
| | ] |
| | |
| | # Run inference |
| | resp_list = engine.infer(infer_requests, request_config) |
| | print(f'Response: {resp_list[0].choices[0].message.content}') |
| | ``` |
| |
|
| |
|
| | ## 🏋️ Training Details |
| |
|
| | - **Dataset**: [KardiaBench](https://huggingface.co/datasets/Jhcircle/KardiaBench) - A curated dataset of emotional support dialogues with rubric-based annotations |
| | - **Training Method**: Rubric-as-Judge RL (Rubric-ERL) |
| | - Uses structured evaluation rubrics as reward signals |
| | - Optimizes for both empathy and response quality |
| | - Incorporates psychological safety constraints |
| | - **Compute**: Training details available in our paper (https://arxiv.org/abs/2512.01282) |
| | - **License**: MIT |
| |
|
| |
|
| |
|
| | ## ⚠️ Limitations & Safety |
| |
|
| | **Important**: Kardia-R1 is designed for **emotional support and companionship**, not clinical therapy. |
| |
|
| | - **Not a Replacement for Professional Help**: This model cannot diagnose mental health conditions or provide clinical treatment. Users experiencing severe mental health crises should contact professional services. |
| | - **Crisis Detection**: The model includes basic crisis detection patterns but may not reliably identify all emergency situations. |
| | - **Bias**: As with all LLMs, outputs may reflect biases present in training data. |
| | - **Consistency**: Emotional support quality may vary across different contexts and user inputs. |
| |
|
| | --- |
| |
|
| |
|
| | <div align="center"> |
| |
|
| | **⭐ Star us on [GitHub](https://github.com/JhCircle/Kardia-R1) if you find this work helpful!** |
| |
|
| | </div> |
| |
|