UEC-InabaLab
/

Llama-3.1-KokoroChat-High

@@ -14,91 +14,104 @@ datasets:
 - UEC-InabaLab/KokoroChat
 ---
-# 🧠 KokoroChat-High (LoRA Adapter for Japanese Counseling Dialogue)
-This repository contains the **LoRA adapter weights** for KokoroChat-High, a version of the KokoroChat model fine-tuned on **high-feedback counseling dialogues** (client score ≥ 70 and ≤ 98) from the [KokoroChat dataset](https://huggingface.co/datasets/UEC-InabaLab/KokoroChat).
-The base model is [tokyotech-llm/Llama-3.1-Swallow-8B-Instruct-v0.3](https://huggingface.co/tokyotech-llm/Llama-3.1-Swallow-8B-Instruct-v0.3), and this adapter enhances it for generating **high-quality, empathetic Japanese counseling responses**.
 ---
-## 💡 What is "KokoroChat-High"?
-- ✅ Trained on **2,601 dialogues**
-- ✅ All sessions have **client feedback scores between 70 and 98**
-- ✅ Represents high-quality, successful counseling interactions
 ---
-## 🧾 Model Details
-- **Base Model**: [`tokyotech-llm/Llama-3.1-Swallow-8B-Instruct-v0.3`](https://huggingface.co/tokyotech-llm/Llama-3.1-Swallow-8B-Instruct-v0.3)
-- **Fine-tuning Method**: PEFT (LoRA)
-- **Adapter Size**: ~1.1GB (`adapter_model.safetensors`)
-- **Language**: Japanese
-- **Training Data**: KokoroChat-High subset
----
-## ⚙️ Usage Instructions (LoRA Adapter)
-This repository only contains the **adapter weights**.
-You must load the original base model and then apply this adapter.
-```python
-from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig
-from peft import PeftModel
-# === Base + Adapter Paths ===
-base_model_id = "tokyotech-llm/Llama-3.1-Swallow-8B-Instruct-v0.3"
-adapter_id = "UEC-InabaLab/KokoroChat-High"
-# === Load Tokenizer ===
-tokenizer = AutoTokenizer.from_pretrained(base_model_id)
-# === Load Base Model ===
-base_model = AutoModelForCausalLM.from_pretrained(
-    base_model_id,
-    device_map="auto",
-    torch_dtype="auto",
-    quantization_config=BitsAndBytesConfig(load_in_4bit=True)
-)
-# === Load & Merge LoRA ===
-model = PeftModel.from_pretrained(base_model, adapter_id)
-model = model.merge_and_unload()
-```
-## 🧪 Example Inference
-```python
 messages = [
     {"role": "system", "content": "心理カウンセリングの会話において、対話履歴を考慮し、カウンセラーとして適切に応答してください。"},
-    {"role": "user", "content": "最近、家族との関係がうまくいかず、気持ちが落ち込んでいます。"}
 ]
-input_ids = tokenizer.apply_chat_template(messages, return_tensors="pt").to(model.device)
-output = model.generate(
-    input_ids,
-    max_new_tokens=512,
-    do_sample=False,
-    eos_token_id=tokenizer.eos_token_id
 )
-print(tokenizer.decode(output[0][input_ids.shape[-1]:], skip_special_tokens=True))
 ```
-## 🔗 Related
-- 📁 **Dataset**: [KokoroChat Dataset on Hugging Face](https://huggingface.co/datasets/UEC-InabaLab/KokoroChat)
-- 🧠 **Base Model**: [tokyotech-llm/Llama-3.1-Swallow-8B-Instruct-v0.3](https://huggingface.co/tokyotech-llm/Llama-3.1-Swallow-8B-Instruct-v0.3)
-- 📄 **Paper**: [KokoroChat: A Japanese Psychological Counseling Dialogue Dataset (ACL 2025)](https://drive.google.com/file/d/1T6XgvZii8rZ1kKLgOUGqm3BMvqQAvxEM/view?usp=sharing)
 ## 📄 Citation
-If you use this dataset, please cite the following paper:
 ```bibtex
 @inproceedings{qi2025kokorochat,
@@ -108,4 +121,15 @@ If you use this dataset, please cite the following paper:
   year      = {2025},
   url       = {https://github.com/UEC-InabaLab/KokoroChat}
 }
-```

 - UEC-InabaLab/KokoroChat
 ---
+# 🧠 KokoroChat-High: Japanese Counseling Dialogue Model
+**KokoroChat-High** is a large-scale Japanese language model fine-tuned on the **entire KokoroChat dataset**—a collection of over 6,000 psychological counseling dialogues conducted via **role-play between trained counselors**. The model is capable of generating **empathetic and context-aware responses** suitable for mental health-related conversational tasks.
 ---
+## 💡 Overview
+- ✅ Fine-tuned on **6,471 dialogues** with feedback scores ≤ 98
+  (from the full KokoroChat dataset of 6,589 dialogues; 118 high-score dialogues reserved for testing)
+- ✅ Data collected through **text-based role-play** by trained counselors
+- ✅ Covers a wide range of topics: depression, family, school, career, relationships, and more
+- ✅ Base Model: [`tokyotech-llm/Llama-3.1-Swallow-8B-Instruct-v0.3`](https://huggingface.co/tokyotech-llm/Llama-3.1-Swallow-8B-Instruct-v0.3)
 ---
+## ⚙️ Usage Example
+```python
+from transformers import AutoModelForCausalLM, AutoTokenizer
+model_id = "UEC-InabaLab/KokoroChat-High"
+# Load tokenizer and model
+tokenizer = AutoTokenizer.from_pretrained(model_id)
+model = AutoModelForCausalLM.from_pretrained(model_id, device_map="auto")
+# Set pad_token_id
+if tokenizer.pad_token_id is None:
+    tokenizer.pad_token = "[PAD]"
+    tokenizer.pad_token_id = tokenizer.convert_tokens_to_ids("[PAD]")
+model.config.pad_token_id = tokenizer.pad_token_id
+# Build dialogue input
 messages = [
     {"role": "system", "content": "心理カウンセリングの会話において、対話履歴を考慮し、カウンセラーとして適切に応答してください。"},
+    {"role": "user", "content": "最近、気分が落ち込んでやる気が出ません。"}
 ]
+# Tokenize with chat template
+inputs = tokenizer.apply_chat_template(
+    messages,
+    add_generation_prompt=True,
+    return_tensors="pt"
+).to(model.device)
+attention_mask = inputs.ne(tokenizer.pad_token_id)
+# Generate response
+outputs = model.generate(
+    inputs,
+    attention_mask=attention_mask,
+    pad_token_id=tokenizer.pad_token_id,
+    max_new_tokens=256
 )
+# Extract only the newly generated tokens
+response = outputs[0][inputs.shape[-1]:]
+response_text = tokenizer.decode(response, skip_special_tokens=True)
+# Print clean response
+print(response_text)
 ```
+---
+## 🛠️ Fine-Tuning Details
+Fine-tuning was performed using **QLoRA** with the following configuration:
+- **Quantization**: 4-bit NF4 with bfloat16 computation
+- **LoRA target modules**: `q_proj`, `k_proj`, `v_proj`, `o_proj`, `gate_proj`, `up_proj`, `down_proj`
+- **LoRA parameters**:
+  - `r = 8`
+  - `lora_alpha = 16`
+  - `lora_dropout = 0.05`
+### Dataset Split
+- **Training Data**: 6,471 dialogues with feedback scores ≤ 98
+  *(from the full KokoroChat dataset of 6,589 dialogues; 118 dialogues with scores of 99 or 100 were reserved for testing)*
+- **Train/Validation Split**: 90% train, 10% validation
+### Hyperparameter Settings
+- **Optimizer**: `adamw_8bit`
+- **Warm-up Steps**: `100`
+- **Learning Rate**: `1e-3`
+- **Epochs**: `5`
+- **Batch Size**: `8`
+- **Validation Frequency**: every 400 steps
+---
 ## 📄 Citation
+If you use this model or dataset, please cite the following paper:
 ```bibtex
 @inproceedings{qi2025kokorochat,
   year      = {2025},
   url       = {https://github.com/UEC-InabaLab/KokoroChat}
 }
+```
+---
+## 🔗 Related
+- 📁 **Dataset**:
+  - [KokoroChat on Hugging Face Datasets](https://huggingface.co/datasets/UEC-InabaLab/KokoroChat)
+  - [KokoroChat on GitHub (UEC-InabaLab)](https://github.com/UEC-InabaLab/KokoroChat)
+- 🤖 **Model Variants**:
+  - [KokoroChat-Low](https://huggingface.co/UEC-InabaLab/KokoroChat-Low): fine-tuned on **3,870 dialogues** with client feedback scores **< 70**
+  - [KokoroChat-High](https://huggingface.co/UEC-InabaLab/KokoroChat-High): fine-tuned on **2,601 dialogues** with client feedback scores between **70 and 98**
+- 📄 **Paper**: [ACL 2025 Paper (PDF)](https://drive.google.com/file/d/1T6XgvZii8rZ1kKLgOUGqm3BMvqQAvxEM/view?usp=sharing)