File size: 6,387 Bytes

c58dc84
 
e903762
 
 
 
c58dc84
0a5f59c
 
 
 
 
 
 
71ebf61
 
3f59136
 
 
 
d8f0730
c00f032
eecb1c2
e903762
 
89d5a42
e903762
1a08d74
e903762
 
 
1a08d74
e903762
1a08d74
088e078
e903762
 
1a08d74
37ccbbd
e903762
71ebf61
0a5f59c
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
f0ece87
0a5f59c
3fdb3e2

---
license: llama2
language:
- ja
tags:
- moe
---

# youri-2x7b_dev

This model is a Mixture of Experts (MoE) merger of the following two models:
- [rinna/youri-7b-instruction](https://huggingface.co/rinna/youri-7b-instruction)
- [rinna/youri-7b-chat](https://huggingface.co/rinna/youri-7b-chat)

## 🏆 Evaluation

All scores for these benchmarks have been evaluated using the [Stability-AI/lm-evaluation-harness](https://github.com/Stability-AI/lm-evaluation-harness/tree/jp-stable). 
The results of the benchmark scores are stored in [benchmark_scores](https://huggingface.co/HachiML/youri-2x7b_dev/tree/main/benchmark_scores). 
For detailed information on the scores and the conditions under which they were obtained, please refer to this link. 

|                             Model                              |JCommonsenseQA(3-shot,acc.)|JNLI(3-shot,balanced acc.)|MARC-ja(0-shot,balanced acc.)|JSQuAD(2-shot,F1)|4-AVERAGE|
|----------------------------------------------------------------|------:|------:|---------:|-------:|------:|
|[**youri-2x7b_dev**](https://huggingface.co/HachiML/youri-2x7b_dev)|   **91.15**|  **71.03**|     **95.90**|   **91.30**|  **87.34**|
|[youri-7b-instruction](https://huggingface.co/rinna/youri-7b-instruction) *1|  88.83|  63.56|     93.78|    92.19|  84.59|
|[youri-7b-chat](https://huggingface.co/rinna/youri-7b-chat) *1|  91.78|  70.35|     96.69|    79.62|  84.61|

|                             Model                              |jaqket-v2(1-shot,F1)|xlsum(1-shot,ROUGE 2) *2|6-AVERAGE|
|----------------------------------------------------------------|------:|------:|------:|
|[**youri-2x7b_dev**](https://huggingface.co/HachiML/youri-2x7b_dev)|   **84.59**|  **25.62**|  **76.59**|
|[youri-7b-instruction](https://huggingface.co/rinna/youri-7b-instruction) *1|  83.92|  24.67|  75.13|
|[youri-7b-chat](https://huggingface.co/rinna/youri-7b-chat) *1|  83.71|  24.21|  75.33|

|                             Model                              |xwinograd(0-shot,acc.) *2|mgsm(5-shot,acc.) *2|JCoLA(2-shot,balanced acc.) *2|9-AVERAGE|
|----------------------------------------------------------------|------:|------:|---------:|------:|
|[**youri-2x7b_dev**](https://huggingface.co/HachiML/youri-2x7b_dev)|   **81.43**|  **24.80**|     **59.09**|  **69.43**|
|[youri-7b-instruction](https://huggingface.co/rinna/youri-7b-instruction) *1|  78.94	|  17.20|     54.04|  66.35|
|[youri-7b-chat](https://huggingface.co/rinna/youri-7b-chat) *1|  80.92|  25.20|     53.78|  67.36|

*1 From the [rinna's LM Benchmark](https://rinnakk.github.io/research/benchmarks/lm/index.html).  
*2 Since there was no mention of these template versions in rinna's LM Benchmark, the scores were calculated without specifying a template.

## 🧩 Configuration

The model has been made with a custom version of the [mergekit](https://github.com/cg123/mergekit) library (mixtral branch) and the following configuration:

```yaml
base_model: rinna/youri-7b-chat
gate_mode: hidden # one of "hidden", "cheap_embed", or "random"
dtype: bfloat16 # output dtype (float32, float16, or bfloat16)
experts:
  - source_model: rinna/youri-7b-chat
    positive_prompts: 
      - "質問と回答の選択肢を入力として受け取り、選択肢から回答を選択してください。"
      - "前提と仮説の関係を含意、矛盾、中立の中から回答してください。"
      - "以下のテキストを、ポジティブまたはネガティブの感情クラスのいずれかに分類してください。"
      - "以下は、タスクを説明する指示と、文脈のある入力の組み合わせです。要求を適切に満たす応答を書きなさい。"
  - source_model: rinna/youri-7b-instruction
    positive_prompts: 
     - "質問に対する回答を題名と文章から一言で抽出してください。回答は名詞で答えてください。"
     - "与えられたニュース記事を要約してください。"
     - "与えられた文が文法的であるかを回答してください。"
```

The `positive_prompts` in the above configuration are extracted from the instructions of benchmarks that each model excels in. 
For reference on the benchmarks for each model, please see the LM Benchmark at [rinna's LM Benchmark](https://rinnakk.github.io/research/benchmarks/lm/index.html). 
These benchmarks provide a detailed overview of the areas where each individual model performs particularly well, guiding the effective use of the merged model in various natural language processing tasks.

## 💻 Usage

```python
!pip install -q --upgrade transformers einops accelerate bitsandbytes

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "HachiML/youri-2x7b_dev"
torch.set_default_device("cuda")

# Load the model and tokenizer
model = AutoModelForCausalLM.from_pretrained(
    model_name, 
    torch_dtype="auto", 
    load_in_4bit=True, 
    trust_remote_code=True
)
tokenizer = AutoTokenizer.from_pretrained(
    model_name, 
    trust_remote_code=True
)

torch.set_default_device("cuda")

# Create input
instruction = "次の日本語を英語に翻訳してください。"
input = "大規模言語モデル（だいきぼげんごモデル、英: large language model、LLM）は、多数のパラメータ（数千万から数十億）を持つ人工ニューラルネットワークで構成されるコンピュータ言語モデルで、膨大なラベルなしテキストを使用して自己教師あり学習または半教師あり学習によって訓練が行われる。"
prompt = f"""
以下は、タスクを説明する指示と、文脈のある入力の組み合わせです。要求を適切に満たす応答を書きなさい。

### 指示:
{instruction}

### 入力:
{input}

### 応答:
"""

# Tokenize the input string
token_ids = tokenizer.encode(prompt, add_special_tokens=False, return_tensors="pt")

# Generate text using the model
with torch.no_grad():
    output_ids = model.generate(
        token_ids.to(model.device),
        max_new_tokens=200,
        do_sample=True,
        temperature=0.5,
        pad_token_id=tokenizer.pad_token_id,
        bos_token_id=tokenizer.bos_token_id,
        eos_token_id=tokenizer.eos_token_id
    )

# Decode and print the output
output = tokenizer.decode(output_ids.tolist()[0])
print(output)
```