HachiML
/

youri-2x7b_dev

Text Generation

Mixture of Experts

text-generation-inference

Model card Files Files and versions

HachiML commited on Jan 13, 2024

Commit

3fdb3e2

·

verified ·

1 Parent(s): 0a5f59c

Update README.md

Files changed (1) hide show

README.md +62 -0

README.md CHANGED Viewed

@@ -33,3 +33,65 @@ experts:
 The `positive_prompts` in the above configuration are extracted from the instructions of benchmarks that each model excels in.
 For reference on the benchmarks for each model, please see the LM Benchmark at [rinnakk's LM Benchmark](https://rinnakk.github.io/research/benchmarks/lm/index.html).
 These benchmarks provide a detailed overview of the areas where each individual model performs particularly well, guiding the effective use of the merged model in various natural language processing tasks.

 The `positive_prompts` in the above configuration are extracted from the instructions of benchmarks that each model excels in.
 For reference on the benchmarks for each model, please see the LM Benchmark at [rinnakk's LM Benchmark](https://rinnakk.github.io/research/benchmarks/lm/index.html).
 These benchmarks provide a detailed overview of the areas where each individual model performs particularly well, guiding the effective use of the merged model in various natural language processing tasks.
+## 💻 Usage
+Here's a [Colab notebook](https://colab.research.google.com/drive/1k6C_oJfEKUq0mtuWKisvoeMHxTcIxWRa?usp=sharing) to run Phixtral in 4-bit precision on a free T4 GPU.
+```python
+!pip install -q --upgrade transformers einops accelerate bitsandbytes
+import torch
+from transformers import AutoModelForCausalLM, AutoTokenizer
+model_name = "HachiML/youri-2x7b_dev"
+torch.set_default_device("cuda")
+# Load the model and tokenizer
+model = AutoModelForCausalLM.from_pretrained(
+    model_name,
+    torch_dtype="auto",
+    load_in_4bit=True,
+    trust_remote_code=True
+)
+tokenizer = AutoTokenizer.from_pretrained(
+    model_name,
+    trust_remote_code=True
+)
+torch.set_default_device("cuda")
+# Create input
+instruction = "次の日本語を英語に翻訳してください。"
+input = "大規模言語モデル（だいきぼげんごモデル、英: large language model、LLM）は、多数のパラメータ（数千万から数十億）を持つ人工ニューラルネットワークで構成されるコンピュータ言語モデルで、膨大なラベルなしテキストを使用して自己教師あり学習または半教師あり学習によって訓練が行われる。"
+prompt = f"""
+以下は、タスクを説明する指示と、文脈のある入力の組み合わせです。要求を適切に満たす応答を書きなさい。
+### 指示:
+{instruction}
+### 入力:
+{input}
+### 応答:
+"""
+# Tokenize the input string
+token_ids = tokenizer.encode(prompt, add_special_tokens=False, return_tensors="pt")
+# Generate text using the model
+with torch.no_grad():
+    output_ids = model.generate(
+        token_ids.to(model.device),
+        max_new_tokens=200,
+        do_sample=True,
+        temperature=0.5,
+        pad_token_id=tokenizer.pad_token_id,
+        bos_token_id=tokenizer.bos_token_id,
+        eos_token_id=tokenizer.eos_token_id
+    )
+# Decode and print the output
+output = tokenizer.decode(output_ids.tolist()[0])
+print(output)
+```