saucam
/

Hikari-4x7B

+---
+license: apache-2.0
+language:
+- en
+- ja
+---
+![](https://raw.githubusercontent.com/saucam/models/main/hikari.png)
+# 🏮 Hikari-4x7B
+Hikari 4x7B is a MoE franken model generated by merging the following models:
+- [mistralai/Mistral-7B-Instruct-v0.2](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2)
+- [NousResearch/Hermes-2-Pro-Mistral-7B](https://huggingface.co/NousResearch/Hermes-2-Pro-Mistral-7B)
+- [Rakuten/RakutenAI-7B-chat](https://huggingface.co/Rakuten/RakutenAI-7B)
+- [NeuralNovel/Mistral-7B-Instruct-v0.2-Neural-Story](https://huggingface.co/NeuralNovel/Mistral-7B-Instruct-v0.2-Neural-Story)
+### Config used for merging
+```
+base_model: Rakuten/RakutenAI-7B-chat
+experts:
+  - source_model: mistralai/Mistral-7B-Instruct-v0.2
+    positive_prompts:
+    - "chat"
+    - "assistant"
+    - "tell me"
+    - "explain"
+    - "I want"
+  - source_model: NousResearch/Hermes-2-Pro-Mistral-7B
+    positive_prompts:
+    - "json"
+    - "function calling"
+    - "call a function"
+    - "story"
+    - "roleplay"
+    - "logic"
+    - "rationality"
+    - "mathematics"
+    - "solve"
+  - source_model: Rakuten/RakutenAI-7B-chat
+    positive_prompts:
+    - "japanese"
+    - "日本語"
+    - "話"
+    - "日本語で答えて"
+    - "日本語版"
+    - "日本語で"
+    - "Japanese translation"
+  - source_model: NeuralNovel/Mistral-7B-Instruct-v0.2-Neural-Story
+    positive_prompts:
+    - "story"
+    - "storywriting"
+    - "character"
+    - "narrative"
+    - "creative"
+```
+### Inference
+Via transfomers
+```
+!pip install -qU transformers bitsandbytes accelerate
+from transformers import AutoTokenizer
+import transformers
+import torch
+model = "saucam/Hikari-4x7B"
+tokenizer = AutoTokenizer.from_pretrained(model)
+pipeline = transformers.pipeline(
+    "text-generation",
+    model=model,
+    model_kwargs={"torch_dtype": torch.float16, "load_in_4bit": False},
+)
+messages = [{"role": "user", "content": "Explain what a Mixture of Experts is in less than 100 words."}]
+prompt = pipeline.tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
+outputs = pipeline(prompt, max_new_tokens=256, do_sample=True, temperature=0.7, top_k=50, top_p=0.95)
+print(outputs[0]["generated_text"])
+```
+Sample responses:
+```
+No chat template is defined for this tokenizer - using the default template for the LlamaTokenizerFast class. If the default is not a
+ppropriate for your model, please set `tokenizer.chat_template` to an appropriate template. See https://huggingface.co/docs/transform
+ers/main/chat_templating for more information.
+<s>[INST] <<SYS>>
+You are a helpful, respectful and honest assistant. Always answer as helpfully as possible, while being safe. Your answers should not
+ include any harmful, unethical, racist, sexist, toxic, dangerous, or illegal content. Please ensure that your responses are socially
+ unbiased and positive in nature.
+If a question does not make any sense, or is not factually coherent, explain why instead of answering something not correct. If you d
+on't know the answer to a question, please don't share false information.
+<</SYS>>
+Explain what a Mixture of Experts is in less than 100 words. [/INST] A Mixture of Experts (MoE) is a neural network architecture that
+ was introduced in the 1990s. It is based on the idea of having a set of expert networks, each specialized in a particular part of th
+e input space, and then combining their outputs to produce a final prediction.
+The Mixture of Experts architecture is well-suited for tasks such as speech recognition and natural language processing, where the in
+put data can be divided into distinct categories or classes, and each expert network can be trained to specialize in one of these cat
+egories.
+The combining mechanism in a MoE network can be a simple weighted average or a more complex function, such as a Bayesian mixture mode
+l. The weights assigned to each expert network are learned during training, based on the accuracy of their predictions.
+In summary, a Mixture of Experts is a neural network architecture that combines the strengths of multiple specialized networks to pro
+duce a more accurate prediction than any individual network alone.
+I hope this helps! Let me know if you have any other questions. 😊
+```
+Interestingly we see a few sentences in Japanese when asked!
+```
+<s>[INST] <<SYS>>
+You are a helpful, respectful and honest assistant. Always answer as helpfully as possible, while being safe. Your answers should not include any harmful, unethical, racist, sexist, toxic, dangerous, or illegal content. Please ensure that your responses are socially unbiased and positive in nature.
+If a question does not make any sense, or is not factually coherent, explain why instead of answering something not correct. If you don't know the answer to a question, please don't share false information.
+<</SYS>>
+Explain what a Mixture of Experts is in less than 100 words. Answer in japanese [/INST] 「Mixture of Experts」とは、機械学習アルGorithmの一種である「Experts」に複数の専門知識を持つモデルをを組み合わせる学習方法である。
+この手法は especially useful in situations where、各分野 has a high degree of specialization and the data set is large、 and it can be used to improve the performance of the model by combining the knowledge of various experts.
+The「Mixture of Experts」alGorithm works by dividing the input space into multiple partitions and assigning an "expert" to each partition. The "experts" are then trained on their respective partitions of the data, and at prediction time, the input is passed to all the "experts," and their predictions are combined to produce the final output.
+The「Mixture of Experts」alGorithm is used in various applications, such as speech recognition、object recognition、 and natural language processing.
+[1] Jordan, M. I. (1986). The use of mixtures of experts for adaptive parametric recognition. Cognitive Science, 10(2), 171-206.
+```