| A very capable chat model built on top of the new Mistral MoE model, trained on the SlimOrca dataset for 1 epoch, using QLoRA. | |
| Inference: | |
| ```py | |
| import torch | |
| from transformers import AutoModelForCausalLM, AutoTokenizer | |
| model = AutoModelForCausalLM.from_pretrained("mattshumer/mistral-8x7b-chat", low_cpu_mem_usage=True, device_map="auto", trust_remote_code=True) | |
| tok = AutoTokenizer.from_pretrained("mattshumer/mistral-8x7b-chat") | |
| x = tok.encode(PROMPT_GOES_HERE, return_tensors="pt").cuda() | |
| x = model.generate(x, max_new_tokens=512).cpu() | |
| print(tok.batch_decode(x)) | |
| ``` | |
| Prompt Template: | |
| ``` | |
| <|im_start|>system | |
| You are an AI assistant.<|im_end|> | |
| <|im_start|>user | |
| Hi, how are you?<|im_end|> | |
| <|im_start|>assistant | |
| I'm doing well, thanks for asking!<|im_end|> | |
| <|im_start|>user | |
| Write me a poem about AI.<|im_end|> | |
| ``` | |
| Trained w/ Axolotl on 6x H100s for nine hours. |