Update README.md

61ad699 verified over 1 year ago

5.17 kB

	---
	base_model: unsloth/Mistral-Nemo-Instruct-2407-bnb-4bit
	language:
	- en
	license: apache-2.0
	tags:
	- text-generation-inference
	- transformers
	- unsloth
	- mistral
	- trl
	---

	# Uploaded model

	- Developed by: EpistemeAI2
	- License: apache-2.0
	- Finetuned from model : unsloth/Mistral-Nemo-Instruct-2407-bnb-4bit

	This mistral model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.

	[<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)

	# Model Card for EpistemeAI2's Fireball-Mistral-Nemo-Instruct-emo-PHD, fine tuned Mistral-Nemo-Instruct-2407

	The EpistemeAI2's Fireball-Mistral-Nemo-Instruct-emo-PHD , fine tuned Mistral-Nemo-Instruct-2407 Large Language Model (LLM) is an instruct fine-tuned version of the [Mistral-Nemo-Base-2407](https://huggingface.co/mistralai/Mistral-Nemo-Base-2407). Trained jointly by Mistral AI and NVIDIA, it significantly outperforms existing models smaller or similar in size.

	For more details about this model please refer to our release [blog post](https://mistral.ai/news/mistral-nemo/).

	## Key features
	- Released under the Apache 2 License
	- Pre-trained and instructed versions
	- Trained with a 128k context window
	- Trained on a large proportion of multilingual and code data
	- Drop-in replacement of Mistral 7B

	## Model Architecture
	Mistral Nemo is a transformer model, with the following architecture choices:
	- Layers: 40
	- Dim: 5,120
	- Head dim: 128
	- Hidden dim: 14,336
	- Activation Function: SwiGLU
	- Number of heads: 32
	- Number of kv-heads: 8 (GQA)
	- Vocabulary size: 2**17 ~= 128k
	- Rotary embeddings (theta = 1M)

	## Training data

	Fireball-Mistral-Nemo-Instruct-emo-PHD is fine tuned by simulated-emotions and philsophy in deduction reasoning, math and science dataset


	### Mistral Inference

	#### Install

	```
	pip install mistral_inference
	```

	#### Download

	```py
	from huggingface_hub import snapshot_download
	from pathlib import Path
	mistral_models_path = Path.home().joinpath('mistral_models', 'Nemo-Instruct')
	mistral_models_path.mkdir(parents=True, exist_ok=True)
	snapshot_download(repo_id="EpistemeAI2/Fireball-Mistral-Nemo-Instruct-emo-PHD", allow_patterns=["params.json", "consolidated.safetensors", "tekken.json"], local_dir=mistral_models_path)
	```



	### Transformers

	> [!IMPORTANT]
	> NOTE: Until a new release has been made, you need to install transformers from source:
	> ```sh
	> pip install git+https://github.com/huggingface/transformers.git
	> ```

	If you want to use Hugging Face `transformers` to generate text, you can do something like this.

	```py
	from transformers import pipeline
	messages = [
	{"role": "system", "content": "You are a pirate chatbot who always responds in pirate speak!"},
	{"role": "user", "content": "Who are you?"},
	]
	chatbot = pipeline("text-generation", model="EpistemeAI2/Fireball-Mistral-Nemo-Instruct-emo-PHD",max_new_tokens=128)
	chatbot(messages)
	```

	## Function calling with `transformers`

	To use this example, you'll need `transformers` version 4.42.0 or higher. Please see the
	[function calling guide](https://huggingface.co/docs/transformers/main/chat_templating#advanced-tool-use--function-calling)
	in the `transformers` docs for more information.

	```python
	from transformers import AutoModelForCausalLM, AutoTokenizer
	import torch
	model_id = "EpistemeAI2/Fireball-Mistral-Nemo-Instruct-emo-PHD"
	tokenizer = AutoTokenizer.from_pretrained(model_id)
	def get_current_weather(location: str, format: str):
	"""
	Get the current weather
	Args:
	location: The city and state, e.g. San Francisco, CA
	format: The temperature unit to use. Infer this from the users location. (choices: ["celsius", "fahrenheit"])
	"""
	pass
	conversation = [{"role": "user", "content": "What's the weather like in Paris?"}]
	tools = [get_current_weather]
	# format and tokenize the tool use prompt
	inputs = tokenizer.apply_chat_template(
	conversation,
	tools=tools,
	add_generation_prompt=True,
	return_dict=True,
	return_tensors="pt",
	)
	model = AutoModelForCausalLM.from_pretrained(model_id, torch_dtype=torch.bfloat16, device_map="auto")
	inputs.to(model.device)
	outputs = model.generate(**inputs, max_new_tokens=1000)
	print(tokenizer.decode(outputs[0], skip_special_tokens=True))
	```

	Note that, for reasons of space, this example does not show a complete cycle of calling a tool and adding the tool call and tool
	results to the chat history so that the model can use them in its next generation. For a full tool calling example, please
	see the [function calling guide](https://huggingface.co/docs/transformers/main/chat_templating#advanced-tool-use--function-calling),
	and note that Mistral does use tool call IDs, so these must be included in your tool calls and tool results. They should be
	exactly 9 alphanumeric characters.

	> [!TIP]
	> Unlike previous Mistral models, Mistral Nemo requires smaller temperatures. We recommend to use a temperature of 0.3.