|
|
--- |
|
|
base_model: unsloth/phi-3.5-mini-instruct-bnb-4bit |
|
|
tags: |
|
|
- text-generation-inference |
|
|
- transformers |
|
|
- unsloth |
|
|
- llama |
|
|
- trl |
|
|
license: mit |
|
|
language: |
|
|
- en |
|
|
--- |
|
|
# Model Summary |
|
|
Reason Phi model for top performing model with it's size of 3.8B. |
|
|
Phi-3 - synthetic data and filtered publicly available websites - with a focus on very high-quality, reasoning dense data. The model belongs to the Phi-3 model family and supports 128K token context length. |
|
|
|
|
|
# Run locally |
|
|
|
|
|
### 4bit |
|
|
After obtaining the Phi-3.5-mini-instruct model checkpoint, users can use this sample code for inference. |
|
|
|
|
|
```python |
|
|
import torch |
|
|
from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline, BitsAndBytesConfig |
|
|
|
|
|
torch.random.manual_seed(0) |
|
|
|
|
|
model_path = "EpistemeAI/DeepPhi-3.5-mini-instruct" |
|
|
|
|
|
# Configure 4-bit quantization using bitsandbytes |
|
|
quantization_config = BitsAndBytesConfig( |
|
|
load_in_4bit=True, |
|
|
bnb_4bit_quant_type="nf4", # You can also try "fp4" if desired. |
|
|
bnb_4bit_compute_dtype=torch.float16 # Or torch.bfloat16 depending on your hardware. |
|
|
) |
|
|
|
|
|
model = AutoModelForCausalLM.from_pretrained( |
|
|
model_path, |
|
|
device_map="auto", |
|
|
torch_dtype=torch.float16, |
|
|
trust_remote_code=True, |
|
|
quantization_config=quantization_config, |
|
|
) |
|
|
tokenizer = AutoTokenizer.from_pretrained(model_path) |
|
|
|
|
|
messages = [ |
|
|
{"role": "system", "content": """ |
|
|
You are a helpful AI assistant. Respond in the following format: |
|
|
<reasoning> |
|
|
... |
|
|
</reasoning> |
|
|
<answer> |
|
|
... |
|
|
</answer>"""}, |
|
|
{"role": "user", "content": "Can you provide ways to eat combinations of bananas and dragonfruits?"}, |
|
|
{"role": "assistant", "content": "Sure! Here are some ways to eat bananas and dragonfruits together: 1. Banana and dragonfruit smoothie: Blend bananas and dragonfruits together with some milk and honey. 2. Banana and dragonfruit salad: Mix sliced bananas and dragonfruits together with some lemon juice and honey."}, |
|
|
{"role": "user", "content": "What about solving a 2x + 3 = 7 equation?"}, |
|
|
] |
|
|
|
|
|
def format_messages(messages): |
|
|
prompt = "" |
|
|
for msg in messages: |
|
|
role = msg["role"].capitalize() |
|
|
prompt += f"{role}: {msg['content']}\n" |
|
|
return prompt.strip() |
|
|
|
|
|
prompt = format_messages(messages) |
|
|
|
|
|
pipe = pipeline( |
|
|
"text-generation", |
|
|
model=model, |
|
|
tokenizer=tokenizer, |
|
|
) |
|
|
|
|
|
generation_args = { |
|
|
"max_new_tokens": 500, |
|
|
"return_full_text": False, |
|
|
"temperature": 0.0, |
|
|
"do_sample": False, |
|
|
} |
|
|
|
|
|
output = pipe(prompt, **generation_args) |
|
|
print(output[0]['generated_text']) |
|
|
|
|
|
``` |
|
|
|
|
|
# Uploaded model |
|
|
|
|
|
- **Developed by:** EpistemeAI |
|
|
- **License:** apache-2.0 |
|
|
- **Finetuned from model :** unsloth/phi-3.5-mini-instruct-bnb-4bit |
|
|
|
|
|
This llama model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library. |
|
|
|
|
|
[<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth) |