Instructions to use aws-neuron/Mistral-neuron with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use aws-neuron/Mistral-neuron with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="aws-neuron/Mistral-neuron")# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("aws-neuron/Mistral-neuron") model = AutoModelForCausalLM.from_pretrained("aws-neuron/Mistral-neuron") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use aws-neuron/Mistral-neuron with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "aws-neuron/Mistral-neuron" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "aws-neuron/Mistral-neuron", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/aws-neuron/Mistral-neuron
- SGLang
How to use aws-neuron/Mistral-neuron with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "aws-neuron/Mistral-neuron" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "aws-neuron/Mistral-neuron", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "aws-neuron/Mistral-neuron" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "aws-neuron/Mistral-neuron", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use aws-neuron/Mistral-neuron with Docker Model Runner:
docker model run hf.co/aws-neuron/Mistral-neuron
3a9c3fc5dd2478d36f41dc0609b6521fd14e3ae4d1aad35baa1c159e2c539240
Browse files- pytorch_model.bin/p150.model.layers.16.mlp.up_proj.weight +3 -0
- pytorch_model.bin/p151.model.layers.16.mlp.down_proj.weight +3 -0
- pytorch_model.bin/p152.model.layers.16.input_layernorm.weight +3 -0
- pytorch_model.bin/p153.model.layers.16.post_attention_layernorm.weight +3 -0
- pytorch_model.bin/p154.model.layers.17.self_attn.q_proj.weight +3 -0
- pytorch_model.bin/p155.model.layers.17.self_attn.k_proj.weight +3 -0
- pytorch_model.bin/p156.model.layers.17.self_attn.v_proj.weight +3 -0
- pytorch_model.bin/p157.model.layers.17.self_attn.o_proj.weight +3 -0
- pytorch_model.bin/p158.model.layers.17.mlp.gate_proj.weight +3 -0
- pytorch_model.bin/p159.model.layers.17.mlp.up_proj.weight +3 -0
- pytorch_model.bin/p16.model.layers.1.mlp.down_proj.weight +3 -0
- pytorch_model.bin/p160.model.layers.17.mlp.down_proj.weight +3 -0
- pytorch_model.bin/p161.model.layers.17.input_layernorm.weight +3 -0
- pytorch_model.bin/p162.model.layers.17.post_attention_layernorm.weight +3 -0
- pytorch_model.bin/p163.model.layers.18.self_attn.q_proj.weight +3 -0
- pytorch_model.bin/p164.model.layers.18.self_attn.k_proj.weight +3 -0
- pytorch_model.bin/p165.model.layers.18.self_attn.v_proj.weight +3 -0
- pytorch_model.bin/p166.model.layers.18.self_attn.o_proj.weight +3 -0
- pytorch_model.bin/p167.model.layers.18.mlp.gate_proj.weight +3 -0
pytorch_model.bin/p150.model.layers.16.mlp.up_proj.weight
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:025794f5c32d8898a68da81d565db205e223d0ce2eaaabd9084c35c3f9bd5dd4
|
| 3 |
+
size 234881910
|
pytorch_model.bin/p151.model.layers.16.mlp.down_proj.weight
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:63018e4557735d5747947cf78a39d66b0378d1d9bda89a970e5d909658f66e5b
|
| 3 |
+
size 234881916
|
pytorch_model.bin/p152.model.layers.16.input_layernorm.weight
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:ba14717df190ed288576d405060aac43348be63720131e713ae6cb8e70d438a3
|
| 3 |
+
size 17282
|
pytorch_model.bin/p153.model.layers.16.post_attention_layernorm.weight
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:074e9422861c9f91c697ff8cfd36ebe99cc2df028dbc562cc0ca34c44a7ba44e
|
| 3 |
+
size 17309
|
pytorch_model.bin/p154.model.layers.17.self_attn.q_proj.weight
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:82586c82e6097e86bfc6718cb68d8f8e8292eae323d8c056a5e04b8cdfa809cd
|
| 3 |
+
size 67109765
|
pytorch_model.bin/p155.model.layers.17.self_attn.k_proj.weight
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:f3fe7444ecbba47b1d751d6bf2c1d62680110e233547c132bdfa31c2bacd8b27
|
| 3 |
+
size 16778117
|
pytorch_model.bin/p156.model.layers.17.self_attn.v_proj.weight
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:7089b6f60df8a5a3329009d961f043636382f3466c96bc3ef913e3fa2f715768
|
| 3 |
+
size 16778117
|
pytorch_model.bin/p157.model.layers.17.self_attn.o_proj.weight
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:6490ae5bf7385966e131b239ed0087daa2c2a57e93433e242ac714f889d9352c
|
| 3 |
+
size 67109765
|
pytorch_model.bin/p158.model.layers.17.mlp.gate_proj.weight
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:c1b5a90a3416b757a22fbeee007406f5ecf0af86547cbcdec9dc3c944e515023
|
| 3 |
+
size 234881916
|
pytorch_model.bin/p159.model.layers.17.mlp.up_proj.weight
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:93a768630dd036f50548283f4c991dc1a87a499e4f9a1e098cb1ba75061c1bb3
|
| 3 |
+
size 234881910
|
pytorch_model.bin/p16.model.layers.1.mlp.down_proj.weight
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:ef3f917b7b5dcf8041d4bd835516bff2636d2368490323f016537127b28df841
|
| 3 |
+
size 234881910
|
pytorch_model.bin/p160.model.layers.17.mlp.down_proj.weight
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:ad87b46bf02f172a418577c47c1356ed0c3826a1fe4b5e1cdf7e26921894fc28
|
| 3 |
+
size 234881916
|
pytorch_model.bin/p161.model.layers.17.input_layernorm.weight
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:80f7433b570ee97438c33ae07b8634042ae6b6d68dedc315403bc7effdbb7da0
|
| 3 |
+
size 17282
|
pytorch_model.bin/p162.model.layers.17.post_attention_layernorm.weight
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:ca2d4d840ce5f3a0b8615010aff98917ff663e45ea6c4e2815707c2659f6818c
|
| 3 |
+
size 17309
|
pytorch_model.bin/p163.model.layers.18.self_attn.q_proj.weight
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:85062d58e29f3f2ae295d688ae4184690bc18cd7c9ce6e2265bface9a3ddf989
|
| 3 |
+
size 67109765
|
pytorch_model.bin/p164.model.layers.18.self_attn.k_proj.weight
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:62fb8501dfddd6037b7f189485042a88ee9367dfcbea88ae2e1db3c1a63ffbc1
|
| 3 |
+
size 16778117
|
pytorch_model.bin/p165.model.layers.18.self_attn.v_proj.weight
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:2a6fba6d56fb36bea878c6c0ba66ea10c1f2b8551345aaea90b18cb0156d87df
|
| 3 |
+
size 16778117
|
pytorch_model.bin/p166.model.layers.18.self_attn.o_proj.weight
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:25c4370ccc9a1357c36e7c8d6009017576453f7c0134d4f31cbb38b021af0b06
|
| 3 |
+
size 67109765
|
pytorch_model.bin/p167.model.layers.18.mlp.gate_proj.weight
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:5995cd642c3af096eb0dde7b664584b79cb436121bb757065ca3f8d811f51417
|
| 3 |
+
size 234881916
|