Instructions to use aws-neuron/Mistral-neuron with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use aws-neuron/Mistral-neuron with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="aws-neuron/Mistral-neuron")# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("aws-neuron/Mistral-neuron") model = AutoModelForCausalLM.from_pretrained("aws-neuron/Mistral-neuron") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use aws-neuron/Mistral-neuron with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "aws-neuron/Mistral-neuron" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "aws-neuron/Mistral-neuron", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/aws-neuron/Mistral-neuron
- SGLang
How to use aws-neuron/Mistral-neuron with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "aws-neuron/Mistral-neuron" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "aws-neuron/Mistral-neuron", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "aws-neuron/Mistral-neuron" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "aws-neuron/Mistral-neuron", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use aws-neuron/Mistral-neuron with Docker Model Runner:
docker model run hf.co/aws-neuron/Mistral-neuron
e5bbc6ac1518183adffe256ac9df49f53b150c0d395ea45f14d8d97b684e7541
Browse files- pytorch_model.bin/p187.model.layers.20.mlp.down_proj.weight +3 -0
- pytorch_model.bin/p188.model.layers.20.input_layernorm.weight +3 -0
- pytorch_model.bin/p189.model.layers.20.post_attention_layernorm.weight +3 -0
- pytorch_model.bin/p19.model.layers.2.self_attn.q_proj.weight +3 -0
- pytorch_model.bin/p190.model.layers.21.self_attn.q_proj.weight +3 -0
- pytorch_model.bin/p191.model.layers.21.self_attn.k_proj.weight +3 -0
- pytorch_model.bin/p192.model.layers.21.self_attn.v_proj.weight +3 -0
- pytorch_model.bin/p193.model.layers.21.self_attn.o_proj.weight +3 -0
- pytorch_model.bin/p194.model.layers.21.mlp.gate_proj.weight +3 -0
- pytorch_model.bin/p195.model.layers.21.mlp.up_proj.weight +3 -0
- pytorch_model.bin/p196.model.layers.21.mlp.down_proj.weight +3 -0
- pytorch_model.bin/p197.model.layers.21.input_layernorm.weight +3 -0
- pytorch_model.bin/p198.model.layers.21.post_attention_layernorm.weight +3 -0
- pytorch_model.bin/p199.model.layers.22.self_attn.q_proj.weight +3 -0
- pytorch_model.bin/p2.model.layers.0.self_attn.k_proj.weight +3 -0
- pytorch_model.bin/p20.model.layers.2.self_attn.k_proj.weight +3 -0
- pytorch_model.bin/p200.model.layers.22.self_attn.k_proj.weight +3 -0
- pytorch_model.bin/p201.model.layers.22.self_attn.v_proj.weight +3 -0
- pytorch_model.bin/p202.model.layers.22.self_attn.o_proj.weight +3 -0
- pytorch_model.bin/p203.model.layers.22.mlp.gate_proj.weight +3 -0
- pytorch_model.bin/p204.model.layers.22.mlp.up_proj.weight +3 -0
- pytorch_model.bin/p205.model.layers.22.mlp.down_proj.weight +3 -0
- pytorch_model.bin/p206.model.layers.22.input_layernorm.weight +3 -0
- pytorch_model.bin/p207.model.layers.22.post_attention_layernorm.weight +3 -0
pytorch_model.bin/p187.model.layers.20.mlp.down_proj.weight
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:2b35b3d21e9d152f4e379513beb1f118477915187fdafc241b95f68362f4cdda
|
| 3 |
+
size 234881916
|
pytorch_model.bin/p188.model.layers.20.input_layernorm.weight
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:e602df310ed864b0fef4e102d2c27729a934554406bad2a47b36becaa8133860
|
| 3 |
+
size 17282
|
pytorch_model.bin/p189.model.layers.20.post_attention_layernorm.weight
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:36b08bffc6a977caecac5a9b75a915484c313e28327327526cf1391ac4b401e0
|
| 3 |
+
size 17309
|
pytorch_model.bin/p19.model.layers.2.self_attn.q_proj.weight
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:34faf042dc3e0e764d535021c3a5baf5b180f792b05240641cce28e8071ea551
|
| 3 |
+
size 67109759
|
pytorch_model.bin/p190.model.layers.21.self_attn.q_proj.weight
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:203b554798af6a3ba5159ed483a01d5749e245f5b63a58de6da1562a794e0aa8
|
| 3 |
+
size 67109765
|
pytorch_model.bin/p191.model.layers.21.self_attn.k_proj.weight
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:dcb370d37571f461d3e0e10e02edb23a1125f56725dab8945dba880b5fe7977e
|
| 3 |
+
size 16778117
|
pytorch_model.bin/p192.model.layers.21.self_attn.v_proj.weight
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:789731477e4ba2b4fe00b384dacb1dd9808b13b50917a476842e14ac8187233c
|
| 3 |
+
size 16778117
|
pytorch_model.bin/p193.model.layers.21.self_attn.o_proj.weight
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:74719cda63dd0bf6c5d8bc8ebe1fed0b036a415c2587151a6bb878b56ef6fb6d
|
| 3 |
+
size 67109765
|
pytorch_model.bin/p194.model.layers.21.mlp.gate_proj.weight
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:91f63290ff49ec1afeaa5b20f65d084cd3d58dc19ce9bc2f46b9e0911d67419c
|
| 3 |
+
size 234881916
|
pytorch_model.bin/p195.model.layers.21.mlp.up_proj.weight
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:ea439ab881b26523ab7335105ab5e11d90f22c2d839e2ac979a62d91cdc8e9e3
|
| 3 |
+
size 234881910
|
pytorch_model.bin/p196.model.layers.21.mlp.down_proj.weight
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:c68ecc52d9c96e8934062b62a6f65a21e095f0209c8fb6495f0e8ec9c772dcab
|
| 3 |
+
size 234881916
|
pytorch_model.bin/p197.model.layers.21.input_layernorm.weight
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:62239a789e338e0dc3b2402adb49a723881601babec999bed7b89b8a7da2e17b
|
| 3 |
+
size 17282
|
pytorch_model.bin/p198.model.layers.21.post_attention_layernorm.weight
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:871e7fedb488f26423694d5df3457a315d19bd9337c3ec9604c87f6388f0446e
|
| 3 |
+
size 17309
|
pytorch_model.bin/p199.model.layers.22.self_attn.q_proj.weight
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:d723fd506dd34e5aaa60a93ead530c2828ddb7ec92f76061bd2b92b46f6e2872
|
| 3 |
+
size 67109765
|
pytorch_model.bin/p2.model.layers.0.self_attn.k_proj.weight
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:532904415395bce99a219cc7c1f4f100796ad2ae36e9de6b7752643cd43628f3
|
| 3 |
+
size 16778108
|
pytorch_model.bin/p20.model.layers.2.self_attn.k_proj.weight
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:ee8a3b701e64834bf45b7f2949bc2c5e5ad6421a15e2052f366df73b7c84b520
|
| 3 |
+
size 16778111
|
pytorch_model.bin/p200.model.layers.22.self_attn.k_proj.weight
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:57d304a40597e76bee146d22a5ea7079e490a2b8bc7216dba270994b630ed2ad
|
| 3 |
+
size 16778117
|
pytorch_model.bin/p201.model.layers.22.self_attn.v_proj.weight
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:8a73c0f25b29dc2882b91d2e6a1c6ec3474e3242c577725a89e5bc3eb51d9463
|
| 3 |
+
size 16778117
|
pytorch_model.bin/p202.model.layers.22.self_attn.o_proj.weight
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:636ee3a06d8ca2571ddd3ad7dfea0d7ed3666391eab84fb29ad1f86f8c7c97f9
|
| 3 |
+
size 67109765
|
pytorch_model.bin/p203.model.layers.22.mlp.gate_proj.weight
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:049fb82ee7a6c0ce7c9084968d4b9912dc775e4c66c992d9aeb928e13778dcbe
|
| 3 |
+
size 234881916
|
pytorch_model.bin/p204.model.layers.22.mlp.up_proj.weight
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:0eb240a5b15390d8c35a43c163adb7a0e97c41d9a4ad729c71464cdc3c9862d4
|
| 3 |
+
size 234881910
|
pytorch_model.bin/p205.model.layers.22.mlp.down_proj.weight
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:276d9a2160c4139e4299cad90be99cea3ac337039d6b72d85aa5a4a7e70dd54f
|
| 3 |
+
size 234881916
|
pytorch_model.bin/p206.model.layers.22.input_layernorm.weight
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:3254f17c8c94ba733963ad697eaee36394718c60ce1fbf049252b9de351e916d
|
| 3 |
+
size 17282
|
pytorch_model.bin/p207.model.layers.22.post_attention_layernorm.weight
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:3275fe61e10ca5cb81e0a71d9bd48c5cec859295813b0f3913c3a58bdf98b047
|
| 3 |
+
size 17309
|