Instructions to use aws-neuron/Mistral-neuron with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use aws-neuron/Mistral-neuron with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="aws-neuron/Mistral-neuron")# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("aws-neuron/Mistral-neuron") model = AutoModelForCausalLM.from_pretrained("aws-neuron/Mistral-neuron") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use aws-neuron/Mistral-neuron with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "aws-neuron/Mistral-neuron" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "aws-neuron/Mistral-neuron", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/aws-neuron/Mistral-neuron
- SGLang
How to use aws-neuron/Mistral-neuron with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "aws-neuron/Mistral-neuron" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "aws-neuron/Mistral-neuron", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "aws-neuron/Mistral-neuron" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "aws-neuron/Mistral-neuron", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use aws-neuron/Mistral-neuron with Docker Model Runner:
docker model run hf.co/aws-neuron/Mistral-neuron
337de27f9a06315532f84748ac3f0a75f8f7683d88f5ca7e3659b7b538c07980
Browse files- pytorch_model.bin/p75.model.layers.8.self_attn.v_proj.weight +3 -0
- pytorch_model.bin/p76.model.layers.8.self_attn.o_proj.weight +3 -0
- pytorch_model.bin/p77.model.layers.8.mlp.gate_proj.weight +3 -0
- pytorch_model.bin/p78.model.layers.8.mlp.up_proj.weight +3 -0
- pytorch_model.bin/p79.model.layers.8.mlp.down_proj.weight +3 -0
- pytorch_model.bin/p8.model.layers.0.input_layernorm.weight +3 -0
- pytorch_model.bin/p80.model.layers.8.input_layernorm.weight +3 -0
- pytorch_model.bin/p81.model.layers.8.post_attention_layernorm.weight +3 -0
- pytorch_model.bin/p82.model.layers.9.self_attn.q_proj.weight +3 -0
- pytorch_model.bin/p83.model.layers.9.self_attn.k_proj.weight +3 -0
- pytorch_model.bin/p84.model.layers.9.self_attn.v_proj.weight +3 -0
- pytorch_model.bin/p85.model.layers.9.self_attn.o_proj.weight +3 -0
- pytorch_model.bin/p86.model.layers.9.mlp.gate_proj.weight +3 -0
- pytorch_model.bin/p87.model.layers.9.mlp.up_proj.weight +3 -0
- pytorch_model.bin/p88.model.layers.9.mlp.down_proj.weight +3 -0
- pytorch_model.bin/p89.model.layers.9.input_layernorm.weight +3 -0
- pytorch_model.bin/p9.model.layers.0.post_attention_layernorm.weight +3 -0
- pytorch_model.bin/p90.model.layers.9.post_attention_layernorm.weight +3 -0
- pytorch_model.bin/p91.model.layers.10.self_attn.q_proj.weight +3 -0
- pytorch_model.bin/p92.model.layers.10.self_attn.k_proj.weight +3 -0
- pytorch_model.bin/p93.model.layers.10.self_attn.v_proj.weight +3 -0
- pytorch_model.bin/p94.model.layers.10.self_attn.o_proj.weight +3 -0
- pytorch_model.bin/p95.model.layers.10.mlp.gate_proj.weight +3 -0
pytorch_model.bin/p75.model.layers.8.self_attn.v_proj.weight
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:505db63590afe38c252ad6332f4140aced32f753528c5c1a53ed08aeea38d11d
|
| 3 |
+
size 16778111
|
pytorch_model.bin/p76.model.layers.8.self_attn.o_proj.weight
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:70545f73c2b19587a3ebf0caae2fd05d39fa9eeb7348981f429a182ff81b0d68
|
| 3 |
+
size 67109759
|
pytorch_model.bin/p77.model.layers.8.mlp.gate_proj.weight
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:985e73bb265822d62292fe099cae1c20f0d0e7641ea6a8f3437eb0ee05a8757f
|
| 3 |
+
size 234881910
|
pytorch_model.bin/p78.model.layers.8.mlp.up_proj.weight
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:af1b2e46d25d988be3bcbadec63b0a85ec3aaa8c71eccb9196b51133f798d2ef
|
| 3 |
+
size 234881904
|
pytorch_model.bin/p79.model.layers.8.mlp.down_proj.weight
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:983bc3628a8c4d15f401c66c869c3da1bdedf5f5ee62eb3cba16e36fdc5035ff
|
| 3 |
+
size 234881910
|
pytorch_model.bin/p8.model.layers.0.input_layernorm.weight
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:42ddaea281402d4d0544c0130cd0522ee71c43778b458b9b1b20f44ac43940ea
|
| 3 |
+
size 17273
|
pytorch_model.bin/p80.model.layers.8.input_layernorm.weight
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:60d09446b96046be239e180153b89133c921e0ad92aff958ac59b1d40887c0d7
|
| 3 |
+
size 17276
|
pytorch_model.bin/p81.model.layers.8.post_attention_layernorm.weight
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:f8b9cfb2c196337ed8ac9e19f98f412586f1bc5964cf9e544c846267c94090a0
|
| 3 |
+
size 17303
|
pytorch_model.bin/p82.model.layers.9.self_attn.q_proj.weight
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:b4623f5bc4dba6e98238e7b5b0536b9a291742b736cebda6cb50cfa7cee4f7ee
|
| 3 |
+
size 67109759
|
pytorch_model.bin/p83.model.layers.9.self_attn.k_proj.weight
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:6d5ded2a52ac1dce1022ca5446ee720d4b692844f211d16f5f6458aa271c210d
|
| 3 |
+
size 16778111
|
pytorch_model.bin/p84.model.layers.9.self_attn.v_proj.weight
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:181b6be65114cf78260ddfab87d1b4c42e9cc292934fb8de3addd23aa4d3feb6
|
| 3 |
+
size 16778111
|
pytorch_model.bin/p85.model.layers.9.self_attn.o_proj.weight
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:c701c5cb4d21556498ceaf86395917e8e6bc07bd6a327413d0835e44f25f010c
|
| 3 |
+
size 67109759
|
pytorch_model.bin/p86.model.layers.9.mlp.gate_proj.weight
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:2fdd719fb7e550544b201bff47ba66a1851cc4d6dea6a29cd119dde5ba7c509e
|
| 3 |
+
size 234881910
|
pytorch_model.bin/p87.model.layers.9.mlp.up_proj.weight
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:303a0a2edd14201c7d6458646bfea20a9d2613080ab37f496dc6cf3467f6728c
|
| 3 |
+
size 234881904
|
pytorch_model.bin/p88.model.layers.9.mlp.down_proj.weight
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:8ef038d286671cdceeace6250aae7f87ef0c0c17897e24a45e409ca73eb768e1
|
| 3 |
+
size 234881910
|
pytorch_model.bin/p89.model.layers.9.input_layernorm.weight
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:2246eea7cab2f07cd25e72c7ee758a9360f2aa2c15383cc143ebd95cfbe55211
|
| 3 |
+
size 17276
|
pytorch_model.bin/p9.model.layers.0.post_attention_layernorm.weight
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:2baec96240f28a5df670607960f1f92e44a2a43485100b4a671d0cfa65276f8e
|
| 3 |
+
size 17300
|
pytorch_model.bin/p90.model.layers.9.post_attention_layernorm.weight
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:43618f69ee0ecabf32cf4542c1e810d1597fc37443c83fafefefa40ba58470c1
|
| 3 |
+
size 17303
|
pytorch_model.bin/p91.model.layers.10.self_attn.q_proj.weight
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:ba75e1b72d108d28bbf78ea6ef47fe847ffb71265ec37583fac094e5d42be058
|
| 3 |
+
size 67109762
|
pytorch_model.bin/p92.model.layers.10.self_attn.k_proj.weight
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:092998a473969813be8fb26393f7694d7492866e2e08956300abb29c9fb09aef
|
| 3 |
+
size 16778114
|
pytorch_model.bin/p93.model.layers.10.self_attn.v_proj.weight
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:e1368de0fbe1dab0602328ec25a5e674072a374cabee90991a2b2e2f03b7fd86
|
| 3 |
+
size 16778114
|
pytorch_model.bin/p94.model.layers.10.self_attn.o_proj.weight
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:15791f868e1df1537da510ea19d4c3134a8b23532d422be3ad4221ba19c7826d
|
| 3 |
+
size 67109762
|
pytorch_model.bin/p95.model.layers.10.mlp.gate_proj.weight
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:f26a91712194caf8ceb23761dfd1e2ebce66bdf9e4ca1745ba7e77558db95c2c
|
| 3 |
+
size 234881913
|