Instructions to use aws-neuron/Mistral-neuron with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use aws-neuron/Mistral-neuron with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="aws-neuron/Mistral-neuron")# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("aws-neuron/Mistral-neuron") model = AutoModelForCausalLM.from_pretrained("aws-neuron/Mistral-neuron") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use aws-neuron/Mistral-neuron with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "aws-neuron/Mistral-neuron" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "aws-neuron/Mistral-neuron", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/aws-neuron/Mistral-neuron
- SGLang
How to use aws-neuron/Mistral-neuron with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "aws-neuron/Mistral-neuron" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "aws-neuron/Mistral-neuron", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "aws-neuron/Mistral-neuron" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "aws-neuron/Mistral-neuron", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use aws-neuron/Mistral-neuron with Docker Model Runner:
docker model run hf.co/aws-neuron/Mistral-neuron
74f98ed453da674f39ed8a82f7aa3e784d48170b05c3171ee74e6d8151561251
Browse files- pytorch_model.bin/p59.model.layers.6.mlp.gate_proj.weight +3 -0
- pytorch_model.bin/p6.model.layers.0.mlp.up_proj.weight +3 -0
- pytorch_model.bin/p60.model.layers.6.mlp.up_proj.weight +3 -0
- pytorch_model.bin/p61.model.layers.6.mlp.down_proj.weight +3 -0
- pytorch_model.bin/p62.model.layers.6.input_layernorm.weight +3 -0
- pytorch_model.bin/p63.model.layers.6.post_attention_layernorm.weight +3 -0
- pytorch_model.bin/p64.model.layers.7.self_attn.q_proj.weight +3 -0
- pytorch_model.bin/p65.model.layers.7.self_attn.k_proj.weight +3 -0
- pytorch_model.bin/p66.model.layers.7.self_attn.v_proj.weight +3 -0
- pytorch_model.bin/p67.model.layers.7.self_attn.o_proj.weight +3 -0
- pytorch_model.bin/p68.model.layers.7.mlp.gate_proj.weight +3 -0
- pytorch_model.bin/p69.model.layers.7.mlp.up_proj.weight +3 -0
- pytorch_model.bin/p7.model.layers.0.mlp.down_proj.weight +3 -0
- pytorch_model.bin/p70.model.layers.7.mlp.down_proj.weight +3 -0
- pytorch_model.bin/p71.model.layers.7.input_layernorm.weight +3 -0
- pytorch_model.bin/p72.model.layers.7.post_attention_layernorm.weight +3 -0
- pytorch_model.bin/p73.model.layers.8.self_attn.q_proj.weight +3 -0
- pytorch_model.bin/p74.model.layers.8.self_attn.k_proj.weight +3 -0
pytorch_model.bin/p59.model.layers.6.mlp.gate_proj.weight
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:6c6d1e7c823c62f89ab6e1209e1fdd1766f30a81a181358f8fe68b344b777be8
|
| 3 |
+
size 234881910
|
pytorch_model.bin/p6.model.layers.0.mlp.up_proj.weight
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:93bcc81ff55db4515577bedab9cb6fb8aa946edb9abe77f66aa71f28eeb0c608
|
| 3 |
+
size 234881901
|
pytorch_model.bin/p60.model.layers.6.mlp.up_proj.weight
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:22505305a75ae0e434215721a5dadda3d0c58ab37a8af2cb2edb33f6ccd3c4d4
|
| 3 |
+
size 234881904
|
pytorch_model.bin/p61.model.layers.6.mlp.down_proj.weight
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:5e6c2af01e7f59982ba2749243dd3ef611adec5ffe5ea8743c861d93f0df2373
|
| 3 |
+
size 234881910
|
pytorch_model.bin/p62.model.layers.6.input_layernorm.weight
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:e85898fdf4baa06c3cca43a2703bbcad2080d657deec63eaeb4b48a57c2051ab
|
| 3 |
+
size 17276
|
pytorch_model.bin/p63.model.layers.6.post_attention_layernorm.weight
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:95a687a8cf2bd6ddbb16a1cb71b42758a6c3cdc7adf3eb5d5a9ce86b53706cfc
|
| 3 |
+
size 17303
|
pytorch_model.bin/p64.model.layers.7.self_attn.q_proj.weight
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:aaec2b83f991c746cdda381e4fb29b4f8b710d956d1a172ea0c3006718672729
|
| 3 |
+
size 67109759
|
pytorch_model.bin/p65.model.layers.7.self_attn.k_proj.weight
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:0b8082bf28ff599393d549629e8ac487fca17270829c607bda9322c5dcf398b5
|
| 3 |
+
size 16778111
|
pytorch_model.bin/p66.model.layers.7.self_attn.v_proj.weight
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:4ce55d7590441cbf8e16fe2a075e80a7f1179645dda3428d329c7eadc73ac1b3
|
| 3 |
+
size 16778111
|
pytorch_model.bin/p67.model.layers.7.self_attn.o_proj.weight
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:061570b4066e5cde3534d0516cdc702870c1c09febb1be9a53d362ddd4284b1e
|
| 3 |
+
size 67109759
|
pytorch_model.bin/p68.model.layers.7.mlp.gate_proj.weight
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:2d045c1191053281a3e00373a853918c2ecc7e1de103e32742f9ced484a7cdc3
|
| 3 |
+
size 234881910
|
pytorch_model.bin/p69.model.layers.7.mlp.up_proj.weight
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:0e47abbf908391b362e5280cd302038223fb4702dd0c7164c1fb54aa7dd77061
|
| 3 |
+
size 234881904
|
pytorch_model.bin/p7.model.layers.0.mlp.down_proj.weight
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:2735559ac7dc8c15f446e5c98a68d4533b6cc8b6b3c9f5e443740a01e65fc8fd
|
| 3 |
+
size 234881907
|
pytorch_model.bin/p70.model.layers.7.mlp.down_proj.weight
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:9c373df847f3cb8c72af964c731cfebd4e0ee137903c3d1ea6854c3694afe9f9
|
| 3 |
+
size 234881910
|
pytorch_model.bin/p71.model.layers.7.input_layernorm.weight
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:44b685d549fe641d8f38dd918a0b48a6afd10f51859bf0dc931d5c8b2345ee04
|
| 3 |
+
size 17276
|
pytorch_model.bin/p72.model.layers.7.post_attention_layernorm.weight
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:51c50efaaaca07acfe15ce4c99967d9ec9e4294f5fff15738741b88a0d65a848
|
| 3 |
+
size 17303
|
pytorch_model.bin/p73.model.layers.8.self_attn.q_proj.weight
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:bf873632de22c477573e0827d092aeed54811327fd7b1bc9f974d5347789d983
|
| 3 |
+
size 67109759
|
pytorch_model.bin/p74.model.layers.8.self_attn.k_proj.weight
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:9d0766dff7a231a33e5bc2eb418d2881e47925e22a13295b0e2abc8820751775
|
| 3 |
+
size 16778111
|