Instructions to use WonGrifferousAI/MisTraXLLM with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use WonGrifferousAI/MisTraXLLM with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="WonGrifferousAI/MisTraXLLM")

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("WonGrifferousAI/MisTraXLLM")
model = AutoModelForCausalLM.from_pretrained("WonGrifferousAI/MisTraXLLM")

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use WonGrifferousAI/MisTraXLLM with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "WonGrifferousAI/MisTraXLLM"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "WonGrifferousAI/MisTraXLLM",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/WonGrifferousAI/MisTraXLLM

SGLang

How to use WonGrifferousAI/MisTraXLLM with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "WonGrifferousAI/MisTraXLLM" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "WonGrifferousAI/MisTraXLLM",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "WonGrifferousAI/MisTraXLLM" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "WonGrifferousAI/MisTraXLLM",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Docker Model Runner
How to use WonGrifferousAI/MisTraXLLM with Docker Model Runner:
```
docker model run hf.co/WonGrifferousAI/MisTraXLLM
```

Wonder-Griffin commited on Sep 12, 2024

Commit

ff5f5cf

verified ·

1 Parent(s): 6e987a4

Update config.json

Browse files

Files changed (1) hide show

config.json +37 -37

config.json CHANGED Viewed

@@ -1,37 +1,37 @@
-{
-  "_name_or_path": "Wonder-Griffin/TraXLMistral",
-  "architectures": [
-    "TraXLMistralForCausalLM"
-  ],
-  "dropout": 0.1,
-  "dynamic_routing": true,
-  "ff_expansion_factor": 4,
-  "hidden_size": 768,
-  "id2label": {
-    "0": "LABEL_0",
-    "1": "LABEL_1",
-    "2": "LABEL_2",
-    "3": "LABEL_3",
-    "4": "LABEL_4"
-  },
-  "is_decoder": true,
-  "label2id": {
-    "LABEL_0": 0,
-    "LABEL_1": 1,
-    "LABEL_2": 2,
-    "LABEL_3": 3,
-    "LABEL_4": 4
-  },
-  "max_computation_steps": 5,
-  "max_len": 256,
-  "memory_size": 256,
-  "model_type": "TraXLMistral",
-  "n_embd": 128,
-  "n_head": 4,
-  "n_layer": 4,
-  "rnn_units": 128,
-  "sparse_attention": true,
-  "torch_dtype": "float32",
-  "transformers_version": "4.44.2",
-  "vocab_size": 50257
-}

+{
+  "_name_or_path": "Wonder-Griffin/TraXLMistral",
+  "architectures": [
+    "GPT2LMHeadModel"
+  ],
+  "dropout": 0.1,
+  "dynamic_routing": true,
+  "ff_expansion_factor": 4,
+  "hidden_size": 768,
+  "id2label": {
+    "0": "LABEL_0",
+    "1": "LABEL_1",
+    "2": "LABEL_2",
+    "3": "LABEL_3",
+    "4": "LABEL_4"
+  },
+  "is_decoder": true,
+  "label2id": {
+    "LABEL_0": 0,
+    "LABEL_1": 1,
+    "LABEL_2": 2,
+    "LABEL_3": 3,
+    "LABEL_4": 4
+  },
+  "max_computation_steps": 5,
+  "max_len": 256,
+  "memory_size": 256,
+  "model_type": "gpt2",
+  "n_embd": 128,
+  "n_head": 4,
+  "n_layer": 4,
+  "rnn_units": 128,
+  "sparse_attention": true,
+  "torch_dtype": "float32",
+  "transformers_version": "4.44.2",
+  "vocab_size": 50257
+}