Instructions to use QuixiAI/WizardLM-7B-Uncensored with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use QuixiAI/WizardLM-7B-Uncensored with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="QuixiAI/WizardLM-7B-Uncensored")

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("QuixiAI/WizardLM-7B-Uncensored")
model = AutoModelForCausalLM.from_pretrained("QuixiAI/WizardLM-7B-Uncensored")

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use QuixiAI/WizardLM-7B-Uncensored with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "QuixiAI/WizardLM-7B-Uncensored"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "QuixiAI/WizardLM-7B-Uncensored",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/QuixiAI/WizardLM-7B-Uncensored

SGLang

How to use QuixiAI/WizardLM-7B-Uncensored with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "QuixiAI/WizardLM-7B-Uncensored" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "QuixiAI/WizardLM-7B-Uncensored",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "QuixiAI/WizardLM-7B-Uncensored" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "QuixiAI/WizardLM-7B-Uncensored",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Docker Model Runner
How to use QuixiAI/WizardLM-7B-Uncensored with Docker Model Runner:
```
docker model run hf.co/QuixiAI/WizardLM-7B-Uncensored
```

WizardLM-7B-Uncensored

Commit History

Update README.md

7f64046
verified

Kearm commited on Jan 30, 2024

Update README.md

14c23f9

ehartford commited on May 12, 2023

Update README.md

2688b37

ehartford commited on May 12, 2023

Update README.md

c54a3f2

ehartford commited on May 12, 2023

update license (#12)

6ebb002

ehartford

Ezi commited on May 11, 2023

Update README.md

f9a91fd

ehartford commited on May 11, 2023

Update README.md

a108228

ehartford commited on May 9, 2023

cleanup

162b0f1

ehartford commited on May 6, 2023

Change use_cache to True which significantly speeds up inference (#2)

ca45eff

ehartford

TheBloke commited on May 5, 2023

Upload fast tokenizer, which is the recommended and default for transformers now (#1)

278bf3c

ehartford

TheBloke commited on May 5, 2023

Update README.md

3534761

ehartford commited on May 5, 2023

model

acb7132

Ubuntu commited on May 4, 2023

Merge branch 'main' of https://huggingface.co/ehartford/WizardLM-7B-Uncensored into main

f871962

Ubuntu commited on May 4, 2023

initial

b73e530

Ubuntu commited on May 4, 2023

Upload LlamaForCausalLM

eab85e8

ehartford commited on May 4, 2023

Upload LlamaForCausalLM

634ee91

ehartford commited on May 4, 2023

Upload LlamaForCausalLM

b250cdb

ehartford commited on May 4, 2023

initial commit

4979024

ehartford commited on May 4, 2023

Commit History

Update README.md 7f64046 verified

Update README.md 14c23f9

Update README.md 2688b37

Update README.md c54a3f2

update license (#12) 6ebb002

Update README.md f9a91fd

Update README.md a108228

cleanup 162b0f1

Change use_cache to True which significantly speeds up inference (#2) ca45eff

Upload fast tokenizer, which is the recommended and default for transformers now (#1) 278bf3c

Update README.md 3534761

model acb7132

Merge branch 'main' of https://huggingface.co/ehartford/WizardLM-7B-Uncensored into main f871962

initial b73e530

Upload LlamaForCausalLM eab85e8

Upload LlamaForCausalLM 634ee91

Upload LlamaForCausalLM b250cdb

initial commit 4979024

Update README.md

7f64046
verified

Update README.md

14c23f9

Update README.md

2688b37

Update README.md

c54a3f2

update license (#12)

6ebb002

Update README.md

f9a91fd

Update README.md

a108228

cleanup

162b0f1

Change use_cache to True which significantly speeds up inference (#2)

ca45eff

Upload fast tokenizer, which is the recommended and default for transformers now (#1)

278bf3c

Update README.md

3534761

model

acb7132

Merge branch 'main' of https://huggingface.co/ehartford/WizardLM-7B-Uncensored into main

f871962

initial

b73e530

Upload LlamaForCausalLM

eab85e8

Upload LlamaForCausalLM

634ee91

Upload LlamaForCausalLM

b250cdb

initial commit

4979024