Instructions to use GreenBitAI/yi-34b-w4a16g32 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use GreenBitAI/yi-34b-w4a16g32 with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="GreenBitAI/yi-34b-w4a16g32")

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("GreenBitAI/yi-34b-w4a16g32")
model = AutoModelForCausalLM.from_pretrained("GreenBitAI/yi-34b-w4a16g32")

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use GreenBitAI/yi-34b-w4a16g32 with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "GreenBitAI/yi-34b-w4a16g32"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "GreenBitAI/yi-34b-w4a16g32",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/GreenBitAI/yi-34b-w4a16g32

SGLang

How to use GreenBitAI/yi-34b-w4a16g32 with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "GreenBitAI/yi-34b-w4a16g32" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "GreenBitAI/yi-34b-w4a16g32",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "GreenBitAI/yi-34b-w4a16g32" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "GreenBitAI/yi-34b-w4a16g32",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Docker Model Runner
How to use GreenBitAI/yi-34b-w4a16g32 with Docker Model Runner:
```
docker model run hf.co/GreenBitAI/yi-34b-w4a16g32
```

yi-34b-w4a16g32

Commit History

Delete special_tokens_map.json

102f920

NicoNico commited on Jan 8, 2024

Delete tokenization_yi.py

0f5d5a1

NicoNico commited on Jan 8, 2024

Delete modeling_yi.py

6d3ef82

NicoNico commited on Jan 8, 2024

Delete configuration_yi.py

23edf07

NicoNico commited on Jan 8, 2024

Upload 4 files

1f39b05

NicoNico commited on Jan 8, 2024

Update README.md

b55fe0b

NicoNico commited on Dec 25, 2023

Update README.md

e8fff0b

yanghaojin commited on Dec 20, 2023

Update README.md

5e16bc4

NicoNico commited on Dec 15, 2023

Update README.md

a42dd8a

NicoNico commited on Dec 14, 2023

Update README.md

341bdd5

NicoNico commited on Dec 14, 2023

Update README.md

bf78d16

NicoNico commited on Dec 1, 2023

Update README.md

d6fa456

NicoNico commited on Dec 1, 2023

update

3d04270

NicoNico6 commited on Nov 16, 2023

initial commit

71bd3d5

NicoNico commited on Nov 16, 2023

Commit History

Delete special_tokens_map.json 102f920

Delete tokenization_yi.py 0f5d5a1

Delete modeling_yi.py 6d3ef82

Delete configuration_yi.py 23edf07

Upload 4 files 1f39b05

Update README.md b55fe0b

Update README.md e8fff0b

Update README.md 5e16bc4

Update README.md a42dd8a

Update README.md 341bdd5

Update README.md bf78d16

Update README.md d6fa456

update 3d04270

initial commit 71bd3d5

Delete special_tokens_map.json

102f920

Delete tokenization_yi.py

0f5d5a1

Delete modeling_yi.py

6d3ef82

Delete configuration_yi.py

23edf07

Upload 4 files

1f39b05

Update README.md

b55fe0b

Update README.md

e8fff0b

Update README.md

5e16bc4

Update README.md

a42dd8a

Update README.md

341bdd5

Update README.md

bf78d16

Update README.md

d6fa456

update

3d04270

initial commit

71bd3d5