Instructions to use GreenBitAI/yi-6b-w2a16g32 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use GreenBitAI/yi-6b-w2a16g32 with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="GreenBitAI/yi-6b-w2a16g32", trust_remote_code=True)# Load model directly from transformers import AutoModelForCausalLM model = AutoModelForCausalLM.from_pretrained("GreenBitAI/yi-6b-w2a16g32", trust_remote_code=True, dtype="auto") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use GreenBitAI/yi-6b-w2a16g32 with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "GreenBitAI/yi-6b-w2a16g32" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "GreenBitAI/yi-6b-w2a16g32", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/GreenBitAI/yi-6b-w2a16g32
- SGLang
How to use GreenBitAI/yi-6b-w2a16g32 with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "GreenBitAI/yi-6b-w2a16g32" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "GreenBitAI/yi-6b-w2a16g32", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "GreenBitAI/yi-6b-w2a16g32" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "GreenBitAI/yi-6b-w2a16g32", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use GreenBitAI/yi-6b-w2a16g32 with Docker Model Runner:
docker model run hf.co/GreenBitAI/yi-6b-w2a16g32
YAML Metadata Warning:empty or missing yaml metadata in repo card
Check out the documentation for more information.
license: apache-2.0
GreenBit Yi
This is GreenBitAI's pretrained 2-bit Yi 6B model with extreme compression yet still strong performance.
Please refer to our Github page for the code to run the model and more information.
Model Description
- Developed by: GreenBitAI
- Model type: Causal (Llama 2/Yi 6B)
- Language(s) (NLP): English
- License: Apache 2.0, Llama 2 license agreement
Zero-Shot Evaluation
| Task | Metric | FP16 | Yi-6B w4a16g32 | Yi-6B w2a16g8 | Yi-6B w2a16g32 |
|---|---|---|---|---|---|
| Openbookqa | acc | 0.314 | 0.324 | 0.26 | 0.228 |
| ac_norm | 0.408 | 0.42 | 0.394 | 0.352 | |
| arc_challenge | acc | 0.462 | 0.4573 | 0.4082 | 0.337 |
| ac_norm | 0.504 | 0.483 | 0.4249 | 0.3523 | |
| hellawswag | acc | 0.553 | 0.5447 | 0.5083 | 0.4326 |
| ac_norm | 0.749 | 0.7327 | 0.6909 | 0.58 | |
| piqa | acc | 0.777 | 0.7709 | 0.7535 | 0.7051 |
| ac_norm | 0.787 | 0.7894 | 0.7655 | 0.7143 | |
| arc_easy | acc | 0.777 | 0.7697 | 0.7373 | 0.6523 |
| ac_norm | 0.774 | 0.7659 | 0.7314 | 0.6115 | |
| Winogrande | acc | 0.707 | 0.7095 | 0.6803 | 0.6219 |
| boolq | acc | 0.755 | 0.7648 | 0.7507 | 0.732 |
| truthfulqa_mc | mc1 | 0.29 | 0.2729 | 0.2753 | 0.219 |
| mc2 | 0.419 | 0.4033 | 0.4156 | 0.3479 | |
| anli_r1 | acc | 0.423 | 0.416 | 0.383 | 0.38 |
| anli_r2 | acc | 0.409 | 0.409 | 0.387 | 0.374 |
| anli_r3 | acc | 0.411 | 0.393 | 0.38 | 0.3475 |
| wic | acc | 0.529 | 0.545 | 0.515 | 0.5 |
| rte | acc | 0.685 | 0.7039 | 0.7436 | 0.6787 |
| record | f1 | 0.904 | 0.9011 | 0.8906 | 0.8521 |
| em | 0.8962 | 0.8927 | 0.8819 | 0.8429 | |
| Average | 0.596 | 0.5937 | 0.5703 | 0.517 | |
| wikitext2 (2048) | ppl | 5.841 | 6.01 | 6.57 | 8.06 |
| ptb (2048) | ppl | 18.93 | 19.76 | 25.9 | 35.3 |
| Model Size | GiB | 12.12 | 3.76 | 3.09 | 2.5 |
- Downloads last month
- 4
docker model run hf.co/GreenBitAI/yi-6b-w2a16g32