Instructions to use 01-ai/Yi-6B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use 01-ai/Yi-6B with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="01-ai/Yi-6B")# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("01-ai/Yi-6B") model = AutoModelForCausalLM.from_pretrained("01-ai/Yi-6B") - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use 01-ai/Yi-6B with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "01-ai/Yi-6B" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "01-ai/Yi-6B", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/01-ai/Yi-6B
- SGLang
How to use 01-ai/Yi-6B with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "01-ai/Yi-6B" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "01-ai/Yi-6B", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "01-ai/Yi-6B" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "01-ai/Yi-6B", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use 01-ai/Yi-6B with Docker Model Runner:
docker model run hf.co/01-ai/Yi-6B
Tokenizer class does not exist
I am trying to use LM-Eval-Harness to benchmark the model, it uses huggingface's AutoTokenizer class to load the Tokenizer, but it is giving this error:
Traceback (most recent call last):
File "/home/teknium/dakota/lm-evaluation-harness/main.py", line 89, in
main()
File "/home/teknium/dakota/lm-evaluation-harness/main.py", line 57, in main
results = evaluator.simple_evaluate(
File "/home/teknium/dakota/lm-evaluation-harness/lm_eval/utils.py", line 242, in _wrapper
return fn(*args, **kwargs)
File "/home/teknium/dakota/lm-evaluation-harness/lm_eval/evaluator.py", line 69, in simple_evaluate
lm = lm_eval.models.get_model(model).create_from_arg_string(
File "/home/teknium/dakota/lm-evaluation-harness/lm_eval/base.py", line 115, in create_from_arg_string
return cls(**args, **args2)
File "/home/teknium/dakota/lm-evaluation-harness/lm_eval/models/huggingface.py", line 189, in init
self.tokenizer = self._create_auto_tokenizer(
File "/home/teknium/dakota/lm-evaluation-harness/lm_eval/models/huggingface.py", line 492, in _create_auto_tokenizer
tokenizer = super()._create_auto_tokenizer(
File "/home/teknium/dakota/lm-evaluation-harness/lm_eval/models/huggingface.py", line 313, in _create_auto_tokenizer
tokenizer = self.AUTO_TOKENIZER_CLASS.from_pretrained(
File "/home/teknium/lm-evaluation-harness/venv/lib/python3.10/site-packages/transformers/models/auto/tokenization_auto.py", line 748, in from_pretrained
raise ValueError(
ValueError: Tokenizer class YiTokenizer does not exist or is not currently imported.
Running tasks: openbookqa,arc_easy,winogrande,hellaswag,arc_challenge,piqa,boolq with batch size: 14 and output path: ./benchmark_logs/01-ai/Yi-6B_float16_GPT4All.json
Yi models use custom model now, so you should add trust_remote_code=True in from_pretrained
Yi models use custom model now, so you should add
trust_remote_code=Trueinfrom_pretrained
I already have trust remote code set, still doesn't accept it
Yi models use custom model now, so you should add
trust_remote_code=Trueinfrom_pretrainedI already have trust remote code set, still doesn't accept it
Can you prepare a minimal reproducible code? So we can check out what's the problem.
Yes:
git clone https://github.com/EleutherAI/lm-evaluation-harness
cd lm-evaluation-harness
pip install -e .
then
python3 main.py --model hf-causal-experimental --model_args pretrained="01-ai/Yi-6B",dtype="bfloat16",trust_remote_code=True,use_accelerate=True --tasks truthfulqa_mc --batch_size 1
Fresh venv still crashes it for me :/
Maybe you can have a try with our Docker image (will be released soon: https://github.com/01-ai/Yi/issues/3)
Could you try our latest instructions at https://github.com/01-ai/Yi#1-prepare-development-environment ?
Same issue here. Any news on this?
