Instructions to use CohereLabs/c4ai-command-r-plus with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use CohereLabs/c4ai-command-r-plus with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="CohereLabs/c4ai-command-r-plus") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("CohereLabs/c4ai-command-r-plus") model = AutoModelForCausalLM.from_pretrained("CohereLabs/c4ai-command-r-plus") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use CohereLabs/c4ai-command-r-plus with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "CohereLabs/c4ai-command-r-plus" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "CohereLabs/c4ai-command-r-plus", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/CohereLabs/c4ai-command-r-plus
- SGLang
How to use CohereLabs/c4ai-command-r-plus with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "CohereLabs/c4ai-command-r-plus" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "CohereLabs/c4ai-command-r-plus", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "CohereLabs/c4ai-command-r-plus" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "CohereLabs/c4ai-command-r-plus", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use CohereLabs/c4ai-command-r-plus with Docker Model Runner:
docker model run hf.co/CohereLabs/c4ai-command-r-plus
Cannot load model in transformers.pipeline
Hi everyone, this is my first time posting to this platform. Please correct me if I'm wrong on any part.
I've tried this boilerplate code to test the predictions of this model through the Python SDK on my local machine:
from transformers import pipeline
messages = [
{"role": "user", "content": "Who are you?"},
]
pipe = pipeline("text-generation", model="CohereForAI/c4ai-command-r-plus")
pipe(messages)
It throws this error:
ValueError: Could not load model CohereForAI/c4ai-command-r-plus with any of the following classes: (<class 'transformers.models.auto.modeling_tf_auto.TFAutoModelForCausalLM'>,). See the original errors:
while loading with TFAutoModelForCausalLM, an error is thrown:
Traceback (most recent call last):
File "/home/dev/anaconda3/lib/python3.9/site-packages/transformers/pipelines/base.py", line 283, in infer_framework_load_model
model = model_class.from_pretrained(model, **kwargs)
File "/home/dev/anaconda3/lib/python3.9/site-packages/transformers/models/auto/auto_factory.py", line 567, in from_pretrained
raise ValueError(
ValueError: Unrecognized configuration class <class 'transformers.models.cohere.configuration_cohere.CohereConfig'> for this kind of AutoModel: TFAutoModelForCausalLM.
Model type should be one of BertConfig, CamembertConfig, CTRLConfig, GPT2Config, GPT2Config, GPTJConfig, MistralConfig, OpenAIGPTConfig, OPTConfig, RemBertConfig, RobertaConfig, RobertaPreLayerNormConfig, RoFormerConfig, TransfoXLConfig, XGLMConfig, XLMConfig, XLMRobertaConfig, XLNetConfig.
Before this, I've always used pretrained models which are based on some architecture (RoBERT, GPT, BART, etc.). And transformers would find an appropriate PreTrainedModel class for the given model and everything was smooth. This time, however, I guess since c4ai-command-r-plus is a new architecture, this issue pops up. I tried upgrading transformers to the latest version but it didn't help. (I'm using Python 3.9 btw, and transformers==4.42.3).
Am I missing something obvious?
Thanks in advance for your inputs.
pip install torch - you have TF but not Torch installed, and it's trying to initialize a TensorFlow pipeline
@Rocketknight1 thanks for pointing out! I didn't see it that way cuz all the earlier models would work with the TF backend. I guess its time to move to torch for this one.
Hi this issue looks resolved so closing it but feel free to reopen in case you're still facing any issues related to this @DevBhuyan !