Instructions to use CohereLabs/c4ai-command-r-plus with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use CohereLabs/c4ai-command-r-plus with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="CohereLabs/c4ai-command-r-plus")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("CohereLabs/c4ai-command-r-plus")
model = AutoModelForCausalLM.from_pretrained("CohereLabs/c4ai-command-r-plus")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use CohereLabs/c4ai-command-r-plus with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "CohereLabs/c4ai-command-r-plus"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "CohereLabs/c4ai-command-r-plus",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/CohereLabs/c4ai-command-r-plus

SGLang

How to use CohereLabs/c4ai-command-r-plus with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "CohereLabs/c4ai-command-r-plus" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "CohereLabs/c4ai-command-r-plus",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "CohereLabs/c4ai-command-r-plus" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "CohereLabs/c4ai-command-r-plus",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use CohereLabs/c4ai-command-r-plus with Docker Model Runner:
```
docker model run hf.co/CohereLabs/c4ai-command-r-plus
```

Cannot load model in transformers.pipeline

#51

by DevBhuyan - opened Jun 28, 2024

Discussion

DevBhuyan

Jun 28, 2024

Hi everyone, this is my first time posting to this platform. Please correct me if I'm wrong on any part.

I've tried this boilerplate code to test the predictions of this model through the Python SDK on my local machine:

from transformers import pipeline

messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe = pipeline("text-generation", model="CohereForAI/c4ai-command-r-plus")
pipe(messages)

It throws this error:

ValueError: Could not load model CohereForAI/c4ai-command-r-plus with any of the following classes: (<class 'transformers.models.auto.modeling_tf_auto.TFAutoModelForCausalLM'>,). See the original errors:

while loading with TFAutoModelForCausalLM, an error is thrown:
Traceback (most recent call last):
  File "/home/dev/anaconda3/lib/python3.9/site-packages/transformers/pipelines/base.py", line 283, in infer_framework_load_model
    model = model_class.from_pretrained(model, **kwargs)
  File "/home/dev/anaconda3/lib/python3.9/site-packages/transformers/models/auto/auto_factory.py", line 567, in from_pretrained
    raise ValueError(
ValueError: Unrecognized configuration class <class 'transformers.models.cohere.configuration_cohere.CohereConfig'> for this kind of AutoModel: TFAutoModelForCausalLM.
Model type should be one of BertConfig, CamembertConfig, CTRLConfig, GPT2Config, GPT2Config, GPTJConfig, MistralConfig, OpenAIGPTConfig, OPTConfig, RemBertConfig, RobertaConfig, RobertaPreLayerNormConfig, RoFormerConfig, TransfoXLConfig, XGLMConfig, XLMConfig, XLMRobertaConfig, XLNetConfig.

Before this, I've always used pretrained models which are based on some architecture (RoBERT, GPT, BART, etc.). And transformers would find an appropriate PreTrainedModel class for the given model and everything was smooth. This time, however, I guess since c4ai-command-r-plus is a new architecture, this issue pops up. I tried upgrading transformers to the latest version but it didn't help. (I'm using Python 3.9 btw, and transformers==4.42.3).

Am I missing something obvious?

Thanks in advance for your inputs.

Rocketknight1

Jul 2, 2024

pip install torch - you have TF but not Torch installed, and it's trying to initialize a TensorFlow pipeline

DevBhuyan

Jul 2, 2024

@Rocketknight1 thanks for pointing out! I didn't see it that way cuz all the earlier models would work with the TF backend. I guess its time to move to torch for this one.

shivalikasingh

Aug 1, 2024

Hi this issue looks resolved so closing it but feel free to reopen in case you're still facing any issues related to this @DevBhuyan !

shivalikasingh changed discussion status to closed Aug 1, 2024

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment