Instructions to use agentica-org/DeepCoder-14B-Preview with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use agentica-org/DeepCoder-14B-Preview with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="agentica-org/DeepCoder-14B-Preview")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("agentica-org/DeepCoder-14B-Preview")
model = AutoModelForCausalLM.from_pretrained("agentica-org/DeepCoder-14B-Preview")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Inference
Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use agentica-org/DeepCoder-14B-Preview with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "agentica-org/DeepCoder-14B-Preview"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "agentica-org/DeepCoder-14B-Preview",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/agentica-org/DeepCoder-14B-Preview

SGLang

How to use agentica-org/DeepCoder-14B-Preview with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "agentica-org/DeepCoder-14B-Preview" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "agentica-org/DeepCoder-14B-Preview",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "agentica-org/DeepCoder-14B-Preview" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "agentica-org/DeepCoder-14B-Preview",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use agentica-org/DeepCoder-14B-Preview with Docker Model Runner:
```
docker model run hf.co/agentica-org/DeepCoder-14B-Preview
```

alpayariyak commited on Apr 9, 2025

Commit

26a4129

verified ·

1 Parent(s): 70db380

Add usage recommendations

Browse files

Files changed (1) hide show

README.md +7 -0

README.md CHANGED Viewed

@@ -102,6 +102,13 @@ Our model can be served using popular high-performance inference systems:
 All these systems support the OpenAI Chat Completions API format.
 ## License
 This project is released under the MIT License, reflecting our commitment to open and accessible AI development.
 We believe in democratizing AI technology by making our work freely available for anyone to use, modify, and build upon.

 All these systems support the OpenAI Chat Completions API format.
+### Usage Recommendations
+Our usage recommendations are similar to those of R1 and R1 Distill series:
+1. Avoid adding a system prompt; all instructions should be contained within the user prompt.
+2. `temperature = 0.6`
+3. `top_p = 0.95`
+4. This model performs best with `max_tokens` set to at least `64000`
 ## License
 This project is released under the MIT License, reflecting our commitment to open and accessible AI development.
 We believe in democratizing AI technology by making our work freely available for anyone to use, modify, and build upon.