Instructions to use CognitiveKernel/Qwen3-8B-CK-Pro with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use CognitiveKernel/Qwen3-8B-CK-Pro with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="CognitiveKernel/Qwen3-8B-CK-Pro")

# Load model directly
from transformers import AutoModel
model = AutoModel.from_pretrained("CognitiveKernel/Qwen3-8B-CK-Pro", dtype="auto")

Inference
Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use CognitiveKernel/Qwen3-8B-CK-Pro with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "CognitiveKernel/Qwen3-8B-CK-Pro"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "CognitiveKernel/Qwen3-8B-CK-Pro",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/CognitiveKernel/Qwen3-8B-CK-Pro

SGLang

How to use CognitiveKernel/Qwen3-8B-CK-Pro with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "CognitiveKernel/Qwen3-8B-CK-Pro" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "CognitiveKernel/Qwen3-8B-CK-Pro",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "CognitiveKernel/Qwen3-8B-CK-Pro" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "CognitiveKernel/Qwen3-8B-CK-Pro",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Docker Model Runner
How to use CognitiveKernel/Qwen3-8B-CK-Pro with Docker Model Runner:
```
docker model run hf.co/CognitiveKernel/Qwen3-8B-CK-Pro
```

Improve model card: Add comprehensive information and usage

by nielsr HF Staff - opened Aug 5, 2025

base: refs/heads/main

←

from: refs/pr/1

Discussion Files changed

+105

-3

Files changed (1) hide show

README.md +105 -3

README.md CHANGED Viewed

@@ -1,3 +1,105 @@
-version https://git-lfs.github.com/spec/v1
-oid sha256:e06a4870fe02aa52095717ce69d4dea985e6f10849ffbdb472864ce8ba43b259
-size 85

+---
+license: other
+license_name: cognitive-kernel-pro
+license_link: LICENSE
+pipeline_tag: image-text-to-text
+library_name: transformers
+---
+# Cognitive Kernel-Pro: A Framework for Deep Research Agents and Agent Foundation Models Training
+This repository hosts the **Qwen3-8B-CK-Pro** model, an 8B-parameter open-source agent foundation model developed as part of the **Cognitive Kernel-Pro** framework. Cognitive Kernel-Pro is designed to democratize the development and evaluation of advanced AI agents, focusing on open-source and free tools to enable complex reasoning, web interaction, coding, and autonomous research capabilities. It explores high-quality training data curation for Agent Foundation Models and novel strategies for agent test-time reflection and voting, achieving state-of-the-art results on GAIA.
+- 📚 **Paper**: [Cognitive Kernel-Pro: A Framework for Deep Research Agents and Agent Foundation Models Training](https://huggingface.co/papers/2508.00414)
+- 🌐 **Project Page**: [https://osatlas.github.io/](https://osatlas.github.io/)
+- 💻 **Code**: [https://github.com/OS-Copilot/OS-Atlas](https://github.com/OS-Copilot/OS-Atlas)
+<p align="center"><img src="https://github.com/OS-Copilot/OS-Atlas/raw/main/results.png" alt="Cognitive Kernel-Pro Overview" width="90%"/></p>
+## Quick Start
+This model processes GUI screenshots along with text instructions to produce grounded actions or text responses. It is compatible with the Hugging Face `transformers` library.
+First, ensure you have the necessary dependencies installed:
+```bash
+pip install transformers torch Pillow
+```
+Here is a Python code snippet demonstrating how to perform inference with the model:
+```python
+import torch
+from PIL import Image
+from transformers import AutoModelForCausalLM, AutoProcessor
+# Load the model and processor
+model_id = "CognitiveKernel/Qwen3-8B-CK-Pro"
+model = AutoModelForCausalLM.from_pretrained(model_id, torch_dtype="auto", device_map="auto", trust_remote_code=True)
+processor = AutoProcessor.from_pretrained(model_id, trust_remote_code=True)
+# Example image and question
+# Replace with your actual image path or use a dummy image for testing
+# image_path = "./examples/images/web_dfacd48d-d2c2-492f-b94c-41e6a34ea99f.png" # Example from GitHub repo
+# image = Image.open(image_path).convert('RGB')
+# Or use a dummy image:
+dummy_image = Image.new('RGB', (500, 500), color = 'red') # For testing without a file
+image = dummy_image
+question = "In the screenshot of this web page, please give me the coordinates of the element I want to click on according to my instructions(with point).\"'Champions League' link\""
+# Prepare messages for chat template
+messages = [
+    {
+        "role": "user",
+        "content": [
+            {"type": "image", "image": image},
+            {"type": "text", "text": question},
+        ],
+    }
+]
+# Apply chat template and process inputs
+text = processor.apply_chat_template(
+    messages,
+    tokenize=False,
+    add_generation_prompt=True
+)
+inputs = processor(
+    text=[text],
+    images=[image],
+    padding=True,
+    return_tensors="pt"
+)
+inputs = {k: v.to(model.device) for k, v in inputs.items()}
+# Generate response
+generated_ids = model.generate(**inputs, max_new_tokens=1024, do_sample=False)
+# Decode and print the output
+generated_ids_trimmed = [
+    out_ids[len(in_ids):] for in_ids, out_ids in zip(inputs.input_ids, generated_ids)
+]
+output_text = processor.batch_decode(generated_ids_trimmed, skip_special_tokens=False, clean_up_tokenization_spaces=False)[0]
+print(f"User: {question}
+Assistant: {output_text}")
+```
+## Citation
+If you find this work helpful, please cite our paper:
+```bibtex
+@misc{fang2025cognitivekernelpro,
+      title={Cognitive Kernel-Pro: A Framework for Deep Research Agents and Agent Foundation Models Training},
+      author={Tianqing Fang and Zhisong Zhang and Xiaoyang Wang and Rui Wang and Can Qin and Yuxuan Wan and Jun-Yu Ma and Ce Zhang and Jiaqi Chen and Xiyun Li and Hongming Zhang and Haitao Mi and Dong Yu},
+      year={2025},
+      eprint={2508.00414},
+      archivePrefix={arXiv},
+      primaryClass={cs.AI},
+      url={https://arxiv.org/abs/2508.00414},
+}
+```