Instructions to use C3DS/CARDS-Qwen3.5-4B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use C3DS/CARDS-Qwen3.5-4B with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="C3DS/CARDS-Qwen3.5-4B")
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    },
]
pipe(text=messages)

# Load model directly
from transformers import AutoProcessor, AutoModelForImageTextToText

processor = AutoProcessor.from_pretrained("C3DS/CARDS-Qwen3.5-4B")
model = AutoModelForImageTextToText.from_pretrained("C3DS/CARDS-Qwen3.5-4B")
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    },
]
inputs = processor.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(processor.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use C3DS/CARDS-Qwen3.5-4B with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "C3DS/CARDS-Qwen3.5-4B"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "C3DS/CARDS-Qwen3.5-4B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/C3DS/CARDS-Qwen3.5-4B

SGLang

How to use C3DS/CARDS-Qwen3.5-4B with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "C3DS/CARDS-Qwen3.5-4B" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "C3DS/CARDS-Qwen3.5-4B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "C3DS/CARDS-Qwen3.5-4B" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "C3DS/CARDS-Qwen3.5-4B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use C3DS/CARDS-Qwen3.5-4B with Docker Model Runner:
```
docker model run hf.co/C3DS/CARDS-Qwen3.5-4B
```
Browse Quantizations to use this model in llama.cpp, Ollama, LM Studio, or any compatible app.

CARDS-Qwen3.5-4B

Fine-tuned Qwen3.5-4B for classification of climate-contrarian claims using the CARDS taxonomy from Coan et al. (2025).

This is a merged checkpoint: a LoRA adapter (rank 16) trained on the CARDS SFT dataset has been merged back into the base weights for direct loading with transformers, vLLM, or any standard inference engine.

Results

Evaluated on the held-out CARDS test set (1,436 samples, Level 1, min_support ≥ 3):

Metric	Qwen3.5-4B (base)	Qwen3.5-4B FT	Qwen3.5-9B FT	Qwen3.5-27B FT	Claude Opus 4.6
Samples F1	0.621	0.838	0.872	0.884	0.893
Macro F1	0.473	0.632	0.663	0.766	0.751
Micro F1	0.696	0.828	0.862	0.877	0.881
Precision	0.829	0.840	0.875	0.879	0.863
Recall	0.600	0.816	0.849	0.874	0.900
Parse failures	376 / 1436	1 / 1436	0 / 1436	0 / 1436	0 / 1436

Fine-tuning lifts samples F1 from 0.621 (base) to 0.838 (+0.217).
Parse failures collapse from 26% to <1% — the model reliably emits the YAML format.
Trails larger siblings on absolute accuracy but stays within 0.05 samples F1 of the 27B FT at a fraction of the deployment cost.
Per-level breakdown: L1 0.838 / L2 0.809 / L3 0.781 samples F1.

Usage

With vLLM

vllm serve C3DS/CARDS-Qwen3.5-4B \
  --port 8000 \
  --max-model-len 4096 \
  --dtype bfloat16 \
  --enable-prefix-caching \
  --served-model-name CARDS-Qwen3.5-4B

The system prompt (slim_system_instruction) and the user-message suffix (cot_trigger) the model was trained with are bundled in this repo as cards_prompts.json — self-contained, with the CARDS taxonomy already inlined.

import json
from huggingface_hub import hf_hub_download
from openai import OpenAI

prompts = json.load(open(hf_hub_download("C3DS/CARDS-Qwen3.5-4B", "cards_prompts.json")))
slim_system_instruction = prompts["slim_system_instruction"]
cot_trigger             = prompts["cot_trigger"]

client = OpenAI(base_url="http://localhost:8000/v1", api_key="dummy")

def classify(text):
    resp = client.chat.completions.create(
        model="CARDS-Qwen3.5-4B",
        messages=[
            {"role": "system", "content": slim_system_instruction},
            {"role": "user",   "content": f"### Text:\n{text}\n\n{cot_trigger}"},
        ],
        temperature=0,
        max_tokens=4000,
    )
    return resp.choices[0].message.content

print(classify("These are only a few renewable energy technologies at work"))

The model produces a reasoning trace inside <think>…</think> followed by a YAML categories: block listing predicted CARDS codes. To parse: take the content after </think> and read the categories: list.

For an FP8-quantized variant (~4 GB on disk, no measurable accuracy loss) see C3DS/CARDS-Qwen3.5-4B-FP8.

Multimodal — image + text

The base Qwen3.5/3.6 family supports image inputs via the OpenAI-compatible image_url content part, and this fine-tune preserves that capability — pass the system prompt below alongside an image (with or without caption text) and the model will classify the depicted claim under the CARDS taxonomy.

Serve vLLM with multimodal flags enabled:

vllm serve C3DS/CARDS-Qwen3.5-4B \
  --port 8000 \
  --max-model-len 8192 \
  --trust-remote-code \
  --limit-mm-per-prompt image=4 \
  --enable-prefix-caching \
  --served-model-name CARDS-Qwen3.5-4B

import base64, json, mimetypes
from pathlib import Path
from huggingface_hub import hf_hub_download
from openai import OpenAI

prompts = json.load(open(hf_hub_download("C3DS/CARDS-Qwen3.5-4B", "cards_prompts.json")))
slim_system_instruction = prompts["slim_system_instruction"]
cot_trigger             = prompts["cot_trigger"]

def image_part(path):
    p = Path(path)
    mime = mimetypes.guess_type(p)[0] or "image/png"
    b64 = base64.b64encode(p.read_bytes()).decode()
    return {"type": "image_url", "image_url": {"url": f"data:{mime};base64,{b64}"}}

client = OpenAI(base_url="http://localhost:8000/v1", api_key="dummy")

resp = client.chat.completions.create(
    model="CARDS-Qwen3.5-4B",
    messages=[
        {"role": "system", "content": slim_system_instruction},
        {"role": "user", "content": [
            {"type": "text", "text": "Read the image (and any caption below) and classify the climate claim it makes."},
            image_part("screenshot.png"),
            {"type": "text", "text": f"### Caption:\n<optional caption>\n\n{cot_trigger}"},
        ]},
    ],
    temperature=0,
    max_tokens=4000,
)
print(resp.choices[0].message.content)

Training

Base model: Qwen/Qwen3.5-4B
Method: LoRA (rank 16, α 16, dropout 0) on q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj, then merged into base weights
Dataset: C3DS/cards_sft_dataset (sft config — RECoT chat messages)
Framework: Unsloth + TRL SFTTrainer
Hyperparameters: 3 epochs, per_device_train_batch_size=1, gradient_accumulation_steps=8, lr=2e-4, cosine schedule, 10 warmup steps, max_seq_length=4096, adamw_8bit, bf16
Hardware: 1× NVIDIA H200
Checkpoint selection: best via load_best_model_at_end=True

Limitations

Macro F1 on rare labels. Rare level-3 claims (under 10 training examples) trail Claude Opus by a wider margin than common claims, reflecting the long-tailed CARDS distribution.
Thinking tokens. Training used enable_thinking=True. Either parse output after </think>, or disable thinking at inference via chat_template_kwargs={"enable_thinking": false}. Reserve token budget for the reasoning trace before the final YAML block.

Citation

@article{coan2025cards,
  title   = {Large language model reveals an increase in climate contrarian speech in the United States Congress},
  author  = {Coan, Travis G. and Malla, Ranadheer and Nanko, Mirjam O. and Kattrup, William and Roberts, J. Timmons and Cook, John and Boussalis, Constantine},
  journal = {Communications Sustainability},
  volume  = {1},
  pages   = {37},
  year    = {2025},
  doi     = {10.1038/s44458-025-00029-z}
}