Instructions to use C3DS/CARDS-Qwen3.5-4B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use C3DS/CARDS-Qwen3.5-4B with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="C3DS/CARDS-Qwen3.5-4B") messages = [ { "role": "user", "content": [ {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"}, {"type": "text", "text": "What animal is on the candy?"} ] }, ] pipe(text=messages)# Load model directly from transformers import AutoProcessor, AutoModelForImageTextToText processor = AutoProcessor.from_pretrained("C3DS/CARDS-Qwen3.5-4B") model = AutoModelForImageTextToText.from_pretrained("C3DS/CARDS-Qwen3.5-4B") messages = [ { "role": "user", "content": [ {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"}, {"type": "text", "text": "What animal is on the candy?"} ] }, ] inputs = processor.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(processor.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use C3DS/CARDS-Qwen3.5-4B with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "C3DS/CARDS-Qwen3.5-4B" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "C3DS/CARDS-Qwen3.5-4B", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/C3DS/CARDS-Qwen3.5-4B
- SGLang
How to use C3DS/CARDS-Qwen3.5-4B with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "C3DS/CARDS-Qwen3.5-4B" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "C3DS/CARDS-Qwen3.5-4B", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "C3DS/CARDS-Qwen3.5-4B" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "C3DS/CARDS-Qwen3.5-4B", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use C3DS/CARDS-Qwen3.5-4B with Docker Model Runner:
docker model run hf.co/C3DS/CARDS-Qwen3.5-4B
CARDS-Qwen3.5-4B
Fine-tuned Qwen3.5-4B for classification of climate-contrarian claims using the CARDS taxonomy from Coan et al. (2025).
This is a merged checkpoint: a LoRA adapter (rank 16) trained on the CARDS SFT dataset has been merged back into the base weights for direct loading with transformers, vLLM, or any standard inference engine.
Results
Evaluated on the held-out CARDS test set (1,436 samples, Level 1, min_support ≥ 3):
| Metric | Qwen3.5-4B (base) | Qwen3.5-4B FT | Qwen3.5-9B FT | Qwen3.5-27B FT | Claude Opus 4.6 |
|---|---|---|---|---|---|
| Samples F1 | 0.621 | 0.838 | 0.872 | 0.884 | 0.893 |
| Macro F1 | 0.473 | 0.632 | 0.663 | 0.766 | 0.751 |
| Micro F1 | 0.696 | 0.828 | 0.862 | 0.877 | 0.881 |
| Precision | 0.829 | 0.840 | 0.875 | 0.879 | 0.863 |
| Recall | 0.600 | 0.816 | 0.849 | 0.874 | 0.900 |
| Parse failures | 376 / 1436 | 1 / 1436 | 0 / 1436 | 0 / 1436 | 0 / 1436 |
- Fine-tuning lifts samples F1 from 0.621 (base) to 0.838 (+0.217).
- Parse failures collapse from 26% to <1% — the model reliably emits the YAML format.
- Trails larger siblings on absolute accuracy but stays within 0.05 samples F1 of the 27B FT at a fraction of the deployment cost.
- Per-level breakdown: L1 0.838 / L2 0.809 / L3 0.781 samples F1.
Usage
With vLLM
vllm serve C3DS/CARDS-Qwen3.5-4B \
--port 8000 \
--max-model-len 4096 \
--dtype bfloat16 \
--enable-prefix-caching \
--served-model-name CARDS-Qwen3.5-4B
The system prompt (slim_system_instruction) and the user-message suffix (cot_trigger) the model was trained with are bundled in this repo as cards_prompts.json — self-contained, with the CARDS taxonomy already inlined.
import json
from huggingface_hub import hf_hub_download
from openai import OpenAI
prompts = json.load(open(hf_hub_download("C3DS/CARDS-Qwen3.5-4B", "cards_prompts.json")))
slim_system_instruction = prompts["slim_system_instruction"]
cot_trigger = prompts["cot_trigger"]
client = OpenAI(base_url="http://localhost:8000/v1", api_key="dummy")
def classify(text):
resp = client.chat.completions.create(
model="CARDS-Qwen3.5-4B",
messages=[
{"role": "system", "content": slim_system_instruction},
{"role": "user", "content": f"### Text:\n{text}\n\n{cot_trigger}"},
],
temperature=0,
max_tokens=4000,
)
return resp.choices[0].message.content
print(classify("These are only a few renewable energy technologies at work"))
The model produces a reasoning trace inside <think>…</think> followed by a YAML categories: block listing predicted CARDS codes. To parse: take the content after </think> and read the categories: list.
For an FP8-quantized variant (~4 GB on disk, no measurable accuracy loss) see C3DS/CARDS-Qwen3.5-4B-FP8.
Multimodal — image + text
The base Qwen3.5/3.6 family supports image inputs via the OpenAI-compatible
image_url content part, and this fine-tune preserves that capability — pass
the system prompt below alongside an image (with or without caption text) and
the model will classify the depicted claim under the CARDS taxonomy.
Serve vLLM with multimodal flags enabled:
vllm serve C3DS/CARDS-Qwen3.5-4B \
--port 8000 \
--max-model-len 8192 \
--trust-remote-code \
--limit-mm-per-prompt image=4 \
--enable-prefix-caching \
--served-model-name CARDS-Qwen3.5-4B
import base64, json, mimetypes
from pathlib import Path
from huggingface_hub import hf_hub_download
from openai import OpenAI
prompts = json.load(open(hf_hub_download("C3DS/CARDS-Qwen3.5-4B", "cards_prompts.json")))
slim_system_instruction = prompts["slim_system_instruction"]
cot_trigger = prompts["cot_trigger"]
def image_part(path):
p = Path(path)
mime = mimetypes.guess_type(p)[0] or "image/png"
b64 = base64.b64encode(p.read_bytes()).decode()
return {"type": "image_url", "image_url": {"url": f"data:{mime};base64,{b64}"}}
client = OpenAI(base_url="http://localhost:8000/v1", api_key="dummy")
resp = client.chat.completions.create(
model="CARDS-Qwen3.5-4B",
messages=[
{"role": "system", "content": slim_system_instruction},
{"role": "user", "content": [
{"type": "text", "text": "Read the image (and any caption below) and classify the climate claim it makes."},
image_part("screenshot.png"),
{"type": "text", "text": f"### Caption:\n<optional caption>\n\n{cot_trigger}"},
]},
],
temperature=0,
max_tokens=4000,
)
print(resp.choices[0].message.content)
Training
- Base model:
Qwen/Qwen3.5-4B - Method: LoRA (rank 16, α 16, dropout 0) on
q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj, then merged into base weights - Dataset:
C3DS/cards_sft_dataset(sftconfig — RECoT chat messages) - Framework: Unsloth + TRL
SFTTrainer - Hyperparameters: 3 epochs,
per_device_train_batch_size=1,gradient_accumulation_steps=8,lr=2e-4, cosine schedule, 10 warmup steps,max_seq_length=4096,adamw_8bit,bf16 - Hardware: 1× NVIDIA H200
- Checkpoint selection: best via
load_best_model_at_end=True
Limitations
- Macro F1 on rare labels. Rare level-3 claims (under 10 training examples) trail Claude Opus by a wider margin than common claims, reflecting the long-tailed CARDS distribution.
- Thinking tokens. Training used
enable_thinking=True. Either parse output after</think>, or disable thinking at inference viachat_template_kwargs={"enable_thinking": false}. Reserve token budget for the reasoning trace before the final YAML block.
Citation
@article{coan2025cards,
title = {Large language model reveals an increase in climate contrarian speech in the United States Congress},
author = {Coan, Travis G. and Malla, Ranadheer and Nanko, Mirjam O. and Kattrup, William and Roberts, J. Timmons and Cook, John and Boussalis, Constantine},
journal = {Communications Sustainability},
volume = {1},
pages = {37},
year = {2025},
doi = {10.1038/s44458-025-00029-z}
}
License
Apache 2.0, inherited from Qwen3.5-4B.
- Downloads last month
- 447
Model tree for C3DS/CARDS-Qwen3.5-4B
Base model
Qwen/Qwen3.5-4B-Base