Instructions to use jmrodri/Llama-3.2_voight-kampff_beta_005 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use jmrodri/Llama-3.2_voight-kampff_beta_005 with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="jmrodri/Llama-3.2_voight-kampff_beta_005")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("jmrodri/Llama-3.2_voight-kampff_beta_005")
model = AutoModelForCausalLM.from_pretrained("jmrodri/Llama-3.2_voight-kampff_beta_005")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use jmrodri/Llama-3.2_voight-kampff_beta_005 with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "jmrodri/Llama-3.2_voight-kampff_beta_005"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "jmrodri/Llama-3.2_voight-kampff_beta_005",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/jmrodri/Llama-3.2_voight-kampff_beta_005

SGLang

How to use jmrodri/Llama-3.2_voight-kampff_beta_005 with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "jmrodri/Llama-3.2_voight-kampff_beta_005" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "jmrodri/Llama-3.2_voight-kampff_beta_005",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "jmrodri/Llama-3.2_voight-kampff_beta_005" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "jmrodri/Llama-3.2_voight-kampff_beta_005",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use jmrodri/Llama-3.2_voight-kampff_beta_005 with Docker Model Runner:
```
docker model run hf.co/jmrodri/Llama-3.2_voight-kampff_beta_005
```

Dargk — Llama-3.2-3B-instruct GRPO LoRA (β=0.05)

LoRA adapter fine-tuned from meta-llama/Llama-3.2-3B-instruct using GRPO (Group Relative Policy Optimization) with a KL-divergence penalty of β=0.05.

This model was developed as part of the Dargk team's submission to the Voight-Kampff task at ELOQUENT Lab 2026, CLEF 2026. The task asks: can text generated by a language model be distinguished from text written by a human? Systems are scored by how often their outputs fool an AI-detection classifier into believing they are human-authored.

Model Details

Developed by: Dargk Team — Antonela Tommasel & Juan Manuel Rodriguez
Base model: meta-llama/Llama-3.2-3B-instruct
Model type: Causal LM — Fine-tuned, decoder-only transformer, 3B parameters
Language: English
License: Llama 3.2 Community License
Task: Text generation with human-like stylistic properties

Training

Objective

The model was fine-tuned to generate text that is classified as human-written by an AI-detection classifier. The reward signal is 1 − p(AI), where p(AI) is the probability assigned by Mdok2 — our fine-tuned AI-detection classifier (described below) — that a generated text is AI-authored. This is not RLHF: there is no human feedback. The signal comes entirely from Mdok2, which was itself trained on a labeled corpus of human-written and AI-generated text.

Reward model — Mdok2

Mdok2 is a binary sequence classifier (human-written vs. AI-generated) built on FacebookAI/roberta-large (355M parameters, encoder-only), fine-tuned with LoRA (r=64, α=16, dropout=0.1) on the PAN25 AI-generated text detection dataset (Task 1). It is inspired by but distinct from the original Mdok system. Text is preprocessed before classification: lowercased, with emails, @-mentions, and phone numbers replaced by placeholder tokens.

Training data

Prompts were drawn from the Voight-Kampff task datasets for 2024, 2025, and 2026. Each prompt combines the task's suggested base prompt, a Content field (bullet-point description of a ~500-word text), and a Genre and Style field.

Training configuration

Parameter	Value
Algorithm	GRPO (TRL)
KL penalty β	0.05
GRPO group size G	8
Epochs	10
Learning rate	5e-5
Batch size	1 (grad. accum. 4)
Max completion length	1000 tokens

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

base_model_id = "meta-llama/Llama-3.2-3B-instruct"
model_id = "jmrodri/Llama-3.2_voight-kampff_beta_005"

tokenizer = AutoTokenizer.from_pretrained(base_model_id)
tokenizer.pad_token = tokenizer.eos_token

model = AutoModelForCausalLM.from_pretrained(model_id, torch_dtype="auto", device_map="auto")
model.eval()

prompt = "Write a text of about 500 words which covers the following items: ..."

chat = [
    {"role": "system", "content": "You are a helpful assistant that generates helpful answers. "
                                  "You will avoid pleasantries and small talk, focusing on the task at hand."},
    {"role": "system", "content": "You will avoid short paragraphs and bullet points."},
    {"role": "user", "content": prompt},
    {"role": "assistant", "content": ""},
]

inputs = tokenizer.apply_chat_template(chat, return_tensors="pt").to(model.device)

with torch.no_grad():
    outputs = model.generate(
        **inputs,
        max_new_tokens=600,
        do_sample=True,
        temperature=0.8,
        top_p=0.9,
    )

decoded = tokenizer.decode(outputs[0], skip_special_tokens=True)

Intended use

This model was developed for participation in the ELOQUENT Lab 2026 Voight-Kampff shared task. It is intended for research into generative text quality, human-likeness evaluation, and AI-detection robustness.

For more information, see the repository Darkg Eloquent 2026.

Contact

Dargk Team

Antonela Tommasel — antonela.tommasel@isistan.unicen.edu.ar
Juan Manuel Rodriguez — jmro@cs.aau.dk

Downloads last month: 41

Safetensors

Model size

3B params

Tensor type

BF16