Instructions to use jmrodri/Llama-3.2_voight-kampff_beta_005 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use jmrodri/Llama-3.2_voight-kampff_beta_005 with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="jmrodri/Llama-3.2_voight-kampff_beta_005") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("jmrodri/Llama-3.2_voight-kampff_beta_005") model = AutoModelForCausalLM.from_pretrained("jmrodri/Llama-3.2_voight-kampff_beta_005") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use jmrodri/Llama-3.2_voight-kampff_beta_005 with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "jmrodri/Llama-3.2_voight-kampff_beta_005" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "jmrodri/Llama-3.2_voight-kampff_beta_005", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/jmrodri/Llama-3.2_voight-kampff_beta_005
- SGLang
How to use jmrodri/Llama-3.2_voight-kampff_beta_005 with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "jmrodri/Llama-3.2_voight-kampff_beta_005" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "jmrodri/Llama-3.2_voight-kampff_beta_005", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "jmrodri/Llama-3.2_voight-kampff_beta_005" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "jmrodri/Llama-3.2_voight-kampff_beta_005", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use jmrodri/Llama-3.2_voight-kampff_beta_005 with Docker Model Runner:
docker model run hf.co/jmrodri/Llama-3.2_voight-kampff_beta_005
Dargk — Llama-3.2-3B-instruct GRPO LoRA (β=0.05)
LoRA adapter fine-tuned from meta-llama/Llama-3.2-3B-instruct using GRPO (Group Relative Policy Optimization) with a KL-divergence penalty of β=0.05.
This model was developed as part of the Dargk team's submission to the Voight-Kampff task at ELOQUENT Lab 2026, CLEF 2026. The task asks: can text generated by a language model be distinguished from text written by a human? Systems are scored by how often their outputs fool an AI-detection classifier into believing they are human-authored.
Model Details
- Developed by: Dargk Team — Antonela Tommasel & Juan Manuel Rodriguez
- Base model:
meta-llama/Llama-3.2-3B-instruct - Model type: Causal LM — Fine-tuned, decoder-only transformer, 3B parameters
- Language: English
- License: Llama 3.2 Community License
- Task: Text generation with human-like stylistic properties
Training
Objective
The model was fine-tuned to generate text that is classified as human-written by an AI-detection classifier. The reward signal is 1 − p(AI), where p(AI) is the probability assigned by Mdok2 — our fine-tuned AI-detection classifier (described below) — that a generated text is AI-authored. This is not RLHF: there is no human feedback. The signal comes entirely from Mdok2, which was itself trained on a labeled corpus of human-written and AI-generated text.
Reward model — Mdok2
Mdok2 is a binary sequence classifier (human-written vs. AI-generated) built on FacebookAI/roberta-large (355M parameters, encoder-only), fine-tuned with LoRA (r=64, α=16, dropout=0.1) on the PAN25 AI-generated text detection dataset (Task 1). It is inspired by but distinct from the original Mdok system. Text is preprocessed before classification: lowercased, with emails, @-mentions, and phone numbers replaced by placeholder tokens.
Training data
Prompts were drawn from the Voight-Kampff task datasets for 2024, 2025, and 2026. Each prompt combines the task's suggested base prompt, a Content field (bullet-point description of a ~500-word text), and a Genre and Style field.
Training configuration
| Parameter | Value |
|---|---|
| Algorithm | GRPO (TRL) |
| KL penalty β | 0.05 |
| GRPO group size G | 8 |
| Epochs | 10 |
| Learning rate | 5e-5 |
| Batch size | 1 (grad. accum. 4) |
| Max completion length | 1000 tokens |
Usage
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
base_model_id = "meta-llama/Llama-3.2-3B-instruct"
model_id = "jmrodri/Llama-3.2_voight-kampff_beta_005"
tokenizer = AutoTokenizer.from_pretrained(base_model_id)
tokenizer.pad_token = tokenizer.eos_token
model = AutoModelForCausalLM.from_pretrained(model_id, torch_dtype="auto", device_map="auto")
model.eval()
prompt = "Write a text of about 500 words which covers the following items: ..."
chat = [
{"role": "system", "content": "You are a helpful assistant that generates helpful answers. "
"You will avoid pleasantries and small talk, focusing on the task at hand."},
{"role": "system", "content": "You will avoid short paragraphs and bullet points."},
{"role": "user", "content": prompt},
{"role": "assistant", "content": ""},
]
inputs = tokenizer.apply_chat_template(chat, return_tensors="pt").to(model.device)
with torch.no_grad():
outputs = model.generate(
**inputs,
max_new_tokens=600,
do_sample=True,
temperature=0.8,
top_p=0.9,
)
decoded = tokenizer.decode(outputs[0], skip_special_tokens=True)
Intended use
This model was developed for participation in the ELOQUENT Lab 2026 Voight-Kampff shared task. It is intended for research into generative text quality, human-likeness evaluation, and AI-detection robustness.
For more information, see the repository Darkg Eloquent 2026.
Contact
Dargk Team
- Antonela Tommasel — antonela.tommasel@isistan.unicen.edu.ar
- Juan Manuel Rodriguez — jmro@cs.aau.dk
- Downloads last month
- 41