GRPO Humanizer DE

Fine-tuned with Group Relative Policy Optimization (GRPO) to rewrite AI-generated German academic text so that it passes GPTZero detection while preserving semantic content.

Training details

Parameter Value
Base model Qwen/Qwen3-8B
Method GRPO (TRL) + LoRA
Learning rate 5e-06
Batch size 2
Gradient accumulation 8
Max steps 50
Precision bf16

Intended use

Academic text humanisation for German-language content. The model is designed to be called via the HuggingFace Inference API from the GhostWriter application.

Licence

Apache-2.0

Downloads last month
87
Safetensors
Model size
8B params
Tensor type
F32
·
U8
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for LevArtesa/grpo-humanizer-de

Finetuned
Qwen/Qwen3-8B
Adapter
(1225)
this model