alexchapin/portkit-coder-8b-grpo8

Anti-hallucination focused GRPO fine-tuned model for Minecraft Java (Forge) to Bedrock Add-on conversion.

Model Details

  • Base model: Qwen/Qwen3-8B
  • Training method: GRPO with anti-hallucination focus
  • Checkpoint: GRPO8 final (50 steps, group_size=16-20)
  • Learning rate: 5e-7

Reward Components

Component Weight Description
manifest 0.10 UUID v4, version array, module type validation
real_api 0.25 Event subscription + API chain depth
anti_hallucination 0.25 Hallucination penalties (shifted to 0.5-1.0)
concise 0.15 Output length optimization
code_bleu 0.25 Token overlap in code blocks

Anti-Hallucination Fixes

  • Issue #1582: Hardened hallucination detection with comprehensive fake API patterns
  • Issue #1583: Real API fixes (event subscription, API chain depth)
  • Issue #1584: GRPO Stabilization (group_size 16, lr 5e-7, clipped surrogate)
  • Issue #1586: Manifest fixes (UUID v4, version array, module type)

Hard Hallucination Patterns (NEVER valid in Bedrock)

  • ServerPlayerAPI, ServerPlayer, PlayerAPI
  • WorldEvent, BlockEntityAPI, EntityPlayerAPI, WorldAPI
  • require('@minecraft/server')
  • registerMod(), getServer(), Server.getInstance()
  • event.level, server.getWorld

Real Bedrock APIs (ALWAYS valid)

  • world.afterEvents, world.beforeEvents
  • system.runInterval, system.runTimeout
  • player.sendMessage, player.getComponent
  • { world, system, player } imports from @minecraft/server

Usage

from transformers import AutoModelForCausalLM

model = AutoModelForCausalLM.from_pretrained("alexchapin/portkit-coder-8b-grpo8")

Framework Versions

  • tinker-cookbook: 0.4.1
  • transformers: 5.5.3
  • torch: 2.11.0+rocm7.2
Downloads last month
9
Safetensors
Model size
8B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for alexchapin/portkit-coder-8b-grpo8

Finetuned
Qwen/Qwen3-8B
Finetuned
(1609)
this model