Model Card for akkikiki/LLaDA-8B-Instruct-judge-fs

This model is a fine-tuned version of GSAI-ML/LLaDA-8B-Instruct. It has been trained using TRL.

Quick start

from transformers import pipeline

prompt = """###Task Description:
An instruction (might include an Input inside it), a response to evaluate, a reference answer that gets a score of 5, and a score rubric representing a evaluation criteria are given.
1. Write a detailed feedback that assess the quality of the response strictly based on the given score rubric, not evaluating in general.
2. After writing a feedback, write a score that is an integer between 1 and 5. You should refer to the score rubric.
3. The output format should look as follows: "Feedback: (write a feedback for criteria) [RESULT] (an integer number between 1 and 5)"
4. Please do not generate any other opening, closing, and explanations.

###The instruction to evaluate:
{orig_instruction}

###Response to evaluate:
{orig_response}

###Reference Answer (Score 5):
{orig_reference_answer}

###Score Rubrics:
[{orig_criteria}]
Score 1: {orig_score1_description}
Score 2: {orig_score2_description}
Score 3: {orig_score3_description}
Score 4: {orig_score4_description}
Score 5: {orig_score5_description}

###Feedback: """

generator = pipeline("text-generation", model="akkikiki/LLaDA-8B-Instruct-judge-fs", device="cuda")
output = generator([{"role": "user", "content": prompt}], max_new_tokens=128, return_full_text=False)[0]
print(output["generated_text"])

Training procedure

This model was trained with SFT on 95% of prometheus-eval/Feedback-Collection with 5% held out as a validation set.

Framework versions

  • TRL: 0.23.0
  • Transformers: 4.56.2
  • Pytorch: 2.8.0
  • Datasets: 4.0.0
  • Tokenizers: 0.22.1

Citations

@misc{fujinuma2026unlockingpromptinfillingcapability,
      title={Unlocking Prompt Infilling Capability for Diffusion Language Models}, 
      author={Yoshinari Fujinuma and Keisuke Sakaguchi},
      year={2026},
      eprint={2604.03677},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2604.03677}, 
}
Downloads last month
137
Safetensors
Model size
8B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for akkikiki/LLaDA-8B-Instruct-judge-fs

Finetuned
(28)
this model
Finetunes
1 model

Collection including akkikiki/LLaDA-8B-Instruct-judge-fs

Paper for akkikiki/LLaDA-8B-Instruct-judge-fs