✨ Note: For all FineInstructions resources please visit: https://huggingface.co/fineinstructions

This model will convert a query / instruction / prompt into a generic, instruction template in the format of FineTemplates.

The output will be a JSON object.

Simple Usage Example

from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline

# Load tokenizer and model
tokenizer = AutoTokenizer.from_pretrained('fineinstructions/query_templatizer', revision=None)
tokenizer.padding_side = 'left'
model = AutoModelForCausalLM.from_pretrained('fineinstructions/query_templatizer', revision=None)
pipe = pipeline('text-generation', model=model, tokenizer=tokenizer, pad_token_id=tokenizer.pad_token_id, return_full_text=False)

# Run inference to templatize the query
inputs = ["What volleyball exercises should I do I'm almost in high school and i do volleyball excellence five times a week (basically an advanced class in school with experienced volleyball coaches) , we have 2-3 skill training sessions a week which i feel like isn't enough for me as I would like to improve my skills almost every day.\n\n&amp;#x200B;\n\nWhat i wanted to know was what setting, digging, serving and spiking exercises could i do that would help me improve all of my skills (I have a large area to practice all these things so space isn't an issue)."]
prompts = [tokenizer.apply_chat_template([{'role': 'user', 'content': i}], tokenize=False, add_generation_prompt=True) for i in inputs]
generations = pipe(prompts, max_length=131072, truncation=True, temperature=None, top_p=None, do_sample=False)
output = generations[0][0]['generated_text']
print(output)

##### Output:
# {
#   ...,
#   "template": "What <fi>name of sport or activity</fi> exercises should I do I'm almost <fi>level of experience or age</fi> and I do <fi>name of sport or activity</fi> <fi>frequency of training sessions</fi> (basically an <fi>level of experience or age</fi> class with experienced <fi>instructors or coaches</fi>), we have <fi>number of training sessions per week</fi> which I feel like isn't enough for me as I would like to improve my skills almost every <fi>unit of time</fi>. What I wanted to know was what <fi>specific skills or techniques</fi> exercises could I do that would help me improve all of my skills (I have a <fi>description of the area for practice</fi> so space isn't an issue).",
#   "compatible_document_description": "A document that provides guidance on a specific sport or activity, including training plans and exercises tailored to different levels of experience or age, would be suitable. The document should contain information on the frequency of training sessions, the role of experienced instructors or coaches, and the importance of regular practice to improve skills. It should also offer specific exercises and techniques for various skills or techniques relevant to the sport or activity, taking into account the individual's current level of experience and the available training space. Additionally, the document should cover the benefits of practicing a particular area of the sport or activity, such as a specific skill or technique, and provide a plan for improving skills at a rate of almost every unit of time. The document could be a training manual, a coaching guide, a blog post, or an article from a reputable source in the field of sports or fitness, and could be written for beginners, intermediate, or advanced practitioners. The document should be detailed enough to provide actionable advice and exercises for improving skills in a specific sport or activity, and should be relevant to the individual's current level of experience and training frequency. Overall, the document should be a comprehensive resource that addresses the needs of individuals seeking to improve their skills in a particular sport or activity.",
#   "qa_or_tasky": "qa",
#   "realistic": true,
#   "conversational": false,
#   "task_type_open": "Provide exercises for improving skills",
#   "task_type_closed": "text_generation",
#   "difficulty": 0.2,
#   "compatibility": 0.05,
#   "query_frequency": 0.01,
#   "is_knowledge_recall": true,
#   "is_reasoning": false,
#   "is_code": false,
#   "is_math": false,
#   "is_science": false,
#   "is_medicine": false,
#   "is_personal_life": true,
#   "is_agenty": false,
#   "is_planning": false,
#   "is_few_shot": false,
#   ...
# }
#

This model was trained with a synthetic dataset with DataDreamer 🤖💤. The synthetic dataset card and model card can be found here. The training arguments can be found here.

If you use this project in your research please cite:

@article{patel2026fineinstructions,
  title={FineInstructions: Scaling Synthetic Instructions to Pre-Training Scale},
  author={Patel, Ajay and Raffel, Colin and Callison-Burch, Chris},
  journal={arXiv preprint arXiv:2601.22146},
  year={2026},
  archivePrefix={arXiv},
  primaryClass={cs.CL},
  doi={10.48550/arXiv.2601.22146}
}

Downloads last month: 12

Safetensors

Model size

1B params

Tensor type

BF16

Model tree for fineinstructions/query_templatizer

Base model

meta-llama/Llama-3.2-1B-Instruct

Finetuned

(1719)

this model

Paper for fineinstructions/query_templatizer

FineInstructions: Scaling Synthetic Instructions to Pre-Training Scale

Paper • 2601.22146 • Published Jan 29 • 11