fineinstructions
/

template_instantiator

@@ -2,21 +2,26 @@
 base_model: meta-llama/Llama-3.2-1B-Instruct
 datasets:
 - fineinstructions/template_instantiator_training
 tags:
 - datadreamer
 - datadreamer-0.46.0
 - synthetic
 - text-generation
-pipeline_tag: text-generation
 ---
 [![FineInstructionsCoverImage](https://cdn-uploads.huggingface.co/production/uploads/61c40eeb727d1257bf3cf5ba/jSiXJ8FaogflCSRt_YirX.png)](https://huggingface.co/fineinstructions)
 **✨ Note:** For all FineInstructions resources please visit: https://huggingface.co/fineinstructions
 ----
-This model will take a instruction template in the format of [FineTemplates](https://huggingface.co/datasets/fineinstructions/finetemplates) and a document and return an instantiated instruction and answer pair.
-The output will be a JSON object.
 ## Simple Usage Example
@@ -63,7 +68,7 @@ inputs = [json.dumps({
 prompts = [tokenizer.apply_chat_template([{'role': 'user', 'content': i}], tokenize=False, add_generation_prompt=True) for i in inputs]
 generations = pipe(prompts, max_length=131072, truncation=True, temperature=None, top_p=None, do_sample=False)
 output = generations[0][0]['generated_text']
-output_json = json.loads()
 # Expand the answer
 output_json["answer"] = expand(document=inputs[0]["document"], text=output_json["answer"])
@@ -95,5 +100,4 @@ If you use this project in your research please cite:
   primaryClass={cs.CL},
   doi={10.48550/arXiv.2601.22146}
 }
-```

 base_model: meta-llama/Llama-3.2-1B-Instruct
 datasets:
 - fineinstructions/template_instantiator_training
+library_name: transformers
+pipeline_tag: text-generation
 tags:
 - datadreamer
 - datadreamer-0.46.0
 - synthetic
 - text-generation
 ---
 [![FineInstructionsCoverImage](https://cdn-uploads.huggingface.co/production/uploads/61c40eeb727d1257bf3cf5ba/jSiXJ8FaogflCSRt_YirX.png)](https://huggingface.co/fineinstructions)
 **✨ Note:** For all FineInstructions resources please visit: https://huggingface.co/fineinstructions
 ----
+This model is the Template Instantiator described in the paper [FineInstructions: Scaling Synthetic Instructions to Pre-Training Scale](https://huggingface.co/papers/2601.22146).
+It takes an instruction template in the format of [FineTemplates](https://huggingface.co/datasets/fineinstructions/finetemplates) and a document and returns an instantiated instruction and answer pair.
+The output is a JSON object.
 ## Simple Usage Example
 prompts = [tokenizer.apply_chat_template([{'role': 'user', 'content': i}], tokenize=False, add_generation_prompt=True) for i in inputs]
 generations = pipe(prompts, max_length=131072, truncation=True, temperature=None, top_p=None, do_sample=False)
 output = generations[0][0]['generated_text']
+output_json = json.loads(output)
 # Expand the answer
 output_json["answer"] = expand(document=inputs[0]["document"], text=output_json["answer"])
   primaryClass={cs.CL},
   doi={10.48550/arXiv.2601.22146}
 }
+```