Instructions to use fineinstructions/template_instantiator with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Inference
Add library metadata and link to paper
#1
by nielsr HF Staff - opened
README.md
CHANGED
|
@@ -2,21 +2,26 @@
|
|
| 2 |
base_model: meta-llama/Llama-3.2-1B-Instruct
|
| 3 |
datasets:
|
| 4 |
- fineinstructions/template_instantiator_training
|
|
|
|
|
|
|
| 5 |
tags:
|
| 6 |
- datadreamer
|
| 7 |
- datadreamer-0.46.0
|
| 8 |
- synthetic
|
| 9 |
- text-generation
|
| 10 |
-
pipeline_tag: text-generation
|
| 11 |
---
|
|
|
|
| 12 |
[](https://huggingface.co/fineinstructions)
|
| 13 |
|
| 14 |
**✨ Note:** For all FineInstructions resources please visit: https://huggingface.co/fineinstructions
|
| 15 |
|
| 16 |
----
|
| 17 |
-
This model will take a instruction template in the format of [FineTemplates](https://huggingface.co/datasets/fineinstructions/finetemplates) and a document and return an instantiated instruction and answer pair.
|
| 18 |
|
| 19 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
| 20 |
|
| 21 |
## Simple Usage Example
|
| 22 |
|
|
@@ -63,7 +68,7 @@ inputs = [json.dumps({
|
|
| 63 |
prompts = [tokenizer.apply_chat_template([{'role': 'user', 'content': i}], tokenize=False, add_generation_prompt=True) for i in inputs]
|
| 64 |
generations = pipe(prompts, max_length=131072, truncation=True, temperature=None, top_p=None, do_sample=False)
|
| 65 |
output = generations[0][0]['generated_text']
|
| 66 |
-
output_json = json.loads()
|
| 67 |
|
| 68 |
# Expand the answer
|
| 69 |
output_json["answer"] = expand(document=inputs[0]["document"], text=output_json["answer"])
|
|
@@ -95,5 +100,4 @@ If you use this project in your research please cite:
|
|
| 95 |
primaryClass={cs.CL},
|
| 96 |
doi={10.48550/arXiv.2601.22146}
|
| 97 |
}
|
| 98 |
-
```
|
| 99 |
-
|
|
|
|
| 2 |
base_model: meta-llama/Llama-3.2-1B-Instruct
|
| 3 |
datasets:
|
| 4 |
- fineinstructions/template_instantiator_training
|
| 5 |
+
library_name: transformers
|
| 6 |
+
pipeline_tag: text-generation
|
| 7 |
tags:
|
| 8 |
- datadreamer
|
| 9 |
- datadreamer-0.46.0
|
| 10 |
- synthetic
|
| 11 |
- text-generation
|
|
|
|
| 12 |
---
|
| 13 |
+
|
| 14 |
[](https://huggingface.co/fineinstructions)
|
| 15 |
|
| 16 |
**✨ Note:** For all FineInstructions resources please visit: https://huggingface.co/fineinstructions
|
| 17 |
|
| 18 |
----
|
|
|
|
| 19 |
|
| 20 |
+
This model is the Template Instantiator described in the paper [FineInstructions: Scaling Synthetic Instructions to Pre-Training Scale](https://huggingface.co/papers/2601.22146).
|
| 21 |
+
|
| 22 |
+
It takes an instruction template in the format of [FineTemplates](https://huggingface.co/datasets/fineinstructions/finetemplates) and a document and returns an instantiated instruction and answer pair.
|
| 23 |
+
|
| 24 |
+
The output is a JSON object.
|
| 25 |
|
| 26 |
## Simple Usage Example
|
| 27 |
|
|
|
|
| 68 |
prompts = [tokenizer.apply_chat_template([{'role': 'user', 'content': i}], tokenize=False, add_generation_prompt=True) for i in inputs]
|
| 69 |
generations = pipe(prompts, max_length=131072, truncation=True, temperature=None, top_p=None, do_sample=False)
|
| 70 |
output = generations[0][0]['generated_text']
|
| 71 |
+
output_json = json.loads(output)
|
| 72 |
|
| 73 |
# Expand the answer
|
| 74 |
output_json["answer"] = expand(document=inputs[0]["document"], text=output_json["answer"])
|
|
|
|
| 100 |
primaryClass={cs.CL},
|
| 101 |
doi={10.48550/arXiv.2601.22146}
|
| 102 |
}
|
| 103 |
+
```
|
|
|