Add library metadata and link to paper

#1
by nielsr HF Staff - opened
Files changed (1) hide show
  1. README.md +10 -6
README.md CHANGED
@@ -2,21 +2,26 @@
2
  base_model: meta-llama/Llama-3.2-1B-Instruct
3
  datasets:
4
  - fineinstructions/template_instantiator_training
 
 
5
  tags:
6
  - datadreamer
7
  - datadreamer-0.46.0
8
  - synthetic
9
  - text-generation
10
- pipeline_tag: text-generation
11
  ---
 
12
  [![FineInstructionsCoverImage](https://cdn-uploads.huggingface.co/production/uploads/61c40eeb727d1257bf3cf5ba/jSiXJ8FaogflCSRt_YirX.png)](https://huggingface.co/fineinstructions)
13
 
14
  **✨ Note:** For all FineInstructions resources please visit: https://huggingface.co/fineinstructions
15
 
16
  ----
17
- This model will take a instruction template in the format of [FineTemplates](https://huggingface.co/datasets/fineinstructions/finetemplates) and a document and return an instantiated instruction and answer pair.
18
 
19
- The output will be a JSON object.
 
 
 
 
20
 
21
  ## Simple Usage Example
22
 
@@ -63,7 +68,7 @@ inputs = [json.dumps({
63
  prompts = [tokenizer.apply_chat_template([{'role': 'user', 'content': i}], tokenize=False, add_generation_prompt=True) for i in inputs]
64
  generations = pipe(prompts, max_length=131072, truncation=True, temperature=None, top_p=None, do_sample=False)
65
  output = generations[0][0]['generated_text']
66
- output_json = json.loads()
67
 
68
  # Expand the answer
69
  output_json["answer"] = expand(document=inputs[0]["document"], text=output_json["answer"])
@@ -95,5 +100,4 @@ If you use this project in your research please cite:
95
  primaryClass={cs.CL},
96
  doi={10.48550/arXiv.2601.22146}
97
  }
98
- ```
99
-
 
2
  base_model: meta-llama/Llama-3.2-1B-Instruct
3
  datasets:
4
  - fineinstructions/template_instantiator_training
5
+ library_name: transformers
6
+ pipeline_tag: text-generation
7
  tags:
8
  - datadreamer
9
  - datadreamer-0.46.0
10
  - synthetic
11
  - text-generation
 
12
  ---
13
+
14
  [![FineInstructionsCoverImage](https://cdn-uploads.huggingface.co/production/uploads/61c40eeb727d1257bf3cf5ba/jSiXJ8FaogflCSRt_YirX.png)](https://huggingface.co/fineinstructions)
15
 
16
  **✨ Note:** For all FineInstructions resources please visit: https://huggingface.co/fineinstructions
17
 
18
  ----
 
19
 
20
+ This model is the Template Instantiator described in the paper [FineInstructions: Scaling Synthetic Instructions to Pre-Training Scale](https://huggingface.co/papers/2601.22146).
21
+
22
+ It takes an instruction template in the format of [FineTemplates](https://huggingface.co/datasets/fineinstructions/finetemplates) and a document and returns an instantiated instruction and answer pair.
23
+
24
+ The output is a JSON object.
25
 
26
  ## Simple Usage Example
27
 
 
68
  prompts = [tokenizer.apply_chat_template([{'role': 'user', 'content': i}], tokenize=False, add_generation_prompt=True) for i in inputs]
69
  generations = pipe(prompts, max_length=131072, truncation=True, temperature=None, top_p=None, do_sample=False)
70
  output = generations[0][0]['generated_text']
71
+ output_json = json.loads(output)
72
 
73
  # Expand the answer
74
  output_json["answer"] = expand(document=inputs[0]["document"], text=output_json["answer"])
 
100
  primaryClass={cs.CL},
101
  doi={10.48550/arXiv.2601.22146}
102
  }
103
+ ```