expertai
/

SLIMER-PARALLEL-LLaMA3

@@ -310,9 +310,8 @@ An inverse trend can be observed, with SLIMER emerging as the most effective in
   <div class="description">JSON SLIMER prompt</div>
   <div class="template">
     <pre>{
-  "description": "SLIMER prompt",
-  "prompt_input": "<|start_header_id|>system<|end_header_id|>\n\nYou are an expert in Named Entity Recognition designed to output JSON only.<|eot_id|>\n<|start_header_id|>user<|end_header_id|>\n\nYou are given a text chunk (delimited by triple quotes) and an instruction.\nRead the text and answer to the instruction in the end.\n\"\"\"\n{<span class="highlight-orange">input</span>}\n\"\"\"\nInstruction: Extract the Named Entities of type {<span class="highlight-orange">NE_name</span>} from the text chunk you have read. You are given a DEFINITION and some GUIDELINES.\nDEFINITION: {<span class="highlight-orange">definition</span>}\nGUIDELINES: {<span class="highlight-orange">guidelines</span>}\nReturn a JSON list of instances of this Named Entity type (for example [\"text_span_1\", \"text_span_2\"]. Return an empty list [] if no instances are present. Return only the JSON list, no further motivations or introduction to the answer.<|eot_id|>\n<|start_header_id|>assistant<|end_header_id|>\n\n"
-}</pre>
   </div>
 </body>
 </html>
@@ -321,14 +320,25 @@ An inverse trend can be observed, with SLIMER emerging as the most effective in
 ```python
 from vllm import LLM, SamplingParams
-vllm_model = LLM(model="expertai/SLIMER-LLaMA3")
-sampling_params = SamplingParams(temperature=0, max_tokens=128)
-prompts = [prompter.generate_prompt(instruction, input) for instruction, input in instruction_input_pairs]
-responses = vllm_model.generate(prompts, sampling_params)
-```
 ## Citation

   <div class="description">JSON SLIMER prompt</div>
   <div class="template">
     <pre>{
+  "description": "SLIMER PARALLEL 3 prompt",
+  "prompt_input": "<|start_header_id|>system<|end_header_id|>\n\nYou are a helpful NER assistant designed to output JSON.<|eot_id|>\n<|start_header_id|>user<|end_header_id|>\n\nYou are given a text chunk (delimited by triple quotes) and an instruction.\nRead the text and answer to the instruction in the end.\n\"\"\"\n{<span class="highlight-orange">input</span>}\n\"\"\"\nInstruction: Extract the entities of type {ne_tags} from the text chunk you have read. Be aware that not all of these entities are necessarily present. Do not extract entities that do not exist in the text, return an empty list for that tag. Ensure each entity is assigned to only one appropriate class.\nTo help you, here are dedicated Definition and Guidelines for each entity tag.\n{Def_and_Guidelines}\nReturn only a JSON object. The JSON should strictly follow this format:\n{expected_json_format}.\nDO NOT output anything else, just the JSON itself."}</pre>
   </div>
 </body>
 </html>
 ```python
 from vllm import LLM, SamplingParams
+vllm_model = LLM(model="expertai/SLIMER-PARALLEL-LLaMA3")
+tokenizer = vllm_model.get_tokenizer()
+sampling_params = SamplingParams(temperature=0, max_tokens=1000, stop=tokenizer.eos_token)
+# create a dictionary of dictionaries
+# each NE_type as key should have a {Definition: str, Guidelines: str} value
+# this promper formats the input text to analize with SLIMER instruction
+input_instruction_prompter = Prompter('LLaMA3-chat-NOheaders', template_path='./src/SFT_finetuning/templates')
+system_message = "You are a helpful NER assistant designed to output JSON."
+conversation = [
+    {"role": "system", "content": system_message},
+    {"role": "user", "content": input_instruction_prompter.generate_prompt(input=row["input"], instruction=row["instruction"])},  # the input_text + instruction
+]
+prompt = tokenizer.apply_chat_template(conversation, tokenize=False, truncation=True, max_length=cutoff_len, add_generation_prompt=True)
+responses = vllm_model.generate(prompt, sampling_params)
+```
 ## Citation