Model Card for Model ID

A LoRA made for the ORKG Ask synthesis usecase. It expects a research question, and a list of five (5) abstracts!

The model also supports 13 languages and 4 diffierent language tones.

Model Details

Finetuned using unsloth with a dataset created using larger LLMs like GPT-4o.

Languages supported are:

  • English
  • Spanish
  • German
  • Dutch
  • French
  • Italian
  • Portuguese
  • Russian
  • Chinese
  • Japanese
  • Korean
  • Arabic
  • Farsi

Language tones supported:

  • Researcher
  • Adult
  • Teenager
  • Child

Model Description

The language tones are described as follows:

  1. Child (10–11 years old):

    • Simple, short sentences and basic accurate explanations.
    • No advanced jargons.
    • Everyday examples that tie into the research findings.
  2. Teenager:

    • Casual, engaging manner; relevant slang moderately.
    • Interesting and emotional research findings.
    • Relatable explanations, referencing everyday scenarios or pop culture where applicable.
  3. Adult:

    • Concise details yet with a polished, clear tone.
    • Moderate, non-technical vocabulary where possible.
    • Essential context and logical flow, focusing on practical applications of research.
  4. Researcher:

    • Formal, precise language with clear references to methodologies or data.
    • Discipline-specific terminology as needed.
    • Balanced, objective presentation of research complexities.

The system prompt of the model is:

Generate a comprehensive answer to the given research question (but no more than three/four sentences)
solely based on the content provided.
Cite the number of the content referenced for each claim like this:
[1] for a single reference or [2][3] for multiple references.
Generate the synthesis in the "{language}" language, and phrase the complexity of the text to be suitable for a/an {level}.

The user prompt should look like this:

# Research Question: {{ question }}
# Abstracts:
Abstract #1:
 Title Here
Abstract text here

Abstract #2:
 Title Here
Abstract text here

Abstract #3:
 Title Here
Abstract text here

Abstract #4:
 Title Here
Abstract text here

Abstract #5:
 Title Here
Abstract text here

# Answer with inline-citations:

The model should be used in chat mode or use the chat template (check tokenizer) and feed it to a normal generation endpoint.

Trainging Details

LoRA details

r=16
target_modules=["q_proj", "k_proj", "v_proj", "o_proj","gate_proj", "up_proj", "down_proj"]
lora_alpha=32
lora_dropout=0
seed=42

SFT details

per_device_train_batch_size=16
gradient_accumulation_steps=8
warmup_steps=5
num_train_epochs=3
learning_rate=2e-4
bf16=True
optim="adamw_torch_fused"
weight_decay=0.01
lr_scheduler_type="linear"
seed=42

Trained on responses only!!

Model Card Contact

ORKG Ask Team - info@orkg.org

Framework versions

  • PEFT 0.14.0
Downloads last month
2
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for yaser-j/qwen2.5-3b-syn-adapter

Base model

Qwen/Qwen2.5-3B
Adapter
(1138)
this model