Model Card for Model ID

A LoRA made for the ORKG Ask synthesis usecase. It expects a research question, and a list of five (5) abstracts!

The model also supports 13 languages and 4 diffierent language tones.

Model Details

Finetuned using unsloth with a dataset created using larger LLMs like GPT-4o.

Languages supported are:

English
Spanish
German
Dutch
French
Italian
Portuguese
Russian
Chinese
Japanese
Korean
Arabic
Farsi

Language tones supported:

Researcher
Adult
Teenager
Child

Model Description

The language tones are described as follows:

Child (10–11 years old):
- Simple, short sentences and basic accurate explanations.
- No advanced jargons.
- Everyday examples that tie into the research findings.
Teenager:
- Casual, engaging manner; relevant slang moderately.
- Interesting and emotional research findings.
- Relatable explanations, referencing everyday scenarios or pop culture where applicable.
Adult:
- Concise details yet with a polished, clear tone.
- Moderate, non-technical vocabulary where possible.
- Essential context and logical flow, focusing on practical applications of research.
Researcher:
- Formal, precise language with clear references to methodologies or data.
- Discipline-specific terminology as needed.
- Balanced, objective presentation of research complexities.

The system prompt of the model is:

Generate a comprehensive answer to the given research question (but no more than three/four sentences)
solely based on the content provided.
Cite the number of the content referenced for each claim like this:
[1] for a single reference or [2][3] for multiple references.
Generate the synthesis in the "{language}" language, and phrase the complexity of the text to be suitable for a/an {level}.

The user prompt should look like this:

# Research Question: {{ question }}
# Abstracts:
Abstract #1:
 Title Here
Abstract text here

Abstract #2:
 Title Here
Abstract text here

Abstract #3:
 Title Here
Abstract text here

Abstract #4:
 Title Here
Abstract text here

Abstract #5:
 Title Here
Abstract text here

# Answer with inline-citations:

The model should be used in chat mode or use the chat template (check tokenizer) and feed it to a normal generation endpoint.

Trainging Details

LoRA details

r=16
target_modules=["q_proj", "k_proj", "v_proj", "o_proj","gate_proj", "up_proj", "down_proj"]
lora_alpha=32
lora_dropout=0
seed=42

SFT details

per_device_train_batch_size=16
gradient_accumulation_steps=8
warmup_steps=5
num_train_epochs=3
learning_rate=2e-4
bf16=True
optim="adamw_torch_fused"
weight_decay=0.01
lr_scheduler_type="linear"
seed=42

Trained on responses only!!

Model Card Contact

ORKG Ask Team - info@orkg.org

Framework versions

PEFT 0.14.0

Downloads last month: 2

Model tree for yaser-j/qwen2.5-3b-syn-adapter

Base model

Qwen/Qwen2.5-3B

Finetuned

Qwen/Qwen2.5-3B-Instruct

Adapter

(1183)

this model