| --- |
| base_model: Qwen/Qwen3.5-4B |
| license: apache-2.0 |
| library_name: peft |
| tags: |
| - base_model:adapter:Qwen/Qwen3.5-4B |
| - lora |
| - sft |
| - transformers |
| - knowledge-graph |
| - fine-tuning |
| - medical |
| - financial |
| pipeline_tag: text-generation |
| datasets: |
| - likhithv/knowledgemesh-benchmark-eval |
| --- |
| |
| # KnowledgeMesh Full Model β LoRA Adapter |
|
|
| LoRA adapter for `Qwen/Qwen3.5-4B` fine-tuned on **4,361 knowledge graph-guided training samples** generated by the KnowledgeMesh pipeline from financial (Apple 10-K) and medical (PubMed abstracts) documents. |
|
|
| This is the **KM (full)** model from the paper *"Knowledge Graph-Guided Fine-Tuning Data Generation: A Rigorous Benchmark"*. |
|
|
| ## Benchmark Results |
|
|
| Evaluated by Gemini 2.5 Flash pointwise judge (1β5 scale, 4 dimensions): |
|
|
| | Eval Set | Base | Meta SDK | **This Model** | Delta | |
| |---|---|---|---|---| |
| | Primary (n=473, KM-generated) | 1.79 | 1.93 | **2.47** | **+0.54** | |
| | Independent (n=955, Gemini-generated) | 1.96 | 2.17 | **2.90** | **+0.72** | |
|
|
| The independent eval set (+0.72, p < 0.0001, Cohen's d = 0.57) is the primary claim β questions were generated by a different model (Gemini) with no access to the KG structure, eliminating question-style bias as an explanation. |
|
|
| ## Usage |
|
|
| ```python |
| from transformers import AutoTokenizer, AutoModelForCausalLM |
| from peft import PeftModel |
| import torch |
| |
| base_model_id = "Qwen/Qwen3.5-4B" |
| adapter_id = "likhithv/km-full-model" |
| |
| tokenizer = AutoTokenizer.from_pretrained(base_model_id) |
| base_model = AutoModelForCausalLM.from_pretrained( |
| base_model_id, |
| torch_dtype=torch.bfloat16, |
| device_map="auto", |
| ) |
| model = PeftModel.from_pretrained(base_model, adapter_id) |
| |
| messages = [{"role": "user", "content": "What are the main risk factors for type 2 diabetes?"}] |
| inputs = tokenizer.apply_chat_template(messages, return_tensors="pt", add_generation_prompt=True) |
| outputs = model.generate(inputs.to(model.device), max_new_tokens=256) |
| print(tokenizer.decode(outputs[0][inputs.shape[-1]:], skip_special_tokens=True)) |
| ``` |
|
|
| ## Training Details |
|
|
| | Parameter | Value | |
| |---|---| |
| | Base model | Qwen/Qwen3.5-4B (4-bit quantized via bitsandbytes) | |
| | Fine-tuning method | LoRA (rank=16, alpha=16) | |
| | Training samples | 4,361 (KG-guided: atomic, aggregated, multihop, chain-of-thought) | |
| | Epochs | 3 | |
| | Learning rate | 2e-4 | |
| | Effective batch size | 8 | |
| | Hardware | Kaggle T4 GPU (16 GB) | |
| | Domains | Financial (Apple 10-K 2023), Medical (PubMed abstracts) | |
|
|
| ## Eval Datasets |
|
|
| - [`likhithv/knowledgemesh-benchmark-eval`](https://huggingface.co/datasets/likhithv/knowledgemesh-benchmark-eval) β both primary (n=473) and independent (n=955) eval sets |
|
|
| ## Compared Models |
|
|
| - This model: trained on 4,361 KG-guided samples |
| - [`likhithv/meta-sdk-baseline`](https://huggingface.co/likhithv/meta-sdk-baseline) β trained on 1,209 chunk-based samples (Meta Synthetic Data Kit) |
|
|
| ## Citation |
|
|
| ```bibtex |
| @misc{knowledgemesh2026, |
| title={Knowledge Graph-Guided Fine-Tuning Data Generation: A Rigorous Benchmark}, |
| author={Likhith V}, |
| year={2026}, |
| howpublished={https://huggingface.co/likhithv/km-full-model} |
| } |
| ``` |
|
|