likhithv
/

km-full-model

@@ -1,63 +1,91 @@
 ---
 base_model: Qwen/Qwen3.5-4B
 library_name: peft
-model_name: km_full_model
 tags:
-- base_model:adapter:Qwen/Qwen3.5-4B
-- lora
-- sft
-- transformers
-- trl
-- unsloth
-licence: license
 pipeline_tag: text-generation
 ---
-# Model Card for km_full_model
-This model is a fine-tuned version of [Qwen/Qwen3.5-4B](https://huggingface.co/Qwen/Qwen3.5-4B).
-It has been trained using [TRL](https://github.com/huggingface/trl).
-## Quick start
 ```python
-from transformers import pipeline
-question = "If you had a time machine, but could only go to the past or the future once and never return, which would you choose and why?"
-generator = pipeline("text-generation", model="None", device="cuda")
-output = generator([{"role": "user", "content": question}], max_new_tokens=128, return_full_text=False)[0]
-print(output["generated_text"])
-```
-## Training procedure
-This model was trained with SFT.
-### Framework versions
-- PEFT 0.18.1
-- TRL: 0.24.0
-- Transformers: 5.2.0
-- Pytorch: 2.9.0+cu126
-- Datasets: 4.3.0
-- Tokenizers: 0.22.2
-## Citations
-Cite TRL as:
 ```bibtex
-@misc{vonwerra2022trl,
-	title        = {{TRL: Transformer Reinforcement Learning}},
-	author       = {Leandro von Werra and Younes Belkada and Lewis Tunstall and Edward Beeching and Tristan Thrush and Nathan Lambert and Shengyi Huang and Kashif Rasul and Quentin Gallou{\'e}dec},
-	year         = 2020,
-	journal      = {GitHub repository},
-	publisher    = {GitHub},
-	howpublished = {\url{https://github.com/huggingface/trl}}
 }
-```

 ---
 base_model: Qwen/Qwen3.5-4B
+license: apache-2.0
 library_name: peft
 tags:
+  - base_model:adapter:Qwen/Qwen3.5-4B
+  - lora
+  - sft
+  - transformers
+  - knowledge-graph
+  - fine-tuning
+  - medical
+  - financial
 pipeline_tag: text-generation
+datasets:
+  - likhithv/knowledgemesh-benchmark-eval
 ---
+# KnowledgeMesh Full Model — LoRA Adapter
+LoRA adapter for `Qwen/Qwen3.5-4B` fine-tuned on **4,361 knowledge graph-guided training samples** generated by the KnowledgeMesh pipeline from financial (Apple 10-K) and medical (PubMed abstracts) documents.
+This is the **KM (full)** model from the paper *"Knowledge Graph-Guided Fine-Tuning Data Generation: A Rigorous Benchmark"*.
+## Benchmark Results
+Evaluated by Gemini 2.5 Flash pointwise judge (1–5 scale, 4 dimensions):
+| Eval Set | Base | Meta SDK | **This Model** | Delta |
+|---|---|---|---|---|
+| Primary (n=473, KM-generated) | 1.79 | 1.93 | **2.47** | **+0.54** |
+| Independent (n=955, Gemini-generated) | 1.96 | 2.17 | **2.90** | **+0.72** |
+The independent eval set (+0.72, p < 0.0001, Cohen's d = 0.57) is the primary claim — questions were generated by a different model (Gemini) with no access to the KG structure, eliminating question-style bias as an explanation.
+## Usage
 ```python
+from transformers import AutoTokenizer, AutoModelForCausalLM
+from peft import PeftModel
+import torch
+base_model_id = "Qwen/Qwen3.5-4B"
+adapter_id = "likhithv/km-full-model"
+tokenizer = AutoTokenizer.from_pretrained(base_model_id)
+base_model = AutoModelForCausalLM.from_pretrained(
+    base_model_id,
+    torch_dtype=torch.bfloat16,
+    device_map="auto",
+)
+model = PeftModel.from_pretrained(base_model, adapter_id)
+messages = [{"role": "user", "content": "What are the main risk factors for type 2 diabetes?"}]
+inputs = tokenizer.apply_chat_template(messages, return_tensors="pt", add_generation_prompt=True)
+outputs = model.generate(inputs.to(model.device), max_new_tokens=256)
+print(tokenizer.decode(outputs[0][inputs.shape[-1]:], skip_special_tokens=True))
+```
+## Training Details
+| Parameter | Value |
+|---|---|
+| Base model | Qwen/Qwen3.5-4B (4-bit quantized via bitsandbytes) |
+| Fine-tuning method | LoRA (rank=16, alpha=16) |
+| Training samples | 4,361 (KG-guided: atomic, aggregated, multihop, chain-of-thought) |
+| Epochs | 3 |
+| Learning rate | 2e-4 |
+| Effective batch size | 8 |
+| Hardware | Kaggle T4 GPU (16 GB) |
+| Domains | Financial (Apple 10-K 2023), Medical (PubMed abstracts) |
+## Eval Datasets
+- [`likhithv/knowledgemesh-benchmark-eval`](https://huggingface.co/datasets/likhithv/knowledgemesh-benchmark-eval) — both primary (n=473) and independent (n=955) eval sets
+## Compared Models
+- This model: trained on 4,361 KG-guided samples
+- [`likhithv/meta-sdk-baseline`](https://huggingface.co/likhithv/meta-sdk-baseline) — trained on 1,209 chunk-based samples (Meta Synthetic Data Kit)
+## Citation
 ```bibtex
+@misc{knowledgemesh2026,
+  title={Knowledge Graph-Guided Fine-Tuning Data Generation: A Rigorous Benchmark},
+  author={Likhith V},
+  year={2026},
+  howpublished={https://huggingface.co/likhithv/km-full-model}
 }
+```