likhithv commited on
Commit
6ebfc07
·
verified ·
1 Parent(s): 84d98c3

Update model card with benchmark results and dataset links

Browse files
Files changed (1) hide show
  1. README.md +67 -39
README.md CHANGED
@@ -1,63 +1,91 @@
1
  ---
2
  base_model: Qwen/Qwen3.5-4B
 
3
  library_name: peft
4
- model_name: km_full_model
5
  tags:
6
- - base_model:adapter:Qwen/Qwen3.5-4B
7
- - lora
8
- - sft
9
- - transformers
10
- - trl
11
- - unsloth
12
- licence: license
 
13
  pipeline_tag: text-generation
 
 
14
  ---
15
 
16
- # Model Card for km_full_model
17
 
18
- This model is a fine-tuned version of [Qwen/Qwen3.5-4B](https://huggingface.co/Qwen/Qwen3.5-4B).
19
- It has been trained using [TRL](https://github.com/huggingface/trl).
20
 
21
- ## Quick start
 
 
 
 
 
 
 
 
 
 
 
 
 
22
 
23
  ```python
24
- from transformers import pipeline
 
 
25
 
26
- question = "If you had a time machine, but could only go to the past or the future once and never return, which would you choose and why?"
27
- generator = pipeline("text-generation", model="None", device="cuda")
28
- output = generator([{"role": "user", "content": question}], max_new_tokens=128, return_full_text=False)[0]
29
- print(output["generated_text"])
30
- ```
31
 
32
- ## Training procedure
 
 
 
 
 
 
33
 
34
-
 
 
 
 
35
 
 
36
 
37
- This model was trained with SFT.
 
 
 
 
 
 
 
 
 
38
 
39
- ### Framework versions
40
 
41
- - PEFT 0.18.1
42
- - TRL: 0.24.0
43
- - Transformers: 5.2.0
44
- - Pytorch: 2.9.0+cu126
45
- - Datasets: 4.3.0
46
- - Tokenizers: 0.22.2
47
 
48
- ## Citations
49
 
 
 
50
 
 
51
 
52
- Cite TRL as:
53
-
54
  ```bibtex
55
- @misc{vonwerra2022trl,
56
- title = {{TRL: Transformer Reinforcement Learning}},
57
- author = {Leandro von Werra and Younes Belkada and Lewis Tunstall and Edward Beeching and Tristan Thrush and Nathan Lambert and Shengyi Huang and Kashif Rasul and Quentin Gallou{\'e}dec},
58
- year = 2020,
59
- journal = {GitHub repository},
60
- publisher = {GitHub},
61
- howpublished = {\url{https://github.com/huggingface/trl}}
62
  }
63
- ```
 
1
  ---
2
  base_model: Qwen/Qwen3.5-4B
3
+ license: apache-2.0
4
  library_name: peft
 
5
  tags:
6
+ - base_model:adapter:Qwen/Qwen3.5-4B
7
+ - lora
8
+ - sft
9
+ - transformers
10
+ - knowledge-graph
11
+ - fine-tuning
12
+ - medical
13
+ - financial
14
  pipeline_tag: text-generation
15
+ datasets:
16
+ - likhithv/knowledgemesh-benchmark-eval
17
  ---
18
 
19
+ # KnowledgeMesh Full Model LoRA Adapter
20
 
21
+ LoRA adapter for `Qwen/Qwen3.5-4B` fine-tuned on **4,361 knowledge graph-guided training samples** generated by the KnowledgeMesh pipeline from financial (Apple 10-K) and medical (PubMed abstracts) documents.
 
22
 
23
+ This is the **KM (full)** model from the paper *"Knowledge Graph-Guided Fine-Tuning Data Generation: A Rigorous Benchmark"*.
24
+
25
+ ## Benchmark Results
26
+
27
+ Evaluated by Gemini 2.5 Flash pointwise judge (1–5 scale, 4 dimensions):
28
+
29
+ | Eval Set | Base | Meta SDK | **This Model** | Delta |
30
+ |---|---|---|---|---|
31
+ | Primary (n=473, KM-generated) | 1.79 | 1.93 | **2.47** | **+0.54** |
32
+ | Independent (n=955, Gemini-generated) | 1.96 | 2.17 | **2.90** | **+0.72** |
33
+
34
+ The independent eval set (+0.72, p < 0.0001, Cohen's d = 0.57) is the primary claim — questions were generated by a different model (Gemini) with no access to the KG structure, eliminating question-style bias as an explanation.
35
+
36
+ ## Usage
37
 
38
  ```python
39
+ from transformers import AutoTokenizer, AutoModelForCausalLM
40
+ from peft import PeftModel
41
+ import torch
42
 
43
+ base_model_id = "Qwen/Qwen3.5-4B"
44
+ adapter_id = "likhithv/km-full-model"
 
 
 
45
 
46
+ tokenizer = AutoTokenizer.from_pretrained(base_model_id)
47
+ base_model = AutoModelForCausalLM.from_pretrained(
48
+ base_model_id,
49
+ torch_dtype=torch.bfloat16,
50
+ device_map="auto",
51
+ )
52
+ model = PeftModel.from_pretrained(base_model, adapter_id)
53
 
54
+ messages = [{"role": "user", "content": "What are the main risk factors for type 2 diabetes?"}]
55
+ inputs = tokenizer.apply_chat_template(messages, return_tensors="pt", add_generation_prompt=True)
56
+ outputs = model.generate(inputs.to(model.device), max_new_tokens=256)
57
+ print(tokenizer.decode(outputs[0][inputs.shape[-1]:], skip_special_tokens=True))
58
+ ```
59
 
60
+ ## Training Details
61
 
62
+ | Parameter | Value |
63
+ |---|---|
64
+ | Base model | Qwen/Qwen3.5-4B (4-bit quantized via bitsandbytes) |
65
+ | Fine-tuning method | LoRA (rank=16, alpha=16) |
66
+ | Training samples | 4,361 (KG-guided: atomic, aggregated, multihop, chain-of-thought) |
67
+ | Epochs | 3 |
68
+ | Learning rate | 2e-4 |
69
+ | Effective batch size | 8 |
70
+ | Hardware | Kaggle T4 GPU (16 GB) |
71
+ | Domains | Financial (Apple 10-K 2023), Medical (PubMed abstracts) |
72
 
73
+ ## Eval Datasets
74
 
75
+ - [`likhithv/knowledgemesh-benchmark-eval`](https://huggingface.co/datasets/likhithv/knowledgemesh-benchmark-eval) — both primary (n=473) and independent (n=955) eval sets
 
 
 
 
 
76
 
77
+ ## Compared Models
78
 
79
+ - This model: trained on 4,361 KG-guided samples
80
+ - [`likhithv/meta-sdk-baseline`](https://huggingface.co/likhithv/meta-sdk-baseline) — trained on 1,209 chunk-based samples (Meta Synthetic Data Kit)
81
 
82
+ ## Citation
83
 
 
 
84
  ```bibtex
85
+ @misc{knowledgemesh2026,
86
+ title={Knowledge Graph-Guided Fine-Tuning Data Generation: A Rigorous Benchmark},
87
+ author={Likhith V},
88
+ year={2026},
89
+ howpublished={https://huggingface.co/likhithv/km-full-model}
 
 
90
  }
91
+ ```