SutanRifkyt commited on
Commit
111fa56
·
verified ·
1 Parent(s): ea87070

Upload folder using huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +114 -26
README.md CHANGED
@@ -1,12 +1,54 @@
1
- # K2-Inhale (LoRA Adapter)
2
-
3
- **Base model:** `LLM360/K2-Think`
4
- **Author fine-tune:** `SutanRifkyt`
5
- **Method:** QLoRA (4-bit)
6
- **Domain:** Lung CT & PET/CT findings (nodule, consolidation, FDG uptake, staging hints)
7
- **Use case:** Explain findings in plain language for patients, say how concerning for lung cancer, and suggest next step (follow-up CT, PET-CT, biopsy, urgent oncologist etc).
8
-
9
- ## How to use
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
10
 
11
  ```python
12
  import torch
@@ -29,7 +71,12 @@ model = AutoModelForCausalLM.from_pretrained(
29
  trust_remote_code=False
30
  )
31
 
32
- model = PeftModel.from_pretrained(model, adapter_id)
 
 
 
 
 
33
 
34
  prompt = """<|user|>:
35
  Explain this chest CT finding in simple language for the patient, assess how concerning it is for lung cancer, and say what should happen next.
@@ -45,33 +92,74 @@ with torch.no_grad():
45
  output = model.generate(
46
  **inputs,
47
  max_new_tokens=300,
48
- do_sample=False,
49
- temperature=0.0,
50
  )
51
 
52
  print(tokenizer.decode(output[0], skip_special_tokens=True))
53
- Training data
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
54
 
55
- Merged ~8k instruction-style samples from:
56
 
57
- ReXGroundingCT style CT findings (free-text localized abnormalities)
58
 
59
- Lung-PET-CT-Dx (TCIA) PET/CT cases with histopathology labels and staging clues
60
 
61
- Each sample is turned into:
62
 
63
- instruction: ask model to explain for patient, assess cancer concern, propose next step
64
 
65
- input: actual radiology-style finding text
66
 
67
- output: chain-of-thought style reasoning + final recommendation
68
 
69
- Safety
70
 
71
- This model is not a doctor. It's a triage / education assistant.
72
- It should not replace radiologist, oncologist, or clinical decision-making.
73
 
74
- License
 
75
 
76
- LoRA weights are provided for research use.
77
- Source data includes CC BY 4.0 material from The Cancer Imaging Archive (TCIA) and academic datasets.
 
1
+ ---
2
+ library_name: peft
3
+ base_model: LLM360/K2-Think
4
+ pipeline_tag: text-generation
5
+ license: apache-2.0
6
+ datasets:
7
+ - TCIA
8
+ - internal_synthetic_clinical_like_reports
9
+ language:
10
+ - en
11
+ - id
12
+ tags:
13
+ - medical
14
+ - lung-ct
15
+ - pet-ct
16
+ - oncology
17
+ - triage
18
+ - peft
19
+ - qlora
20
+ model_name: K2-Inhale (QLoRA adapter)
21
+ inference: false
22
+ ---
23
+
24
+ # K2-Inhale 🫁 (LoRA Adapter for LLM360/K2-Think)
25
+
26
+ **Author / Fine-tune:** Sutan Rifky Tedjasukmana (@SutanRifkyt)
27
+ **Base model:** `LLM360/K2-Think` (credit to LLM360)
28
+ **Method:** QLoRA (4-bit base) with PEFT adapters
29
+ **Domain:** Lung CT & PET/CT findings — nodules, consolidation, FDG uptake, possible staging hints
30
+ **Languages:** English + Bahasa Indonesia (output is patient-friendly, non-radiologist tone)
31
+ **Intended use:** Patient-facing explanation + triage suggestion
32
+ **Not intended for:** Final diagnosis, treatment planning, or replacing licensed clinicians.
33
+
34
+ ---
35
+
36
+ ## 🔍 What this model does
37
+
38
+ K2-Inhale is a lightweight LoRA adapter trained on top of `LLM360/K2-Think` to:
39
+ 1. Rewrite lung CT / PET-CT findings into patient-friendly explanation.
40
+ 2. Give a plain-language "how worrying is this for lung cancer".
41
+ 3. Suggest a next step (follow-up CT, PET-CT, tissue biopsy, urgent oncologist, etc.).
42
+
43
+ Target audience is:
44
+ - patients who just got an imaging report and are anxious,
45
+ - junior clinicians who want a patient-facing summary first draft.
46
+
47
+ ⚠️ This model is **NOT** a medical device and should **NOT** be used for autonomous diagnosis.
48
+
49
+ ---
50
+
51
+ ## 🧠 How to load (recommended path = base model + LoRA)
52
 
53
  ```python
54
  import torch
 
71
  trust_remote_code=False
72
  )
73
 
74
+ model = PeftModel.from_pretrained(
75
+ model,
76
+ adapter_id,
77
+ torch_dtype=torch.bfloat16,
78
+ device_map="auto",
79
+ )
80
 
81
  prompt = """<|user|>:
82
  Explain this chest CT finding in simple language for the patient, assess how concerning it is for lung cancer, and say what should happen next.
 
92
  output = model.generate(
93
  **inputs,
94
  max_new_tokens=300,
95
+ temperature=0.2,
96
+ do_sample=True,
97
  )
98
 
99
  print(tokenizer.decode(output[0], skip_special_tokens=True))
100
+ Quantized version
101
+
102
+ For easier inference on smaller GPUs / single consumer cards, a quantized export is included under quantized/.
103
+
104
+ quantized/ is an experimental merged model snapshot intended for local testing / demo.
105
+ Quality may be lower vs full base+LoRA above.
106
+
107
+ Basic usage (example, adjust to your runtime):
108
+
109
+ import torch
110
+ from transformers import AutoTokenizer, AutoModelForCausalLM
111
+
112
+ quantized_id = "SutanRifkyt/K2-Inhale/quantized"
113
+
114
+ tokenizer = AutoTokenizer.from_pretrained(
115
+ quantized_id,
116
+ use_fast=False,
117
+ trust_remote_code=False
118
+ )
119
+
120
+ model = AutoModelForCausalLM.from_pretrained(
121
+ quantized_id,
122
+ torch_dtype=torch.float16,
123
+ device_map="auto",
124
+ trust_remote_code=False
125
+ )
126
+
127
+
128
+ Note: If you see GGUF / AWQ / bitsandbytes entries, load with the correct loader for that format.
129
+
130
+ 📚 Training data (high-level)
131
+
132
+ ~8k supervised instruction-style pairs constructed from:
133
+
134
+ public lung CT and PET/CT descriptions (incl. TCIA-like oncology cohorts),
135
+
136
+ synthetic expansions of impression/assessment text,
137
+
138
+ staged "what happens next" counseling scripts.
139
+
140
+ Each sample looks like:
141
+
142
+ instruction: "Explain this finding for the patient, include cancer concern level, and next step"
143
 
144
+ input: actual CT/PET-CT style text (nodule size, FDG uptake, etc.)
145
 
146
+ output: step-by-step reasoning and final recommendation in plain language.
147
 
148
+ 🚨 Safety & limitations
149
 
150
+ This model is for triage / education, not diagnosis.
151
 
152
+ It may sound confident even when uncertain.
153
 
154
+ It has not been clinically validated.
155
 
156
+ Always involve a radiologist / oncologist for real decisions.
157
 
158
+ ✍️ Citation / credit
159
 
160
+ Base model LLM360/K2-Think is released by the LLM360 team.
161
+ This repository only publishes LoRA/PEFT adapter weights and an optional quantized snapshot, fine-tuned by Sutan Rifky Tedjasukmana (@SutanRifkyt) for lung imaging triage.
162
 
163
+ License: Apache-2.0 for adapter weights.
164
+ Underlying medical text sources may include portions of CC BY 4.0 datasets and synthetic expansions derived from them.
165