File size: 6,799 Bytes
111fa56
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
ea87070
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
111fa56
 
 
 
 
 
ea87070
 
 
 
 
 
 
 
 
 
 
 
 
 
 
111fa56
 
ea87070
 
 
111fa56
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
ea87070
111fa56
ea87070
111fa56
ea87070
111fa56
ea87070
111fa56
ea87070
111fa56
ea87070
111fa56
ea87070
111fa56
ea87070
4befaba
ea87070
111fa56
 
ea87070
4befaba
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
111fa56
 
ea87070
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
---
library_name: peft
base_model: LLM360/K2-Think
pipeline_tag: text-generation
license: apache-2.0
datasets:
  - TCIA
  - internal_synthetic_clinical_like_reports
language:
  - en
  - id
tags:
  - medical
  - lung-ct
  - pet-ct
  - oncology
  - triage
  - peft
  - qlora
model_name: K2-Inhale (QLoRA adapter)
inference: false
---

# K2-Inhale 🫁 (LoRA Adapter for LLM360/K2-Think)

**Author / Fine-tune:** Sutan Rifky Tedjasukmana (@SutanRifkyt)  
**Base model:** `LLM360/K2-Think` (credit to LLM360)  
**Method:** QLoRA (4-bit base) with PEFT adapters  
**Domain:** Lung CT & PET/CT findings β€” nodules, consolidation, FDG uptake, possible staging hints  
**Languages:** English + Bahasa Indonesia (output is patient-friendly, non-radiologist tone)  
**Intended use:** Patient-facing explanation + triage suggestion  
**Not intended for:** Final diagnosis, treatment planning, or replacing licensed clinicians.

---

## πŸ” What this model does

K2-Inhale is a lightweight LoRA adapter trained on top of `LLM360/K2-Think` to:
1. Rewrite lung CT / PET-CT findings into patient-friendly explanation.
2. Give a plain-language "how worrying is this for lung cancer".
3. Suggest a next step (follow-up CT, PET-CT, tissue biopsy, urgent oncologist, etc.).

Target audience is:
- patients who just got an imaging report and are anxious,
- junior clinicians who want a patient-facing summary first draft.

⚠️ This model is **NOT** a medical device and should **NOT** be used for autonomous diagnosis.

---

## 🧠 How to load (recommended path = base model + LoRA)

```python
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel

base_model_id = "LLM360/K2-Think"
adapter_id    = "SutanRifkyt/K2-Inhale"

tokenizer = AutoTokenizer.from_pretrained(
    base_model_id,
    use_fast=False,
    trust_remote_code=False
)

model = AutoModelForCausalLM.from_pretrained(
    base_model_id,
    torch_dtype=torch.bfloat16,
    device_map="auto",
    trust_remote_code=False
)

model = PeftModel.from_pretrained(
    model,
    adapter_id,
    torch_dtype=torch.bfloat16,
    device_map="auto",
)

prompt = """<|user|>:
Explain this chest CT finding in simple language for the patient, assess how concerning it is for lung cancer, and say what should happen next.

Clinical findings:
Spiculated 1.8 cm nodule in the right upper lobe with irregular margins and increased FDG uptake on PET.
<|assistant|>:
"""

inputs = tokenizer(prompt, return_tensors="pt").to("cuda")

with torch.no_grad():
    output = model.generate(
        **inputs,
        max_new_tokens=300,
        temperature=0.2,
        do_sample=True,
    )

print(tokenizer.decode(output[0], skip_special_tokens=True))
⚑ Quantized version

For easier inference on smaller GPUs / single consumer cards, a quantized export is included under quantized/.

quantized/ is an experimental merged model snapshot intended for local testing / demo.
Quality may be lower vs full base+LoRA above.

Basic usage (example, adjust to your runtime):

import torch
from transformers import AutoTokenizer, AutoModelForCausalLM

quantized_id = "SutanRifkyt/K2-Inhale/quantized"

tokenizer = AutoTokenizer.from_pretrained(
    quantized_id,
    use_fast=False,
    trust_remote_code=False
)

model = AutoModelForCausalLM.from_pretrained(
    quantized_id,
    torch_dtype=torch.float16,
    device_map="auto",
    trust_remote_code=False
)


Note: If you see GGUF / AWQ / bitsandbytes entries, load with the correct loader for that format.

πŸ“š Training data (high-level)

~8k supervised instruction-style pairs constructed from:

public lung CT and PET/CT descriptions (incl. TCIA-like oncology cohorts),

synthetic expansions of impression/assessment text,

staged "what happens next" counseling scripts.

Each sample looks like:

instruction: "Explain this finding for the patient, include cancer concern level, and next step"

input: actual CT/PET-CT style text (nodule size, FDG uptake, etc.)

output: step-by-step reasoning and final recommendation in plain language.

🚨 Safety & limitations

This model is for triage / education, not diagnosis.

It may sound confident even when uncertain.

It has not been clinically validated.

Always involve a radiologist / oncologist for real decisions.

πŸ“š Citation / credit

Base model LLM360/K2-Think is released by the LLM360 team.
This repository only publishes LoRA/PEFT adapter weights and an optional quantized snapshot, fine-tuned by Sutan Rifky Tedjasukmana (@SutanRifkyt) for lung imaging triage.

Li, P., Wang, S., Li, T., Lu, J., HuangFu, Y., & Wang, D. (2020). A Large-Scale CT and PET/CT Dataset for Lung Cancer Diagnosis (Lung-PET-CT-Dx) [Data set].
The Cancer Imaging Archive. https://doi.org/10.7937/TCIA.2020.NNC2-0461

S. Montagna et al., "LLM-based Solutions for Healthcare Chatbots: a Comparative Analysis," 2024 IEEE International Conference on Pervasive Computing and Communications Workshops and other Affiliated Events (PerCom Workshops), Biarritz, France, 2024,
pp. 346-351, doi: 10.1109/PerComWorkshops59983.2024.10503257. keywords: {Pervasive computing;Privacy;Filtering;Conferences;
Computational modeling;Medical services;Chatbots;Large Language Model;Medical Chatbot;Chronic Disease Management},

Baharoon, M., Luo, L., Moritz, M., Kumar, A., Kim, S.E., Zhang, X., Zhu, M., Alabbad, M.H., Alhazmi, M.S., Mistry, N.P. and Kleinschmidt, K.R., 2025. Rexgroundingct: A 3d chest ct dataset for segmentation of findings from free-text reports. arXiv preprint arXiv:2507.22030.

Faiyazuddin, M., Rahman, S.J.Q., Anand, G., Siddiqui, R.K., Mehta, R., Khatib, M.N., Gaidhane, S., Zahiruddin, Q.S., Hussain, A. and Sah, R. (2025),The Impact of Artificial Intelligence on Healthcare: A Comprehensive Review of Advancements in Diagnostics, Treatment, and Operational Efficiency.
Health Science Reports, 8: e70312. https://doi.org/10.1002/hsr2.70312

Suhana Bedi, Yutong Liu, Lucy Orr-Ewing, Dev Dash, Sanmi Koyejo, Alison Callahan, Jason A. Fries, Michael Wornow, Akshay Swaminathan, Lisa Soleymani Lehmann, Hyo Jung Hong, Mehr Kashyap, Akash R. Chaurasia, Nirav R. Shah, Karandeep Singh, Troy Tazbaz, Arnold Milstein, Michael A. Pfeffer, Nigam H. Shah
Shool, S., Adimi, S., Saboori Amleshi, R. et al. A systematic review of large language model (LLM) evaluations in clinical medicine. BMC Med Inform Decis Mak 25, 117 (2025). https://doi.org/10.1186/s12911-025-02954-4

Cheng, Z., Fan, R., Hao, S., Killian, T.W., Li, H., Sun, S., Ren, H., Moreno, A., Zhang, D., Zhong, T. and Xiong, Y., 2025.
K2-think: A parameter-efficient reasoning system. arXiv preprint arXiv:2509.07604.

License: Apache-2.0 for adapter weights.
Underlying medical text sources may include portions of CC BY 4.0 datasets and synthetic expansions derived from them.