Klinexa EL1.md
Browse files- README_KLINEXA_EL1.md +352 -0
README_KLINEXA_EL1.md
ADDED
|
@@ -0,0 +1,352 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: apache-2.0
|
| 3 |
+
language:
|
| 4 |
+
- id
|
| 5 |
+
tags:
|
| 6 |
+
- health
|
| 7 |
+
- medical
|
| 8 |
+
- indonesia
|
| 9 |
+
- native-llm
|
| 10 |
+
- from-scratch
|
| 11 |
+
- klinexa
|
| 12 |
+
pipeline_tag: text-generation
|
| 13 |
+
---
|
| 14 |
+
|
| 15 |
+
# KLINEXA-EL1 β Kei Local Intelligence for Nexus Expert Analysis (Edition Level 1)
|
| 16 |
+
|
| 17 |
+
> **Native Indonesian Medical LLM β dibangun dari NOL (from scratch)**
|
| 18 |
+
> Dibuat oleh **Emylton Leunufna** di Kota Langgur, Kabupaten Maluku Tenggara, Provinsi Maluku, Indonesia.
|
| 19 |
+
|
| 20 |
+
---
|
| 21 |
+
|
| 22 |
+
## Tentang KLINEXA
|
| 23 |
+
|
| 24 |
+
**KLINEXA** (**K**ei **L**ocal **I**ntelligence for **N**exus **Ex**pert **A**nalysis) adalah proyek LLM native Indonesia yang dibangun sepenuhnya dari nol β bukan fine-tune dari model lain. Seluruh arsitektur, tokenizer, dan training pipeline dirancang sendiri oleh Emylton Leunufna.
|
| 25 |
+
|
| 26 |
+
### Varian KLINEXA
|
| 27 |
+
|
| 28 |
+
| Model | Domain | Deskripsi |
|
| 29 |
+
|-------|--------|-----------|
|
| 30 |
+
| **KLINEXA-EL1** | Kesehatan Indonesia (umum) | Model utama. Dilatih dengan 500K+ data medis Indonesia: clinical reasoning, diagnosis, tatalaksana, farmakologi, SOAP, interaksi obat, dll. |
|
| 31 |
+
| **KLINEXA-EL1-Malra** | Kesehatan Kabupaten Maluku Tenggara | Versi spesialis untuk Kabupaten Maluku Tenggara (Malra). Berisi data puskesmas, rumah sakit, penyakit endemik, statistik kesehatan, SDM kesehatan, dan geografi spesifik Malra. |
|
| 32 |
+
| **KLINEXA-EL2** *(dalam pengembangan)* | Multi-domain + Crystal Architecture | Arsitektur lanjutan dengan Crystal ARM (modular knowledge storage). Target 1.55B parameters. |
|
| 33 |
+
|
| 34 |
+
> **Catatan penting:** KLINEXA-EL1 awalnya hanya untuk domain kesehatan Kabupaten Maluku Tenggara. Pada versi v10, cakupan **ditingkatkan menjadi domain kesehatan Indonesia secara umum** menggunakan dataset 500K clinical reasoning. Versi spesifik Maluku Tenggara sekarang menjadi **KLINEXA-EL1-Malra**.
|
| 35 |
+
|
| 36 |
+
---
|
| 37 |
+
|
| 38 |
+
## Spesifikasi Model
|
| 39 |
+
|
| 40 |
+
| Parameter | Nilai |
|
| 41 |
+
|-----------|-------|
|
| 42 |
+
| **Nama** | KLINEXA-EL1 |
|
| 43 |
+
| **Versi** | v10 (Full Fix) |
|
| 44 |
+
| **Total Parameters** | 271.1M |
|
| 45 |
+
| **Layers** | 16 |
|
| 46 |
+
| **Hidden Dimension** | 1024 |
|
| 47 |
+
| **Attention Heads** | 16 |
|
| 48 |
+
| **Head Dimension** | 64 |
|
| 49 |
+
| **FFN Dimension** | 2816 |
|
| 50 |
+
| **Vocabulary** | 32,000 (BPE) |
|
| 51 |
+
| **Context Length** | 1,024 tokens |
|
| 52 |
+
| **Activation** | SwiGLU |
|
| 53 |
+
| **Normalization** | RMSNorm |
|
| 54 |
+
| **Position Encoding** | RoPE (Rotary Position Embedding) |
|
| 55 |
+
| **Weight Tying** | Ya (tok_emb = lm_head) |
|
| 56 |
+
| **Dropout** | 0.1 |
|
| 57 |
+
| **Gradient Checkpointing** | Ya |
|
| 58 |
+
| **Precision** | FP16 (training), FP32 (inference) |
|
| 59 |
+
| **Built From** | Scratch (bukan fine-tune) |
|
| 60 |
+
|
| 61 |
+
---
|
| 62 |
+
|
| 63 |
+
## Arsitektur
|
| 64 |
+
|
| 65 |
+
```
|
| 66 |
+
Input Tokens
|
| 67 |
+
β
|
| 68 |
+
[Embedding] (tok_emb: 32000 Γ 1024) + Dropout(0.1)
|
| 69 |
+
β
|
| 70 |
+
[Transformer Block Γ 16]
|
| 71 |
+
βββ RMSNorm β CausalSelfAttention (16 heads, RoPE) β Dropout
|
| 72 |
+
βββ RMSNorm β SwiGLU FFN (1024 β 2816 β 1024) β Dropout
|
| 73 |
+
β
|
| 74 |
+
[RMSNorm]
|
| 75 |
+
β
|
| 76 |
+
[LM Head] (1024 β 32000, weight tied with embedding)
|
| 77 |
+
β
|
| 78 |
+
Output Logits
|
| 79 |
+
```
|
| 80 |
+
|
| 81 |
+
Setiap Transformer Block menggunakan **gradient checkpointing** untuk menghemat VRAM saat training.
|
| 82 |
+
|
| 83 |
+
### Komponen Kunci
|
| 84 |
+
|
| 85 |
+
- **RMSNorm**: Normalisasi yang lebih efisien dari LayerNorm
|
| 86 |
+
- **RoPE**: Rotary Position Embedding β encoding posisi tanpa learned parameters
|
| 87 |
+
- **SwiGLU**: `SiLU(W1Β·x) * W3Β·x` β `W2Β·...` β FFN dengan gating
|
| 88 |
+
- **Weight Tying**: Embedding dan LM head berbagi weight matrix yang sama
|
| 89 |
+
- **Causal Mask**: Lower-triangular mask untuk autoregressive generation
|
| 90 |
+
|
| 91 |
+
---
|
| 92 |
+
|
| 93 |
+
## Format Token (KRITIS β Harus Diikuti!)
|
| 94 |
+
|
| 95 |
+
### Format Training
|
| 96 |
+
|
| 97 |
+
```
|
| 98 |
+
[BOS][USER] question_tokens [ASST] answer_tokens [EOS] [PAD...]
|
| 99 |
+
```
|
| 100 |
+
|
| 101 |
+
Dengan token ID:
|
| 102 |
+
| Token | ID | Keterangan |
|
| 103 |
+
|-------|-----|------------|
|
| 104 |
+
| `<pad>` | 0 | Padding |
|
| 105 |
+
| `<bos>` | 2 | Begin of Sequence |
|
| 106 |
+
| `<eos>` | 3 | End of Sequence |
|
| 107 |
+
| `<user>` | 5 | Awal pertanyaan user (SINGLE TOKEN) |
|
| 108 |
+
| `<assistant>` | 6 | Awal jawaban assistant (SINGLE TOKEN) |
|
| 109 |
+
|
| 110 |
+
### Label Masking
|
| 111 |
+
|
| 112 |
+
```
|
| 113 |
+
Input: [BOS][USER] q1 q2 q3 [ASST] a1 a2 a3 [EOS] [PAD] [PAD]
|
| 114 |
+
Labels: [-100][-100][-100][-100][-100][-100] a1 a2 a3 [EOS] [-100] [-100]
|
| 115 |
+
^^^^^^^^^^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^^
|
| 116 |
+
prompt (di-mask, tidak di-train) answer (di-train)
|
| 117 |
+
```
|
| 118 |
+
|
| 119 |
+
- **Prompt** (BOS + USER + question + ASST) β di-mask dengan `-100`
|
| 120 |
+
- **Answer** (answer tokens + EOS) β yang di-train
|
| 121 |
+
- **Padding** β di-mask dengan `-100`
|
| 122 |
+
|
| 123 |
+
### Format Inference
|
| 124 |
+
|
| 125 |
+
```python
|
| 126 |
+
# BENAR:
|
| 127 |
+
input_ids = [BOS_ID, USER_ID] + tokenize(question) + [ASST_ID]
|
| 128 |
+
|
| 129 |
+
# SALAH (JANGAN PAKAI):
|
| 130 |
+
# "<user>\nquestion\n</user>\n<assistant>\n" β format ini TIDAK DIKENALI model
|
| 131 |
+
# "[BOS][SYS]system_prompt[USER]..." β model tidak dilatih dengan [SYS]
|
| 132 |
+
```
|
| 133 |
+
|
| 134 |
+
> **PERINGATAN**: Model HANYA mengenal format `[BOS][USER]...[ASST]`. Menggunakan format lain (termasuk `<user>...</user><assistant>` sebagai teks, atau menambahkan system prompt) akan menghasilkan output yang rusak/acak.
|
| 135 |
+
|
| 136 |
+
---
|
| 137 |
+
|
| 138 |
+
## Tokenizer
|
| 139 |
+
|
| 140 |
+
- **Tipe**: BPE (Byte-Pair Encoding)
|
| 141 |
+
- **Library**: `tokenizers` (HuggingFace tokenizers, BUKAN `transformers.PreTrainedTokenizerFast`)
|
| 142 |
+
- **Vocab size**: 32,000
|
| 143 |
+
- **File**: `klinexa_tokenizer.json`
|
| 144 |
+
|
| 145 |
+
### Cara Load
|
| 146 |
+
|
| 147 |
+
```python
|
| 148 |
+
from tokenizers import Tokenizer as TokLoader
|
| 149 |
+
|
| 150 |
+
tok = TokLoader.from_file("klinexa_tokenizer.json")
|
| 151 |
+
PAD_ID = tok.token_to_id("<pad>") # 0
|
| 152 |
+
BOS_ID = tok.token_to_id("<bos>") # 2
|
| 153 |
+
EOS_ID = tok.token_to_id("<eos>") # 3
|
| 154 |
+
USER_ID = tok.token_to_id("<user>") # 5
|
| 155 |
+
ASST_ID = tok.token_to_id("<assistant>") # 6
|
| 156 |
+
```
|
| 157 |
+
|
| 158 |
+
> **PENTING**: Gunakan `tokenizers.Tokenizer`, BUKAN `transformers.PreTrainedTokenizerFast`. Alasan: `PreTrainedTokenizerFast` meng-encode `</user>` menjadi 3 token `[10587, 7722, 37]` (bukan special token), yang akan merusak label masking.
|
| 159 |
+
|
| 160 |
+
---
|
| 161 |
+
|
| 162 |
+
## Dataset Training
|
| 163 |
+
|
| 164 |
+
### KLINEXA-EL1 v10 (Domain Kesehatan Indonesia)
|
| 165 |
+
|
| 166 |
+
| Sumber | Jumlah | Deskripsi |
|
| 167 |
+
|--------|--------|-----------|
|
| 168 |
+
| **combined_dataset_v4** | 500,000 | Clinical reasoning, SOAP, diagnosis, tatalaksana, farmakologi, interaksi obat, skenario kompleks |
|
| 169 |
+
|
| 170 |
+
Kategori data:
|
| 171 |
+
- **Clinical Reasoning**: analisis kasus, diagnosis banding, tatalaksana berbasis evidence
|
| 172 |
+
- **SOAP Notes**: dokumentasi klinis terstruktur (Subjective, Objective, Assessment, Plan)
|
| 173 |
+
- **Drug Interactions**: mekanisme interaksi obat, severity, rekomendasi
|
| 174 |
+
- **Complex Scenarios**: multi-disease management, kasus menantang (HIV+TB, geriatri, pediatri)
|
| 175 |
+
- **Disease Knowledge**: patofisiologi, epidemiologi, pencegahan
|
| 176 |
+
- **Integrated Clinical**: kasus dari diagnosis hingga tatalaksana terintegrasi
|
| 177 |
+
|
| 178 |
+
### KLINEXA-EL1-Malra (Domain Maluku Tenggara)
|
| 179 |
+
|
| 180 |
+
| Sumber | Jumlah | Deskripsi |
|
| 181 |
+
|--------|--------|-----------|
|
| 182 |
+
| malra_llm_training_v5 | 5,667 | Reasoning + 26 modul kesehatan Malra |
|
| 183 |
+
| natural_qa_12k | 12,000 | Fakta paten, identity, refusal |
|
| 184 |
+
| engine_training_qa | 452 | Analytical, recommendation, prediction |
|
| 185 |
+
| v5_augmented_4k | 3,170 | Reasoning reinforcement |
|
| 186 |
+
| sft_v8_rebalance | 7,622 | Behavior fixes |
|
| 187 |
+
|
| 188 |
+
---
|
| 189 |
+
|
| 190 |
+
## Cara Menggunakan
|
| 191 |
+
|
| 192 |
+
### 1. Load Model
|
| 193 |
+
|
| 194 |
+
```python
|
| 195 |
+
import torch
|
| 196 |
+
import torch.nn as nn
|
| 197 |
+
import torch.nn.functional as F
|
| 198 |
+
import math
|
| 199 |
+
|
| 200 |
+
# Definisi arsitektur (WAJIB β model ini native, bukan HuggingFace format)
|
| 201 |
+
class KlinexaConfig:
|
| 202 |
+
vocab_size = 32000
|
| 203 |
+
max_seq_len = 1024
|
| 204 |
+
n_layers = 16
|
| 205 |
+
n_heads = 16
|
| 206 |
+
d_model = 1024
|
| 207 |
+
d_ff = 2816
|
| 208 |
+
dropout = 0.1
|
| 209 |
+
pad_id = 0
|
| 210 |
+
bos_id = 2
|
| 211 |
+
eos_id = 3
|
| 212 |
+
@property
|
| 213 |
+
def head_dim(self):
|
| 214 |
+
return self.d_model // self.n_heads
|
| 215 |
+
|
| 216 |
+
# ... (definisi class RMSNorm, CausalSelfAttention, SwiGLUFFN,
|
| 217 |
+
# TransformerBlock, KlinexaEL1 β lihat notebook atau source code)
|
| 218 |
+
|
| 219 |
+
cfg = KlinexaConfig()
|
| 220 |
+
model = KlinexaEL1(cfg)
|
| 221 |
+
|
| 222 |
+
ckpt = torch.load("klinexa_el1_model.pt", map_location="cpu")
|
| 223 |
+
model.load_state_dict(ckpt["model_state_dict"])
|
| 224 |
+
model.eval()
|
| 225 |
+
```
|
| 226 |
+
|
| 227 |
+
### 2. Chat / Inference
|
| 228 |
+
|
| 229 |
+
```python
|
| 230 |
+
from tokenizers import Tokenizer as TokLoader
|
| 231 |
+
|
| 232 |
+
tok = TokLoader.from_file("klinexa_tokenizer.json")
|
| 233 |
+
BOS_ID = tok.token_to_id("<bos>")
|
| 234 |
+
USER_ID = tok.token_to_id("<user>")
|
| 235 |
+
ASST_ID = tok.token_to_id("<assistant>")
|
| 236 |
+
EOS_ID = tok.token_to_id("<eos>")
|
| 237 |
+
PAD_ID = tok.token_to_id("<pad>")
|
| 238 |
+
|
| 239 |
+
def chat(question, max_tokens=300, temperature=0.7):
|
| 240 |
+
u_ids = tok.encode(question).ids
|
| 241 |
+
inp = [BOS_ID, USER_ID] + u_ids + [ASST_ID]
|
| 242 |
+
inp_t = torch.tensor([inp], dtype=torch.long)
|
| 243 |
+
|
| 244 |
+
with torch.no_grad():
|
| 245 |
+
out = model.generate(inp_t, max_new_tokens=max_tokens,
|
| 246 |
+
temperature=temperature)
|
| 247 |
+
|
| 248 |
+
gen = out[0].tolist()[len(inp):]
|
| 249 |
+
clean = [t for t in gen if t not in (EOS_ID, PAD_ID)]
|
| 250 |
+
# Potong di EOS
|
| 251 |
+
for i, t in enumerate(gen):
|
| 252 |
+
if t in (EOS_ID, PAD_ID):
|
| 253 |
+
clean = gen[:i]
|
| 254 |
+
break
|
| 255 |
+
|
| 256 |
+
return tok.decode(clean)
|
| 257 |
+
|
| 258 |
+
# Contoh
|
| 259 |
+
print(chat("Apa gejala demam berdarah dengue?"))
|
| 260 |
+
print(chat("Jelaskan mekanisme kerja metformin"))
|
| 261 |
+
```
|
| 262 |
+
|
| 263 |
+
---
|
| 264 |
+
|
| 265 |
+
## Loss & Training
|
| 266 |
+
|
| 267 |
+
### Causal Language Modeling dengan Shift
|
| 268 |
+
|
| 269 |
+
```python
|
| 270 |
+
logits, _ = model(input_ids)
|
| 271 |
+
|
| 272 |
+
# KRITIS: Causal shift β logits[i] predicts token[i+1]
|
| 273 |
+
shift_logits = logits[:, :-1, :].contiguous()
|
| 274 |
+
shift_labels = labels[:, 1:].contiguous()
|
| 275 |
+
|
| 276 |
+
loss = F.cross_entropy(
|
| 277 |
+
shift_logits.view(-1, shift_logits.size(-1)),
|
| 278 |
+
shift_labels.view(-1),
|
| 279 |
+
ignore_index=-100
|
| 280 |
+
)
|
| 281 |
+
```
|
| 282 |
+
|
| 283 |
+
> **JANGAN** compute loss tanpa shift. Tanpa shift, model belajar "copy task" (memprediksi token yang sama, bukan token berikutnya), menghasilkan loss artifisial rendah dan output yang rusak.
|
| 284 |
+
|
| 285 |
+
### Hyperparameters Training
|
| 286 |
+
|
| 287 |
+
| Parameter | Nilai |
|
| 288 |
+
|-----------|-------|
|
| 289 |
+
| Optimizer | AdamW (Ξ²1=0.9, Ξ²2=0.95) |
|
| 290 |
+
| Learning Rate | 2e-5 β 1e-6 (cosine decay) |
|
| 291 |
+
| Warmup | 200 steps |
|
| 292 |
+
| Weight Decay | 0.01 |
|
| 293 |
+
| Gradient Clipping | 1.0 |
|
| 294 |
+
| Batch Size | 4 Γ 8 = 32 (effective) |
|
| 295 |
+
| AMP | FP16 |
|
| 296 |
+
| Epochs | 1 |
|
| 297 |
+
|
| 298 |
+
---
|
| 299 |
+
|
| 300 |
+
## File dalam Repository
|
| 301 |
+
|
| 302 |
+
| File | Ukuran | Deskripsi |
|
| 303 |
+
|------|--------|-----------|
|
| 304 |
+
| `klinexa_el1_model.pt` | ~1 GB | Model weights (PyTorch state_dict + config + training info) |
|
| 305 |
+
| `klinexa_tokenizer.json` | ~1 MB | BPE tokenizer (tokenizers library format) |
|
| 306 |
+
| `config.json` | ~1 KB | Model configuration |
|
| 307 |
+
| `README.md` | - | Dokumentasi ini |
|
| 308 |
+
|
| 309 |
+
---
|
| 310 |
+
|
| 311 |
+
## Riwayat Versi
|
| 312 |
+
|
| 313 |
+
| Versi | Tanggal | Perubahan |
|
| 314 |
+
|-------|---------|-----------|
|
| 315 |
+
| v1-v7 | 2026 | Iterasi awal, domain Maluku Tenggara only |
|
| 316 |
+
| v8 | Mar 2026 | Rebalanced dataset + behavior fixes |
|
| 317 |
+
| v9 | Mar 2026 | Causal shift fix, single GPU |
|
| 318 |
+
| **v10** | **Mar 2026** | **Full fix**: arsitektur exact match, format token benar, dataset 500K kesehatan Indonesia, label masking benar, tokenizer benar |
|
| 319 |
+
|
| 320 |
+
### Bug yang Diperbaiki di v10
|
| 321 |
+
|
| 322 |
+
1. **Arsitektur mismatch** β Nama layer `embed` β `tok_emb` (sesuai checkpoint). Dropout dan gradient checkpointing yang hilang dikembalikan.
|
| 323 |
+
2. **Format token salah** β `<user>...\n</user>\n<assistant>` β `[BOS][USER]...[ASST]...[EOS]` (sesuai training).
|
| 324 |
+
3. **Loss tanpa causal shift** β Ditambahkan `logits[:, :-1]` vs `labels[:, 1:]`.
|
| 325 |
+
4. **Tokenizer salah** β `PreTrainedTokenizerFast` β `tokenizers.Tokenizer` (agar special tokens dikenali dengan benar).
|
| 326 |
+
5. **Dual GPU bugs** β Dihapus, single GPU only.
|
| 327 |
+
|
| 328 |
+
---
|
| 329 |
+
|
| 330 |
+
## Batasan & Peringatan
|
| 331 |
+
|
| 332 |
+
- Model ini **BUKAN** model HuggingFace standar. Tidak bisa di-load dengan `AutoModelForCausalLM`. Harus menggunakan class `KlinexaEL1` custom.
|
| 333 |
+
- Context length terbatas **1,024 tokens**. Input + output harus muat dalam limit ini.
|
| 334 |
+
- Model dilatih untuk **domain kesehatan**. Untuk pertanyaan di luar domain, model akan berusaha menolak atau memberikan disclaimer.
|
| 335 |
+
- Model **TIDAK** boleh digunakan sebagai pengganti diagnosis medis profesional.
|
| 336 |
+
- Output model **HARUS** diverifikasi oleh tenaga medis yang kompeten.
|
| 337 |
+
|
| 338 |
+
---
|
| 339 |
+
|
| 340 |
+
## Lisensi
|
| 341 |
+
|
| 342 |
+
Apache 2.0
|
| 343 |
+
|
| 344 |
+
---
|
| 345 |
+
|
| 346 |
+
## Kredit
|
| 347 |
+
|
| 348 |
+
**Dibuat oleh:** Emylton Leunufna
|
| 349 |
+
**Lokasi:** Kota Langgur, Kabupaten Maluku Tenggara, Provinsi Maluku, Indonesia
|
| 350 |
+
**Proyek:** KLINEXA β Kei Local Intelligence for Nexus Expert Analysis
|
| 351 |
+
|
| 352 |
+
*Seluruh arsitektur, tokenizer, dataset pipeline, dan training code dirancang dan dibangun dari nol oleh Emylton Leunufna.*
|