emylton
/

KLINEXA-EL1

@@ -1,17 +1,80 @@
-# KLINEXA-EL1 v10 — Full Fix
-## Critical Fixes in v10
-- Architecture: EXACT match with original (tok_emb, dropout, grad checkpoint, weight tying)
-- Token format: [BOS][USER]question[ASST]answer[EOS] (matches training format)
-- Loss: causal shift + ignore_index=-100
-- Dataset: 500K samples converted to correct format
-## Inference
-Format: `[BOS][USER]question[ASST]` → generate
-DO NOT use `<user>...</user><assistant>` format — model was NOT trained with that.
-## Architecture
-16 layers, 1024 hidden, 16 heads, SwiGLU, RoPE, RMSNorm
-Vocab 32K, max seq 1024
-WARNING: Native model, use KlinexaConfig + KlinexaEL1 classes

+---
+license: apache-2.0
+language:
+  - id
+tags:
+  - health
+  - maluku-tenggara
+  - native-llm
+  - from-scratch
+  - adaptive-learning
+pipeline_tag: text-generation
+---
+# KLINEXA-EL1 — Full Engine v5.1
+**Kei Local Intelligence for Nexus Expert Analysis — Edition Level 1**
+Native LLM dibangun dari NOL oleh **Emylton Leunufna** di Kota Langgur,
+Kabupaten Maluku Tenggara, untuk domain kesehatan, statistik, iklim, dan geografi.
+## Model
+| Parameter | Nilai |
+|---|---|
+| Total Parameters | 238.3M |
+| Layers | 16 |
+| Heads | 16 |
+| Dimension | 1024 |
+| FFN | 2816 |
+| Context Length | 1024 tokens |
+| Tokenizer | BPE 32K vocab (trained from scratch) |
+| Components | RoPE, SwiGLU, RMSNorm, Gradient Checkpointing |
+## 23 Fitur Unik (Full Engine v5.1)
+1. **Persistent Memory** — Simpan fakta baru antar sesi
+2. **Knowledge Graph** — Relasi entitas terstruktur
+3. **URL Reader** — Baca & ingat konten dari link
+4. **File Reader** — Baca PDF, CSV, Excel yang diupload
+5. **Online Learning** — Update weights dari koreksi user real-time
+6. **Multi-turn Memory** — Ingat seluruh percakapan dalam sesi
+7. **Confidence Scoring** — Model tahu batas pengetahuannya
+8. **Auto-Summary** — Rangkum percakapan panjang
+9. **Domain-Aware Auto-Learn** — Otomatis belajar dari interaksi kesehatan & pemerintahan Malra
+10. **API Health Data Fetcher** — Data real-time dari BPS, Kemenkes, portal Malra
+11. **Geo-Temporal Context** — Sadar musim, cuaca, risiko kesehatan, & kalender lokal
+12. **Uncertainty Quantification** — Sistem ragu-ragu eksplisit dengan 5 level keyakinan
+13. **Fragmented Processing** — Pecah pertanyaan kompleks multi-topik otomatis
+14. **Medical Protocol Compliance** — Hard-coded guardrails medis
+15. **Recursive Self-Correction** — Micro-update weights + decay + domain-priority
+16. **Dual-Core Check-Before-Speak** — Generator + Verifier audit loop
+17. **Native Geospatial Memory** — GPS 24 faskes, jarak haversine, rute laut/darat
+18. **Self-Auditing Medical Law** — DOEN, wewenang faskes, regulasi Kemenkes/UU Kesehatan
+19. **Ethno-Medical Mapping** — 9 tanaman obat lokal + senyawa aktif + bukti ilmiah
+20. **Cultural Tone-Switcher** — Mode Dokter/Anak Daerah/Santai sesuai konteks budaya Kei
+21. **Drug-Herb Interaction** — 6 interaksi obat-ramuan berbahaya terdeteksi otomatis
+22. **Placebo & Empathy Layer** — Hormati spiritual/adat + sisipkan saran medis
+23. **Myth vs Fact Classifier** — 8 mitos kesehatan Malra dikoreksi secara halus & tegas
+## Training
+- Pre-train corpus: Wikipedia ID + Statistik BPS + Iklim/Geo + Kesehatan Publik
+- SFT data: 4,795 samples dari 7 dokumen resmi Maluku Tenggara
+- Identity & key facts: hard-coded (engine layer, not in model weights)
+- Pre-train steps: 2880
+- SFT steps: 1710
+- GPU: NVIDIA A100 80GB
+## Inference Format
+Model menggunakan format SFT: `[BOS][USER]question[ASST]` → generate.
+JANGAN pakai `[SYS]` token — model tidak pernah dilatih dengan format itu.
+## Engine Architecture (v5.1)
+- Model generate = bersih, tanpa context injection
+- Identity/greeting = hard-coded responses (13 kategori)
+- Post-processing: Drug-herb interaction, Myth check, Uncertainty
+- Disabled (noise): Verifier, Legal audit, Medical guardrails non-critical, Auto-learn
+- Slash commands: 25 perintah interaktif

config.json CHANGED Viewed

@@ -1,7 +1,7 @@
 {
   "model_name": "KLINEXA-EL1",
   "architecture": "GPT-style Decoder-Only Transformer",
-  "version": "v7.0 (SFT v7 — v6 polish + stunting/disease/tual_kec patches)",
   "config": {
     "vocab_size": 32000,
     "max_seq_len": 1024,
@@ -18,7 +18,31 @@
     "<assistant>": 6,
     "<system>": 7
   },
-  "inference_format": "[BOS][USER]question[ASST] -> generate",
   "creator": "Emylton Leunufna",
   "location": "Kota Langgur, Kabupaten Maluku Tenggara"
 }

 {
   "model_name": "KLINEXA-EL1",
   "architecture": "GPT-style Decoder-Only Transformer",
+  "version": "Full Engine v5.1",
   "config": {
     "vocab_size": 32000,
     "max_seq_len": 1024,
     "<assistant>": 6,
     "<system>": 7
   },
+  "engine_features": [
+    "persistent_memory",
+    "knowledge_graph",
+    "url_reader",
+    "file_reader",
+    "online_learning",
+    "multi_turn_memory",
+    "confidence_scoring",
+    "auto_summary",
+    "domain_aware_auto_learn",
+    "api_health_data_fetcher",
+    "geo_temporal_context",
+    "uncertainty_quantification",
+    "fragmented_processing",
+    "medical_protocol_compliance",
+    "recursive_self_correction",
+    "dual_core_verifier",
+    "native_geospatial_memory",
+    "medical_law_audit",
+    "ethno_medical_mapping",
+    "cultural_tone_switcher",
+    "drug_herb_interaction",
+    "placebo_empathy_layer",
+    "myth_fact_classifier"
+  ],
   "creator": "Emylton Leunufna",
   "location": "Kota Langgur, Kabupaten Maluku Tenggara"
 }

klinexa_el1_model.pt CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:f2939191b2c5f9aa31c379ae4e4c988b86fcb9d800a4db9391afe0e9283e3ba1
-size 953349771

 version https://git-lfs.github.com/spec/v1
+oid sha256:5d680a8908a542622e32aa428cd83c5eb0005c98f334ecb20f98954e3529d0f0
+size 953349707