πŸ₯ Skin Disease Knowledge Base Embeddings v2

Knowledge base lengkap untuk chatbot kesehatan kulit berbahasa Indonesia & English. Scraping dari multiple trusted sources untuk informasi yang comprehensive.

πŸ“Š Statistics

  • Total Documents: 1346
  • Embedding Dimension: 384
  • Diseases Covered: 16
  • Sources: wikipedia, alodokter, halodoc, sehatq
  • Model: sentence-transformers/all-MiniLM-L6-v2
  • Last Updated: 2026-01-05 04:30:00.329293

πŸ₯ Diseases Included

dermatitis_seboroik, eksim, herpes, impetigo, jerawat, kulit_berminyak, kulit_kering, kulit_sensitif, kurap, kutil ... (+6 more)

🌐 Data Sources

  1. Wikipedia (EN & ID) - Authoritative medical information
  2. Halodoc - Indonesian health platform
  3. Alodokter - Trusted Indonesian medical site
  4. SehatQ - Health information portal

πŸš€ Quick Start

from sentence_transformers import SentenceTransformer
import pickle
import numpy as np
from huggingface_hub import hf_hub_download

# Download KB
kb_path = hf_hub_download(
    repo_id="Ardian122/skin-embeddings-v2",
    filename="skin_kb.pkl"
)

# Load KB
with open(kb_path, "rb") as f:
    kb = pickle.load(f)

print(f"Loaded {len(kb['documents'])} documents")

# Load embedder
embedder = SentenceTransformer("sentence-transformers/all-MiniLM-L6-v2")

# Search
def search_similar(query, top_k=5):
    q_emb = embedder.encode(query, normalize_embeddings=True)
    embeddings = np.array(kb['embeddings'])
    sims = np.dot(embeddings, q_emb)
    top_idx = np.argsort(sims)[-top_k:][::-1]
    
    return [kb['documents'][i] for i in top_idx]

# Example
results = search_similar("penyebab jerawat", top_k=3)
for result in results:
    print(result['text'][:200])

πŸ“„ License

Apache 2.0

⚠️ Disclaimer

This knowledge base is for educational purposes only. Always consult healthcare professionals for medical advice.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Dataset used to train Ardian122/skin-embeddings-v2