metadata
license: mit
datasets:
- Lyon28/datasets-caca-3500
language:
- id
tags:
- retrieval
- qa
- indonesian
- bm25
- tfidf
title: Chatbot Caca emoji: 💬 colorFrom: purple colorTo: pink sdk: gradio sdk_version: 4.44.0 app_file: app.py pinned: false license: mit
Chatbot Caca - Retrieval-Based QA
Chatbot berbasis BM25 + TF-IDF untuk QA Bahasa Indonesia.
Model Details
- Type: Retrieval-based QA System
- Size: 2.69 MB
- Algorithm: Hybrid BM25 + TF-IDF + Fuzzy Matching
- Dataset: datasets-caca-3500 (3,500 QA pairs)
- Language: Indonesian
Usage
# Install dependencies
!pip install rank-bm25 scikit-learn huggingface-hub
# Download model
from huggingface_hub import hf_hub_download
model_path = hf_hub_download(
repo_id="Lyon28/caca-based-chatbot",
filename="chatbot_caca.pkl"
)
# Load model
import pickle
with open(model_path, 'rb') as f:
data = pickle.load(f)
print(f"Loaded {len(data['qa_pairs'])} QA pairs!")
Performance
- Query speed: < 10ms
- Accuracy: High for paraphrase matching
- Memory: ~3MB
Credits
Created by Lyon28