File size: 1,252 Bytes
a9a0272 36913ed a9a0272 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 | ---
license: mit
datasets:
- Lyon28/datasets-caca-3500
language:
- id
tags:
- retrieval
- qa
- indonesian
- bm25
- tfidf
---
title: Chatbot Caca
emoji: 💬
colorFrom: purple
colorTo: pink
sdk: gradio
sdk_version: 4.44.0
app_file: app.py
pinned: false
license: mit
---
# Chatbot Caca - Retrieval-Based QA
Chatbot berbasis BM25 + TF-IDF untuk QA Bahasa Indonesia.
## Model Details
- **Type:** Retrieval-based QA System
- **Size:** 2.69 MB
- **Algorithm:** Hybrid BM25 + TF-IDF + Fuzzy Matching
- **Dataset:** datasets-caca-3500 (3,500 QA pairs)
- **Language:** Indonesian
## Usage
```python
# Install dependencies
!pip install rank-bm25 scikit-learn huggingface-hub
# Download model
from huggingface_hub import hf_hub_download
model_path = hf_hub_download(
repo_id="Lyon28/caca-based-chatbot",
filename="chatbot_caca.pkl"
)
# Load model
import pickle
with open(model_path, 'rb') as f:
data = pickle.load(f)
print(f"Loaded {len(data['qa_pairs'])} QA pairs!")
```
## Performance
- Query speed: < 10ms
- Accuracy: High for paraphrase matching
- Memory: ~3MB
## Credits
Created by Lyon28 |