|
|
--- |
|
|
license: mit |
|
|
datasets: |
|
|
- Lyon28/Caca-Behavior |
|
|
language: |
|
|
- id |
|
|
tags: |
|
|
- retrieval |
|
|
- qa |
|
|
- indonesian |
|
|
- bm25 |
|
|
- tfidf |
|
|
--- |
|
|
# Chatbot Caca - Retrieval-Based QA |
|
|
|
|
|
Chatbot berbasis BM25 + TF-IDF untuk QA Bahasa Indonesia. |
|
|
|
|
|
## Model Details |
|
|
|
|
|
- **Type:** Retrieval-based QA System |
|
|
- **Size:** 2.69 MB |
|
|
- **Algorithm:** Hybrid BM25 + TF-IDF + Fuzzy Matching |
|
|
- **Dataset:** Caca-Behavior (4.079 QA pairs) |
|
|
- **Language:** Indonesian |
|
|
|
|
|
## Usage |
|
|
|
|
|
```python |
|
|
# Install dependencies |
|
|
!pip install rank-bm25 scikit-learn huggingface-hub |
|
|
|
|
|
# Download model |
|
|
from huggingface_hub import hf_hub_download |
|
|
|
|
|
model_path = hf_hub_download( |
|
|
repo_id="Lyon28/Caca-Chatbot", |
|
|
filename="chatbot_caca.pkl" |
|
|
) |
|
|
|
|
|
# Load model |
|
|
import pickle |
|
|
with open(model_path, 'rb') as f: |
|
|
data = pickle.load(f) |
|
|
|
|
|
print(f"Loaded {len(data['qa_pairs'])} QA pairs!") |
|
|
``` |
|
|
|
|
|
## Performance |
|
|
|
|
|
- Query speed: < 10ms |
|
|
- Accuracy: High for paraphrase matching |
|
|
- Memory: ~3MB |
|
|
|
|
|
## Credits |
|
|
|
|
|
Created by Lyon28 |