| | --- |
| | license: mit |
| | datasets: |
| | - Lyon28/datasets-caca-3500 |
| | language: |
| | - id |
| | tags: |
| | - retrieval |
| | - qa |
| | - indonesian |
| | - bm25 |
| | - tfidf |
| | --- |
| | title: Chatbot Caca |
| | emoji: 💬 |
| | colorFrom: purple |
| | colorTo: pink |
| | sdk: gradio |
| | sdk_version: 4.44.0 |
| | app_file: app.py |
| | pinned: false |
| | license: mit |
| | --- |
| |
|
| | # Chatbot Caca - Retrieval-Based QA |
| | |
| | Chatbot berbasis BM25 + TF-IDF untuk QA Bahasa Indonesia. |
| | |
| | ## Model Details |
| | |
| | - **Type:** Retrieval-based QA System |
| | - **Size:** 2.69 MB |
| | - **Algorithm:** Hybrid BM25 + TF-IDF + Fuzzy Matching |
| | - **Dataset:** datasets-caca-3500 (3,500 QA pairs) |
| | - **Language:** Indonesian |
| | |
| | ## Usage |
| | |
| | ```python |
| | # Install dependencies |
| | !pip install rank-bm25 scikit-learn huggingface-hub |
| | |
| | # Download model |
| | from huggingface_hub import hf_hub_download |
| | |
| | model_path = hf_hub_download( |
| | repo_id="Lyon28/caca-based-chatbot", |
| | filename="chatbot_caca.pkl" |
| | ) |
| | |
| | # Load model |
| | import pickle |
| | with open(model_path, 'rb') as f: |
| | data = pickle.load(f) |
| | |
| | print(f"Loaded {len(data['qa_pairs'])} QA pairs!") |
| | ``` |
| | |
| | ## Performance |
| | |
| | - Query speed: < 10ms |
| | - Accuracy: High for paraphrase matching |
| | - Memory: ~3MB |
| | |
| | ## Credits |
| | |
| | Created by Lyon28 |