File size: 1,252 Bytes
a9a0272
 
 
 
 
 
 
 
 
 
 
 
36913ed
 
 
 
 
 
 
 
 
 
a9a0272
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
---
license: mit
datasets:
- Lyon28/datasets-caca-3500
language:
- id
tags:
   - retrieval
   - qa
   - indonesian
   - bm25
   - tfidf
---
title: Chatbot Caca
emoji: 💬
colorFrom: purple
colorTo: pink
sdk: gradio
sdk_version: 4.44.0
app_file: app.py
pinned: false
license: mit
---

   # Chatbot Caca - Retrieval-Based QA
   
   Chatbot berbasis BM25 + TF-IDF untuk QA Bahasa Indonesia.
   
   ## Model Details
   
   - **Type:** Retrieval-based QA System
   - **Size:** 2.69 MB
   - **Algorithm:** Hybrid BM25 + TF-IDF + Fuzzy Matching
   - **Dataset:** datasets-caca-3500 (3,500 QA pairs)
   - **Language:** Indonesian
   
   ## Usage
   
   ```python
   # Install dependencies
   !pip install rank-bm25 scikit-learn huggingface-hub
   
   # Download model
   from huggingface_hub import hf_hub_download
   
   model_path = hf_hub_download(
       repo_id="Lyon28/caca-based-chatbot",
       filename="chatbot_caca.pkl"
   )
   
   # Load model
   import pickle
   with open(model_path, 'rb') as f:
       data = pickle.load(f)
   
   print(f"Loaded {len(data['qa_pairs'])} QA pairs!")
   ```
   
   ## Performance
   
   - Query speed: < 10ms
   - Accuracy: High for paraphrase matching
   - Memory: ~3MB
   
   ## Credits
   
   Created by Lyon28