Lyon28
/

Caca-Chatbot

Model card Files Files and versions

Caca-Chatbot / model_card.md

Lyon28's picture

Create model_card.md

d646643 verified 3 months ago

|

history blame contribute delete

1 kB

	# Model Card: Chatbot Caca Retrieval

	## Model Description

	Lightweight retrieval-based QA system untuk Bahasa Indonesia.

	### Training Data

	- Source: datasets-caca-3500
	- Size: 3,500 conversational QA pairs
	- Language: Indonesian
	- Format: User-Assistant conversations

	### Architecture

	- Algorithm: Hybrid scoring system
	- BM25 (40% weight) - keyword matching
	- TF-IDF + Cosine Similarity (50% weight) - semantic matching
	- Fuzzy String Matching (10% weight) - typo tolerance

	### Performance

	\| Metric \| Value \|
	\|--------\|-------\|
	\| Model Size \| 2.69 MB \|
	\| Query Latency \| <10 ms \|
	\| Memory Usage \| ~5 MB RAM \|
	\| Paraphrase Accuracy \| High \|

	### Limitations

	- Only works for questions in dataset or similar paraphrases
	- No generative capability
	- Limited to Indonesian language

	### Ethical Considerations

	- Responses reflect training data (datasets-caca-3500)
	- Personality may include sarcasm/humor
	- Not suitable for critical applications

	### License

	MIT License