wiki_tools / README.md
nurasaki's picture
Improved README.md
6cd8369
---
title: Wiki Tools
emoji: 😻
colorFrom: purple
colorTo: blue
sdk: gradio
sdk_version: 5.28.0
app_file: app.py
pinned: false
license: apache-2.0
short_description: Conversational space enhanced with Viquipedia RAG
---
# Wiki Tools
An example chatbot using [Gradio](https://gradio.app), [`huggingface_hub`](https://huggingface.co/docs/huggingface_hub/v0.22.2/en/index), and the [Hugging Face Inference API](https://huggingface.co/docs/api-inference/index).
This space demonstrates how to build a conversational AI application enhanced with Retrieval-Augmented Generation (RAG) using a Vector Database (VectorDB) built from Viquipedia articles.
### Current resources
- **VectorDB mRoBERTA**: 9.9GB / 2400k vectors [langtech-innovation/mRoberta_experimental_ViquipediaVectorStore](https://huggingface.co/langtech-innovation/mRoberta_experimental_ViquipediaVectorStore)
- **Embedding model mRoBERTA**: [langtech-innovation/sentence-mRoBERTa-v0](https://huggingface.co/langtech-innovation/sentence-mRoBERTa-v0)
- **LLM endpoint model** [Salamandra-7B-Instruct-Tools-16k](https://huggingface.co/BSC-LT/salamandra-7b-instruct-tools-16k)
Check out the HF Inference endpoint link: https://endpoints.huggingface.co/BSC-LT/endpoints/salamandra-7b-instruct-tools-16k
<!-- Disclaimer -->
> [!WARNING]
> **DISCLAIMER:** This model is an **experimental version** and is provided for **research purposes only**.
> Access is **not public**.
> Please do not share.
### Alternative resources
Configure other available alternative resources for embeddings and VectorDB.
Setup `VS_HF_PATH=langtech-innovation/vdb-cawiki-v3` and `EMBEDDINGS_MODEL=BAAI/bge-m3` variables to switch between resources. In this case, the system will use:
- **VectorDB BGE-M3**: 12.3GB / 2400k vectors [langtech-innovation/vdb-cawiki-v3](https://huggingface.co/langtech-innovation/vdb-cawiki-v3)
- **Embedding model BGE-M3**: [BAAI/bge-m3](https://huggingface.co/BAAI/bge-m3)