Spaces:
Sleeping
A newer version of the Streamlit SDK is available: 1.57.0
π§ Troubleshooting Guide
Common Issues & Solutions
1. Setup Issues
"ModuleNotFoundError: No module named 'streamlit'"
Problem: Dependencies not installed Solution:
source .venv/bin/activate
pip install -r requirements.txt
"python3: command not found"
Problem: Python not installed or not in PATH Solution:
# Install Python 3.8+
# macOS: brew install python3
# Ubuntu/Debian: sudo apt install python3
# Windows: Download from python.org
# Verify:
python3 --version
"virtualenv not found"
Problem: venv module missing Solution:
# Install it:
# macOS: brew install python3-venv
# Ubuntu: sudo apt install python3-venv
# Then recreate venv:
python3 -m venv .venv
2. Dataset Building Issues
"No article URLs found"
Problem: Website structure changed or connection failed Solution:
# Check internet connection
ping community.sap.com
# Try rebuilding with debug
python tools/build_dataset.py
# Check if data directory exists
ls -la data/
"Connection timeout"
Problem: Website taking too long to respond Solution:
# Modify timeout in tools/build_dataset.py:
# Change: timeout=10
# To: timeout=30
# Or add delay
import time
time.sleep(5) # Between requests
"Permission denied" error
Problem: Can't write to data directory Solution:
# Fix permissions
mkdir -p data
chmod 755 data/
# Or run with sudo (not recommended)
sudo python tools/build_dataset.py
3. Embeddings/Index Issues
"ModuleNotFoundError: No module named 'faiss'"
Problem: FAISS not installed correctly Solution:
pip uninstall faiss-cpu
pip install faiss-cpu --no-cache-dir
# Or use GPU version if available:
# pip install faiss-gpu
"CUDA error" / "GPU not found"
Problem: GPU version installed but no GPU available Solution:
# Use CPU version instead
pip uninstall faiss-gpu
pip install faiss-cpu
"MemoryError during embeddings"
Problem: System ran out of memory Solution:
# In tools/embeddings.py, reduce batch size:
# Change: batch_size=32
# To: batch_size=8 or 4
# Or use smaller model:
# Change: model_name="all-MiniLM-L6-v2"
# To: model_name="sentence-transformers/all-MiniLM-L12-v2"
"Index not found" error
Problem: RAG index not built Solution:
# Rebuild the index
python tools/embeddings.py
# Verify files exist
ls -la data/rag_index.faiss
ls -la data/rag_metadata.pkl
4. LLM Provider Issues
Ollama
"ConnectionRefusedError: [Errno 111] Connection refused"
# Ollama server not running
# Start it in a new terminal:
ollama serve
# Or use nohup to background it:
nohup ollama serve &
"Model not found"
# Pull the model first:
ollama pull mistral
# Or
ollama pull neural-chat
ollama pull dolphin-mixtral
# List available models:
ollama list
"Out of memory"
# Use smaller model:
ollama pull neural-chat # 3B instead of 7B
# Or configure in config.py:
DEFAULT_MODEL = "neural-chat"
Replicate
"REPLICATE_API_TOKEN not set"
# Set token in terminal:
export REPLICATE_API_TOKEN="your_token_here"
# Or add to .env:
REPLICATE_API_TOKEN=your_token_here
# Verify:
echo $REPLICATE_API_TOKEN
"401 Unauthorized"
# Token is invalid or expired
# 1. Get new token from https://replicate.com/account
# 2. Update environment variable
# 3. Try again
"Rate limit exceeded"
# Wait a bit, then try again
# Or use Ollama/HuggingFace instead
HuggingFace
"HF_API_TOKEN not set"
# Set token:
export HF_API_TOKEN="your_token_here"
# Or add to .env:
HF_API_TOKEN=your_token_here
# Verify:
echo $HF_API_TOKEN
"Model not found" on HuggingFace
# Verify model ID exists:
# Go to https://huggingface.co/models
# Find a text-generation model
# Example: mistralai/Mistral-7B-Instruct-v0.1
# Update config:
LLM_MODEL="mistralai/Mistral-7B-Instruct-v0.1"
5. Streamlit Issues
"streamlit: command not found"
Problem: Streamlit not installed Solution:
source .venv/bin/activate
pip install streamlit>=1.28.0
Port 8501 already in use
Problem: Another app using port 8501 Solution:
# Use different port:
streamlit run app.py --server.port 8502
# Or kill the process using 8501:
lsof -i :8501 # See what's using it
kill -9 <PID> # Kill it
"Cache resource initialization failed"
Problem: Session state issue Solution:
# Clear Streamlit cache:
rm -rf ~/.streamlit/cache/
# Restart the app:
streamlit run app.py
App not responding / frozen
Problem: Long-running operation blocking UI Solution:
# Wait for current operation to complete
# Or restart:
# 1. Press Ctrl+C
# 2. Run: streamlit run app.py again
6. Runtime Issues
"Empty search results"
Problem: No relevant documents found Solution:
# 1. Verify dataset exists:
ls -la data/sap_dataset.json
# 2. Verify index exists:
ls -la data/rag_index.faiss
# 3. Try a different query:
# "SAP Basis administration" instead of "help"
# 4. Rebuild dataset:
python tools/build_dataset.py
python tools/embeddings.py
"Very slow responses"
Problem: LLM taking too long Solution:
# Use faster model in config.py:
DEFAULT_MODEL = "neural-chat" # 3B is 2-3x faster
# Or use cloud provider (usually faster):
LLM_PROVIDER = "replicate"
"Inaccurate or irrelevant answers"
Problem: RAG not finding good sources or LLM quality Solution:
# 1. Improve RAG:
# In config.py, increase sources:
RAG_TOP_K = 10 # From 5
# 2. Use better embeddings:
EMBEDDINGS_MODEL = "all-mpnet-base-v2" # Better quality
# 3. Use better LLM:
DEFAULT_MODEL = "mistral" # From neural-chat
# 4. Rebuild index:
python tools/embeddings.py
"API rate limit exceeded"
Problem: Using cloud provider too frequently Solution:
# 1. Wait a bit
# 2. Use Ollama (no rate limits)
# 3. Or try different cloud provider
7. Configuration Issues
"Settings not taking effect"
Problem: Configuration changes not applied Solution:
# 1. Make sure you edited the right file:
cat .env
# 2. Restart the app:
# Ctrl+C and run again
# 3. Clear cache:
rm -rf ~/.streamlit/cache/
streamlit run app.py
"Environment variables not loading"
Problem: .env file not being read Solution:
# Verify in app.py or config.py:
# from dotenv import load_dotenv
# load_dotenv() # Must be called
# Or set manually:
export VAR_NAME="value"
streamlit run app.py
8. Performance Issues
"High CPU usage"
Problem: Embeddings or search consuming CPU Solution:
# Use batch processing in embeddings.py:
# Already optimized with batch_size=32
# Or use pre-built index (don't rebuild often)
"High memory usage"
Problem: Large dataset or model in memory Solution:
# Use lighter model in config.py:
EMBEDDINGS_MODEL = "all-MiniLM-L6-v2"
# Reduce chunk size:
RAG_CHUNK_SIZE = 256 # From 512
# Use Ollama 3B model:
ollama pull neural-chat
"Slow search"
Problem: FAISS search taking too long Solution:
# Should be fast already, but:
# 1. Reduce results:
RAG_TOP_K = 3 # From 5
# 2. Check if index is corrupted:
# Rebuild it:
python tools/embeddings.py
9. Deployment Issues
Streamlit Cloud deployment fails
Problem: Missing secrets or dependencies Solution:
# 1. Add secrets in Streamlit Cloud:
# Settings β Secrets
# LLM_PROVIDER=replicate
# REPLICATE_API_TOKEN=xxx
# 2. Make sure requirements.txt is in repo
# 3. Commit data files or download on deploy
# 4. Check build logs:
# Deploy β Manage app β Logs
Docker container issues
Problem: Can't build or run Docker image Solution:
# Create Dockerfile (if not exists)
# Build: docker build -t sap-chatbot .
# Run: docker run -p 8501:8501 sap-chatbot
# Or provide Docker guide
10. Data Issues
"Dataset is outdated"
Problem: Knowledge base needs refresh Solution:
# Rebuild dataset:
rm data/sap_dataset.json
python tools/build_dataset.py
python tools/embeddings.py
# Takes 10-15 minutes but gets latest content
"Too much data (slow startup)"
Problem: Large dataset causing slow startup Solution:
# Limit dataset in build_dataset.py:
# Change: for repo in repos (all repos)
# To: for repo in repos[:10] (first 10 only)
# Or reduce sources scraped
"Data format error"
Problem: JSON file corrupted Solution:
# Verify JSON:
python -c "import json; json.load(open('data/sap_dataset.json'))"
# If error, rebuild:
rm data/sap_dataset.json
python tools/build_dataset.py
Quick Diagnosis
System Check Script
#!/bin/bash
echo "SAP Chatbot System Check"
echo "========================"
echo ""
echo "1. Python:"
python3 --version
echo ""
echo "2. Virtual Environment:"
if [ -d ".venv" ]; then
echo "β
Exists"
else
echo "β Missing"
fi
echo ""
echo "3. Dependencies:"
pip list | grep -E "streamlit|transformers|faiss|ollama"
echo ""
echo "4. Dataset:"
ls -lh data/sap_dataset.json 2>/dev/null || echo "β Not found"
echo ""
echo "5. Index:"
ls -lh data/rag_index.faiss 2>/dev/null || echo "β Not found"
echo ""
echo "6. .env file:"
[ -f ".env" ] && echo "β
Exists" || echo "β Missing"
echo ""
echo "7. Ollama:"
curl -s http://localhost:11434/ > /dev/null && echo "β
Running" || echo "β Not running"
echo ""
echo "Check complete!"
Save as check_system.sh and run:
bash check_system.sh
Getting Help
- Check this guide - Most issues documented
- Read GETTING_STARTED.md - Step-by-step setup
- Check README.md - Architecture & concepts
- Check config.py - All configuration options
- Look at code - Well-commented Python files
- Open GitHub issue - Report bugs with details
Debug Mode
Enable debug logging:
# In app.py or any module:
import logging
logging.basicConfig(level=logging.DEBUG)
logger = logging.getLogger(__name__)
logger.debug("Debug message here")
Then run:
streamlit run app.py --logger.level=debug
Still stuck? Check the GitHub issues or create a new one with:
- Python version
- OS (Windows/Mac/Linux)
- Error message (full traceback)
- Steps to reproduce
- What you've already tried
Good luck! π