A newer version of the Gradio SDK is available: 6.13.0
Quick Launch Guide
β System Status
Your HPMOR chatbot is fully configured and ready to use!
- β Ollama installed with both models (llama3.2:3b & llama3.1:8b)
- β Groq API configured
- β HPMOR document processed (69 chunks)
- β Vector database created
- β Harry Potter personality integrated
π Launch the Chat Interface
Simply run:
uv run python main.py chat
The interface will be available at: http://localhost:7860
π¬ How to Chat with Harry
Harry will respond in character as Harry James Potter-Evans-Verres from HPMOR. He will:
- Apply rational thinking and the scientific method
- Reference his experiments and discoveries
- Use precise, analytical language
- Question assumptions and explore possibilities
- Cite relevant context from the book
Example Questions:
- "Harry, what do you think about magic?"
- "Can you explain your approach to learning spells?"
- "What's your opinion on Dumbledore?"
- "Tell me about your friendship with Hermione"
- "How do you apply rationality to solve problems?"
π― Model Selection
The system automatically chooses the best model:
| Complexity | Model Used | Use Case |
|---|---|---|
| Simple | Llama 3.2 3B (local) | Quick factual questions |
| Moderate | Llama 3.1 8B (local) | Analysis and reasoning |
| Complex | Llama 3.3 70B (Groq) | Deep reasoning, creativity |
You can also manually select models in the UI!
π§ Troubleshooting
Ollama not working?
# Start Ollama service
ollama serve
Want to rebuild the index?
uv run python main.py setup --force
Check system status:
uv run python main.py check
π Performance
- Local (3B): ~50 tokens/sec, instant startup
- Local (8B): ~25 tokens/sec, 1-2s startup
- Groq (70B): ~150+ tokens/sec, network latency
With your M4 Max (48GB RAM), both local models run smoothly!
Have fun chatting with Harry! π§ββοΈβ¨