hpmor / LAUNCH.md
deenaik's picture
Update README and chat interface to enhance user interaction with Harry Potter character. Improved prompts and example questions for better engagement. Refactored model chain to utilize Harry's personality in responses.
d659883

A newer version of the Gradio SDK is available: 6.13.0

Upgrade

Quick Launch Guide

βœ… System Status

Your HPMOR chatbot is fully configured and ready to use!

  • βœ… Ollama installed with both models (llama3.2:3b & llama3.1:8b)
  • βœ… Groq API configured
  • βœ… HPMOR document processed (69 chunks)
  • βœ… Vector database created
  • βœ… Harry Potter personality integrated

πŸš€ Launch the Chat Interface

Simply run:

uv run python main.py chat

The interface will be available at: http://localhost:7860

πŸ’¬ How to Chat with Harry

Harry will respond in character as Harry James Potter-Evans-Verres from HPMOR. He will:

  • Apply rational thinking and the scientific method
  • Reference his experiments and discoveries
  • Use precise, analytical language
  • Question assumptions and explore possibilities
  • Cite relevant context from the book

Example Questions:

  • "Harry, what do you think about magic?"
  • "Can you explain your approach to learning spells?"
  • "What's your opinion on Dumbledore?"
  • "Tell me about your friendship with Hermione"
  • "How do you apply rationality to solve problems?"

🎯 Model Selection

The system automatically chooses the best model:

Complexity Model Used Use Case
Simple Llama 3.2 3B (local) Quick factual questions
Moderate Llama 3.1 8B (local) Analysis and reasoning
Complex Llama 3.3 70B (Groq) Deep reasoning, creativity

You can also manually select models in the UI!

πŸ”§ Troubleshooting

Ollama not working?

# Start Ollama service
ollama serve

Want to rebuild the index?

uv run python main.py setup --force

Check system status:

uv run python main.py check

πŸ“Š Performance

  • Local (3B): ~50 tokens/sec, instant startup
  • Local (8B): ~25 tokens/sec, 1-2s startup
  • Groq (70B): ~150+ tokens/sec, network latency

With your M4 Max (48GB RAM), both local models run smoothly!


Have fun chatting with Harry! πŸ§™β€β™‚οΈβœ¨