Spaces:

deenaik
/

hpmor

Build error

hpmor / LAUNCH.md

Update README and chat interface to enhance user interaction with Harry Potter character. Improved prompts and example questions for better engagement. Refactored model chain to utilize Harry's personality in responses.

d659883 7 months ago

preview code

raw

history blame contribute delete

1.99 kB

A newer version of the Gradio SDK is available: 6.13.0

Upgrade

Quick Launch Guide

✅ System Status

Your HPMOR chatbot is fully configured and ready to use!

✅ Ollama installed with both models (llama3.2:3b & llama3.1:8b)
✅ Groq API configured
✅ HPMOR document processed (69 chunks)
✅ Vector database created
✅ Harry Potter personality integrated

🚀 Launch the Chat Interface

Simply run:

uv run python main.py chat

The interface will be available at: http://localhost:7860

💬 How to Chat with Harry

Harry will respond in character as Harry James Potter-Evans-Verres from HPMOR. He will:

Apply rational thinking and the scientific method
Reference his experiments and discoveries
Use precise, analytical language
Question assumptions and explore possibilities
Cite relevant context from the book

Example Questions:

"Harry, what do you think about magic?"
"Can you explain your approach to learning spells?"
"What's your opinion on Dumbledore?"
"Tell me about your friendship with Hermione"
"How do you apply rationality to solve problems?"

🎯 Model Selection

The system automatically chooses the best model:

Complexity	Model Used	Use Case
Simple	Llama 3.2 3B (local)	Quick factual questions
Moderate	Llama 3.1 8B (local)	Analysis and reasoning
Complex	Llama 3.3 70B (Groq)	Deep reasoning, creativity

You can also manually select models in the UI!

🔧 Troubleshooting

Ollama not working?

# Start Ollama service
ollama serve

Want to rebuild the index?

uv run python main.py setup --force

Check system status:

uv run python main.py check

📊 Performance

Local (3B): ~50 tokens/sec, instant startup
Local (8B): ~25 tokens/sec, 1-2s startup
Groq (70B): ~150+ tokens/sec, network latency

With your M4 Max (48GB RAM), both local models run smoothly!

Have fun chatting with Harry! 🧙‍♂️✨