--- title: "Phi-3 Mini 128K Chat" emoji: "💬" colorFrom: "blue" colorTo: "purple" sdk: "gradio" python_version: "3.10" app_file: "app.py" suggested_hardware: "a10g-small" suggested_storage: "medium" short_description: "A demo of Phi-3-mini-128k-instruct model." tags: - phi-3 - microsoft - chatbot - long-context - transformers - gradio models: - microsoft/Phi-3-mini-128k-instruct preload_from_hub: - microsoft/Phi-3-mini-128k-instruct disable_embedding: false fullscreen: true --- # Phi-3 Mini 128K Instruct Chat Demo This Space demonstrates **Microsoft's [Phi-3-mini-128k-instruct](https://huggingface.co/microsoft/Phi-3-mini-128k-instruct)**, a powerful small language model with support for **up to 128,000 tokens of context**. 🔧 Built with: - 🤗 Transformers - Gradio (for UI) - Hugging Face Hub 🚀 Features: - Long-context understanding - Instruction-tuned chat format - Fast response generation - Optimized for clarity and reasoning 💡 Try asking it to: - Summarize long texts - Explain complex topics - Write code - Answer questions with context > ⏳ **Note**: First load may take 2–3 minutes as the model initializes. Subsequent runs are faster thanks to `preload_from_hub`. --- ## How It Works The model is preloaded during build time using `preload_from_hub`, ensuring minimal cold-start delay. The chat interface uses the official chat template: