---
title: "Phi-3 Mini 128K Chat"
emoji: "💬"
colorFrom: "blue"
colorTo: "purple"
sdk: "gradio"
python_version: "3.10"
app_file: "app.py"
suggested_hardware: "a10g-small"
suggested_storage: "medium"
short_description: "A demo of Phi-3-mini-128k-instruct model."
tags:
  - phi-3
  - microsoft
  - chatbot
  - long-context
  - transformers
  - gradio
models:
  - microsoft/Phi-3-mini-128k-instruct
preload_from_hub:
  - microsoft/Phi-3-mini-128k-instruct
disable_embedding: false
fullscreen: true
---
# Phi-3 Mini 128K Instruct Chat Demo

This Space demonstrates **Microsoft's [Phi-3-mini-128k-instruct](https://huggingface.co/microsoft/Phi-3-mini-128k-instruct)**, a powerful small language model with support for **up to 128,000 tokens of context**.

🔧 Built with:
- 🤗 Transformers
- Gradio (for UI)
- Hugging Face Hub

🚀 Features:
- Long-context understanding
- Instruction-tuned chat format
- Fast response generation
- Optimized for clarity and reasoning

💡 Try asking it to:
- Summarize long texts
- Explain complex topics
- Write code
- Answer questions with context

> ⏳ **Note**: First load may take 2–3 minutes as the model initializes. Subsequent runs are faster thanks to `preload_from_hub`.

---

## How It Works

The model is preloaded during build time using `preload_from_hub`, ensuring minimal cold-start delay. The chat interface uses the official chat template: