Instructions to use microsoft/Phi-3-medium-4k-instruct with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use microsoft/Phi-3-medium-4k-instruct with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="microsoft/Phi-3-medium-4k-instruct", trust_remote_code=True)
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("microsoft/Phi-3-medium-4k-instruct", trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained("microsoft/Phi-3-medium-4k-instruct", trust_remote_code=True)
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use microsoft/Phi-3-medium-4k-instruct with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "microsoft/Phi-3-medium-4k-instruct"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "microsoft/Phi-3-medium-4k-instruct",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/microsoft/Phi-3-medium-4k-instruct

SGLang

How to use microsoft/Phi-3-medium-4k-instruct with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "microsoft/Phi-3-medium-4k-instruct" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "microsoft/Phi-3-medium-4k-instruct",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "microsoft/Phi-3-medium-4k-instruct" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "microsoft/Phi-3-medium-4k-instruct",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use microsoft/Phi-3-medium-4k-instruct with Docker Model Runner:
```
docker model run hf.co/microsoft/Phi-3-medium-4k-instruct
```

Something seriously wrong with this model

by rombodawg - opened May 29, 2024

Discussion

rombodawg

May 29, 2024

•

edited May 29, 2024

No matter if I run this model quantized (GGUF) or unquantized (Raw bf16 files) I get the same janky outputs. I try to have it code a very simple game of pong in python and it outputs complete jank after the first few lines of code.

Prompt:
Code the classic game "pong" in python, using pygame.

Here's a simple implementation of Pong using Python and Pygame excluding comments:

import sys
import random
import pygame as pg
from pygame import Rect
pg.init()

SCREEN_WIDTH = 640
SCREEN_HEIGHT = 320
WINDOW_TITLE = "Pong"
FPS = 60

# Colors (RGB)
WHITE = (255, 255, 255)
BLACK = (  0,   0,   0)
RED = (255,   0,   0)
GREEN = (  0, 128,   0)
YELLOW = (255, 255,   0)
DARK_BLUE = ( 64,  64, 192)
LIME = (37, 185, 44 ) # Lime Green color for the ball.
PURPLE = ( 185 , 44 , 37 )# Purple Color For Paddle
BLUE_GREY= (92,  96, 104)
ORANGE=(255,   69,    0) # Orange color for the ball.
LIME =(80, 200 , 70 )# Lime Green Color For Paddle
DARK_GREEN= (  34,139, 34 )
BLUE=(56,   10,    15) # Blue color for the ball.
PURPLERED =(208 , 70 , 132)# Purple Color For Paddle
LIGHT_RED= (  255 , 94 , 94 )
BLUEYELLOW=(6,   214,    84 ) # Blue color for the ball.
PURPLERED =(70 , 30 , 180)# Purple Color For Paddle
LIGHT_RED= (  95 , 15 , 15 )
BLUE=(26,   140,    255 ) # Blue color for the ball.
PURPLERED =(73 , 8 , 160)# Purple Color For Paddle
...

Like "PURPLERED" What??? Microsoft what is this product you've delivered? You acted in your press conference like this was a LLM you could use for development. You cant use this to write one of the simplest code files.

DarkJanissary

Jun 27, 2024

I got the same results. Phi is total garbage, no matter which version.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment