Instructions to use microsoft/phi-1 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use microsoft/phi-1 with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="microsoft/phi-1")# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("microsoft/phi-1") model = AutoModelForCausalLM.from_pretrained("microsoft/phi-1") - Inference
- Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use microsoft/phi-1 with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "microsoft/phi-1" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "microsoft/phi-1", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/microsoft/phi-1
- SGLang
How to use microsoft/phi-1 with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "microsoft/phi-1" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "microsoft/phi-1", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "microsoft/phi-1" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "microsoft/phi-1", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use microsoft/phi-1 with Docker Model Runner:
docker model run hf.co/microsoft/phi-1
Update README.md
Browse files
README.md
CHANGED
|
@@ -11,7 +11,7 @@ tags:
|
|
| 11 |
---
|
| 12 |
## Model Summary
|
| 13 |
|
| 14 |
-
The language model Phi-1 is a Transformer with 1.3 billion parameters, specialized for basic Python coding. Its training involved a variety of data sources, including subsets of Python codes from [The Stack v1.2](https://huggingface.co/datasets/bigcode/the-stack), Q&A content from [StackOverflow](https://archive.org/download/stackexchange), competition code from [code_contests](https://github.com/deepmind/code_contests), and synthetic Python textbooks and exercises generated by [gpt-3.5-turbo-0301](https://platform.openai.com/docs/models/gpt-3-5). Even though the model and the datasets are relatively small compared to contemporary Large Language Models (LLMs),
|
| 15 |
|
| 16 |
## Intended Uses
|
| 17 |
Given the nature of the training data, Phi-1 is best suited for prompts using the code format:
|
|
@@ -37,6 +37,7 @@ where the model generates the code after the comments. (Note: This is a legitima
|
|
| 37 |
* If you are using `transformers>=4.36.0`, always load the model with `trust_remote_code=True` to prevent side-effects.
|
| 38 |
|
| 39 |
## Sample Code
|
|
|
|
| 40 |
```python
|
| 41 |
import torch
|
| 42 |
from transformers import AutoModelForCausalLM, AutoTokenizer
|
|
@@ -56,10 +57,9 @@ text = tokenizer.batch_decode(outputs)[0]
|
|
| 56 |
print(text)
|
| 57 |
```
|
| 58 |
|
| 59 |
-
**Remark.** In the generation function, our model currently does not support beam search (`num_beams
|
| 60 |
Furthermore, in the forward pass of the model, we currently do not support outputting hidden states or attention values, or using custom input embeddings.
|
| 61 |
|
| 62 |
-
|
| 63 |
## Limitations of Phi-1
|
| 64 |
|
| 65 |
* Limited Scope: 99.8% of the Python scripts in our fine-tuning dataset use only the packages "typing, math, random, collections, datetime, itertools". If the model generates Python scripts that utilize other packages, we strongly recommend users manually verify all API uses.
|
|
@@ -93,7 +93,7 @@ Given these potential pitfalls, and others not explicitly mentioned, it's essent
|
|
| 93 |
### Software
|
| 94 |
* [PyTorch](https://github.com/pytorch/pytorch)
|
| 95 |
* [DeepSpeed](https://github.com/microsoft/DeepSpeed)
|
| 96 |
-
* [
|
| 97 |
|
| 98 |
### License
|
| 99 |
The model is licensed under the [Research License](https://huggingface.co/microsoft/phi-1/resolve/main/Research%20License.docx).
|
|
|
|
| 11 |
---
|
| 12 |
## Model Summary
|
| 13 |
|
| 14 |
+
The language model Phi-1 is a Transformer with 1.3 billion parameters, specialized for basic Python coding. Its training involved a variety of data sources, including subsets of Python codes from [The Stack v1.2](https://huggingface.co/datasets/bigcode/the-stack), Q&A content from [StackOverflow](https://archive.org/download/stackexchange), competition code from [code_contests](https://github.com/deepmind/code_contests), and synthetic Python textbooks and exercises generated by [gpt-3.5-turbo-0301](https://platform.openai.com/docs/models/gpt-3-5). Even though the model and the datasets are relatively small compared to contemporary Large Language Models (LLMs), Phi-1 has demonstrated an impressive accuracy rate exceeding 50% on the simple Python coding benchmark, HumanEval.
|
| 15 |
|
| 16 |
## Intended Uses
|
| 17 |
Given the nature of the training data, Phi-1 is best suited for prompts using the code format:
|
|
|
|
| 37 |
* If you are using `transformers>=4.36.0`, always load the model with `trust_remote_code=True` to prevent side-effects.
|
| 38 |
|
| 39 |
## Sample Code
|
| 40 |
+
|
| 41 |
```python
|
| 42 |
import torch
|
| 43 |
from transformers import AutoModelForCausalLM, AutoTokenizer
|
|
|
|
| 57 |
print(text)
|
| 58 |
```
|
| 59 |
|
| 60 |
+
**Remark.** In the generation function, our model currently does not support beam search (`num_beams > 1`).
|
| 61 |
Furthermore, in the forward pass of the model, we currently do not support outputting hidden states or attention values, or using custom input embeddings.
|
| 62 |
|
|
|
|
| 63 |
## Limitations of Phi-1
|
| 64 |
|
| 65 |
* Limited Scope: 99.8% of the Python scripts in our fine-tuning dataset use only the packages "typing, math, random, collections, datetime, itertools". If the model generates Python scripts that utilize other packages, we strongly recommend users manually verify all API uses.
|
|
|
|
| 93 |
### Software
|
| 94 |
* [PyTorch](https://github.com/pytorch/pytorch)
|
| 95 |
* [DeepSpeed](https://github.com/microsoft/DeepSpeed)
|
| 96 |
+
* [Flash-Attention](https://github.com/HazyResearch/flash-attention)
|
| 97 |
|
| 98 |
### License
|
| 99 |
The model is licensed under the [Research License](https://huggingface.co/microsoft/phi-1/resolve/main/Research%20License.docx).
|