Instructions to use susnato/phi-2 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use susnato/phi-2 with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="susnato/phi-2")# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("susnato/phi-2") model = AutoModelForCausalLM.from_pretrained("susnato/phi-2") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use susnato/phi-2 with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "susnato/phi-2" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "susnato/phi-2", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/susnato/phi-2
- SGLang
How to use susnato/phi-2 with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "susnato/phi-2" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "susnato/phi-2", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "susnato/phi-2" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "susnato/phi-2", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use susnato/phi-2 with Docker Model Runner:
docker model run hf.co/susnato/phi-2
Different results gotten by this phi-2 and microsoft/phi-2
I inference phi-2 with trasformers==4.36.2. But I got different results between this phi-2 and microsoft/phi-2 (trust_remote_code=True)
Here are corresponding codes
this phi-2
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained("susnato/phi-2")
tokenizer = AutoTokenizer.from_pretrained("susnato/phi-2")
inputs = tokenizer('Can you help me write a formal email to a potential business partner proposing a joint venture?', return_tensors="pt", return_attention_mask=False)
outputs = model.generate(**inputs, max_length=256)
text = tokenizer.batch_decode(outputs)[0]
print(text)
The results were
Can you help me write a formal email to a potential business partner proposing a joint venture?
## INSTRUCTION
## INPUT
We are looking to expand our business relationship
##OUTPUT
##OUTPUT
##OUTPUT
## INSTRUCTIONS
#
For microsoft/phi-2
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained('microsoft/phi-2', trust_remote_code=True)
tokenizer = AutoTokenizer.from_pretrained('microsoft/phi-2', trust_remote_code=True)
prompt = 'Can you help me write a formal email to a potential business partner proposing a joint venture?'
input = tokenizer(prompt, return_tensors="pt")
generate_id = model.generate(input.input_ids,max_length=256)
output = tokenizer.batch_decode(generate_id)[0]
print(output)
while the results were
Can you help me write a formal email to a potential business partner proposing a joint venture?
Input: Company A: ABC Inc.
Company B: XYZ Ltd.
Joint Venture: A new online platform for e-commerce
Output: Dear Mr. Smith,
I am writing to you on behalf of ABC Inc., a leading provider of e-commerce solutions. I am interested in exploring the possibility of a joint venture with XYZ Ltd., a reputable online retailer.
We have been following your company's impressive growth and innovation in the e-commerce sector, and we believe that we have a lot to offer each other. We have extensive experience and expertise in developing and managing online platforms for e-commerce, with a proven track record of delivering high-quality products and services to our clients. We also have a strong network of suppliers, distributors, and customers, as well as a dedicated team of professionals.
We propose to create a new online platform for e-commerce that combines our strengths and resources, and leverages our mutual opportunities and goals. The platform would offer a wide range of products and services, from various categories and brands, at competitive prices and with fast delivery. The platform would also provide features such as secure payment,
Obviously, the microsoft/phi-2 results are better
Hi @EnmingZhang , yes the results are different because the weights were not properly converted...I am looking into it. Will ping you when it's the same.
@susnato Thanks for such quick reply !
Could you please provide further details on why the weights were not converted correctly? I have compared the source codes of modeling_phi.py inmicrosoft/phi-1_5 and microsoft/phi-2. I noticed that there were only two modifications made (at lines 500 and 867).
Hi @EnmingZhang , it's because the script that I wrote for conversion of Phi weights are prior to this commit.
And with phi2 the code follows the new codebase(that was introduced with this commit)...that's why it's failing. I am looking into it.
(Also you could not directly use this script, you need to make some minor modifications in the keys dict).
Hi @EnmingZhang , can you please use it now and let me know if it is working as supposed or not?
I am getting quite good results here -

