Instructions to use internlm/Intern-S2-Preview with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use internlm/Intern-S2-Preview with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("image-text-to-text", model="internlm/Intern-S2-Preview", trust_remote_code=True) messages = [ { "role": "user", "content": [ {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"}, {"type": "text", "text": "What animal is on the candy?"} ] }, ] pipe(text=messages)# Load model directly from transformers import AutoModelForImageTextToText model = AutoModelForImageTextToText.from_pretrained("internlm/Intern-S2-Preview", trust_remote_code=True, dtype="auto") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use internlm/Intern-S2-Preview with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "internlm/Intern-S2-Preview" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "internlm/Intern-S2-Preview", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }'Use Docker
docker model run hf.co/internlm/Intern-S2-Preview
- SGLang
How to use internlm/Intern-S2-Preview with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "internlm/Intern-S2-Preview" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "internlm/Intern-S2-Preview", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "internlm/Intern-S2-Preview" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "internlm/Intern-S2-Preview", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }' - Docker Model Runner
How to use internlm/Intern-S2-Preview with Docker Model Runner:
docker model run hf.co/internlm/Intern-S2-Preview
Why Do So Many Chinese Companies Cheat On Tests?
For example, this model has a reported SimpleQA score 27, but its real score is ~6. And Qwen3.5 35b, from which this model was derived, has a SimpleQA of ~8, not 20.
There's a huge difference between a SimpleQA score of 10 and 20, and to date the smallest model that legitimately achieved a SimpleQA score of 20 is Llama 3.1 70b, and its top English competitor Gemma 4 scored 9.
For Alibaba to try to claim that Qwen3.5 scored vastly higher in broad English knowledge than the top English competitor in the same size category (20 vs 9) is nothing short of insane. Why cheat so egregiously that everyone in the industry knows you cheated?
Note: The English SimpleQA is a non-multiple choice broad English knowledge test, so unlike many other tests you can't fine-tune your way to a higher score. You either need more parameters or to train for a very long time on a very large and broad English corpus.
There may be some misunderstanding here. The SimpleQA test used here is SimpleQA Verified provided by Google. You can reproduce the evaluation results using OpenCompass (the dataset configuration file is opencompass/configs/datasets/SimpleQA/simpleqa_verified_rawprompt_gen.py).
Thank you for pointing this out. We have updated the benchmark table and adopted more rigorous wording.
@mzr1996 Thanks for clarifying. But as explained by Google this is the same SimpleQA released by OpenAI, but improved (e.g. got rid of redundancy and other issues). It's still a hard non-multiple choice broad English knowledge test evaluated by GPT 4.1 with scores that correlate with the original test, and Google's Gemma 4 scored ~9.
There's simply no way the comparably-sized Qwen3.5 scored 20, or this model scored 27, unless the models were contaminated by the test. The only legitimate way to climb 11 and 18 points on a non-multiple choice broad English knowledge test is to train a much larger model on a huge amount of broad English knowledge, which these models did not do (verified myself by asking sister questions). These models without a doubt do not have more broad English knowledge than Gemma 4, let alone the vastly more broad English knowledge needed to achieve the astonishingly high scores of 20 and 27, respectively.