Instructions to use internlm/internlm2-base-20b with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use internlm/internlm2-base-20b with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="internlm/internlm2-base-20b", trust_remote_code=True)

# Load model directly
from transformers import AutoModelForCausalLM
model = AutoModelForCausalLM.from_pretrained("internlm/internlm2-base-20b", trust_remote_code=True, dtype="auto")

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use internlm/internlm2-base-20b with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "internlm/internlm2-base-20b"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "internlm/internlm2-base-20b",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/internlm/internlm2-base-20b

SGLang

How to use internlm/internlm2-base-20b with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "internlm/internlm2-base-20b" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "internlm/internlm2-base-20b",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "internlm/internlm2-base-20b" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "internlm/internlm2-base-20b",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Docker Model Runner
How to use internlm/internlm2-base-20b with Docker Model Runner:
```
docker model run hf.co/internlm/internlm2-base-20b
```

internlm2-base-20b

Commit History

Adding `safetensors` variant of this model

948f334
verified

SFconvertbot commited on Jan 19, 2025

update modeling file to newest

e873130

x54-729 commited on Aug 20, 2024

update opencompass url

88d8bc4

x54-729 commited on Jul 3, 2024

transforer version

6f7849f

x54-729 commited on Jul 2, 2024

update opencompass leaderboard url

b77090b

x54-729 commited on Jul 2, 2024

fix flash attention import

f213bf5

x54-729 commited on Jun 21, 2024

small update

b4e6b30

x54-729 commited on Jun 19, 2024

update for new version

6097ed8

x54-729 commited on Jun 17, 2024

fix eos & update README for tech report

7e419e8

x54-729 commited on May 14, 2024

upload new weights

24e1e91
verified

x54-729 commited on Mar 20, 2024

fix no white space when using stream_chat with fast tokenizer

edf2879

x54-729 commited on Feb 28, 2024

small

1857a15

x54-729 commited on Jan 24, 2024

fast tokenizer and stream_chat fix (#3)

ca3f2c7
verified

RangiLyu

x54-729 commited on Jan 24, 2024

update rope_scaling

9305163

x54-729 commited on Jan 19, 2024

remove unnecessary attention_drop

8586def

x54-729 commited on Jan 19, 2024

update chat template

d1913f2

x54-729 commited on Jan 19, 2024

update readme info

f7374de

x54-729 commited on Jan 18, 2024

Update license in README.md

a40ae5d
verified

ZwwWayne commited on Jan 18, 2024

fix example prompt

70d038d

x54-729 commited on Jan 17, 2024

update readme

a9418b3

x54-729 commited on Jan 17, 2024

fix import error

9c9e595

x54-729 commited on Jan 16, 2024

support flash attn 2

5966e12

x54-729 commited on Jan 16, 2024

Create README.md

1611f45
verified

ZwwWayne commited on Jan 15, 2024

update model weights

c790826

ZwwWayne commited on Jan 12, 2024

initial commit

5587603
verified

ZwwWayne commited on Jan 12, 2024

Commit History

Adding `safetensors` variant of this model 948f334 verified

update modeling file to newest e873130

update opencompass url 88d8bc4

transforer version 6f7849f

update opencompass leaderboard url b77090b

fix flash attention import f213bf5

small update b4e6b30

update for new version 6097ed8

fix eos & update README for tech report 7e419e8

upload new weights 24e1e91 verified

fix no white space when using stream_chat with fast tokenizer edf2879

small 1857a15

fast tokenizer and stream_chat fix (#3) ca3f2c7 verified

update rope_scaling 9305163

remove unnecessary attention_drop 8586def

update chat template d1913f2

update readme info f7374de

Update license in README.md a40ae5d verified

fix example prompt 70d038d

update readme a9418b3

fix import error 9c9e595

support flash attn 2 5966e12

Create README.md 1611f45 verified

update model weights c790826

initial commit 5587603 verified

Adding `safetensors` variant of this model

948f334
verified

update modeling file to newest

e873130

update opencompass url

88d8bc4

transforer version

6f7849f

update opencompass leaderboard url

b77090b

fix flash attention import

f213bf5

small update

b4e6b30

update for new version

6097ed8

fix eos & update README for tech report

7e419e8

upload new weights

24e1e91
verified

fix no white space when using stream_chat with fast tokenizer

edf2879

small

1857a15

fast tokenizer and stream_chat fix (#3)

ca3f2c7
verified

update rope_scaling

9305163

remove unnecessary attention_drop

8586def

update chat template

d1913f2

update readme info

f7374de

Update license in README.md

a40ae5d
verified

fix example prompt

70d038d

update readme

a9418b3

fix import error

9c9e595

support flash attn 2

5966e12

Create README.md

1611f45
verified

update model weights

c790826

initial commit

5587603
verified