Instructions to use SystemAdmin123/SmolLM-360M-Instruct with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use SystemAdmin123/SmolLM-360M-Instruct with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="SystemAdmin123/SmolLM-360M-Instruct")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("SystemAdmin123/SmolLM-360M-Instruct")
model = AutoModelForCausalLM.from_pretrained("SystemAdmin123/SmolLM-360M-Instruct")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use SystemAdmin123/SmolLM-360M-Instruct with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "SystemAdmin123/SmolLM-360M-Instruct"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "SystemAdmin123/SmolLM-360M-Instruct",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/SystemAdmin123/SmolLM-360M-Instruct

SGLang

How to use SystemAdmin123/SmolLM-360M-Instruct with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "SystemAdmin123/SmolLM-360M-Instruct" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "SystemAdmin123/SmolLM-360M-Instruct",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "SystemAdmin123/SmolLM-360M-Instruct" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "SystemAdmin123/SmolLM-360M-Instruct",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use SystemAdmin123/SmolLM-360M-Instruct with Docker Model Runner:
```
docker model run hf.co/SystemAdmin123/SmolLM-360M-Instruct
```

SystemAdmin123 commited on Feb 4, 2025

Commit

401f307

verified ·

1 Parent(s): 6dded20

Training in progress, step 1600, checkpoint

Browse files

Files changed (5) hide show

last-checkpoint/model.safetensors +1 -1
last-checkpoint/optimizer.pt +1 -1
last-checkpoint/rng_state.pth +1 -1
last-checkpoint/scheduler.pt +1 -1
last-checkpoint/trainer_state.json +299 -3

last-checkpoint/model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:97f5a3fd042daf5f3fb5e84e1ea4890a0bd863a4173a97a2f729a192dac1c8fc
 size 723674912

 version https://git-lfs.github.com/spec/v1
+oid sha256:29a427d6545dc0e8ef1d0145e8423f41bb3a50913f2d1b921e04445da575506d
 size 723674912

last-checkpoint/optimizer.pt CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:c005d723746a5ad45697ed3df203b1df0a1b923778c5cc24835c266eb0b5fa45
 size 735625626

 version https://git-lfs.github.com/spec/v1
+oid sha256:a4b78a335a1b8fefe3ad448d5bcce0a3fe0dc366c6e8b16bb8f35e1d94b22307
 size 735625626

last-checkpoint/rng_state.pth CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:a85b7ee4e3e06f8b21d4d23e7eb8bbe5510e7f25d23cfc2ffc16d97845a1be25
 size 14244

 version https://git-lfs.github.com/spec/v1
+oid sha256:ced0ac0d077b41bd2987add3782b7ce1140142ac3cddaf433babda96674c50fb
 size 14244

last-checkpoint/scheduler.pt CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:427276ae77d918ee2b880ea4152618640d39ea76588856ca2cd62fe2ab8b83d7
 size 1064

 version https://git-lfs.github.com/spec/v1
+oid sha256:9ed6aad8025a80b776f2d50234fd05b8c1e2e758d3d427458fe15ed9bc7f733a
 size 1064

last-checkpoint/trainer_state.json CHANGED Viewed

@@ -1,9 +1,9 @@
 {
   "best_metric": null,
   "best_model_checkpoint": null,
-  "epoch": 0.3551346552234389,
   "eval_steps": 200,
-  "global_step": 1200,
   "is_hyper_param_search": false,
   "is_local_process_zero": true,
   "is_world_process_zero": true,
@@ -903,6 +903,302 @@
       "eval_samples_per_second": 39.862,
       "eval_steps_per_second": 9.979,
       "step": 1200
     }
   ],
   "logging_steps": 10,
@@ -922,7 +1218,7 @@
       "attributes": {}
     }
   },
-  "total_flos": 1.85888691191808e+16,
   "train_batch_size": 4,
   "trial_name": null,
   "trial_params": null

 {
   "best_metric": null,
   "best_model_checkpoint": null,
+  "epoch": 0.47351287363125183,
   "eval_steps": 200,
+  "global_step": 1600,
   "is_hyper_param_search": false,
   "is_local_process_zero": true,
   "is_world_process_zero": true,
       "eval_samples_per_second": 39.862,
       "eval_steps_per_second": 9.979,
       "step": 1200
+    },
+    {
+      "epoch": 0.3580941106836342,
+      "grad_norm": 2.46875,
+      "learning_rate": 0.0001135169494631497,
+      "loss": 2.2422,
+      "step": 1210
+    },
+    {
+      "epoch": 0.36105356614382955,
+      "grad_norm": 2.46875,
+      "learning_rate": 0.00011220516908034601,
+      "loss": 2.1766,
+      "step": 1220
+    },
+    {
+      "epoch": 0.3640130216040249,
+      "grad_norm": 4.40625,
+      "learning_rate": 0.00011089125314635726,
+      "loss": 2.2057,
+      "step": 1230
+    },
+    {
+      "epoch": 0.3669724770642202,
+      "grad_norm": 5.09375,
+      "learning_rate": 0.00010957543155842702,
+      "loss": 1.7583,
+      "step": 1240
+    },
+    {
+      "epoch": 0.36993193252441553,
+      "grad_norm": 16.125,
+      "learning_rate": 0.00010825793454723325,
+      "loss": 1.9914,
+      "step": 1250
+    },
+    {
+      "epoch": 0.37289138798461086,
+      "grad_norm": 2.046875,
+      "learning_rate": 0.00010693899263660441,
+      "loss": 2.4002,
+      "step": 1260
+    },
+    {
+      "epoch": 0.3758508434448062,
+      "grad_norm": 3.0,
+      "learning_rate": 0.00010561883660318455,
+      "loss": 2.2105,
+      "step": 1270
+    },
+    {
+      "epoch": 0.37881029890500145,
+      "grad_norm": 3.953125,
+      "learning_rate": 0.00010429769743605407,
+      "loss": 1.9458,
+      "step": 1280
+    },
+    {
+      "epoch": 0.3817697543651968,
+      "grad_norm": 3.953125,
+      "learning_rate": 0.00010297580629631325,
+      "loss": 1.7066,
+      "step": 1290
+    },
+    {
+      "epoch": 0.3847292098253921,
+      "grad_norm": 20.5,
+      "learning_rate": 0.00010165339447663587,
+      "loss": 1.9038,
+      "step": 1300
+    },
+    {
+      "epoch": 0.38768866528558743,
+      "grad_norm": 2.109375,
+      "learning_rate": 0.00010033069336079952,
+      "loss": 2.2051,
+      "step": 1310
+    },
+    {
+      "epoch": 0.39064812074578276,
+      "grad_norm": 4.15625,
+      "learning_rate": 9.900793438320037e-05,
+      "loss": 2.137,
+      "step": 1320
+    },
+    {
+      "epoch": 0.3936075762059781,
+      "grad_norm": 3.40625,
+      "learning_rate": 9.768534898835862e-05,
+      "loss": 2.072,
+      "step": 1330
+    },
+    {
+      "epoch": 0.3965670316661734,
+      "grad_norm": 4.125,
+      "learning_rate": 9.636316859042259e-05,
+      "loss": 2.2351,
+      "step": 1340
+    },
+    {
+      "epoch": 0.39952648712636873,
+      "grad_norm": 12.5625,
+      "learning_rate": 9.504162453267777e-05,
+      "loss": 2.0439,
+      "step": 1350
+    },
+    {
+      "epoch": 0.40248594258656406,
+      "grad_norm": 2.875,
+      "learning_rate": 9.372094804706867e-05,
+      "loss": 2.5911,
+      "step": 1360
+    },
+    {
+      "epoch": 0.4054453980467594,
+      "grad_norm": 3.75,
+      "learning_rate": 9.24013702137397e-05,
+      "loss": 2.1565,
+      "step": 1370
+    },
+    {
+      "epoch": 0.4084048535069547,
+      "grad_norm": 4.0,
+      "learning_rate": 9.108312192060298e-05,
+      "loss": 2.2613,
+      "step": 1380
+    },
+    {
+      "epoch": 0.41136430896715004,
+      "grad_norm": 4.875,
+      "learning_rate": 8.97664338229395e-05,
+      "loss": 1.8462,
+      "step": 1390
+    },
+    {
+      "epoch": 0.41432376442734536,
+      "grad_norm": 24.375,
+      "learning_rate": 8.845153630304139e-05,
+      "loss": 2.1576,
+      "step": 1400
+    },
+    {
+      "epoch": 0.41432376442734536,
+      "eval_loss": 2.132373571395874,
+      "eval_runtime": 38.4807,
+      "eval_samples_per_second": 39.033,
+      "eval_steps_per_second": 9.771,
+      "step": 1400
+    },
+    {
+      "epoch": 0.4172832198875407,
+      "grad_norm": 2.046875,
+      "learning_rate": 8.713865942990141e-05,
+      "loss": 2.4016,
+      "step": 1410
+    },
+    {
+      "epoch": 0.420242675347736,
+      "grad_norm": 3.515625,
+      "learning_rate": 8.582803291895758e-05,
+      "loss": 2.3257,
+      "step": 1420
+    },
+    {
+      "epoch": 0.42320213080793134,
+      "grad_norm": 6.3125,
+      "learning_rate": 8.451988609189987e-05,
+      "loss": 2.0979,
+      "step": 1430
+    },
+    {
+      "epoch": 0.42616158626812667,
+      "grad_norm": 4.25,
+      "learning_rate": 8.321444783654524e-05,
+      "loss": 1.8707,
+      "step": 1440
+    },
+    {
+      "epoch": 0.429121041728322,
+      "grad_norm": 13.4375,
+      "learning_rate": 8.191194656678904e-05,
+      "loss": 2.05,
+      "step": 1450
+    },
+    {
+      "epoch": 0.4320804971885173,
+      "grad_norm": 1.8671875,
+      "learning_rate": 8.061261018263919e-05,
+      "loss": 2.3737,
+      "step": 1460
+    },
+    {
+      "epoch": 0.43503995264871265,
+      "grad_norm": 3.015625,
+      "learning_rate": 7.931666603034033e-05,
+      "loss": 2.3148,
+      "step": 1470
+    },
+    {
+      "epoch": 0.437999408108908,
+      "grad_norm": 3.296875,
+      "learning_rate": 7.80243408625947e-05,
+      "loss": 2.3303,
+      "step": 1480
+    },
+    {
+      "epoch": 0.4409588635691033,
+      "grad_norm": 4.15625,
+      "learning_rate": 7.673586079888698e-05,
+      "loss": 2.2395,
+      "step": 1490
+    },
+    {
+      "epoch": 0.4439183190292986,
+      "grad_norm": 18.375,
+      "learning_rate": 7.54514512859201e-05,
+      "loss": 2.1118,
+      "step": 1500
+    },
+    {
+      "epoch": 0.44687777448949395,
+      "grad_norm": 1.78125,
+      "learning_rate": 7.417133705816837e-05,
+      "loss": 2.2514,
+      "step": 1510
+    },
+    {
+      "epoch": 0.4498372299496893,
+      "grad_norm": 2.359375,
+      "learning_rate": 7.289574209855559e-05,
+      "loss": 2.3813,
+      "step": 1520
+    },
+    {
+      "epoch": 0.4527966854098846,
+      "grad_norm": 3.25,
+      "learning_rate": 7.16248895992645e-05,
+      "loss": 2.0406,
+      "step": 1530
+    },
+    {
+      "epoch": 0.45575614087007993,
+      "grad_norm": 6.15625,
+      "learning_rate": 7.035900192268464e-05,
+      "loss": 1.8527,
+      "step": 1540
+    },
+    {
+      "epoch": 0.45871559633027525,
+      "grad_norm": 24.125,
+      "learning_rate": 6.909830056250527e-05,
+      "loss": 1.8146,
+      "step": 1550
+    },
+    {
+      "epoch": 0.4616750517904706,
+      "grad_norm": 1.9453125,
+      "learning_rate": 6.784300610496048e-05,
+      "loss": 2.4665,
+      "step": 1560
+    },
+    {
+      "epoch": 0.46463450725066585,
+      "grad_norm": 2.859375,
+      "learning_rate": 6.65933381902329e-05,
+      "loss": 2.2625,
+      "step": 1570
+    },
+    {
+      "epoch": 0.4675939627108612,
+      "grad_norm": 3.609375,
+      "learning_rate": 6.534951547402322e-05,
+      "loss": 2.2715,
+      "step": 1580
+    },
+    {
+      "epoch": 0.4705534181710565,
+      "grad_norm": 4.5625,
+      "learning_rate": 6.411175558929152e-05,
+      "loss": 1.9711,
+      "step": 1590
+    },
+    {
+      "epoch": 0.47351287363125183,
+      "grad_norm": 11.125,
+      "learning_rate": 6.28802751081779e-05,
+      "loss": 2.1731,
+      "step": 1600
+    },
+    {
+      "epoch": 0.47351287363125183,
+      "eval_loss": 2.12680983543396,
+      "eval_runtime": 39.4155,
+      "eval_samples_per_second": 38.107,
+      "eval_steps_per_second": 9.539,
+      "step": 1600
     }
   ],
   "logging_steps": 10,
       "attributes": {}
     }
   },
+  "total_flos": 2.47748488593408e+16,
   "train_batch_size": 4,
   "trial_name": null,
   "trial_params": null