Instructions to use SystemAdmin123/SmolLM-360M with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use SystemAdmin123/SmolLM-360M with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="SystemAdmin123/SmolLM-360M")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("SystemAdmin123/SmolLM-360M")
model = AutoModelForCausalLM.from_pretrained("SystemAdmin123/SmolLM-360M")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use SystemAdmin123/SmolLM-360M with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "SystemAdmin123/SmolLM-360M"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "SystemAdmin123/SmolLM-360M",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/SystemAdmin123/SmolLM-360M

SGLang

How to use SystemAdmin123/SmolLM-360M with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "SystemAdmin123/SmolLM-360M" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "SystemAdmin123/SmolLM-360M",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "SystemAdmin123/SmolLM-360M" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "SystemAdmin123/SmolLM-360M",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use SystemAdmin123/SmolLM-360M with Docker Model Runner:
```
docker model run hf.co/SystemAdmin123/SmolLM-360M
```

SystemAdmin123 commited on Feb 7, 2025

Commit

6230b87

verified ·

1 Parent(s): cb610ac

Training in progress, step 20, checkpoint

Browse files

Files changed (9) hide show

last-checkpoint/model.safetensors +1 -1
last-checkpoint/optimizer.pt +1 -1
last-checkpoint/rng_state_0.pth +1 -1
last-checkpoint/rng_state_1.pth +1 -1
last-checkpoint/rng_state_2.pth +1 -1
last-checkpoint/rng_state_3.pth +1 -1
last-checkpoint/scheduler.pt +1 -1
last-checkpoint/trainer_state.json +15 -141
last-checkpoint/training_args.bin +1 -1

last-checkpoint/model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:6ca635966f2128b90695cdcf1b450ff9388c9812f95f690192973e5b7eefd3c9
 size 723674912

 version https://git-lfs.github.com/spec/v1
+oid sha256:b999a352785b09d15860c1320662d8ec817d298e02812f21d68f01e900838dea
 size 723674912

last-checkpoint/optimizer.pt CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:ef2ab014f8101a1dbbd4564e14dff1cbf3c43dda56a1b19089771de3e5eb2e2f
 size 735625370

 version https://git-lfs.github.com/spec/v1
+oid sha256:2899f09cd2c54fdf5ef96e88d4218eefc8c7ef1b0315b44f21f9a89576e1bb98
 size 735625370

last-checkpoint/rng_state_0.pth CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:d497aa3968cd2f05db0d0e8c5e1be496a8a5348df0a825e18ed3fdbaa24257ad
 size 15024

 version https://git-lfs.github.com/spec/v1
+oid sha256:69a04a1208f7a0d6f51f37a136b5c2e55bf3f53b3d0fd57164c5b83ca47a2645
 size 15024

last-checkpoint/rng_state_1.pth CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:5dda0d87dad890add5a8f3995389ff6a597895845903171a363aa580fa07ac30
 size 15024

 version https://git-lfs.github.com/spec/v1
+oid sha256:080a7e72d6be938a9418e60003db90412af8a61e6434f9e9f1b598cca861dbcd
 size 15024

last-checkpoint/rng_state_2.pth CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:de656d8e54adb9fa6e0b2ddbe69d4325a775f7e1580ed51c58a759ad9c7520d4
 size 15024

 version https://git-lfs.github.com/spec/v1
+oid sha256:c3d114a75d37be476b865187eb2b3d29d9343b131614a08f42be0014f110ce6f
 size 15024

last-checkpoint/rng_state_3.pth CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:9eaa87e4309aa1a00b739cd637a2ec444ea6c757388c653064a1906e4d8dfb2e
 size 15024

 version https://git-lfs.github.com/spec/v1
+oid sha256:4fc5a0f78838743362c5d5378dff81ea2f7d0039da53a423f1759e861bc6b233
 size 15024

last-checkpoint/scheduler.pt CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:ca9a25c72339c898b564e0c464a3f6fc75bbeec408008928b7ed05533156b98c
 size 1064

 version https://git-lfs.github.com/spec/v1
+oid sha256:3bb34b6c62864960ca7a1a2bf6005b33b4420cc8055506432b79e0fe18bca2cd
 size 1064

last-checkpoint/trainer_state.json CHANGED Viewed

@@ -1,9 +1,9 @@
 {
   "best_metric": null,
   "best_model_checkpoint": null,
-  "epoch": 25.0,
-  "eval_steps": 200,
-  "global_step": 200,
   "is_hyper_param_search": false,
   "is_local_process_zero": true,
   "is_world_process_zero": true,
@@ -11,9 +11,9 @@
     {
       "epoch": 0.125,
       "eval_loss": 2.5584230422973633,
-      "eval_runtime": 4.9063,
-      "eval_samples_per_second": 305.933,
-      "eval_steps_per_second": 3.465,
       "step": 1
     },
     {
@@ -31,145 +31,19 @@
       "step": 20
     },
     {
-      "epoch": 3.75,
-      "grad_norm": 0.1630859375,
-      "learning_rate": 0.00019458172417006347,
-      "loss": 2.1785,
-      "step": 30
-    },
-    {
-      "epoch": 5.0,
-      "grad_norm": 0.1376953125,
-      "learning_rate": 0.0001879473751206489,
-      "loss": 2.1359,
-      "step": 40
-    },
-    {
-      "epoch": 6.25,
-      "grad_norm": 0.1376953125,
-      "learning_rate": 0.00017891405093963938,
-      "loss": 2.1125,
-      "step": 50
-    },
-    {
-      "epoch": 7.5,
-      "grad_norm": 0.1279296875,
-      "learning_rate": 0.00016772815716257412,
-      "loss": 2.0939,
-      "step": 60
-    },
-    {
-      "epoch": 8.75,
-      "grad_norm": 0.138671875,
-      "learning_rate": 0.00015469481581224272,
-      "loss": 2.0638,
-      "step": 70
-    },
-    {
-      "epoch": 10.0,
-      "grad_norm": 0.140625,
-      "learning_rate": 0.00014016954246529696,
-      "loss": 2.0632,
-      "step": 80
-    },
-    {
-      "epoch": 11.25,
-      "grad_norm": 0.1337890625,
-      "learning_rate": 0.00012454854871407994,
-      "loss": 2.055,
-      "step": 90
-    },
-    {
-      "epoch": 12.5,
-      "grad_norm": 0.14453125,
-      "learning_rate": 0.00010825793454723325,
-      "loss": 2.0298,
-      "step": 100
-    },
-    {
-      "epoch": 13.75,
-      "grad_norm": 0.142578125,
-      "learning_rate": 9.174206545276677e-05,
-      "loss": 2.0271,
-      "step": 110
-    },
-    {
-      "epoch": 15.0,
-      "grad_norm": 0.1357421875,
-      "learning_rate": 7.54514512859201e-05,
-      "loss": 2.0168,
-      "step": 120
-    },
-    {
-      "epoch": 16.25,
-      "grad_norm": 0.1318359375,
-      "learning_rate": 5.983045753470308e-05,
-      "loss": 2.0126,
-      "step": 130
-    },
-    {
-      "epoch": 17.5,
-      "grad_norm": 0.1376953125,
-      "learning_rate": 4.530518418775733e-05,
-      "loss": 2.0188,
-      "step": 140
-    },
-    {
-      "epoch": 18.75,
-      "grad_norm": 0.134765625,
-      "learning_rate": 3.227184283742591e-05,
-      "loss": 2.009,
-      "step": 150
-    },
-    {
-      "epoch": 20.0,
-      "grad_norm": 0.1357421875,
-      "learning_rate": 2.1085949060360654e-05,
-      "loss": 2.0108,
-      "step": 160
-    },
-    {
-      "epoch": 21.25,
-      "grad_norm": 0.130859375,
-      "learning_rate": 1.2052624879351104e-05,
-      "loss": 2.0101,
-      "step": 170
-    },
-    {
-      "epoch": 22.5,
-      "grad_norm": 0.146484375,
-      "learning_rate": 5.418275829936537e-06,
-      "loss": 2.0168,
-      "step": 180
-    },
-    {
-      "epoch": 23.75,
-      "grad_norm": 0.146484375,
-      "learning_rate": 1.3638696597277679e-06,
-      "loss": 2.0059,
-      "step": 190
-    },
-    {
-      "epoch": 25.0,
-      "grad_norm": 0.1328125,
-      "learning_rate": 0.0,
-      "loss": 2.011,
-      "step": 200
-    },
-    {
-      "epoch": 25.0,
-      "eval_loss": 2.0616559982299805,
-      "eval_runtime": 4.9912,
-      "eval_samples_per_second": 300.731,
-      "eval_steps_per_second": 3.406,
-      "step": 200
     }
   ],
   "logging_steps": 10,
   "max_steps": 200,
   "num_input_tokens_seen": 0,
   "num_train_epochs": 25,
-  "save_steps": 200,
   "stateful_callbacks": {
     "TrainerControl": {
       "args": {
@@ -177,12 +51,12 @@
         "should_evaluate": false,
         "should_log": false,
         "should_save": true,
-        "should_training_stop": true
       },
       "attributes": {}
     }
   },
-  "total_flos": 7.113876738932736e+16,
   "train_batch_size": 23,
   "trial_name": null,
   "trial_params": null

 {
   "best_metric": null,
   "best_model_checkpoint": null,
+  "epoch": 2.5,
+  "eval_steps": 20,
+  "global_step": 20,
   "is_hyper_param_search": false,
   "is_local_process_zero": true,
   "is_world_process_zero": true,
     {
       "epoch": 0.125,
       "eval_loss": 2.5584230422973633,
+      "eval_runtime": 4.8815,
+      "eval_samples_per_second": 307.488,
+      "eval_steps_per_second": 3.483,
       "step": 1
     },
     {
       "step": 20
     },
     {
+      "epoch": 2.5,
+      "eval_loss": 2.1561896800994873,
+      "eval_runtime": 4.8534,
+      "eval_samples_per_second": 309.27,
+      "eval_steps_per_second": 3.503,
+      "step": 20
     }
   ],
   "logging_steps": 10,
   "max_steps": 200,
   "num_input_tokens_seen": 0,
   "num_train_epochs": 25,
+  "save_steps": 20,
   "stateful_callbacks": {
     "TrainerControl": {
       "args": {
         "should_evaluate": false,
         "should_log": false,
         "should_save": true,
+        "should_training_stop": false
       },
       "attributes": {}
     }
   },
+  "total_flos": 7113876738932736.0,
   "train_batch_size": 23,
   "trial_name": null,
   "trial_params": null

last-checkpoint/training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:3156bde561d7a483929e0f1d8c097a973dfeb26f4690b823508131f70e6df615
 size 6840

 version https://git-lfs.github.com/spec/v1
+oid sha256:2d4c566c5121c40bd8158f38c2b438572e5e9a4a274e5b8c6fc9ce3ebb93d224
 size 6840