Instructions to use SystemAdmin123/SmolLM-360M-Instruct with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use SystemAdmin123/SmolLM-360M-Instruct with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="SystemAdmin123/SmolLM-360M-Instruct")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("SystemAdmin123/SmolLM-360M-Instruct")
model = AutoModelForCausalLM.from_pretrained("SystemAdmin123/SmolLM-360M-Instruct", device_map="auto")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use SystemAdmin123/SmolLM-360M-Instruct with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "SystemAdmin123/SmolLM-360M-Instruct"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "SystemAdmin123/SmolLM-360M-Instruct",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/SystemAdmin123/SmolLM-360M-Instruct

SGLang

How to use SystemAdmin123/SmolLM-360M-Instruct with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "SystemAdmin123/SmolLM-360M-Instruct" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "SystemAdmin123/SmolLM-360M-Instruct",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "SystemAdmin123/SmolLM-360M-Instruct" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "SystemAdmin123/SmolLM-360M-Instruct",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use SystemAdmin123/SmolLM-360M-Instruct with Docker Model Runner:
```
docker model run hf.co/SystemAdmin123/SmolLM-360M-Instruct
```

SystemAdmin123 commited on Feb 4, 2025

Commit

616f3f5

verified ·

1 Parent(s): edca1cb

Training in progress, step 2000, checkpoint

Browse files

Files changed (5) hide show

last-checkpoint/model.safetensors +1 -1
last-checkpoint/optimizer.pt +1 -1
last-checkpoint/rng_state.pth +1 -1
last-checkpoint/scheduler.pt +1 -1
last-checkpoint/trainer_state.json +299 -3

last-checkpoint/model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:29a427d6545dc0e8ef1d0145e8423f41bb3a50913f2d1b921e04445da575506d
 size 723674912

 version https://git-lfs.github.com/spec/v1
+oid sha256:65a553bd7056176e2288025437ccdeff8d47868eefca4ade3457c24cca308444
 size 723674912

last-checkpoint/optimizer.pt CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:a4b78a335a1b8fefe3ad448d5bcce0a3fe0dc366c6e8b16bb8f35e1d94b22307
 size 735625626

 version https://git-lfs.github.com/spec/v1
+oid sha256:1bfbfd154cb90e557e9d1d28cf0a383f1d2890c6a73e5538416e046c8ee482fe
 size 735625626

last-checkpoint/rng_state.pth CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:ced0ac0d077b41bd2987add3782b7ce1140142ac3cddaf433babda96674c50fb
 size 14244

 version https://git-lfs.github.com/spec/v1
+oid sha256:19ab3d6cfcb43de67f16e412d0cb4f86309db602f8242d16f2b203a0212d6cbb
 size 14244

last-checkpoint/scheduler.pt CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:9ed6aad8025a80b776f2d50234fd05b8c1e2e758d3d427458fe15ed9bc7f733a
 size 1064

 version https://git-lfs.github.com/spec/v1
+oid sha256:c88b3aeb8ec2bf995149291b90b69667d3f268ff2f13afbeab1a220b8cc27590
 size 1064

last-checkpoint/trainer_state.json CHANGED Viewed

@@ -1,9 +1,9 @@
 {
   "best_metric": null,
   "best_model_checkpoint": null,
-  "epoch": 0.47351287363125183,
   "eval_steps": 200,
-  "global_step": 1600,
   "is_hyper_param_search": false,
   "is_local_process_zero": true,
   "is_world_process_zero": true,
@@ -1199,6 +1199,302 @@
       "eval_samples_per_second": 38.107,
       "eval_steps_per_second": 9.539,
       "step": 1600
     }
   ],
   "logging_steps": 10,
@@ -1218,7 +1514,7 @@
       "attributes": {}
     }
   },
-  "total_flos": 2.47748488593408e+16,
   "train_batch_size": 4,
   "trial_name": null,
   "trial_params": null

 {
   "best_metric": null,
   "best_model_checkpoint": null,
+  "epoch": 0.5918910920390648,
   "eval_steps": 200,
+  "global_step": 2000,
   "is_hyper_param_search": false,
   "is_local_process_zero": true,
   "is_world_process_zero": true,
       "eval_samples_per_second": 38.107,
       "eval_steps_per_second": 9.539,
       "step": 1600
+    },
+    {
+      "epoch": 0.47647232909144716,
+      "grad_norm": 1.96875,
+      "learning_rate": 6.165528950410884e-05,
+      "loss": 2.1656,
+      "step": 1610
+    },
+    {
+      "epoch": 0.4794317845516425,
+      "grad_norm": 3.1875,
+      "learning_rate": 6.0437013114095195e-05,
+      "loss": 2.2137,
+      "step": 1620
+    },
+    {
+      "epoch": 0.4823912400118378,
+      "grad_norm": 3.71875,
+      "learning_rate": 5.922565910122967e-05,
+      "loss": 2.253,
+      "step": 1630
+    },
+    {
+      "epoch": 0.48535069547203313,
+      "grad_norm": 7.21875,
+      "learning_rate": 5.8021439417389444e-05,
+      "loss": 2.077,
+      "step": 1640
+    },
+    {
+      "epoch": 0.48831015093222846,
+      "grad_norm": 18.375,
+      "learning_rate": 5.6824564766150726e-05,
+      "loss": 1.8215,
+      "step": 1650
+    },
+    {
+      "epoch": 0.4912696063924238,
+      "grad_norm": 1.953125,
+      "learning_rate": 5.563524456592163e-05,
+      "loss": 2.2989,
+      "step": 1660
+    },
+    {
+      "epoch": 0.4942290618526191,
+      "grad_norm": 3.0625,
+      "learning_rate": 5.4453686913300074e-05,
+      "loss": 2.1334,
+      "step": 1670
+    },
+    {
+      "epoch": 0.49718851731281444,
+      "grad_norm": 4.125,
+      "learning_rate": 5.328009854666303e-05,
+      "loss": 2.3074,
+      "step": 1680
+    },
+    {
+      "epoch": 0.5001479727730098,
+      "grad_norm": 4.3125,
+      "learning_rate": 5.2114684809993044e-05,
+      "loss": 2.1466,
+      "step": 1690
+    },
+    {
+      "epoch": 0.5031074282332051,
+      "grad_norm": 21.375,
+      "learning_rate": 5.095764961694922e-05,
+      "loss": 1.9702,
+      "step": 1700
+    },
+    {
+      "epoch": 0.5060668836934004,
+      "grad_norm": 2.671875,
+      "learning_rate": 4.980919541518796e-05,
+      "loss": 2.3311,
+      "step": 1710
+    },
+    {
+      "epoch": 0.5090263391535957,
+      "grad_norm": 3.546875,
+      "learning_rate": 4.866952315094088e-05,
+      "loss": 2.1625,
+      "step": 1720
+    },
+    {
+      "epoch": 0.511985794613791,
+      "grad_norm": 4.9375,
+      "learning_rate": 4.753883223385467e-05,
+      "loss": 2.1047,
+      "step": 1730
+    },
+    {
+      "epoch": 0.5149452500739864,
+      "grad_norm": 5.40625,
+      "learning_rate": 4.6417320502100316e-05,
+      "loss": 1.9118,
+      "step": 1740
+    },
+    {
+      "epoch": 0.5179047055341817,
+      "grad_norm": 22.5,
+      "learning_rate": 4.530518418775733e-05,
+      "loss": 2.3183,
+      "step": 1750
+    },
+    {
+      "epoch": 0.520864160994377,
+      "grad_norm": 2.09375,
+      "learning_rate": 4.4202617882478405e-05,
+      "loss": 2.3082,
+      "step": 1760
+    },
+    {
+      "epoch": 0.5238236164545723,
+      "grad_norm": 2.46875,
+      "learning_rate": 4.310981450344189e-05,
+      "loss": 2.1393,
+      "step": 1770
+    },
+    {
+      "epoch": 0.5267830719147677,
+      "grad_norm": 3.109375,
+      "learning_rate": 4.2026965259596666e-05,
+      "loss": 2.0223,
+      "step": 1780
+    },
+    {
+      "epoch": 0.529742527374963,
+      "grad_norm": 7.875,
+      "learning_rate": 4.0954259618206295e-05,
+      "loss": 1.9122,
+      "step": 1790
+    },
+    {
+      "epoch": 0.5327019828351583,
+      "grad_norm": 38.25,
+      "learning_rate": 3.9891885271697496e-05,
+      "loss": 2.149,
+      "step": 1800
+    },
+    {
+      "epoch": 0.5327019828351583,
+      "eval_loss": 2.1312239170074463,
+      "eval_runtime": 37.8217,
+      "eval_samples_per_second": 39.713,
+      "eval_steps_per_second": 9.941,
+      "step": 1800
+    },
+    {
+      "epoch": 0.5356614382953536,
+      "grad_norm": 1.7890625,
+      "learning_rate": 3.884002810481958e-05,
+      "loss": 2.3824,
+      "step": 1810
+    },
+    {
+      "epoch": 0.538620893755549,
+      "grad_norm": 2.859375,
+      "learning_rate": 3.779887216211995e-05,
+      "loss": 2.1831,
+      "step": 1820
+    },
+    {
+      "epoch": 0.5415803492157443,
+      "grad_norm": 4.25,
+      "learning_rate": 3.676859961574162e-05,
+      "loss": 2.0926,
+      "step": 1830
+    },
+    {
+      "epoch": 0.5445398046759397,
+      "grad_norm": 5.4375,
+      "learning_rate": 3.574939073354838e-05,
+      "loss": 2.107,
+      "step": 1840
+    },
+    {
+      "epoch": 0.5474992601361349,
+      "grad_norm": 22.75,
+      "learning_rate": 3.4741423847583134e-05,
+      "loss": 2.1866,
+      "step": 1850
+    },
+    {
+      "epoch": 0.5504587155963303,
+      "grad_norm": 2.046875,
+      "learning_rate": 3.3744875322865034e-05,
+      "loss": 2.3345,
+      "step": 1860
+    },
+    {
+      "epoch": 0.5534181710565256,
+      "grad_norm": 2.734375,
+      "learning_rate": 3.275991952653054e-05,
+      "loss": 2.2518,
+      "step": 1870
+    },
+    {
+      "epoch": 0.556377626516721,
+      "grad_norm": 6.0,
+      "learning_rate": 3.178672879732435e-05,
+      "loss": 2.0092,
+      "step": 1880
+    },
+    {
+      "epoch": 0.5593370819769162,
+      "grad_norm": 7.0625,
+      "learning_rate": 3.0825473415445074e-05,
+      "loss": 1.7693,
+      "step": 1890
+    },
+    {
+      "epoch": 0.5622965374371116,
+      "grad_norm": 34.25,
+      "learning_rate": 2.9876321572751144e-05,
+      "loss": 2.3584,
+      "step": 1900
+    },
+    {
+      "epoch": 0.5652559928973069,
+      "grad_norm": 1.8828125,
+      "learning_rate": 2.8939439343332086e-05,
+      "loss": 2.4875,
+      "step": 1910
+    },
+    {
+      "epoch": 0.5682154483575023,
+      "grad_norm": 2.625,
+      "learning_rate": 2.8014990654450325e-05,
+      "loss": 2.3285,
+      "step": 1920
+    },
+    {
+      "epoch": 0.5711749038176975,
+      "grad_norm": 3.5,
+      "learning_rate": 2.7103137257858868e-05,
+      "loss": 2.2535,
+      "step": 1930
+    },
+    {
+      "epoch": 0.5741343592778929,
+      "grad_norm": 5.875,
+      "learning_rate": 2.6204038701499056e-05,
+      "loss": 2.0563,
+      "step": 1940
+    },
+    {
+      "epoch": 0.5770938147380882,
+      "grad_norm": 16.5,
+      "learning_rate": 2.5317852301584643e-05,
+      "loss": 1.8811,
+      "step": 1950
+    },
+    {
+      "epoch": 0.5800532701982836,
+      "grad_norm": 1.796875,
+      "learning_rate": 2.4444733115075823e-05,
+      "loss": 2.3993,
+      "step": 1960
+    },
+    {
+      "epoch": 0.5830127256584788,
+      "grad_norm": 2.359375,
+      "learning_rate": 2.3584833912548888e-05,
+      "loss": 2.1668,
+      "step": 1970
+    },
+    {
+      "epoch": 0.5859721811186742,
+      "grad_norm": 3.71875,
+      "learning_rate": 2.2738305151465645e-05,
+      "loss": 2.1702,
+      "step": 1980
+    },
+    {
+      "epoch": 0.5889316365788695,
+      "grad_norm": 6.53125,
+      "learning_rate": 2.190529494984782e-05,
+      "loss": 2.2042,
+      "step": 1990
+    },
+    {
+      "epoch": 0.5918910920390648,
+      "grad_norm": 18.375,
+      "learning_rate": 2.1085949060360654e-05,
+      "loss": 2.7263,
+      "step": 2000
+    },
+    {
+      "epoch": 0.5918910920390648,
+      "eval_loss": 2.1315317153930664,
+      "eval_runtime": 38.3028,
+      "eval_samples_per_second": 39.214,
+      "eval_steps_per_second": 9.817,
+      "step": 2000
     }
   ],
   "logging_steps": 10,
       "attributes": {}
     }
   },
+  "total_flos": 3.09608285995008e+16,
   "train_batch_size": 4,
   "trial_name": null,
   "trial_params": null