Instructions to use Thunderbolts123/UltraThinker-Coder-3B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use Thunderbolts123/UltraThinker-Coder-3B with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="Thunderbolts123/UltraThinker-Coder-3B")

# Load model directly
from transformers import AutoTokenizer, AutoModelForMultimodalLM

tokenizer = AutoTokenizer.from_pretrained("Thunderbolts123/UltraThinker-Coder-3B")
model = AutoModelForMultimodalLM.from_pretrained("Thunderbolts123/UltraThinker-Coder-3B")

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use Thunderbolts123/UltraThinker-Coder-3B with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "Thunderbolts123/UltraThinker-Coder-3B"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Thunderbolts123/UltraThinker-Coder-3B",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/Thunderbolts123/UltraThinker-Coder-3B

SGLang

How to use Thunderbolts123/UltraThinker-Coder-3B with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "Thunderbolts123/UltraThinker-Coder-3B" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Thunderbolts123/UltraThinker-Coder-3B",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "Thunderbolts123/UltraThinker-Coder-3B" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Thunderbolts123/UltraThinker-Coder-3B",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Unsloth Studio

How to use Thunderbolts123/UltraThinker-Coder-3B with Unsloth Studio:

Install Unsloth Studio (macOS, Linux, WSL)

curl -fsSL https://unsloth.ai/install.sh | sh
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for Thunderbolts123/UltraThinker-Coder-3B to start chatting

Install Unsloth Studio (Windows)

irm https://unsloth.ai/install.ps1 | iex
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for Thunderbolts123/UltraThinker-Coder-3B to start chatting

Using HuggingFace Spaces for Unsloth

# No setup required
# Open https://huggingface.co/spaces/unsloth/studio in your browser
# Search for Thunderbolts123/UltraThinker-Coder-3B to start chatting

Load model with FastModel

pip install unsloth
from unsloth import FastModel
model, tokenizer = FastModel.from_pretrained(
    model_name="Thunderbolts123/UltraThinker-Coder-3B",
    max_seq_length=2048,
)

Docker Model Runner
How to use Thunderbolts123/UltraThinker-Coder-3B with Docker Model Runner:
```
docker model run hf.co/Thunderbolts123/UltraThinker-Coder-3B
```

Thunderbolts123 commited on 7 days ago

Commit

47ad5fd

verified ·

1 Parent(s): 91e4de1

Training in progress, step 200, checkpoint

Browse files

Files changed (7) hide show

last-checkpoint/adapter_config.json +4 -4
last-checkpoint/adapter_model.safetensors +1 -1
last-checkpoint/optimizer.pt +1 -1
last-checkpoint/scaler.pt +1 -1
last-checkpoint/scheduler.pt +1 -1
last-checkpoint/trainer_state.json +147 -6
last-checkpoint/training_args.bin +1 -1

last-checkpoint/adapter_config.json CHANGED Viewed

@@ -34,13 +34,13 @@
   "rank_pattern": {},
   "revision": null,
   "target_modules": [
-    "q_proj",
-    "gate_proj",
     "v_proj",
-    "o_proj",
     "down_proj",
     "k_proj",
-    "up_proj"
   ],
   "target_parameters": null,
   "task_type": "CAUSAL_LM",

   "rank_pattern": {},
   "revision": null,
   "target_modules": [
+    "up_proj",
     "v_proj",
     "down_proj",
+    "gate_proj",
     "k_proj",
+    "q_proj",
+    "o_proj"
   ],
   "target_parameters": null,
   "task_type": "CAUSAL_LM",

last-checkpoint/adapter_model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:c072b8006de91fd16554375d116f410546a24756b76486e7ff3f6165d5cd1c01
 size 479005064

 version https://git-lfs.github.com/spec/v1
+oid sha256:b32b5959735aa57b511b7726bfbc2ef5de45ff1ebf9d62c5f60891e02815697b
 size 479005064

last-checkpoint/optimizer.pt CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:37ee5112fd5a35910b9a64c3a2c317dfb07eb43033e8df25ee5003f288101d96
 size 243807941

 version https://git-lfs.github.com/spec/v1
+oid sha256:c7bb1fceeac45ca2476246b3d206978cdd4cb987a38fb6a0552e774331666a57
 size 243807941

last-checkpoint/scaler.pt CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:25e591b6ccdc9dcb49a29bd97e2c898e0e2dc4799b75694557b3955730633d8b
 size 1383

 version https://git-lfs.github.com/spec/v1
+oid sha256:317b5a305b2a9e21e527e7f85fdb3c6126a0ca02234bcb93021996746c86138a
 size 1383

last-checkpoint/scheduler.pt CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:90daa5cc6fd25d912a2841b492510679aba1d1fd92344153762534657395f260
 size 1465

 version https://git-lfs.github.com/spec/v1
+oid sha256:b8bf6871cccebbd8019e51a8751deebfdc1a27237b371091ed859a0e2e1ce5c9
 size 1465

last-checkpoint/trainer_state.json CHANGED Viewed

@@ -2,15 +2,156 @@
   "best_global_step": null,
   "best_metric": null,
   "best_model_checkpoint": null,
-  "epoch": 0.0008,
   "eval_steps": 500,
-  "global_step": 1,
   "is_hyper_param_search": false,
   "is_local_process_zero": true,
   "is_world_process_zero": true,
-  "log_history": [],
   "logging_steps": 10,
-  "max_steps": 1,
   "num_input_tokens_seen": 0,
   "num_train_epochs": 1,
   "save_steps": 200,
@@ -21,12 +162,12 @@
         "should_evaluate": false,
         "should_log": false,
         "should_save": true,
-        "should_training_stop": true
       },
       "attributes": {}
     }
   },
-  "total_flos": 112399535898624.0,
   "train_batch_size": 1,
   "trial_name": null,
   "trial_params": null

   "best_global_step": null,
   "best_metric": null,
   "best_model_checkpoint": null,
+  "epoch": 0.16,
   "eval_steps": 500,
+  "global_step": 200,
   "is_hyper_param_search": false,
   "is_local_process_zero": true,
   "is_world_process_zero": true,
+  "log_history": [
+    {
+      "epoch": 0.008,
+      "grad_norm": 0.800916850566864,
+      "learning_rate": 3.6e-05,
+      "loss": 1.431671142578125,
+      "step": 10
+    },
+    {
+      "epoch": 0.016,
+      "grad_norm": 0.4533734917640686,
+      "learning_rate": 7.6e-05,
+      "loss": 1.1412681579589843,
+      "step": 20
+    },
+    {
+      "epoch": 0.024,
+      "grad_norm": 0.4183805584907532,
+      "learning_rate": 0.000116,
+      "loss": 1.0069389343261719,
+      "step": 30
+    },
+    {
+      "epoch": 0.032,
+      "grad_norm": 2.1650190353393555,
+      "learning_rate": 0.00015600000000000002,
+      "loss": 0.9767860412597656,
+      "step": 40
+    },
+    {
+      "epoch": 0.04,
+      "grad_norm": 3.925295352935791,
+      "learning_rate": 0.000188,
+      "loss": 0.9355104446411133,
+      "step": 50
+    },
+    {
+      "epoch": 0.048,
+      "grad_norm": 0.5158158540725708,
+      "learning_rate": 0.0001999732083645129,
+      "loss": 0.9155762672424317,
+      "step": 60
+    },
+    {
+      "epoch": 0.056,
+      "grad_norm": 0.3576229214668274,
+      "learning_rate": 0.00019984201858549693,
+      "loss": 0.8876156806945801,
+      "step": 70
+    },
+    {
+      "epoch": 0.064,
+      "grad_norm": 0.45568719506263733,
+      "learning_rate": 0.0001996016530250235,
+      "loss": 0.9064264297485352,
+      "step": 80
+    },
+    {
+      "epoch": 0.072,
+      "grad_norm": 0.35059407353401184,
+      "learning_rate": 0.0001992523745193039,
+      "loss": 0.9449616432189941,
+      "step": 90
+    },
+    {
+      "epoch": 0.08,
+      "grad_norm": 0.3109589219093323,
+      "learning_rate": 0.00019879456499925614,
+      "loss": 0.9112279891967774,
+      "step": 100
+    },
+    {
+      "epoch": 0.088,
+      "grad_norm": 0.3478052616119385,
+      "learning_rate": 0.0001982287250728689,
+      "loss": 0.9097712516784668,
+      "step": 110
+    },
+    {
+      "epoch": 0.096,
+      "grad_norm": 0.3412795960903168,
+      "learning_rate": 0.00019755547347779403,
+      "loss": 0.9231362342834473,
+      "step": 120
+    },
+    {
+      "epoch": 0.104,
+      "grad_norm": 0.75782710313797,
+      "learning_rate": 0.00019677554640476624,
+      "loss": 0.9049114227294922,
+      "step": 130
+    },
+    {
+      "epoch": 0.112,
+      "grad_norm": 0.35266366600990295,
+      "learning_rate": 0.0001958897966925891,
+      "loss": 0.8955144882202148,
+      "step": 140
+    },
+    {
+      "epoch": 0.12,
+      "grad_norm": 0.3340380787849426,
+      "learning_rate": 0.00019489919289556845,
+      "loss": 0.9052764892578125,
+      "step": 150
+    },
+    {
+      "epoch": 0.128,
+      "grad_norm": 0.3489997982978821,
+      "learning_rate": 0.00019380481822441235,
+      "loss": 0.918581199645996,
+      "step": 160
+    },
+    {
+      "epoch": 0.136,
+      "grad_norm": 0.3366387188434601,
+      "learning_rate": 0.00019260786936175635,
+      "loss": 0.8691808700561523,
+      "step": 170
+    },
+    {
+      "epoch": 0.144,
+      "grad_norm": 0.35365116596221924,
+      "learning_rate": 0.0001913096551536083,
+      "loss": 0.9018807411193848,
+      "step": 180
+    },
+    {
+      "epoch": 0.152,
+      "grad_norm": 0.3102000653743744,
+      "learning_rate": 0.0001899115951781446,
+      "loss": 0.8774255752563477,
+      "step": 190
+    },
+    {
+      "epoch": 0.16,
+      "grad_norm": 0.41139864921569824,
+      "learning_rate": 0.00018841521819342236,
+      "loss": 0.8466087341308594,
+      "step": 200
+    }
+  ],
   "logging_steps": 10,
+  "max_steps": 1000,
   "num_input_tokens_seen": 0,
   "num_train_epochs": 1,
   "save_steps": 200,
         "should_evaluate": false,
         "should_log": false,
         "should_save": true,
+        "should_training_stop": false
       },
       "attributes": {}
     }
   },
+  "total_flos": 2.8420601253617664e+16,
   "train_batch_size": 1,
   "trial_name": null,
   "trial_params": null

last-checkpoint/training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:f64590a7b7803a555a01a39aba443c022199cbe4883538827236b8875f588e15
 size 5649

 version https://git-lfs.github.com/spec/v1
+oid sha256:f6eed1a57e558818ad765c74a1b5250d3193d3f5a9f2c99695dab37df84671b2
 size 5649