Instructions to use Sim4Rec/inter-play-sim-assistant-sft with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use Sim4Rec/inter-play-sim-assistant-sft with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="Sim4Rec/inter-play-sim-assistant-sft")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("Sim4Rec/inter-play-sim-assistant-sft")
model = AutoModelForCausalLM.from_pretrained("Sim4Rec/inter-play-sim-assistant-sft")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use Sim4Rec/inter-play-sim-assistant-sft with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "Sim4Rec/inter-play-sim-assistant-sft"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Sim4Rec/inter-play-sim-assistant-sft",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/Sim4Rec/inter-play-sim-assistant-sft

SGLang

How to use Sim4Rec/inter-play-sim-assistant-sft with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "Sim4Rec/inter-play-sim-assistant-sft" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Sim4Rec/inter-play-sim-assistant-sft",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "Sim4Rec/inter-play-sim-assistant-sft" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Sim4Rec/inter-play-sim-assistant-sft",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use Sim4Rec/inter-play-sim-assistant-sft with Docker Model Runner:
```
docker model run hf.co/Sim4Rec/inter-play-sim-assistant-sft
```

jeromeramos commited on Feb 3, 2025

Commit

a99133b

verified ·

1 Parent(s): 27e0090

Model save

Browse files

Files changed (14) hide show

README.md +1 -1
all_results.json +5 -5
model-00001-of-00004.safetensors +1 -1
model-00002-of-00004.safetensors +1 -1
model-00003-of-00004.safetensors +1 -1
model-00004-of-00004.safetensors +1 -1
runs/Feb03_13-28-29_w-jerom-inter-play-sim-94c6890b9ccf44ea86f033a3db8a5dbd-84rg227/events.out.tfevents.1738589574.w-jerom-inter-play-sim-94c6890b9ccf44ea86f033a3db8a5dbd-84rg227.97164.0 +3 -0
runs/Feb03_19-31-47_w-jerom-inter-play-sim-94c6890b9ccf44ea86f033a3db8a5dbd-6ckztwz/events.out.tfevents.1738611496.w-jerom-inter-play-sim-94c6890b9ccf44ea86f033a3db8a5dbd-6ckztwz.6260.0 +3 -0
special_tokens_map.json +37 -0
tokenizer.json +2 -2
tokenizer_config.json +47 -0
train_results.json +5 -5
trainer_state.json +156 -156
training_args.bin +1 -1

README.md CHANGED Viewed

@@ -27,7 +27,7 @@ print(output["generated_text"])
 ## Training procedure
-[<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="150" height="24"/>](https://wandb.ai/jerome-ramos-20/huggingface/runs/rdaw49f9)
 This model was trained with SFT.

 ## Training procedure
+[<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="150" height="24"/>](https://wandb.ai/jerome-ramos-20/huggingface/runs/qm0bt6vo)
 This model was trained with SFT.

all_results.json CHANGED Viewed

@@ -5,10 +5,10 @@
     "eval_samples": 2071,
     "eval_samples_per_second": 86.635,
     "eval_steps_per_second": 2.725,
-    "total_flos": 1.74045731487744e+18,
-    "train_loss": 0.8234024724801822,
-    "train_runtime": 2385.6161,
     "train_samples": 46269,
-    "train_samples_per_second": 19.395,
-    "train_steps_per_second": 0.151
 }

     "eval_samples": 2071,
     "eval_samples_per_second": 86.635,
     "eval_steps_per_second": 2.725,
+    "total_flos": 1.7115790489220547e+18,
+    "train_loss": 0.861595592175164,
+    "train_runtime": 2353.9448,
     "train_samples": 46269,
+    "train_samples_per_second": 19.656,
+    "train_steps_per_second": 0.153
 }

model-00001-of-00004.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:96a3f25cdf50508cedb9141a645a1b95248e26d20ef5fe3d2de30857075f9ee2
 size 4977222960

 version https://git-lfs.github.com/spec/v1
+oid sha256:5f3646a85996025cdaed773e7201ee3e3320349d66731b2b77492ae1a5d14add
 size 4977222960

model-00002-of-00004.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:675a20a1c8cb7ef8957fa7e3549f80d43b42e7bb023aa7c1a6c3b159e495bc67
 size 4999802720

 version https://git-lfs.github.com/spec/v1
+oid sha256:6f2743e048959efde7f2379dd20a4fe9079ab98f6b125b32edd2f0c912d96d3e
 size 4999802720

model-00003-of-00004.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:bc06566be8c2f403026d162424f82153270a6a0d04b0b40e6e14ad4c2ea5332c
 size 4915916176

 version https://git-lfs.github.com/spec/v1
+oid sha256:ca317cebfba4e27a3e92f9eb2fd21f5695e0e5a514d71d78f5f2e240dd728ae2
 size 4915916176

model-00004-of-00004.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:2986a57de29725fc6544bc073fd543bfb7c0fe517bc6ef006cfebc4c12bbb8e5
 size 1168663096

 version https://git-lfs.github.com/spec/v1
+oid sha256:7e888911d0c982b08e2ac6b084e7de4a4d500111b730a8ef9c8c43b1c4e83ad2
 size 1168663096

runs/Feb03_13-28-29_w-jerom-inter-play-sim-94c6890b9ccf44ea86f033a3db8a5dbd-84rg227/events.out.tfevents.1738589574.w-jerom-inter-play-sim-94c6890b9ccf44ea86f033a3db8a5dbd-84rg227.97164.0 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:12bf47e94d4936149802c609d62553a93fc9ac6a80eaa55ea0dd4c4a86390310
+size 18230

runs/Feb03_19-31-47_w-jerom-inter-play-sim-94c6890b9ccf44ea86f033a3db8a5dbd-6ckztwz/events.out.tfevents.1738611496.w-jerom-inter-play-sim-94c6890b9ccf44ea86f033a3db8a5dbd-6ckztwz.6260.0 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:598f7bfb6190b6462598690862205d1d99f7df666c9d5b8a3087f43e28244b69
+size 21809

special_tokens_map.json CHANGED Viewed

@@ -1,4 +1,41 @@
 {
   "bos_token": {
     "content": "<|im_start|>",
     "lstrip": false,

 {
+  "additional_special_tokens": [
+    {
+      "content": "<response>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false
+    },
+    {
+      "content": "</response>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false
+    },
+    {
+      "content": "<answer>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false
+    },
+    {
+      "content": "</answer>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false
+    },
+    {
+      "content": "<inquire>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false
+    }
+  ],
   "bos_token": {
     "content": "<|im_start|>",
     "lstrip": false,

tokenizer.json CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:635e16753749bb3465bdf9e00f68e8b29c9e4884d9ee55eb27705bd8f1318cf4
-size 17210395

 version https://git-lfs.github.com/spec/v1
+oid sha256:3919c1e7bfa558ff525a618a3d463929a238acaba668d7ef6da432fcd6cd7fad
+size 17211327

tokenizer_config.json CHANGED Viewed

@@ -2063,8 +2063,55 @@
       "rstrip": false,
       "single_word": false,
       "special": true
     }
   },
   "bos_token": "<|im_start|>",
   "chat_template": "{% if not add_generation_prompt is defined %}{% set add_generation_prompt = false %}{% endif %}{% for message in messages %}{{'<|im_start|>' + message['role'] + '\n' + message['content'] + '<|im_end|>' + '\n'}}{% endfor %}{% if add_generation_prompt %}{{ '<|im_start|>assistant\n' }}{% endif %}",
   "clean_up_tokenization_spaces": true,

       "rstrip": false,
       "single_word": false,
       "special": true
+    },
+    "128258": {
+      "content": "<response>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128259": {
+      "content": "</response>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128260": {
+      "content": "<answer>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128261": {
+      "content": "</answer>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "128262": {
+      "content": "<inquire>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
     }
   },
+  "additional_special_tokens": [
+    "<response>",
+    "</response>",
+    "<answer>",
+    "</answer>",
+    "<inquire>"
+  ],
   "bos_token": "<|im_start|>",
   "chat_template": "{% if not add_generation_prompt is defined %}{% set add_generation_prompt = false %}{% endif %}{% for message in messages %}{{'<|im_start|>' + message['role'] + '\n' + message['content'] + '<|im_end|>' + '\n'}}{% endfor %}{% if add_generation_prompt %}{{ '<|im_start|>assistant\n' }}{% endif %}",
   "clean_up_tokenization_spaces": true,

train_results.json CHANGED Viewed

@@ -1,9 +1,9 @@
 {
     "epoch": 0.9986168741355463,
-    "total_flos": 1.74045731487744e+18,
-    "train_loss": 0.8234024724801822,
-    "train_runtime": 2385.6161,
     "train_samples": 46269,
-    "train_samples_per_second": 19.395,
-    "train_steps_per_second": 0.151
 }

 {
     "epoch": 0.9986168741355463,
+    "total_flos": 1.7115790489220547e+18,
+    "train_loss": 0.861595592175164,
+    "train_runtime": 2353.9448,
     "train_samples": 46269,
+    "train_samples_per_second": 19.656,
+    "train_steps_per_second": 0.153
 }

trainer_state.json CHANGED Viewed

@@ -10,531 +10,531 @@
   "log_history": [
     {
       "epoch": 0.0027662517289073307,
-      "grad_norm": 22.881450653076172,
       "learning_rate": 5.405405405405406e-06,
-      "loss": 1.6158,
       "step": 1
     },
     {
       "epoch": 0.013831258644536652,
-      "grad_norm": 2.178889274597168,
       "learning_rate": 2.702702702702703e-05,
-      "loss": 1.3807,
       "step": 5
     },
     {
       "epoch": 0.027662517289073305,
-      "grad_norm": 14.400589942932129,
       "learning_rate": 5.405405405405406e-05,
-      "loss": 1.3352,
       "step": 10
     },
     {
       "epoch": 0.04149377593360996,
-      "grad_norm": 2.756945848464966,
       "learning_rate": 8.108108108108109e-05,
-      "loss": 1.2203,
       "step": 15
     },
     {
       "epoch": 0.05532503457814661,
-      "grad_norm": 1.3922957181930542,
       "learning_rate": 0.00010810810810810812,
-      "loss": 1.0964,
       "step": 20
     },
     {
       "epoch": 0.06915629322268327,
-      "grad_norm": 1.0261996984481812,
       "learning_rate": 0.00013513513513513514,
-      "loss": 1.2033,
       "step": 25
     },
     {
       "epoch": 0.08298755186721991,
-      "grad_norm": 1.6099579334259033,
       "learning_rate": 0.00016216216216216218,
-      "loss": 1.2005,
       "step": 30
     },
     {
       "epoch": 0.09681881051175657,
-      "grad_norm": 1.77192223072052,
       "learning_rate": 0.0001891891891891892,
-      "loss": 1.4161,
       "step": 35
     },
     {
       "epoch": 0.11065006915629322,
-      "grad_norm": 1.0772837400436401,
       "learning_rate": 0.0001999576950082201,
-      "loss": 1.4553,
       "step": 40
     },
     {
       "epoch": 0.12448132780082988,
-      "grad_norm": 1.4605121612548828,
       "learning_rate": 0.0001996992941167792,
-      "loss": 1.2175,
       "step": 45
     },
     {
       "epoch": 0.13831258644536654,
-      "grad_norm": 1.0822768211364746,
       "learning_rate": 0.00019920660160815422,
-      "loss": 1.0378,
       "step": 50
     },
     {
       "epoch": 0.15214384508990317,
-      "grad_norm": 0.9796843528747559,
       "learning_rate": 0.00019848077530122083,
-      "loss": 1.0451,
       "step": 55
     },
     {
       "epoch": 0.16597510373443983,
-      "grad_norm": 1.1945514678955078,
       "learning_rate": 0.00019752352087524933,
-      "loss": 1.4266,
       "step": 60
     },
     {
       "epoch": 0.1798063623789765,
-      "grad_norm": 0.8683685064315796,
       "learning_rate": 0.00019633708786158806,
-      "loss": 1.0347,
       "step": 65
     },
     {
       "epoch": 0.19363762102351315,
-      "grad_norm": 0.25568732619285583,
       "learning_rate": 0.0001949242643573034,
-      "loss": 0.9376,
       "step": 70
     },
     {
       "epoch": 0.2074688796680498,
-      "grad_norm": 0.26001420617103577,
       "learning_rate": 0.0001932883704732001,
-      "loss": 0.9132,
       "step": 75
     },
     {
       "epoch": 0.22130013831258644,
-      "grad_norm": 0.2598419189453125,
       "learning_rate": 0.00019143325053161796,
-      "loss": 0.8958,
       "step": 80
     },
     {
       "epoch": 0.2351313969571231,
-      "grad_norm": 0.20231448113918304,
       "learning_rate": 0.00018936326403234125,
-      "loss": 0.8734,
       "step": 85
     },
     {
       "epoch": 0.24896265560165975,
-      "grad_norm": 0.17383822798728943,
       "learning_rate": 0.00018708327540784922,
-      "loss": 0.8701,
       "step": 90
     },
     {
       "epoch": 0.2627939142461964,
-      "grad_norm": 0.17745399475097656,
       "learning_rate": 0.0001845986425919841,
-      "loss": 0.8499,
       "step": 95
     },
     {
       "epoch": 0.2766251728907331,
-      "grad_norm": 0.17801660299301147,
       "learning_rate": 0.0001819152044288992,
-      "loss": 0.8512,
       "step": 100
     },
     {
       "epoch": 0.29045643153526973,
-      "grad_norm": 0.18566825985908508,
       "learning_rate": 0.00017903926695187595,
-      "loss": 0.8361,
       "step": 105
     },
     {
       "epoch": 0.30428769017980634,
-      "grad_norm": 0.18012060225009918,
       "learning_rate": 0.00017597758856425494,
-      "loss": 0.834,
       "step": 110
     },
     {
       "epoch": 0.318118948824343,
-      "grad_norm": 0.16151954233646393,
       "learning_rate": 0.00017273736415730488,
-      "loss": 0.8114,
       "step": 115
     },
     {
       "epoch": 0.33195020746887965,
-      "grad_norm": 0.16563855111598969,
       "learning_rate": 0.00016932620820235244,
-      "loss": 0.8191,
       "step": 120
     },
     {
       "epoch": 0.3457814661134163,
-      "grad_norm": 0.16186057031154633,
       "learning_rate": 0.0001657521368569064,
-      "loss": 0.7887,
       "step": 125
     },
     {
       "epoch": 0.359612724757953,
-      "grad_norm": 0.1734704077243805,
       "learning_rate": 0.000162023549126826,
-      "loss": 0.7946,
       "step": 130
     },
     {
       "epoch": 0.37344398340248963,
-      "grad_norm": 0.17336814105510712,
       "learning_rate": 0.00015814920712880267,
-      "loss": 0.7974,
       "step": 135
     },
     {
       "epoch": 0.3872752420470263,
-      "grad_norm": 0.15509486198425293,
       "learning_rate": 0.00015413821549953698,
-      "loss": 0.7866,
       "step": 140
     },
     {
       "epoch": 0.40110650069156295,
-      "grad_norm": 0.18101590871810913,
       "learning_rate": 0.00015000000000000001,
-      "loss": 0.7927,
       "step": 145
     },
     {
       "epoch": 0.4149377593360996,
-      "grad_norm": 0.14941518008708954,
       "learning_rate": 0.0001457442853650581,
-      "loss": 0.7698,
       "step": 150
     },
     {
       "epoch": 0.4287690179806362,
-      "grad_norm": 0.15677104890346527,
       "learning_rate": 0.00014138107245051392,
-      "loss": 0.7721,
       "step": 155
     },
     {
       "epoch": 0.4426002766251729,
-      "grad_norm": 0.14607611298561096,
       "learning_rate": 0.00013692061473126845,
-      "loss": 0.7516,
       "step": 160
     },
     {
       "epoch": 0.45643153526970953,
-      "grad_norm": 0.16472630202770233,
       "learning_rate": 0.00013237339420583212,
-      "loss": 0.7554,
       "step": 165
     },
     {
       "epoch": 0.4702627939142462,
-      "grad_norm": 0.13666489720344543,
       "learning_rate": 0.00012775009676380957,
-      "loss": 0.7515,
       "step": 170
     },
     {
       "epoch": 0.48409405255878285,
-      "grad_norm": 0.1362183392047882,
       "learning_rate": 0.00012306158707424403,
-      "loss": 0.7513,
       "step": 175
     },
     {
       "epoch": 0.4979253112033195,
-      "grad_norm": 0.12810291349887848,
       "learning_rate": 0.00011831888305383268,
-      "loss": 0.7385,
       "step": 180
     },
     {
       "epoch": 0.5117565698478561,
-      "grad_norm": 0.14311975240707397,
       "learning_rate": 0.00011353312997501313,
-      "loss": 0.7469,
       "step": 185
     },
     {
       "epoch": 0.5255878284923928,
-      "grad_norm": 0.129547581076622,
       "learning_rate": 0.00010871557427476583,
-      "loss": 0.7423,
       "step": 190
     },
     {
       "epoch": 0.5394190871369294,
-      "grad_norm": 0.1447523832321167,
       "learning_rate": 0.0001038775371256817,
-      "loss": 0.7351,
       "step": 195
     },
     {
       "epoch": 0.5532503457814661,
-      "grad_norm": 0.1369813233613968,
       "learning_rate": 9.903038783140216e-05,
-      "loss": 0.7202,
       "step": 200
     },
     {
       "epoch": 0.5670816044260027,
-      "grad_norm": 0.12533989548683167,
       "learning_rate": 9.418551710895243e-05,
-      "loss": 0.722,
       "step": 205
     },
     {
       "epoch": 0.5809128630705395,
-      "grad_norm": 0.12739399075508118,
       "learning_rate": 8.935431032075318e-05,
-      "loss": 0.7173,
       "step": 210
     },
     {
       "epoch": 0.5947441217150761,
-      "grad_norm": 0.13596710562705994,
       "learning_rate": 8.454812071921596e-05,
-      "loss": 0.7194,
       "step": 215
     },
     {
       "epoch": 0.6085753803596127,
-      "grad_norm": 0.12327581644058228,
       "learning_rate": 7.977824276679623e-05,
-      "loss": 0.7095,
       "step": 220
     },
     {
       "epoch": 0.6224066390041494,
-      "grad_norm": 0.1317676603794098,
       "learning_rate": 7.505588559420189e-05,
-      "loss": 0.713,
       "step": 225
     },
     {
       "epoch": 0.636237897648686,
-      "grad_norm": 0.13516183197498322,
       "learning_rate": 7.039214665913003e-05,
-      "loss": 0.7048,
       "step": 230
     },
     {
       "epoch": 0.6500691562932227,
-      "grad_norm": 0.12717784941196442,
       "learning_rate": 6.579798566743314e-05,
-      "loss": 0.7088,
       "step": 235
     },
     {
       "epoch": 0.6639004149377593,
-      "grad_norm": 0.12097220122814178,
       "learning_rate": 6.128419881799996e-05,
-      "loss": 0.6939,
       "step": 240
     },
     {
       "epoch": 0.677731673582296,
-      "grad_norm": 0.1216357946395874,
       "learning_rate": 5.6861393431874675e-05,
-      "loss": 0.6943,
       "step": 245
     },
     {
       "epoch": 0.6915629322268326,
-      "grad_norm": 0.12578962743282318,
       "learning_rate": 5.253996302523596e-05,
-      "loss": 0.6832,
       "step": 250
     },
     {
       "epoch": 0.7053941908713693,
-      "grad_norm": 0.1288958042860031,
       "learning_rate": 4.833006288481371e-05,
-      "loss": 0.6786,
       "step": 255
     },
     {
       "epoch": 0.719225449515906,
-      "grad_norm": 0.13444924354553223,
       "learning_rate": 4.424158620314073e-05,
-      "loss": 0.6861,
       "step": 260
     },
     {
       "epoch": 0.7330567081604425,
-      "grad_norm": 0.15658161044120789,
       "learning_rate": 4.028414082972141e-05,
-      "loss": 0.6829,
       "step": 265
     },
     {
       "epoch": 0.7468879668049793,
-      "grad_norm": 0.13638462126255035,
       "learning_rate": 3.646702669275151e-05,
-      "loss": 0.6811,
       "step": 270
     },
     {
       "epoch": 0.7607192254495159,
-      "grad_norm": 0.11960398405790329,
       "learning_rate": 3.279921394444776e-05,
-      "loss": 0.6645,
       "step": 275
     },
     {
       "epoch": 0.7745504840940526,
-      "grad_norm": 0.12005037814378738,
       "learning_rate": 2.9289321881345254e-05,
-      "loss": 0.6709,
       "step": 280
     },
     {
       "epoch": 0.7883817427385892,
-      "grad_norm": 0.12300828844308853,
       "learning_rate": 2.594559868909956e-05,
-      "loss": 0.6629,
       "step": 285
     },
     {
       "epoch": 0.8022130013831259,
-      "grad_norm": 0.11922738701105118,
       "learning_rate": 2.2775902059393085e-05,
-      "loss": 0.6613,
       "step": 290
     },
     {
       "epoch": 0.8160442600276625,
-      "grad_norm": 0.11143971979618073,
       "learning_rate": 1.9787680724495617e-05,
-      "loss": 0.6546,
       "step": 295
     },
     {
       "epoch": 0.8298755186721992,
-      "grad_norm": 0.11601640284061432,
       "learning_rate": 1.698795695287212e-05,
-      "loss": 0.6567,
       "step": 300
     },
     {
       "epoch": 0.8437067773167358,
-      "grad_norm": 0.11989685148000717,
       "learning_rate": 1.4383310046973365e-05,
-      "loss": 0.657,
       "step": 305
     },
     {
       "epoch": 0.8575380359612724,
-      "grad_norm": 0.11077902466058731,
       "learning_rate": 1.1979860881988902e-05,
-      "loss": 0.6555,
       "step": 310
     },
     {
       "epoch": 0.8713692946058091,
-      "grad_norm": 0.11324643343687057,
       "learning_rate": 9.783257521896227e-06,
-      "loss": 0.6468,
       "step": 315
     },
     {
       "epoch": 0.8852005532503457,
-      "grad_norm": 0.11370333284139633,
       "learning_rate": 7.798661946608166e-06,
-      "loss": 0.648,
       "step": 320
     },
     {
       "epoch": 0.8990318118948825,
-      "grad_norm": 0.10991474986076355,
       "learning_rate": 6.030737921409169e-06,
-      "loss": 0.6446,
       "step": 325
     },
     {
       "epoch": 0.9128630705394191,
-      "grad_norm": 0.11461606621742249,
       "learning_rate": 4.4836400371876974e-06,
-      "loss": 0.6387,
       "step": 330
     },
     {
       "epoch": 0.9266943291839558,
-      "grad_norm": 0.1137213185429573,
       "learning_rate": 3.161003947219421e-06,
-      "loss": 0.6329,
       "step": 335
     },
     {
       "epoch": 0.9405255878284924,
-      "grad_norm": 0.10857342928647995,
       "learning_rate": 2.0659378234448525e-06,
-      "loss": 0.6627,
       "step": 340
     },
     {
       "epoch": 0.9543568464730291,
-      "grad_norm": 0.10978103429079056,
       "learning_rate": 1.201015052319099e-06,
-      "loss": 0.6435,
       "step": 345
     },
     {
       "epoch": 0.9681881051175657,
-      "grad_norm": 0.1058996319770813,
       "learning_rate": 5.682681873981577e-07,
-      "loss": 0.6388,
       "step": 350
     },
     {
       "epoch": 0.9820193637621023,
-      "grad_norm": 0.10548459738492966,
       "learning_rate": 1.6918417287318245e-07,
-      "loss": 0.6382,
       "step": 355
     },
     {
       "epoch": 0.995850622406639,
-      "grad_norm": 0.11099706590175629,
       "learning_rate": 4.700849277383679e-09,
-      "loss": 0.6424,
       "step": 360
     },
     {
       "epoch": 0.9986168741355463,
-      "eval_loss": 0.658014178276062,
-      "eval_runtime": 53.9504,
-      "eval_samples_per_second": 85.449,
-      "eval_steps_per_second": 2.688,
       "step": 361
     },
     {
       "epoch": 0.9986168741355463,
       "step": 361,
-      "total_flos": 1.74045731487744e+18,
-      "train_loss": 0.8234024724801822,
-      "train_runtime": 2385.6161,
-      "train_samples_per_second": 19.395,
-      "train_steps_per_second": 0.151
     }
   ],
   "logging_steps": 5,
@@ -554,7 +554,7 @@
       "attributes": {}
     }
   },
-  "total_flos": 1.74045731487744e+18,
   "train_batch_size": 4,
   "trial_name": null,
   "trial_params": null

   "log_history": [
     {
       "epoch": 0.0027662517289073307,
+      "grad_norm": 22.1127872467041,
       "learning_rate": 5.405405405405406e-06,
+      "loss": 2.6011,
       "step": 1
     },
     {
       "epoch": 0.013831258644536652,
+      "grad_norm": 3.526327610015869,
       "learning_rate": 2.702702702702703e-05,
+      "loss": 2.2001,
       "step": 5
     },
     {
       "epoch": 0.027662517289073305,
+      "grad_norm": 2.0694353580474854,
       "learning_rate": 5.405405405405406e-05,
+      "loss": 1.8786,
       "step": 10
     },
     {
       "epoch": 0.04149377593360996,
+      "grad_norm": 1.429513931274414,
       "learning_rate": 8.108108108108109e-05,
+      "loss": 1.6509,
       "step": 15
     },
     {
       "epoch": 0.05532503457814661,
+      "grad_norm": 3.6957762241363525,
       "learning_rate": 0.00010810810810810812,
+      "loss": 1.4395,
       "step": 20
     },
     {
       "epoch": 0.06915629322268327,
+      "grad_norm": 3.5924487113952637,
       "learning_rate": 0.00013513513513513514,
+      "loss": 1.1714,
       "step": 25
     },
     {
       "epoch": 0.08298755186721991,
+      "grad_norm": 1.092515468597412,
       "learning_rate": 0.00016216216216216218,
+      "loss": 1.2197,
       "step": 30
     },
     {
       "epoch": 0.09681881051175657,
+      "grad_norm": 4.442113876342773,
       "learning_rate": 0.0001891891891891892,
+      "loss": 1.204,
       "step": 35
     },
     {
       "epoch": 0.11065006915629322,
+      "grad_norm": 6.686959266662598,
       "learning_rate": 0.0001999576950082201,
+      "loss": 1.6501,
       "step": 40
     },
     {
       "epoch": 0.12448132780082988,
+      "grad_norm": 4.45343017578125,
       "learning_rate": 0.0001996992941167792,
+      "loss": 1.4183,
       "step": 45
     },
     {
       "epoch": 0.13831258644536654,
+      "grad_norm": 5.694210052490234,
       "learning_rate": 0.00019920660160815422,
+      "loss": 1.5559,
       "step": 50
     },
     {
       "epoch": 0.15214384508990317,
+      "grad_norm": 2.5626814365386963,
       "learning_rate": 0.00019848077530122083,
+      "loss": 1.2451,
       "step": 55
     },
     {
       "epoch": 0.16597510373443983,
+      "grad_norm": 0.5388926863670349,
       "learning_rate": 0.00019752352087524933,
+      "loss": 1.0484,
       "step": 60
     },
     {
       "epoch": 0.1798063623789765,
+      "grad_norm": 0.3036655783653259,
       "learning_rate": 0.00019633708786158806,
+      "loss": 0.9605,
       "step": 65
     },
     {
       "epoch": 0.19363762102351315,
+      "grad_norm": 0.25516265630722046,
       "learning_rate": 0.0001949242643573034,
+      "loss": 0.9121,
       "step": 70
     },
     {
       "epoch": 0.2074688796680498,
+      "grad_norm": 0.22290439903736115,
       "learning_rate": 0.0001932883704732001,
+      "loss": 0.9072,
       "step": 75
     },
     {
       "epoch": 0.22130013831258644,
+      "grad_norm": 0.23729199171066284,
       "learning_rate": 0.00019143325053161796,
+      "loss": 0.8938,
       "step": 80
     },
     {
       "epoch": 0.2351313969571231,
+      "grad_norm": 0.2355058640241623,
       "learning_rate": 0.00018936326403234125,
+      "loss": 0.8759,
       "step": 85
     },
     {
       "epoch": 0.24896265560165975,
+      "grad_norm": 0.20683090388774872,
       "learning_rate": 0.00018708327540784922,
+      "loss": 0.8758,
       "step": 90
     },
     {
       "epoch": 0.2627939142461964,
+      "grad_norm": 0.21719329059123993,
       "learning_rate": 0.0001845986425919841,
+      "loss": 0.8571,
       "step": 95
     },
     {
       "epoch": 0.2766251728907331,
+      "grad_norm": 0.20208917558193207,
       "learning_rate": 0.0001819152044288992,
+      "loss": 0.859,
       "step": 100
     },
     {
       "epoch": 0.29045643153526973,
+      "grad_norm": 0.18826699256896973,
       "learning_rate": 0.00017903926695187595,
+      "loss": 0.8427,
       "step": 105
     },
     {
       "epoch": 0.30428769017980634,
+      "grad_norm": 0.18175852298736572,
       "learning_rate": 0.00017597758856425494,
+      "loss": 0.8389,
       "step": 110
     },
     {
       "epoch": 0.318118948824343,
+      "grad_norm": 0.17405715584754944,
       "learning_rate": 0.00017273736415730488,
+      "loss": 0.8185,
       "step": 115
     },
     {
       "epoch": 0.33195020746887965,
+      "grad_norm": 0.15530933439731598,
       "learning_rate": 0.00016932620820235244,
+      "loss": 0.8249,
       "step": 120
     },
     {
       "epoch": 0.3457814661134163,
+      "grad_norm": 0.17757271230220795,
       "learning_rate": 0.0001657521368569064,
+      "loss": 0.7947,
       "step": 125
     },
     {
       "epoch": 0.359612724757953,
+      "grad_norm": 0.18264907598495483,
       "learning_rate": 0.000162023549126826,
+      "loss": 0.8021,
       "step": 130
     },
     {
       "epoch": 0.37344398340248963,
+      "grad_norm": 0.18304209411144257,
       "learning_rate": 0.00015814920712880267,
+      "loss": 0.8039,
       "step": 135
     },
     {
       "epoch": 0.3872752420470263,
+      "grad_norm": 0.16061393916606903,
       "learning_rate": 0.00015413821549953698,
+      "loss": 0.792,
       "step": 140
     },
     {
       "epoch": 0.40110650069156295,
+      "grad_norm": 0.1555311381816864,
       "learning_rate": 0.00015000000000000001,
+      "loss": 0.7948,
       "step": 145
     },
     {
       "epoch": 0.4149377593360996,
+      "grad_norm": 0.15761056542396545,
       "learning_rate": 0.0001457442853650581,
+      "loss": 0.7768,
       "step": 150
     },
     {
       "epoch": 0.4287690179806362,
+      "grad_norm": 0.1716078668832779,
       "learning_rate": 0.00014138107245051392,
+      "loss": 0.7758,
       "step": 155
     },
     {
       "epoch": 0.4426002766251729,
+      "grad_norm": 0.1470308154821396,
       "learning_rate": 0.00013692061473126845,
+      "loss": 0.7578,
       "step": 160
     },
     {
       "epoch": 0.45643153526970953,
+      "grad_norm": 0.15690156817436218,
       "learning_rate": 0.00013237339420583212,
+      "loss": 0.7619,
       "step": 165
     },
     {
       "epoch": 0.4702627939142462,
+      "grad_norm": 0.17660725116729736,
       "learning_rate": 0.00012775009676380957,
+      "loss": 0.7567,
       "step": 170
     },
     {
       "epoch": 0.48409405255878285,
+      "grad_norm": 0.13694822788238525,
       "learning_rate": 0.00012306158707424403,
+      "loss": 0.7569,
       "step": 175
     },
     {
       "epoch": 0.4979253112033195,
+      "grad_norm": 0.12447214871644974,
       "learning_rate": 0.00011831888305383268,
+      "loss": 0.7414,
       "step": 180
     },
     {
       "epoch": 0.5117565698478561,
+      "grad_norm": 0.13208778202533722,
       "learning_rate": 0.00011353312997501313,
+      "loss": 0.7495,
       "step": 185
     },
     {
       "epoch": 0.5255878284923928,
+      "grad_norm": 0.13374905288219452,
       "learning_rate": 0.00010871557427476583,
+      "loss": 0.7467,
       "step": 190
     },
     {
       "epoch": 0.5394190871369294,
+      "grad_norm": 0.14392955601215363,
       "learning_rate": 0.0001038775371256817,
+      "loss": 0.7388,
       "step": 195
     },
     {
       "epoch": 0.5532503457814661,
+      "grad_norm": 0.13033545017242432,
       "learning_rate": 9.903038783140216e-05,
+      "loss": 0.7239,
       "step": 200
     },
     {
       "epoch": 0.5670816044260027,
+      "grad_norm": 0.12652400135993958,
       "learning_rate": 9.418551710895243e-05,
+      "loss": 0.7251,
       "step": 205
     },
     {
       "epoch": 0.5809128630705395,
+      "grad_norm": 0.12813538312911987,
       "learning_rate": 8.935431032075318e-05,
+      "loss": 0.7206,
       "step": 210
     },
     {
       "epoch": 0.5947441217150761,
+      "grad_norm": 0.13136501610279083,
       "learning_rate": 8.454812071921596e-05,
+      "loss": 0.721,
       "step": 215
     },
     {
       "epoch": 0.6085753803596127,
+      "grad_norm": 0.13638000190258026,
       "learning_rate": 7.977824276679623e-05,
+      "loss": 0.7134,
       "step": 220
     },
     {
       "epoch": 0.6224066390041494,
+      "grad_norm": 0.13380198180675507,
       "learning_rate": 7.505588559420189e-05,
+      "loss": 0.7158,
       "step": 225
     },
     {
       "epoch": 0.636237897648686,
+      "grad_norm": 0.13291427493095398,
       "learning_rate": 7.039214665913003e-05,
+      "loss": 0.7068,
       "step": 230
     },
     {
       "epoch": 0.6500691562932227,
+      "grad_norm": 0.12505605816841125,
       "learning_rate": 6.579798566743314e-05,
+      "loss": 0.7109,
       "step": 235
     },
     {
       "epoch": 0.6639004149377593,
+      "grad_norm": 0.11483744531869888,
       "learning_rate": 6.128419881799996e-05,
+      "loss": 0.6962,
       "step": 240
     },
     {
       "epoch": 0.677731673582296,
+      "grad_norm": 0.1254301220178604,
       "learning_rate": 5.6861393431874675e-05,
+      "loss": 0.6944,
       "step": 245
     },
     {
       "epoch": 0.6915629322268326,
+      "grad_norm": 0.13567984104156494,
       "learning_rate": 5.253996302523596e-05,
+      "loss": 0.6865,
       "step": 250
     },
     {
       "epoch": 0.7053941908713693,
+      "grad_norm": 0.1235489696264267,
       "learning_rate": 4.833006288481371e-05,
+      "loss": 0.6807,
       "step": 255
     },
     {
       "epoch": 0.719225449515906,
+      "grad_norm": 0.13388977944850922,
       "learning_rate": 4.424158620314073e-05,
+      "loss": 0.6881,
       "step": 260
     },
     {
       "epoch": 0.7330567081604425,
+      "grad_norm": 0.12815245985984802,
       "learning_rate": 4.028414082972141e-05,
+      "loss": 0.6842,
       "step": 265
     },
     {
       "epoch": 0.7468879668049793,
+      "grad_norm": 0.1258043646812439,
       "learning_rate": 3.646702669275151e-05,
+      "loss": 0.6832,
       "step": 270
     },
     {
       "epoch": 0.7607192254495159,
+      "grad_norm": 0.11947453022003174,
       "learning_rate": 3.279921394444776e-05,
+      "loss": 0.6672,
       "step": 275
     },
     {
       "epoch": 0.7745504840940526,
+      "grad_norm": 0.12488783895969391,
       "learning_rate": 2.9289321881345254e-05,
+      "loss": 0.6729,
       "step": 280
     },
     {
       "epoch": 0.7883817427385892,
+      "grad_norm": 0.11996188759803772,
       "learning_rate": 2.594559868909956e-05,
+      "loss": 0.6641,
       "step": 285
     },
     {
       "epoch": 0.8022130013831259,
+      "grad_norm": 0.12338840216398239,
       "learning_rate": 2.2775902059393085e-05,
+      "loss": 0.6618,
       "step": 290
     },
     {
       "epoch": 0.8160442600276625,
+      "grad_norm": 0.11500907689332962,
       "learning_rate": 1.9787680724495617e-05,
+      "loss": 0.6576,
       "step": 295
     },
     {
       "epoch": 0.8298755186721992,
+      "grad_norm": 0.1203397586941719,
       "learning_rate": 1.698795695287212e-05,
+      "loss": 0.6579,
       "step": 300
     },
     {
       "epoch": 0.8437067773167358,
+      "grad_norm": 0.11593286693096161,
       "learning_rate": 1.4383310046973365e-05,
+      "loss": 0.659,
       "step": 305
     },
     {
       "epoch": 0.8575380359612724,
+      "grad_norm": 0.10674016922712326,
       "learning_rate": 1.1979860881988902e-05,
+      "loss": 0.6581,
       "step": 310
     },
     {
       "epoch": 0.8713692946058091,
+      "grad_norm": 0.1114317774772644,
       "learning_rate": 9.783257521896227e-06,
+      "loss": 0.6489,
       "step": 315
     },
     {
       "epoch": 0.8852005532503457,
+      "grad_norm": 0.11088614910840988,
       "learning_rate": 7.798661946608166e-06,
+      "loss": 0.6485,
       "step": 320
     },
     {
       "epoch": 0.8990318118948825,
+      "grad_norm": 0.10715563595294952,
       "learning_rate": 6.030737921409169e-06,
+      "loss": 0.645,
       "step": 325
     },
     {
       "epoch": 0.9128630705394191,
+      "grad_norm": 0.11442163586616516,
       "learning_rate": 4.4836400371876974e-06,
+      "loss": 0.64,
       "step": 330
     },
     {
       "epoch": 0.9266943291839558,
+      "grad_norm": 0.1089484840631485,
       "learning_rate": 3.161003947219421e-06,
+      "loss": 0.6336,
       "step": 335
     },
     {
       "epoch": 0.9405255878284924,
+      "grad_norm": 0.10584916174411774,
       "learning_rate": 2.0659378234448525e-06,
+      "loss": 0.665,
       "step": 340
     },
     {
       "epoch": 0.9543568464730291,
+      "grad_norm": 0.10534138232469559,
       "learning_rate": 1.201015052319099e-06,
+      "loss": 0.6455,
       "step": 345
     },
     {
       "epoch": 0.9681881051175657,
+      "grad_norm": 0.1038522943854332,
       "learning_rate": 5.682681873981577e-07,
+      "loss": 0.6406,
       "step": 350
     },
     {
       "epoch": 0.9820193637621023,
+      "grad_norm": 0.10471897572278976,
       "learning_rate": 1.6918417287318245e-07,
+      "loss": 0.6396,
       "step": 355
     },
     {
       "epoch": 0.995850622406639,
+      "grad_norm": 0.10800525546073914,
       "learning_rate": 4.700849277383679e-09,
+      "loss": 0.6434,
       "step": 360
     },
     {
       "epoch": 0.9986168741355463,
+      "eval_loss": 0.6599090695381165,
+      "eval_runtime": 53.0431,
+      "eval_samples_per_second": 86.91,
+      "eval_steps_per_second": 2.734,
       "step": 361
     },
     {
       "epoch": 0.9986168741355463,
       "step": 361,
+      "total_flos": 1.7115790489220547e+18,
+      "train_loss": 0.861595592175164,
+      "train_runtime": 2353.9448,
+      "train_samples_per_second": 19.656,
+      "train_steps_per_second": 0.153
     }
   ],
   "logging_steps": 5,
       "attributes": {}
     }
   },
+  "total_flos": 1.7115790489220547e+18,
   "train_batch_size": 4,
   "trial_name": null,
   "trial_params": null

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:475f29db775c3fa3ebae0c3997a227d93f50e4e631d281907a24df8c23250da0
 size 7096

 version https://git-lfs.github.com/spec/v1
+oid sha256:4394984455d4ffe3e51e3b2431658cf9b616f4718e0ca4da0047bdbe4ff3859e
 size 7096