Instructions to use Alphatao/Qwen-sft with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use Alphatao/Qwen-sft with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="Alphatao/Qwen-sft") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("Alphatao/Qwen-sft") model = AutoModelForCausalLM.from_pretrained("Alphatao/Qwen-sft") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use Alphatao/Qwen-sft with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "Alphatao/Qwen-sft" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Alphatao/Qwen-sft", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/Alphatao/Qwen-sft
- SGLang
How to use Alphatao/Qwen-sft with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "Alphatao/Qwen-sft" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Alphatao/Qwen-sft", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "Alphatao/Qwen-sft" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Alphatao/Qwen-sft", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use Alphatao/Qwen-sft with Docker Model Runner:
docker model run hf.co/Alphatao/Qwen-sft
Model save
Browse filesThis view is limited to 50 files because it contains too many changes. See raw diff
- README.md +2 -2
- best/global_step200/bf16_zero_pp_rank_0_mp_rank_00_optim_states.pt +3 -0
- best/global_step200/bf16_zero_pp_rank_1_mp_rank_00_optim_states.pt +3 -0
- best/global_step200/bf16_zero_pp_rank_2_mp_rank_00_optim_states.pt +3 -0
- best/global_step200/bf16_zero_pp_rank_3_mp_rank_00_optim_states.pt +3 -0
- best/global_step200/bf16_zero_pp_rank_4_mp_rank_00_optim_states.pt +3 -0
- best/global_step200/bf16_zero_pp_rank_5_mp_rank_00_optim_states.pt +3 -0
- best/global_step200/bf16_zero_pp_rank_6_mp_rank_00_optim_states.pt +3 -0
- best/global_step200/bf16_zero_pp_rank_7_mp_rank_00_optim_states.pt +3 -0
- best/global_step200/zero_pp_rank_0_mp_rank_00_model_states.pt +3 -0
- best/global_step200/zero_pp_rank_1_mp_rank_00_model_states.pt +3 -0
- best/global_step200/zero_pp_rank_2_mp_rank_00_model_states.pt +3 -0
- best/global_step200/zero_pp_rank_3_mp_rank_00_model_states.pt +3 -0
- best/global_step200/zero_pp_rank_4_mp_rank_00_model_states.pt +3 -0
- best/global_step200/zero_pp_rank_5_mp_rank_00_model_states.pt +3 -0
- best/global_step200/zero_pp_rank_6_mp_rank_00_model_states.pt +3 -0
- best/global_step200/zero_pp_rank_7_mp_rank_00_model_states.pt +3 -0
- best/global_step400/bf16_zero_pp_rank_0_mp_rank_00_optim_states.pt +3 -0
- best/global_step400/bf16_zero_pp_rank_1_mp_rank_00_optim_states.pt +3 -0
- best/global_step400/bf16_zero_pp_rank_2_mp_rank_00_optim_states.pt +3 -0
- best/global_step400/bf16_zero_pp_rank_3_mp_rank_00_optim_states.pt +3 -0
- best/global_step400/bf16_zero_pp_rank_4_mp_rank_00_optim_states.pt +3 -0
- best/global_step400/bf16_zero_pp_rank_5_mp_rank_00_optim_states.pt +3 -0
- best/global_step400/bf16_zero_pp_rank_6_mp_rank_00_optim_states.pt +3 -0
- best/global_step400/bf16_zero_pp_rank_7_mp_rank_00_optim_states.pt +3 -0
- best/global_step400/zero_pp_rank_0_mp_rank_00_model_states.pt +3 -0
- best/global_step400/zero_pp_rank_1_mp_rank_00_model_states.pt +3 -0
- best/global_step400/zero_pp_rank_2_mp_rank_00_model_states.pt +3 -0
- best/global_step400/zero_pp_rank_3_mp_rank_00_model_states.pt +3 -0
- best/global_step400/zero_pp_rank_4_mp_rank_00_model_states.pt +3 -0
- best/global_step400/zero_pp_rank_5_mp_rank_00_model_states.pt +3 -0
- best/global_step400/zero_pp_rank_6_mp_rank_00_model_states.pt +3 -0
- best/global_step400/zero_pp_rank_7_mp_rank_00_model_states.pt +3 -0
- best/latest +1 -1
- best/monitor.json +5 -5
- best/training_args.bin +1 -1
- global_step200/bf16_zero_pp_rank_0_mp_rank_00_optim_states.pt +3 -0
- global_step200/bf16_zero_pp_rank_1_mp_rank_00_optim_states.pt +3 -0
- global_step200/bf16_zero_pp_rank_2_mp_rank_00_optim_states.pt +3 -0
- global_step200/bf16_zero_pp_rank_3_mp_rank_00_optim_states.pt +3 -0
- global_step200/bf16_zero_pp_rank_4_mp_rank_00_optim_states.pt +3 -0
- global_step200/bf16_zero_pp_rank_5_mp_rank_00_optim_states.pt +3 -0
- global_step200/bf16_zero_pp_rank_6_mp_rank_00_optim_states.pt +3 -0
- global_step200/bf16_zero_pp_rank_7_mp_rank_00_optim_states.pt +3 -0
- global_step200/zero_pp_rank_0_mp_rank_00_model_states.pt +3 -0
- global_step200/zero_pp_rank_1_mp_rank_00_model_states.pt +3 -0
- global_step200/zero_pp_rank_2_mp_rank_00_model_states.pt +3 -0
- global_step200/zero_pp_rank_3_mp_rank_00_model_states.pt +3 -0
- global_step200/zero_pp_rank_4_mp_rank_00_model_states.pt +3 -0
- global_step200/zero_pp_rank_5_mp_rank_00_model_states.pt +3 -0
README.md
CHANGED
|
@@ -3,8 +3,8 @@ library_name: transformers
|
|
| 3 |
model_name: Qwen-sft
|
| 4 |
tags:
|
| 5 |
- generated_from_trainer
|
| 6 |
-
- trl
|
| 7 |
- sft
|
|
|
|
| 8 |
licence: license
|
| 9 |
---
|
| 10 |
|
|
@@ -26,7 +26,7 @@ print(output["generated_text"])
|
|
| 26 |
|
| 27 |
## Training procedure
|
| 28 |
|
| 29 |
-
[<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="150" height="24"/>](https://wandb.ai/alphatao-alphatao/Gradients-On-Demand/runs/
|
| 30 |
|
| 31 |
|
| 32 |
This model was trained with SFT.
|
|
|
|
| 3 |
model_name: Qwen-sft
|
| 4 |
tags:
|
| 5 |
- generated_from_trainer
|
|
|
|
| 6 |
- sft
|
| 7 |
+
- trl
|
| 8 |
licence: license
|
| 9 |
---
|
| 10 |
|
|
|
|
| 26 |
|
| 27 |
## Training procedure
|
| 28 |
|
| 29 |
+
[<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="150" height="24"/>](https://wandb.ai/alphatao-alphatao/Gradients-On-Demand/runs/ghwtvbzp)
|
| 30 |
|
| 31 |
|
| 32 |
This model was trained with SFT.
|
best/global_step200/bf16_zero_pp_rank_0_mp_rank_00_optim_states.pt
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:469e1bb6e9632b1494555b5f32439a44857fe3cd415a7d4a87e02460f5f24589
|
| 3 |
+
size 12286638307
|
best/global_step200/bf16_zero_pp_rank_1_mp_rank_00_optim_states.pt
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:e02a28c096e41bc72cd4997909578f24de1c734310d2f4a9b174f8aa4a601ae8
|
| 3 |
+
size 12286638307
|
best/global_step200/bf16_zero_pp_rank_2_mp_rank_00_optim_states.pt
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:445eede74d5503a6083df422a8b0da580240dcf027095932a31235c0700aecf5
|
| 3 |
+
size 12286638307
|
best/global_step200/bf16_zero_pp_rank_3_mp_rank_00_optim_states.pt
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:b33a346b60bd372f94bdbc17b2d077f521e3f26861ef02ed61c2c4c9e31c2737
|
| 3 |
+
size 12286638307
|
best/global_step200/bf16_zero_pp_rank_4_mp_rank_00_optim_states.pt
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:bbb542e3a080adc84c56218b403cbbc8bf5fbb744fc1d862d7b3ef5761d58691
|
| 3 |
+
size 12286638307
|
best/global_step200/bf16_zero_pp_rank_5_mp_rank_00_optim_states.pt
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:b973ffc311cff9c5ca641ced3dc1773eb210b7c7a62940fc33210c3d66934692
|
| 3 |
+
size 12286638307
|
best/global_step200/bf16_zero_pp_rank_6_mp_rank_00_optim_states.pt
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:fe700d997cb26ad73cd4e0d08ff1fac82f6f1075a10d2d83b7917fa9acc7c030
|
| 3 |
+
size 12286638307
|
best/global_step200/bf16_zero_pp_rank_7_mp_rank_00_optim_states.pt
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:d5fee71665931f0dafa4f4f6c3c06f38a676830d5b0d27fc87721ce6de6d77f4
|
| 3 |
+
size 12286638307
|
best/global_step200/zero_pp_rank_0_mp_rank_00_model_states.pt
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:1342d4dc7a4628d877abf125c3f613a182eafb101b5e7f7b86a5a7cef175f0af
|
| 3 |
+
size 206444
|
best/global_step200/zero_pp_rank_1_mp_rank_00_model_states.pt
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:9dfb1286b375cdebbb10946e259bdb10c194cc4c7739fde1ebe2a5b9c0931101
|
| 3 |
+
size 206444
|
best/global_step200/zero_pp_rank_2_mp_rank_00_model_states.pt
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:26d2b81375bae7ff93776b351edc5a7d3905352cf863170ac67939eb30b019bf
|
| 3 |
+
size 206444
|
best/global_step200/zero_pp_rank_3_mp_rank_00_model_states.pt
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:1775ce4acf9c030a131d2ce2727d243c14094ec098699993786082316eb43c9c
|
| 3 |
+
size 206444
|
best/global_step200/zero_pp_rank_4_mp_rank_00_model_states.pt
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:82d4559b5c1bf131c94346a460b0d689044f4998ea94d90b1fc983861365f4b7
|
| 3 |
+
size 206444
|
best/global_step200/zero_pp_rank_5_mp_rank_00_model_states.pt
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:0068af7d59e510134a707bf9e04711c1efbaf1eb8288eaef739836c5e72a7ab3
|
| 3 |
+
size 206444
|
best/global_step200/zero_pp_rank_6_mp_rank_00_model_states.pt
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:d245754563e09c58f14a256494bacd640fb22f19872aa29e0c8e0fe690b89ffa
|
| 3 |
+
size 206444
|
best/global_step200/zero_pp_rank_7_mp_rank_00_model_states.pt
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:cf6175e0cab44eabd60a29b8eac80ca684dbc24c7e0a21721ddad15c5850a129
|
| 3 |
+
size 206444
|
best/global_step400/bf16_zero_pp_rank_0_mp_rank_00_optim_states.pt
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:2bdb4fccfd367cc09ef997ccbc2dc8a3050d8cd68b0a8174b4f9460b2cbdefc1
|
| 3 |
+
size 12286638307
|
best/global_step400/bf16_zero_pp_rank_1_mp_rank_00_optim_states.pt
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:8c9b87bf6f188098e523e45cb0d8c03b32953bdf47694ac709993d10ca2f63a6
|
| 3 |
+
size 12286638307
|
best/global_step400/bf16_zero_pp_rank_2_mp_rank_00_optim_states.pt
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:ec3119cb5fbfe33480d777832b57253231005662916c0f22b2a49c55a1ace1e8
|
| 3 |
+
size 12286638307
|
best/global_step400/bf16_zero_pp_rank_3_mp_rank_00_optim_states.pt
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:d3cc21c19e09cb2f588a2b804a95ef4395b1b9841b0ee0ab39426f3d94f27a1d
|
| 3 |
+
size 12286638307
|
best/global_step400/bf16_zero_pp_rank_4_mp_rank_00_optim_states.pt
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:effa5c55f7ea8249c6aea9ddc89ac9c297a155731ae4e5848903367f08e6d91d
|
| 3 |
+
size 12286638307
|
best/global_step400/bf16_zero_pp_rank_5_mp_rank_00_optim_states.pt
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:7354a9ebc2b3023c4e94c426035f9e63e817d64c2943b9722b04bb0fc653be49
|
| 3 |
+
size 12286638307
|
best/global_step400/bf16_zero_pp_rank_6_mp_rank_00_optim_states.pt
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:e54ecd6a1a1122fff73dbe1700271827a87043a2533bb94b19902be4dcc6a6af
|
| 3 |
+
size 12286638307
|
best/global_step400/bf16_zero_pp_rank_7_mp_rank_00_optim_states.pt
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:15247f8fae2e4cad91c403f0c02f2422fcda69b3e4a416805ec5049b05e4312b
|
| 3 |
+
size 12286638307
|
best/global_step400/zero_pp_rank_0_mp_rank_00_model_states.pt
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:6b870f2f4baeb7a2fdfc2a4e93b8db1f9ddc8f8f4c92c0112abcedf99c2cca2a
|
| 3 |
+
size 206444
|
best/global_step400/zero_pp_rank_1_mp_rank_00_model_states.pt
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:35ef2a12c409ee0f640576170a918519c5fe5c9e3e53577762f7949150b1c116
|
| 3 |
+
size 206444
|
best/global_step400/zero_pp_rank_2_mp_rank_00_model_states.pt
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:6ecffb5bc7362f7c6b7080d1a7afd1be2cd193861e0d7e6810f43ee361ff103b
|
| 3 |
+
size 206444
|
best/global_step400/zero_pp_rank_3_mp_rank_00_model_states.pt
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:5570a60df26914e4604c4db34973bbab7ae5596b6475685c13989709aba5cb64
|
| 3 |
+
size 206444
|
best/global_step400/zero_pp_rank_4_mp_rank_00_model_states.pt
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:4b22e39e7fde70ff2976af39286f69f42a82e8cdce47796c551557a077abceca
|
| 3 |
+
size 206444
|
best/global_step400/zero_pp_rank_5_mp_rank_00_model_states.pt
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:caab3a7de0a470fd89541b02334ef4583e79d16c5bf1b216a6c8dce447820638
|
| 3 |
+
size 206444
|
best/global_step400/zero_pp_rank_6_mp_rank_00_model_states.pt
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:c77262ec4d9d2bfaf31980bb6bd7403fd122e3e4d5f8fb5df6bd0033edcd81fa
|
| 3 |
+
size 206444
|
best/global_step400/zero_pp_rank_7_mp_rank_00_model_states.pt
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:809045676b4efa3eb66f271d58500fc59a2abd7204a17316185499bcb4589fa0
|
| 3 |
+
size 206444
|
best/latest
CHANGED
|
@@ -1 +1 @@
|
|
| 1 |
-
|
|
|
|
| 1 |
+
global_step400
|
best/monitor.json
CHANGED
|
@@ -1,7 +1,7 @@
|
|
| 1 |
{
|
| 2 |
-
"global_step":
|
| 3 |
-
"test1_loss": 0.
|
| 4 |
-
"test2_loss": 0.
|
| 5 |
-
"test3_loss": 0.
|
| 6 |
-
"combined_test_loss": 0.
|
| 7 |
}
|
|
|
|
| 1 |
{
|
| 2 |
+
"global_step": 400,
|
| 3 |
+
"test1_loss": 0.5453783056361186,
|
| 4 |
+
"test2_loss": 0.6494426287417849,
|
| 5 |
+
"test3_loss": 0.3703578654676676,
|
| 6 |
+
"combined_test_loss": 0.6494426287417849
|
| 7 |
}
|
best/training_args.bin
CHANGED
|
@@ -1,3 +1,3 @@
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:
|
| 3 |
size 7633
|
|
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:358bf0a7ae023dadd4b1324f65438f9ae4887b0c804696199d43876e58313b2d
|
| 3 |
size 7633
|
global_step200/bf16_zero_pp_rank_0_mp_rank_00_optim_states.pt
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:469e1bb6e9632b1494555b5f32439a44857fe3cd415a7d4a87e02460f5f24589
|
| 3 |
+
size 12286638307
|
global_step200/bf16_zero_pp_rank_1_mp_rank_00_optim_states.pt
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:e02a28c096e41bc72cd4997909578f24de1c734310d2f4a9b174f8aa4a601ae8
|
| 3 |
+
size 12286638307
|
global_step200/bf16_zero_pp_rank_2_mp_rank_00_optim_states.pt
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:445eede74d5503a6083df422a8b0da580240dcf027095932a31235c0700aecf5
|
| 3 |
+
size 12286638307
|
global_step200/bf16_zero_pp_rank_3_mp_rank_00_optim_states.pt
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:b33a346b60bd372f94bdbc17b2d077f521e3f26861ef02ed61c2c4c9e31c2737
|
| 3 |
+
size 12286638307
|
global_step200/bf16_zero_pp_rank_4_mp_rank_00_optim_states.pt
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:bbb542e3a080adc84c56218b403cbbc8bf5fbb744fc1d862d7b3ef5761d58691
|
| 3 |
+
size 12286638307
|
global_step200/bf16_zero_pp_rank_5_mp_rank_00_optim_states.pt
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:b973ffc311cff9c5ca641ced3dc1773eb210b7c7a62940fc33210c3d66934692
|
| 3 |
+
size 12286638307
|
global_step200/bf16_zero_pp_rank_6_mp_rank_00_optim_states.pt
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:fe700d997cb26ad73cd4e0d08ff1fac82f6f1075a10d2d83b7917fa9acc7c030
|
| 3 |
+
size 12286638307
|
global_step200/bf16_zero_pp_rank_7_mp_rank_00_optim_states.pt
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:d5fee71665931f0dafa4f4f6c3c06f38a676830d5b0d27fc87721ce6de6d77f4
|
| 3 |
+
size 12286638307
|
global_step200/zero_pp_rank_0_mp_rank_00_model_states.pt
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:1342d4dc7a4628d877abf125c3f613a182eafb101b5e7f7b86a5a7cef175f0af
|
| 3 |
+
size 206444
|
global_step200/zero_pp_rank_1_mp_rank_00_model_states.pt
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:9dfb1286b375cdebbb10946e259bdb10c194cc4c7739fde1ebe2a5b9c0931101
|
| 3 |
+
size 206444
|
global_step200/zero_pp_rank_2_mp_rank_00_model_states.pt
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:26d2b81375bae7ff93776b351edc5a7d3905352cf863170ac67939eb30b019bf
|
| 3 |
+
size 206444
|
global_step200/zero_pp_rank_3_mp_rank_00_model_states.pt
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:1775ce4acf9c030a131d2ce2727d243c14094ec098699993786082316eb43c9c
|
| 3 |
+
size 206444
|
global_step200/zero_pp_rank_4_mp_rank_00_model_states.pt
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:82d4559b5c1bf131c94346a460b0d689044f4998ea94d90b1fc983861365f4b7
|
| 3 |
+
size 206444
|
global_step200/zero_pp_rank_5_mp_rank_00_model_states.pt
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:0068af7d59e510134a707bf9e04711c1efbaf1eb8288eaef739836c5e72a7ab3
|
| 3 |
+
size 206444
|