Ba2han commited on
Commit
2acd93a
·
verified ·
1 Parent(s): 960149a

Training in progress, step 175

Browse files
Files changed (4) hide show
  1. README.md +4 -4
  2. config.json +1 -1
  3. model.safetensors +2 -2
  4. training_args.bin +1 -1
README.md CHANGED
@@ -1,18 +1,18 @@
1
  ---
2
- base_model: Ba2han/checkpoint-10398
3
  library_name: transformers
4
  model_name: qwen-test-3
5
  tags:
6
  - generated_from_trainer
7
- - sft
8
  - unsloth
9
  - trl
 
10
  licence: license
11
  ---
12
 
13
  # Model Card for qwen-test-3
14
 
15
- This model is a fine-tuned version of [Ba2han/checkpoint-10398](https://huggingface.co/Ba2han/checkpoint-10398).
16
  It has been trained using [TRL](https://github.com/huggingface/trl).
17
 
18
  ## Quick start
@@ -28,7 +28,7 @@ print(output["generated_text"])
28
 
29
  ## Training procedure
30
 
31
- [<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="150" height="24"/>](https://wandb.ai/batuhan409/huggingface/runs/v4ciay17)
32
 
33
 
34
  This model was trained with SFT.
 
1
  ---
2
+ base_model: Ba2han/qwen_test_residual-attn
3
  library_name: transformers
4
  model_name: qwen-test-3
5
  tags:
6
  - generated_from_trainer
 
7
  - unsloth
8
  - trl
9
+ - sft
10
  licence: license
11
  ---
12
 
13
  # Model Card for qwen-test-3
14
 
15
+ This model is a fine-tuned version of [Ba2han/qwen_test_residual-attn](https://huggingface.co/Ba2han/qwen_test_residual-attn).
16
  It has been trained using [TRL](https://github.com/huggingface/trl).
17
 
18
  ## Quick start
 
28
 
29
  ## Training procedure
30
 
31
+ [<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="150" height="24"/>](https://wandb.ai/batuhan409/huggingface/runs/kng3dzth)
32
 
33
 
34
  This model was trained with SFT.
config.json CHANGED
@@ -56,7 +56,7 @@
56
  ],
57
  "max_position_embeddings": 8192,
58
  "max_window_layers": 40,
59
- "model_name": "Ba2han/checkpoint-10398",
60
  "model_type": "qwen3",
61
  "num_attention_heads": 8,
62
  "num_hidden_layers": 40,
 
56
  ],
57
  "max_position_embeddings": 8192,
58
  "max_window_layers": 40,
59
+ "model_name": "Ba2han/qwen_test_residual-attn",
60
  "model_type": "qwen3",
61
  "num_attention_heads": 8,
62
  "num_hidden_layers": 40,
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:6a56297574b4044672a052e5e09d1309c287bc5ce144a4f544242b90c842ba35
3
- size 1310251320
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:1866ec4bd5092b8d4898c2449a7271f11d8607a9dfd140824bb6c8b6cdd33dbc
3
+ size 1311381296
training_args.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:0a76cd6d33964d0fcbd52452a6f79fd3cf5d98df925136542bca5f615279d8e6
3
  size 5713
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:625fd9bf830aafe7a1352d1dcc32d6415d22211279fdb29e6da043d6f7675a8c
3
  size 5713