Training in progress, step 100

Files changed (4) hide show

README.md CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
-base_model: unsloth/qwen3-0.6b-unsloth-bnb-4bit
 library_name: transformers
-model_name: punctuation_128
 tags:
 - generated_from_trainer
 - unsloth
@@ -10,9 +10,9 @@ tags:
 licence: license
 ---
-# Model Card for punctuation_128
-This model is a fine-tuned version of [unsloth/qwen3-0.6b-unsloth-bnb-4bit](https://huggingface.co/unsloth/qwen3-0.6b-unsloth-bnb-4bit).
 It has been trained using [TRL](https://github.com/huggingface/trl).
 ## Quick start
@@ -21,14 +21,14 @@ It has been trained using [TRL](https://github.com/huggingface/trl).
 from transformers import pipeline
 question = "If you had a time machine, but could only go to the past or the future once and never return, which would you choose and why?"
-generator = pipeline("text-generation", model="picard47at/punctuation_128", device="cuda")
 output = generator([{"role": "user", "content": question}], max_new_tokens=128, return_full_text=False)[0]
 print(output["generated_text"])
 ```
 ## Training procedure
-[<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="150" height="24"/>](https://wandb.ai/picardtseng-pesi/punctuation/runs/u1163x5c)
 This model was trained with SFT.
@@ -39,7 +39,7 @@ This model was trained with SFT.
 - Transformers: 4.51.3
 - Pytorch: 2.7.0
 - Datasets: 3.6.0
-- Tokenizers: 0.21.0
 ## Citations

 ---
+base_model: unsloth/qwen3-1.7b-unsloth-bnb-4bit
 library_name: transformers
+model_name: punctuation_512
 tags:
 - generated_from_trainer
 - unsloth
 licence: license
 ---
+# Model Card for punctuation_512
+This model is a fine-tuned version of [unsloth/qwen3-1.7b-unsloth-bnb-4bit](https://huggingface.co/unsloth/qwen3-1.7b-unsloth-bnb-4bit).
 It has been trained using [TRL](https://github.com/huggingface/trl).
 ## Quick start
 from transformers import pipeline
 question = "If you had a time machine, but could only go to the past or the future once and never return, which would you choose and why?"
+generator = pipeline("text-generation", model="picard47at/punctuation_512", device="cuda")
 output = generator([{"role": "user", "content": question}], max_new_tokens=128, return_full_text=False)[0]
 print(output["generated_text"])
 ```
 ## Training procedure
+[<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="150" height="24"/>](https://wandb.ai/picardtseng-pesi/punctuation/runs/gghra5tk)
 This model was trained with SFT.
 - Transformers: 4.51.3
 - Pytorch: 2.7.0
 - Datasets: 3.6.0
+- Tokenizers: 0.21.1
 ## Citations

adapter_config.json CHANGED Viewed

@@ -24,8 +24,8 @@
   "rank_pattern": {},
   "revision": null,
   "target_modules": [
-    "q_proj",
-    "v_proj"
   ],
   "task_type": "CAUSAL_LM",
   "trainable_token_indices": null,

   "rank_pattern": {},
   "revision": null,
   "target_modules": [
+    "v_proj",
+    "q_proj"
   ],
   "task_type": "CAUSAL_LM",
   "trainable_token_indices": null,

adapter_model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:1e147ba60049109b0f9031d1807461a6d2060239ff4c192104f14b46a1de70a3
 size 9189904

 version https://git-lfs.github.com/spec/v1
+oid sha256:5a32446a7acfc33ccd63119008b43f052d18bac0c83f5fe79cc4b60873353f11
 size 9189904

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:d6856b5b1ca5318169bd21f8ff7c0c0c9cf7021de2d4e6fa2e71bf644a16bbb1
 size 6033

 version https://git-lfs.github.com/spec/v1
+oid sha256:84aff1c1c0f8723aad093d37ba912fc737efb2d4c96545539999f474102470b6
 size 6033