Upload merged Qwen3-4B-Instruct-2507 model (auto-generated README)

Files changed (4) hide show

README.md CHANGED Viewed

@@ -1,11 +1,11 @@
 ---
-base_model: Qwen/Qwen3-4B-Instruct-2507
 datasets:
-- ALFWorld1
 language:
 - en
 license: apache-2.0
-library_name: peft
 pipeline_tag: text-generation
 tags:
 - lora
@@ -17,11 +17,8 @@ tags:
 # ＜qwen3-4b-agent-trajectory-lora＞
-This repository provides a **LoRA adapter** fine-tuned from
-**Qwen/Qwen3-4B-Instruct-2507** using **LoRA + Unsloth**.
-This repository contains **LoRA adapter weights only**.
-The base model must be loaded separately.
 ## Training Objective
@@ -34,8 +31,10 @@ tool use, and recovery from errors.
 ## Training Configuration
-- Base model: Qwen/Qwen3-4B-Instruct-2507
-- Method: LoRA (full precision base)
 - Max sequence length: 2048
 - Epochs: 2
 - Learning rate: 2e-06
@@ -45,24 +44,22 @@ tool use, and recovery from errors.
 ```python
 from transformers import AutoModelForCausalLM, AutoTokenizer
-from peft import PeftModel
 import torch
-base = "Qwen/Qwen3-4B-Instruct-2507"
-adapter = "your_id/your-repo"
-tokenizer = AutoTokenizer.from_pretrained(base)
 model = AutoModelForCausalLM.from_pretrained(
-    base,
     torch_dtype=torch.float16,
     device_map="auto",
 )
-model = PeftModel.from_pretrained(model, adapter)
 ```
 ## Sources & Terms (IMPORTANT)
-Training data: ALFWorld1
 Dataset License: MIT License. This dataset is used and distributed under the terms of the MIT License.
 Compliance: Users must comply with the MIT license (including copyright notice) and the base model's original terms of use.

 ---
+base_model: unsloth/Qwen3-4B-Instruct-2507
 datasets:
+- u-10bei/dbbench_sft_dataset_react
 language:
 - en
 license: apache-2.0
+library_name: transformers
 pipeline_tag: text-generation
 tags:
 - lora
 # ＜qwen3-4b-agent-trajectory-lora＞
+This repository provides a merged model that includes both the base model
+**unsloth/Qwen3-4B-Instruct-2507** and the LoRA adapter. No separate LoRA loading is required.
 ## Training Objective
 ## Training Configuration
+- Base model: unsloth/Qwen3-4B-Instruct-2507
+- Method: LoRA
+  - dtype: torch.bfloat16
+  - load_in_4bit: False
 - Max sequence length: 2048
 - Epochs: 2
 - Learning rate: 2e-06
 ```python
 from transformers import AutoModelForCausalLM, AutoTokenizer
 import torch
+model_id = da1ch812/advanced-comp-model
+tokenizer = AutoTokenizer.from_pretrained(model_id)
 model = AutoModelForCausalLM.from_pretrained(
+    model_id,
     torch_dtype=torch.float16,
     device_map="auto",
 )
 ```
 ## Sources & Terms (IMPORTANT)
+Training data:
+ - u-10bei/dbbench_sft_dataset_react
 Dataset License: MIT License. This dataset is used and distributed under the terms of the MIT License.
 Compliance: Users must comply with the MIT license (including copyright notice) and the base model's original terms of use.

config.json CHANGED Viewed

@@ -4,7 +4,7 @@
   ],
   "attention_bias": false,
   "attention_dropout": 0.0,
-  "dtype": "float16",
   "eos_token_id": 151645,
   "head_dim": 128,
   "hidden_act": "silu",

   ],
   "attention_bias": false,
   "attention_dropout": 0.0,
+  "dtype": "bfloat16",
   "eos_token_id": 151645,
   "head_dim": 128,
   "hidden_act": "silu",

model-00001-of-00002.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:8a2766b91404181283f47a2560bc92500452b6242762b116ca0f9e6c65794d26
-size 4967215128

 version https://git-lfs.github.com/spec/v1
+oid sha256:b548ca350ccfd7e6b50af77cd5cb296374c029cb94acba85cae8854fa357735f
+size 4967215360

model-00002-of-00002.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:3a94c6c210b6ce9aa28a39026d4d59f240071dc216f7da1ca2fd0603c8b77ea0
-size 3077766464

 version https://git-lfs.github.com/spec/v1
+oid sha256:c06842070760200934a1818592a41c844bb6b9d8455328c7a3c395b4a6398b59
+size 3077766632