hara-CU commited on
Commit
46fd128
·
verified ·
1 Parent(s): f4ffc73

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +5 -22
README.md CHANGED
@@ -17,36 +17,19 @@ tags:
17
 
18
  # Qwen3-4B-DBbase_AW_345NoEAd_ALFformat_QH5L4R5_1392-r16a32-B16-2ep-5e6
19
 
20
- This repository provides a **LoRA adapter** fine-tuned from
21
- **Qwen/Qwen3-4B-Instruct-2507** using **LoRA + Unsloth**.
 
22
 
23
 
24
- ## Training Objective
25
-
26
- This model is trained to improve **multi-turn agent task performance**
27
- on ALFWorld (household tasks) and DBBench (database operations).
28
-
29
- Loss is applied to **all assistant turns** in the multi-turn trajectory,
30
- enabling the model to learn environment observation, action selection,
31
- tool use, and recovery from errors.
32
-
33
- ## Training Configuration
34
-
35
- - Base model: Qwen/Qwen3-4B-Instruct-2507
36
- - Method: LoRA (full precision base)
37
- - Max sequence length: 8192
38
- - Epochs: 2
39
- - Learning rate: 5e-06
40
- - LoRA: r=16, alpha=32, use_rslora=False
41
- - TOTAL_BATCH_SIZE: 16
42
 
43
  ## Usage
44
-
45
  ```python
46
  from transformers import AutoModelForCausalLM, AutoTokenizer
47
  import torch
48
 
49
- model_id = "hara-CU/Qwen3-4B-DBbase_AW_345NoEAd_ALFformat_QH5L4R5_1392-r16a32-B16-2ep-5e6"
50
 
51
  tokenizer = AutoTokenizer.from_pretrained(model_id)
52
  model = AutoModelForCausalLM.from_pretrained(
 
17
 
18
  # Qwen3-4B-DBbase_AW_345NoEAd_ALFformat_QH5L4R5_1392-r16a32-B16-2ep-5e6
19
 
20
+ This repository contains the **full-merged 16-bit weights** fine-tuned from
21
+ **Qwen/Qwen3-4B-Instruct-2507** using **LoRA + Unsloth**.
22
+ No adapter loading is required.
23
 
24
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
25
 
26
  ## Usage
27
+ Since this is a merged model, you can use it directly with `transformers`.
28
  ```python
29
  from transformers import AutoModelForCausalLM, AutoTokenizer
30
  import torch
31
 
32
+ model_id = "hara-CU/Advanced_FinalCandidate_482"
33
 
34
  tokenizer = AutoTokenizer.from_pretrained(model_id)
35
  model = AutoModelForCausalLM.from_pretrained(