hara-CU
/

Advanced_FinalCandidate_482

Text Generation

Model card Files Files and versions

hara-CU commited on 2 days ago

Commit

46fd128

·

verified ·

1 Parent(s): f4ffc73

Update README.md

Files changed (1) hide show

README.md +5 -22

README.md CHANGED Viewed

@@ -17,36 +17,19 @@ tags:
 # Qwen3-4B-DBbase_AW_345NoEAd_ALFformat_QH5L4R5_1392-r16a32-B16-2ep-5e6
-This repository provides a **LoRA adapter** fine-tuned from
-**Qwen/Qwen3-4B-Instruct-2507** using **LoRA + Unsloth**.
-## Training Objective
-This model is trained to improve **multi-turn agent task performance**
-on ALFWorld (household tasks) and DBBench (database operations).
-Loss is applied to **all assistant turns** in the multi-turn trajectory,
-enabling the model to learn environment observation, action selection,
-tool use, and recovery from errors.
-## Training Configuration
-- Base model: Qwen/Qwen3-4B-Instruct-2507
-- Method: LoRA (full precision base)
-- Max sequence length: 8192
-- Epochs: 2
-- Learning rate: 5e-06
-- LoRA: r=16, alpha=32, use_rslora=False
-- TOTAL_BATCH_SIZE: 16
 ## Usage
 ```python
 from transformers import AutoModelForCausalLM, AutoTokenizer
 import torch
-model_id = "hara-CU/Qwen3-4B-DBbase_AW_345NoEAd_ALFformat_QH5L4R5_1392-r16a32-B16-2ep-5e6"
 tokenizer = AutoTokenizer.from_pretrained(model_id)
 model = AutoModelForCausalLM.from_pretrained(

 # Qwen3-4B-DBbase_AW_345NoEAd_ALFformat_QH5L4R5_1392-r16a32-B16-2ep-5e6
+This repository contains the **full-merged 16-bit weights** fine-tuned from
+**Qwen/Qwen3-4B-Instruct-2507** using **LoRA + Unsloth**.
+No adapter loading is required.
 ## Usage
+Since this is a merged model, you can use it directly with `transformers`.
 ```python
 from transformers import AutoModelForCausalLM, AutoTokenizer
 import torch
+model_id = "hara-CU/Advanced_FinalCandidate_482"
 tokenizer = AutoTokenizer.from_pretrained(model_id)
 model = AutoModelForCausalLM.from_pretrained(