doabell
/

dependent-qlora

text-generation-inference

Model card Files Files and versions

dependent-qlora / README.md

doabell's picture

Update README.md

f1c513b verified 9 months ago

|

2.7 kB

	---
	base_model: unsloth/Qwen3-14B-unsloth-bnb-4bit
	tags:
	- text-generation-inference
	- transformers
	- unsloth
	- qwen3
	- trl
	license: apache-2.0
	language:
	- en
	---

	I edited my `README.md` locally, but `unsloth` hijacked it. That's not good.

	Fine-tuned LoRA adapter from `unsloth/Qwen3-14B-unsloth-bnb-4bit` using `unsloth`.

	Based on [this tutorial](https://docs.unsloth.ai/basics/qwen3-how-to-run-and-fine-tune).

	## Data

	Training data is 237 scenarios for dependents eligible under [26 U.S.C. 152 (a)-(d)](https://www.law.cornell.edu/uscode/text/26/152), generated with `gemini-2.5-pro-preview-03-25`, but not checked for correctness.

	Training arguments on A100 (40GB):

	```python
	TrainingArguments(
	per_device_train_batch_size=8,
	gradient_accumulation_steps=4,
	num_train_epochs=16,
	warmup_steps=16,
	learning_rate=2e-4,
	fp16=not is_bfloat16_supported(),
	bf16=is_bfloat16_supported(),
	logging_steps=10,
	optim="adamw_8bit",
	weight_decay=0.01,
	lr_scheduler_type="linear",
	seed=3407, # https://arxiv.org/abs/2109.08203
	output_dir="outputs",
	report_to="none",
	)
	```

	## Usage

	To use:
	```python
	from unsloth import FastLanguageModel

	max_seq_length = 2048
	dtype = None
	load_in_4bit = True

	model, tokenizer = FastLanguageModel.from_pretrained(
	model_name="doabell/dependent-qlora", # YOUR MODEL YOU USED FOR TRAINING
	max_seq_length=max_seq_length,
	dtype=dtype,
	load_in_4bit=load_in_4bit,
	)
	FastLanguageModel.for_inference(model)
	```

	Template:
	```python
	alpaca_prompt = """Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.

	### Instruction:
	You are an experienced lawyer in dealing with dependents for US tax purposes.

	### Input:
	{}

	### Response:
	{}"""
	```


	Streaming:
	```python
	from transformers import TextStreamer

	FastLanguageModel.for_inference(model)
	inputs = tokenizer(
	[
	alpaca_prompt.format(
	"Can I claim my 7 year old son? He is an instagram influencer and earned $505 last year.",
	"",
	)
	],
	return_tensors="pt",
	).to("cuda")

	text_streamer = TextStreamer(tokenizer)
	_ = model.generate(**inputs, streamer=text_streamer, max_new_tokens=256)
	```

	No streaming:
	```python
	FastLanguageModel.for_inference(model)
	inputs = tokenizer(
	[
	alpaca_prompt.format(
	"Can I claim my 7 year old son? He is an instagram influencer and earned $5050 last year.",
	"",
	)
	],
	return_tensors="pt",
	).to("cuda")

	outputs = model.generate(**inputs, max_new_tokens=256, use_cache=True)
	tokenizer.batch_decode(outputs)
	```