Qwen3 4B Tax
A fine-tuned version of Qwen3-4B specialized for U.S. tax law reasoning with IRC citation support.
Model Details
- Base model: Qwen/Qwen3-4B
- Architecture: Qwen3ForCausalLM
- Parameters: ~4B
- Precision: bfloat16
- Fine-tuned with: Unsloth
Training
Fine-tuned using Unsloth with LoRA (rank=16, alpha=16) on synthetic U.S. tax law Q&A data covering:
- Individual taxation
- Business entity taxation
- Estate and gift tax
- International tax (CFCs, GILTI, FDII)
- Tax procedure and compliance
LoRA target modules: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
Training details:
- 3 epochs, 9 training steps (packed sequences)
- Effective batch size: 16 (4 x 4 gradient accumulation)
- Learning rate: 2e-4 with cosine schedule
- Optimizer: adamw_8bit
- Final training loss: 1.64
- Trainable parameters: 33M / 4B (0.81%)
The model was trained using SFT with structured reasoning in IRAC format (Issue, Rule, Application, Conclusion) and IRC citation grounding.
Usage
from transformers import AutoModelForCausalLM, AutoTokenizer
model_name = "DJLougen/qwen3-4b-tax"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype="auto", device_map="auto")
messages = [
{"role": "system", "content": "You are a tax law assistant. Provide accurate analysis with IRC citations."},
{"role": "user", "content": "What are the requirements for a corporation to elect S corporation status under IRC § 1362?"}
]
input_ids = tokenizer.apply_chat_template(messages, return_tensors="pt").to(model.device)
output = model.generate(input_ids, max_new_tokens=512)
print(tokenizer.decode(output[0][input_ids.shape[-1]:], skip_special_tokens=True))
Limitations
- Trained on synthetic data; not a substitute for professional tax advice
- Coverage is focused on U.S. federal tax law
- Small training set (100 examples) — intended as see how well just fine tuning a model could handle research.
- The model was fine-tuned and evaluated using a hybrid RAG pipeline with rule-based section forcing, code-computed tax calculations, disambiguation chunks for complex statutes, and an agentic self-correction loop. Evaluated on complex tax scenarios including SSTB phase-outs, passive loss exceptions, and nonqualified use proration.
- Downloads last month
- 5