Model Card for ryomac/lora_sft_full_ep2
Model Details
- Model type: LoRA adapter (PEFT)
- Base model:
Qwen/Qwen3-4B-Instruct-2507 - Goal: Structured output performance improvement for JSON-style competition prompts
- Author: ryomac
Intended Use
- This repository contains adapter weights only.
- Use together with the base model for inference or continued fine-tuning.
- Suitable for controlled structured generation tasks.
Out-of-Scope Use
- Medical/legal/financial high-stakes decisions.
- Fully automated decision pipelines without human validation.
Training Data
- Main SFT dataset:
u-10bei/structured_data_with_cot_dataset_512_v2 - Additional local preprocessing/filtering was applied for training.
- Public evaluation set was handled as evaluation/inference target.
Training Procedure
- Method: Supervised Fine-Tuning (SFT) with LoRA
- Run profile: full training variant (
full_ep2) - Precision: fp16 (Mac MPS environment)
Evaluation
- Evaluated with local competition-style inference/evaluation workflow.
- Performance varies with prompt template and decoding parameters.
How to Use
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
base_id = "Qwen/Qwen3-4B-Instruct-2507"
adapter_id = "ryomac/lora_sft_full_ep2"
tokenizer = AutoTokenizer.from_pretrained(base_id)
base_model = AutoModelForCausalLM.from_pretrained(base_id)
model = PeftModel.from_pretrained(base_model, adapter_id)
Limitations
- JSON consistency can degrade on unseen prompt distributions.
- Output quality depends on decoding settings and prompt design.
- Task-specific validation is required before production usage.
Contact
For reproducibility details, refer to training/inference scripts under scripts/ in this project.
- Downloads last month
- 29
Model tree for ryomac/lora_sft_full_ep2
Base model
Qwen/Qwen3-4B-Instruct-2507