Model Card for ryomac/lora_sft_full_ep2

Model Details

  • Model type: LoRA adapter (PEFT)
  • Base model: Qwen/Qwen3-4B-Instruct-2507
  • Goal: Structured output performance improvement for JSON-style competition prompts
  • Author: ryomac

Intended Use

  • This repository contains adapter weights only.
  • Use together with the base model for inference or continued fine-tuning.
  • Suitable for controlled structured generation tasks.

Out-of-Scope Use

  • Medical/legal/financial high-stakes decisions.
  • Fully automated decision pipelines without human validation.

Training Data

  • Main SFT dataset: u-10bei/structured_data_with_cot_dataset_512_v2
  • Additional local preprocessing/filtering was applied for training.
  • Public evaluation set was handled as evaluation/inference target.

Training Procedure

  • Method: Supervised Fine-Tuning (SFT) with LoRA
  • Run profile: full training variant (full_ep2)
  • Precision: fp16 (Mac MPS environment)

Evaluation

  • Evaluated with local competition-style inference/evaluation workflow.
  • Performance varies with prompt template and decoding parameters.

How to Use

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel

base_id = "Qwen/Qwen3-4B-Instruct-2507"
adapter_id = "ryomac/lora_sft_full_ep2"

tokenizer = AutoTokenizer.from_pretrained(base_id)
base_model = AutoModelForCausalLM.from_pretrained(base_id)
model = PeftModel.from_pretrained(base_model, adapter_id)

Limitations

  • JSON consistency can degrade on unseen prompt distributions.
  • Output quality depends on decoding settings and prompt design.
  • Task-specific validation is required before production usage.

Contact

For reproducibility details, refer to training/inference scripts under scripts/ in this project.

Downloads last month
29
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for ryomac/lora_sft_full_ep2

Adapter
(5264)
this model