Model Card for ryomac/lora_sft_v2_weak650_ep2

Model Details

  • Model type: LoRA adapter (PEFT) for causal language modeling
  • Base model: Qwen/Qwen3-4B-Instruct-2507
  • Purpose: Structured output tuning for JSON-style answer formatting in competition-style prompts
  • Developer: ryomac

Intended Use

  • This repository provides adapter weights only (adapter_model.safetensors).
  • Load this adapter on top of the base model for inference/fine-tuning.
  • Recommended use: controlled generation tasks requiring stable structured responses.

Out-of-Scope Use

  • High-stakes decision making (medical/legal/financial).
  • Safety-critical or fully autonomous production workflows without human review.

Training Data

  • Main SFT source: u-10bei/structured_data_with_cot_dataset_512_v2
  • Additional local preprocessing/filtering was applied for training convenience.
  • Evaluation/public benchmark files were treated as inference/evaluation targets only.

Training Procedure

  • Method: Supervised Fine-Tuning (SFT) with LoRA
  • Typical settings used in this run:
    • Epochs: 2
    • Learning rate: low-range (competition tuning)
    • LoRA rank/alpha: medium-to-high rank setup
    • Mixed precision: fp16 (MPS environment)

Evaluation

  • This adapter was validated in local competition-style evaluation.
  • Reported score reference exists in project documentation (docs/report_v2_weak650_ep2.md).
  • Results can vary by prompt template, decoding params, and inference implementation.

How to Use

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel

base_id = "Qwen/Qwen3-4B-Instruct-2507"
adapter_id = "ryomac/lora_sft_v2_weak650_ep2"

tok = AutoTokenizer.from_pretrained(base_id)
base = AutoModelForCausalLM.from_pretrained(base_id)
model = PeftModel.from_pretrained(base, adapter_id)

Limitations

  • Structured output quality is sensitive to prompt format and decoding settings.
  • May produce malformed JSON or schema mismatches on out-of-distribution tasks.
  • Performance claims are task-specific and should be independently verified.

Contact

For reproducibility context, refer to this project repository and training scripts under scripts/.

Downloads last month
27
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for ryomac/lora_sft_v2_weak650_ep2

Adapter
(5264)
this model