satoyutaka/llm2025_main_0

This repository provides a LoRA adapter fine-tuned from Qwen/Qwen3-4B-Instruct-2507 using the PEFT library on Apple Silicon (M4).

This repository contains LoRA adapter weights only. The base model must be loaded separately.

Training Objective

This adapter is specifically trained to improve structured output accuracy (JSON / XML) for high-difficulty tasks. A key feature of this model is the "No-Preamble" training: the model is trained to output JSON/XML starting directly with { or < characters, minimizing irrelevant introductory text ("Here is your JSON...") to ensure compatibility with strict parsers.

Training Configuration

Base model: Qwen/Qwen3-4B-Instruct-2507
Method: LoRA (Standard PEFT)
Precision: bfloat16 (trained on M4 MPS)
Max steps: 1000
Learning rate: 5e-5 (Cosine scheduler)
LoRA Parameters: r=16, alpha=32
Target Modules: All major projections (q_proj, v_proj, k_proj, o_proj, gate_proj, up_proj, down_proj)
Data Augmentation: LLM-free BERT-based substitution

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch

base = "Qwen/Qwen3-4B-Instruct-2507"
adapter = "satoyutaka/llm2025_main_0" 

tokenizer = AutoTokenizer.from_pretrained(base)
model = AutoModelForCausalLM.from_pretrained(
    base,
    torch_dtype=torch.bfloat16,
    device_map="auto",
)
model = PeftModel.from_pretrained(model, adapter)

Sources & License (IMPORTANT)

Training Data: daichira/structevalt-3k-mix-sft (with BERT-based augmentation)
Dataset License: Creative Commons Attribution (CC-BY-4.0). This dataset is used and redistributed under the terms of the CC-BY-4.0 license.
Compliance: Users must comply with both the dataset's attribution requirements and the base model's original terms of use.

＜日本語訳＞

qwen3-4b-struct-evaluation-lora-v1

このリポジトリは、Qwen/Qwen3-4B-Instruct-2507 をベースモデルとし、Apple Silicon (M4) 環境の PEFT ライブラリを用いてファインチューニングされた LoRA アダプターを提供します。

【重要】本リポジトリには LoRA アダプターの重みのみが含まれています。ベースモデルは別途ロードする必要があります。

学習の目的

このアダプターは、高難度なタスクにおける構造化出力（JSON / XML）の精度向上を目的としてトレーニングされています。本モデルの大きな特徴は、「前置きの排除（No-Preamble）」学習です。AIが回答の冒頭に不要な文章（例：「はい、承知しました。以下がJSONです：」）を書かず、直接 { や < から書き始めるように訓練されており、厳格なパースを必要とするシステムとの互換性を最大化しています。

学習設定

ベースモデル: Qwen/Qwen3-4B-Instruct-2507
手法: LoRA (Standard PEFT)
精度: bfloat16 (M4 MPS デバイスにて学習)
最大ステップ数: 1000
学習率: 5e-5 (Cosine スケジューラ)
LoRA パラメータ: r=16, α=32
ターゲットモジュール: 主要な全プロジェクション層 (q_proj, v_proj, k_proj, o_proj, gate_proj, up_proj, down_proj)
データ拡張: LLMを使用しないBERTベースの単語置換

使い方

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch

base = "Qwen/Qwen3-4B-Instruct-2507"
adapter = "satoyutaka/llm2025_main_0" 

tokenizer = AutoTokenizer.from_pretrained(base)
model = AutoModelForCausalLM.from_pretrained(
    base,
    torch_dtype=torch.bfloat16,
    device_map="auto",
)
model = PeftModel.from_pretrained(model, adapter)

ソースおよびライセンス（重要）

学習データ: daichira/structevalt-3k-mix-sft (BERTベースの拡張を追加)
データセットライセンス: Creative Commons Attribution (CC-BY-4.0) 本データセットは、CC-BY-4.0 ライセンスの条項に基づき、使用および再配布が可能です。
遵守事項: 利用者は、データセットの帰属表記（クレジット）に関する要件、およびベースモデルの元の利用規約の両方を遵守する必要があります。

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support