Upload LoRA adapter (README written by author)

Browse files

Files changed (9) hide show

.gitattributes +1 -0
README.md +314 -0
adapter_config.json +46 -0
adapter_model.safetensors +3 -0
added_tokens.json +28 -0
special_tokens_map.json +31 -0
tokenizer.json +3 -0
tokenizer_config.json +239 -0
vocab.json +0 -0

.gitattributes CHANGED Viewed

@@ -33,3 +33,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text

 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text
+tokenizer.json filter=lfs diff=lfs merge=lfs -text

README.md ADDED Viewed

	@@ -0,0 +1,314 @@

+---
+base_model: Qwen/Qwen3-4B-Instruct-2507
+library_name: peft
+pipeline_tag: text-generation
+tags:
+- base_model:adapter:Qwen/Qwen3-4B-Instruct-2507
+- lora
+- transformers
+---
+# Qwen3-4B-Instruct LoRA Fine-tuned Model
+A LoRA adapter model fine-tuned on structured data and Chain-of-Thought reasoning datasets based on Qwen3-4B-Instruct.
+## Model Details
+### Model Description
+This model is a LoRA adapter that performs SFT (Supervised Fine-Tuning) on multiple structured datasets (including CoT reasoning) using Qwen3-4B-Instruct-2507 as the base model. It achieves efficient fine-tuning by combining 4-bit quantization (NF4) with LoRA.
+- **Developed by:** u-10bei
+- **Model type:** Causal Language Model (LoRA Adapter)
+- **Language(s) (NLP):** Japanese, English
+- **License:** Follows the base model's license
+- **Finetuned from model:** Qwen/Qwen3-4B-Instruct-2507
+### Model Sources
+- **Repository:** [GitHub Repository URL]
+- **Base Model:** [Qwen/Qwen3-4B-Instruct-2507](https://huggingface.co/Qwen/Qwen3-4B-Instruct-2507)
+## Uses
+### Direct Use
+This model can be used for:
+- Understanding and generating structured data
+- Complex problem-solving including Chain-of-Thought reasoning
+- Conversational tasks in Japanese and English
+### Recommended Usage
+```python
+from transformers import AutoModelForCausalLM, AutoTokenizer
+from peft import PeftModel
+# Load base model and LoRA adapter
+base_model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen3-4B-Instruct-2507")
+model = PeftModel.from_pretrained(base_model, "path/to/adapter")
+tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen3-4B-Instruct-2507")
+# Inference
+messages = [{"role": "user", "content": "Your question"}]
+text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
+inputs = tokenizer(text, return_tensors="pt")
+outputs = model.generate(**inputs, max_new_tokens=512)
+print(tokenizer.decode(outputs[0]))
+```
+## Bias, Risks, and Limitations
+This model has the following known limitations:
+- **Training Data Bias:** The model is trained on specific structured datasets and may not generalize well to domains outside the training distribution
+- **Language Limitations:** While supporting Japanese and English, performance may vary between languages
+- **Sequence Length:** Limited to 512 tokens maximum, which may be insufficient for very long contexts
+- **Quantization Effects:** 4-bit quantization may introduce minor accuracy degradation compared to full-precision models
+- **CoT Reasoning:** Chain-of-Thought capabilities are limited to patterns seen in training data
+### Recommendations
+Users should:
+- Validate model outputs for their specific use case before production deployment
+- Be aware of potential biases in structured data generation tasks
+- Consider the 512 token limit when designing prompts and applications
+- Test thoroughly with domain-specific data to ensure adequate performance
+- Monitor for hallucinations or incorrect reasoning in CoT tasks
+## How to Get Started with the Model
+### Configuration via Environment Variables
+The training script (train.py) can be configured using the following environment variables:
+#### Required Settings
+- `SM_MODEL_DIR`: Model output directory (default: /opt/ml/model)
+- `SM_HPS`: Hyperparameters JSON string
+#### MLflow Settings
+- `MLFLOW_TRACKING_URI`: MLflow tracking server URI
+- `MLFLOW_EXPERIMENT_NAME`: Experiment name (default: qwen3-sft-grpo)
+#### Hyperparameters (JSON in SM_HPS)
+```json
+{
+  "base_model": "Qwen/Qwen3-4B-Instruct-2507",
+  "dataset_id": "u-10bei/structured_data_with_cot_dataset_512_v2",
+  "max_seq_len": "512",
+  "seed": "3407",
+  "lora_r": "64",
+  "lora_alpha": "128",
+  "sft_epochs": "1",
+  "sft_batch_size": "2",
+  "sft_lr": "1e-6",
+  "grpo_epochs": "1",
+  "grpo_batch_size": "1",
+  "grpo_lr": "5e-7",
+  "sft_val_ratio": "0.05",
+  "upsample_enable": "false",
+  "upsample_rules_json": ""
+}
+```
+### Running Training
+```bash
+# Set environment variables and run
+export SM_MODEL_DIR="./output"
+export SM_HPS='{"base_model":"Qwen/Qwen3-4B-Instruct-2507","sft_epochs":"1"}'
+python train.py
+```
+## Training Details
+### Training Data
+Combined 5 structured datasets (including CoT reasoning):
+- u-10bei/structured_data_with_cot_dataset_512_v2
+- u-10bei/structured_data_with_cot_dataset_512_v5
+- u-10bei/structured_data_with_cot_dataset_512_v4
+- u-10bei/structured_data_with_cot_dataset_512
+- u-10bei/structured_data_with_cot_dataset_v2
+Data preprocessing includes:
+- Conversion to OpenAI Chat format (messages: [{role, content}, ...])
+- Filtering out samples with empty Assistant responses
+- Using only samples ending with Assistant turn
+- Train/Validation split (default 95:5)
+- Optional upsampling functionality
+### Training Procedure
+#### Quantization Configuration
+- **Quantization Method:** 4-bit NF4 quantization (BitsAndBytes)
+- **Compute Precision:** float16 (optimized for T4 GPU)
+- **Double Quantization:** Enabled
+#### LoRA Configuration
+- **LoRA Rank (r):** 64 (default)
+- **LoRA Alpha:** 128 (default)
+- **Target Modules:** q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
+- **LoRA Dropout:** 0
+- **Task Type:** CAUSAL_LM
+#### Training Hyperparameters
+- **Training regime:** fp16 mixed precision
+- **Epochs:** 1 (default)
+- **Batch Size:** 2 per device (default)
+- **Gradient Accumulation Steps:** 8
+- **Learning Rate:** 1e-6 (default)
+- **LR Scheduler:** Cosine
+- **Warmup Ratio:** 0.1
+- **Weight Decay:** 0.05
+- **Max Sequence Length:** 512 (default)
+- **Optimizer:** AdamW (Transformers standard)
+#### Loss Calculation Method
+- **Assistant-Only Loss:** Only Assistant response parts are trained, User input parts are masked (-100)
+- **Padding Mask:** Padding parts are also excluded from training
+#### Evaluation and Saving Settings
+- **Evaluation Strategy:** steps
+- **Eval Steps:** 50
+- **Save Strategy:** steps
+- **Save Steps:** 100
+- **Save Total Limit:** 2
+- **Logging Steps:** 10
+#### MLflow Integration
+- Automatically logs training parameters, metrics, and models
+- Experiment name: qwen3-sft-grpo (default)
+## Evaluation
+### Testing Data, Factors & Metrics
+#### Testing Data
+The model uses a validation split (5% by default) from the combined training datasets for evaluation during training. No separate held-out test set is currently defined.
+#### Factors
+Evaluation considers:
+- Loss convergence across training steps
+- Performance on validation set samples
+- Assistant response generation quality
+#### Metrics
+- **Training Loss:** Cross-entropy loss on Assistant-only tokens
+- **Validation Loss:** Evaluated every 50 steps to monitor overfitting
+- **Perplexity:** Derived from validation loss as a measure of prediction confidence
+### Results
+Results vary based on hyperparameters and training duration. With default settings (1 epoch, lr=1e-6):
+- Training converges within the single epoch
+- Validation loss typically stabilizes after initial warmup phase
+- Model demonstrates improved structured data understanding compared to base model
+#### Summary
+The fine-tuned model shows enhanced capabilities in:
+- Structured data generation and parsing
+- Chain-of-Thought reasoning patterns
+- Task-specific response formatting
+## Model Examination
+The model architecture consists of:
+- **Base Model:** Qwen3-4B-Instruct-2507 with 4B parameters
+- **LoRA Adapters:** Low-rank matrices (rank 64) applied to attention and MLP layers
+- **Quantization:** 4-bit NF4 quantization reduces memory footprint while maintaining performance
+- **Training Focus:** Assistant-only loss ensures the model learns to generate appropriate responses without overfitting to user inputs
+Key design decisions:
+- Using multiple related datasets improves generalization across structured data tasks
+- 512 token limit balances training efficiency with practical use cases
+- FP16 precision optimized for T4 GPU availability and cost-effectiveness
+## Environmental Impact
+<!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
+Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
+- **Hardware Type:** [More Information Needed]
+- **Hours used:** [More Information Needed]
+- **Cloud Provider:** [More Information Needed]
+- **Compute Region:** [More Information Needed]
+- **Carbon Emitted:** [More Information Needed]
+## Technical Specifications
+### Model Architecture and Objective
+- **Base Architecture:** Qwen3-4B-Instruct-2507
+- **Fine-tuning Method:** LoRA (Low-Rank Adaptation)
+- **Quantization:** 4-bit NF4 with double quantization
+- **Training Objective:** Causal Language Modeling with Assistant-Only Loss
+### Compute Infrastructure
+#### Hardware
+- **GPU:** NVIDIA T4 (recommended)
+- **Precision:** FP16 mixed precision training
+#### Software
+- **Framework:** Transformers, PEFT, TRL
+- **Quantization:** BitsAndBytes
+- **Experiment Tracking:** MLflow
+- **Key Dependencies:**
+  - transformers
+  - peft
+  - trl
+  - bitsandbytes
+  - mlflow
+  - datasets
+  - torch
+## Citation [optional]
+<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
+**BibTeX:**
+[More Information Needed]
+**APA:**
+[More Information Needed]
+## Glossary [optional]
+<!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
+[More Information Needed]
+## More Information [optional]
+[More Information Needed]
+## Model Card Authors [optional]
+[More Information Needed]
+## Model Card Contact
+[More Information Needed]
+### Framework versions
+- PEFT 0.18.1

adapter_config.json ADDED Viewed

	@@ -0,0 +1,46 @@

+{
+  "alora_invocation_tokens": null,
+  "alpha_pattern": {},
+  "arrow_config": null,
+  "auto_mapping": null,
+  "base_model_name_or_path": "Qwen/Qwen3-4B-Instruct-2507",
+  "bias": "none",
+  "corda_config": null,
+  "ensure_weight_tying": false,
+  "eva_config": null,
+  "exclude_modules": null,
+  "fan_in_fan_out": false,
+  "inference_mode": true,
+  "init_lora_weights": true,
+  "layer_replication": null,
+  "layers_pattern": null,
+  "layers_to_transform": null,
+  "loftq_config": {},
+  "lora_alpha": 128,
+  "lora_bias": false,
+  "lora_dropout": 0,
+  "megatron_config": null,
+  "megatron_core": "megatron.core",
+  "modules_to_save": null,
+  "peft_type": "LORA",
+  "peft_version": "0.18.1",
+  "qalora_group_size": 16,
+  "r": 64,
+  "rank_pattern": {},
+  "revision": null,
+  "target_modules": [
+    "k_proj",
+    "q_proj",
+    "up_proj",
+    "o_proj",
+    "down_proj",
+    "gate_proj",
+    "v_proj"
+  ],
+  "target_parameters": null,
+  "task_type": "CAUSAL_LM",
+  "trainable_token_indices": null,
+  "use_dora": false,
+  "use_qalora": false,
+  "use_rslora": false
+}

adapter_model.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:124b524fc7b8db42fc239f9d04f15b16f789e39d69ddaf7ca838702c30674e4b
+size 528550256

added_tokens.json ADDED Viewed

	@@ -0,0 +1,28 @@

+{
+  "</think>": 151668,
+  "</tool_call>": 151658,
+  "</tool_response>": 151666,
+  "<think>": 151667,
+  "<tool_call>": 151657,
+  "<tool_response>": 151665,
+  "<|box_end|>": 151649,
+  "<|box_start|>": 151648,
+  "<|endoftext|>": 151643,
+  "<|file_sep|>": 151664,
+  "<|fim_middle|>": 151660,
+  "<|fim_pad|>": 151662,
+  "<|fim_prefix|>": 151659,
+  "<|fim_suffix|>": 151661,
+  "<|im_end|>": 151645,
+  "<|im_start|>": 151644,
+  "<|image_pad|>": 151655,
+  "<|object_ref_end|>": 151647,
+  "<|object_ref_start|>": 151646,
+  "<|quad_end|>": 151651,
+  "<|quad_start|>": 151650,
+  "<|repo_name|>": 151663,
+  "<|video_pad|>": 151656,
+  "<|vision_end|>": 151653,
+  "<|vision_pad|>": 151654,
+  "<|vision_start|>": 151652
+}

special_tokens_map.json ADDED Viewed

	@@ -0,0 +1,31 @@

+{
+  "additional_special_tokens": [
+    "<|im_start|>",
+    "<|im_end|>",
+    "<|object_ref_start|>",
+    "<|object_ref_end|>",
+    "<|box_start|>",
+    "<|box_end|>",
+    "<|quad_start|>",
+    "<|quad_end|>",
+    "<|vision_start|>",
+    "<|vision_end|>",
+    "<|vision_pad|>",
+    "<|image_pad|>",
+    "<|video_pad|>"
+  ],
+  "eos_token": {
+    "content": "<|im_end|>",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "pad_token": {
+    "content": "<|endoftext|>",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  }
+}

tokenizer.json ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:1574cf58b63a2a56db9bc28f6ddcac4ece87690840939153189077692486f4ee
+size 11422920

tokenizer_config.json ADDED Viewed

	@@ -0,0 +1,239 @@

+{
+  "add_bos_token": false,
+  "add_prefix_space": false,
+  "added_tokens_decoder": {
+    "151643": {
+      "content": "<|endoftext|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "151644": {
+      "content": "<|im_start|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "151645": {
+      "content": "<|im_end|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "151646": {
+      "content": "<|object_ref_start|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "151647": {
+      "content": "<|object_ref_end|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "151648": {
+      "content": "<|box_start|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "151649": {
+      "content": "<|box_end|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "151650": {
+      "content": "<|quad_start|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "151651": {
+      "content": "<|quad_end|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "151652": {
+      "content": "<|vision_start|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "151653": {
+      "content": "<|vision_end|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "151654": {
+      "content": "<|vision_pad|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "151655": {
+      "content": "<|image_pad|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "151656": {
+      "content": "<|video_pad|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "151657": {
+      "content": "<tool_call>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "151658": {
+      "content": "</tool_call>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "151659": {
+      "content": "<|fim_prefix|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "151660": {
+      "content": "<|fim_middle|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "151661": {
+      "content": "<|fim_suffix|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "151662": {
+      "content": "<|fim_pad|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "151663": {
+      "content": "<|repo_name|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "151664": {
+      "content": "<|file_sep|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "151665": {
+      "content": "<tool_response>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "151666": {
+      "content": "</tool_response>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "151667": {
+      "content": "<think>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    },
+    "151668": {
+      "content": "</think>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": false
+    }
+  },
+  "additional_special_tokens": [
+    "<|im_start|>",
+    "<|im_end|>",
+    "<|object_ref_start|>",
+    "<|object_ref_end|>",
+    "<|box_start|>",
+    "<|box_end|>",
+    "<|quad_start|>",
+    "<|quad_end|>",
+    "<|vision_start|>",
+    "<|vision_end|>",
+    "<|vision_pad|>",
+    "<|image_pad|>",
+    "<|video_pad|>"
+  ],
+  "bos_token": null,
+  "clean_up_tokenization_spaces": false,
+  "eos_token": "<|im_end|>",
+  "errors": "replace",
+  "extra_special_tokens": {},
+  "model_max_length": 1010000,
+  "pad_token": "<|endoftext|>",
+  "split_special_tokens": false,
+  "tokenizer_class": "Qwen2Tokenizer",
+  "unk_token": null
+}

vocab.json ADDED Viewed

The diff for this file is too large to render. See raw diff