--- library_name: peft license: apache-2.0 base_model: Qwen/Qwen2.5-32B tags: - generated_from_trainer datasets: - StudyPal/education model-index: - name: outputs/qwen25-croatian-education results: [] --- [Built with Axolotl](https://github.com/axolotl-ai-cloud/axolotl)
See axolotl config axolotl version: `0.8.0.dev0` ```yaml base_model: Qwen/Qwen2.5-32B datasets: - path: StudyPal/education type: chat_template field_messages: conversations message_property_mappings: role: from content: value output_dir: ./outputs/qwen25-croatian-education sequence_len: 1024 adapter: lora lora_r: 32 lora_alpha: 64 lora_dropout: 0.05 lora_target_modules: - q_proj - v_proj - k_proj - o_proj - gate_proj - down_proj - up_proj gradient_accumulation_steps: 8 micro_batch_size: 2 max_steps: 1500 optimizer: paged_adamw_8bit learning_rate: 0.0001 load_in_4bit: true train_on_inputs: false bf16: auto # Additional settings for 32B model trust_remote_code: true bnb_4bit_quant_type: nf4 bnb_4bit_use_double_quant: true bnb_4bit_compute_dtype: bfloat16 gradient_checkpointing: true sample_packing: true pad_to_sequence_len: true val_set_size: 0.001 warmup_steps: 75 save_safetensors: true flash_attention: true dataloader_num_workers: 8 dataloader_pin_memory: true eval_steps: 999999 eval_strategy: steps save_steps: 750 logging_steps: 50 special_tokens: bos_token: "<|im_start|>" eos_token: "<|im_end|>" pad_token: "<|endoftext|>" ```

# outputs/qwen25-croatian-education This model is a fine-tuned version of [Qwen/Qwen2.5-32B](https://huggingface.co/Qwen/Qwen2.5-32B) on the StudyPal/education dataset. ## Model description More information needed ## Intended uses & limitations More information needed ## Training and evaluation data More information needed ## Training procedure ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 0.0001 - train_batch_size: 2 - eval_batch_size: 2 - seed: 42 - gradient_accumulation_steps: 8 - total_train_batch_size: 16 - optimizer: Use OptimizerNames.PAGED_ADAMW_8BIT with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments - lr_scheduler_type: cosine - lr_scheduler_warmup_steps: 75 - training_steps: 1500 ### Training results | Training Loss | Epoch | Step | Validation Loss | |:-------------:|:------:|:----:|:---------------:| | No log | 0.0003 | 1 | 1.4631 | ### Framework versions - PEFT 0.14.0 - Transformers 4.49.0 - Pytorch 2.5.1+cu124 - Datasets 3.2.0 - Tokenizers 0.21.0