--- library_name: transformers license: other base_model: inclusionAI/Ling-mini-2.0 tags: - llama-factory - full model-index: - name: outputs results: [] pipeline_tag: text-generation --- # Ling Mini 2.0 Identity This model is a fine-tuned version of [inclusionAI/Ling-mini-2.0](https://huggingface.co/inclusionAI/Ling-mini-2.0) on the identity dataset (from LLaMA-Factory). ## Training procedure Full fine tuning with DeepSpeed Zero3 offloading and 4 x A100 80GB. For a faster setup, you can use the `qingy1337/llamafactory-cu128:latest` docker image. ### Training hyperparameters The following hyperparameters were used during training: ``` model_name_or_path: inclusionAI/Ling-mini-2.0 trust_remote_code: true ### method stage: sft do_train: true finetuning_type: full deepspeed: examples/deepspeed/ds_z3_config.json ### dataset dataset: identity template: bailing_v2 cutoff_len: 8192 max_samples: 10000000000 overwrite_cache: true preprocessing_num_workers: 16 ### output output_dir: ./outputs/ logging_steps: 1 save_steps: 10000000000 save_only_model: true plot_loss: true overwrite_output_dir: true report_to: wandb run_name: Test-FT ### train per_device_train_batch_size: 2 gradient_accumulation_steps: 1 learning_rate: 1.0e-6 num_train_epochs: 10.0 lr_scheduler_type: cosine warmup_ratio: 0.2 bf16: true ddp_timeout: 180000000 resume_from_checkpoint: null ``` ### Framework versions - Transformers 4.56.1 - Pytorch 2.8.0+cu128 - Datasets 4.0.0 - Tokenizers 0.22.1