kevin009
/

culturalmixed

Text Generation

text-generation-inference

Model card Files Files and versions

culturalmixed / README.md

kevin009's picture

Create README.md

a087f98 verified over 2 years ago

|

history blame contribute delete

1.6 kB

license: apache-2.0
language:
  - en

Model Card: LoRA Configuration Causal Language Model

Training Specifications

LoRA Configuration

Configuration: LoraConfig
- Parameters:
  - r: 16
  - lora_alpha: 16
  - lora_dropout: 0.05
  - bias: none
  - task_type: CAUSAL_LM
  - target_modules: ['k_proj', 'gate_proj', 'v_proj', 'up_proj', 'q_proj', 'o_proj', 'down_proj']

Model to Fine-Tune

Function: AutoModelForCausalLM.from_pretrained
- Parameters:
  - model_name
  - torch_dtype: torch.float16
  - load_in_4bit: True
- Configurations:
  - use_cache: False

Reference Model

Function: AutoModelForCausalLM.from_pretrained
- Parameters:
  - model_name
  - torch_dtype: torch.float16
  - load_in_4bit: True

Training Arguments

Function: TrainingArguments
- Parameters:
  - per_device_train_batch_size: 4
  - gradient_accumulation_steps: 4
  - gradient_checkpointing: True
  - learning_rate: 5e-5
  - lr_scheduler_type: "cosine"
  - max_steps: 200
  - save_strategy: "no"
  - logging_steps: 1
  - output_dir: new_model
  - optim: "paged_adamw_32bit"
  - warmup_steps: 100
  - bf16: True
  - report_to: "wandb"

Create DPO Trainer

Function: DPOTrainer
- Parameters:
  - model
  - ref_model
  - args: training_args
  - train_dataset: dataset
  - tokenizer: tokenizer
  - peft_config: peft_config
  - beta: 0.1
  - max_prompt_length: 1024
  - max_length: 1536