Model Card: LoRA Configuration Causal Language Model
Training Specifications
LoRA Configuration
- Configuration:
LoraConfig
- Parameters:
r: 16
lora_alpha: 16
lora_dropout: 0.05
bias: none
task_type: CAUSAL_LM
target_modules: ['k_proj', 'gate_proj', 'v_proj', 'up_proj', 'q_proj', 'o_proj', 'down_proj']
Model to Fine-Tune
- Function:
AutoModelForCausalLM.from_pretrained
- Parameters:
model_name
torch_dtype: torch.float16
load_in_4bit: True
- Configurations:
Reference Model
- Function:
AutoModelForCausalLM.from_pretrained
- Parameters:
model_name
torch_dtype: torch.float16
load_in_4bit: True
Training Arguments
- Function:
TrainingArguments
- Parameters:
per_device_train_batch_size: 4
gradient_accumulation_steps: 4
gradient_checkpointing: True
learning_rate: 5e-5
lr_scheduler_type: "cosine"
max_steps: 200
save_strategy: "no"
logging_steps: 1
output_dir: new_model
optim: "paged_adamw_32bit"
warmup_steps: 100
bf16: True
report_to: "wandb"
Create DPO Trainer
- Function:
DPOTrainer
- Parameters:
model
ref_model
args: training_args
train_dataset: dataset
tokenizer: tokenizer
peft_config: peft_config
beta: 0.1
max_prompt_length: 1024
max_length: 1536