このモデルについて
このモデルはコード翻訳用にファインチューニングされたモデルです。 学習データセットについては、yamlの中身をご確認ください。
Model Card for Model ID
See axolotl config
axolotl version: 0.8.0.dev0
# 学習のベースモデルに関する設定
# ベースモデルには先ほどSFTしたモデルを指定
base_model: kazuyamaa/gemma-2-2b-sft-merged
model_type: AutoModelForCausalLM
tokenizer_type: AutoTokenizer
# 学習後のモデルのHFへのアップロードに関する設定
hub_model_id: kazuyamaa/gemma-2-2b-code-translate-dpo-merged
hub_strategy: "end"
push_dataset_to_hub:
hf_use_auth_token: true
# Liger Kernelの設定(学習の軽量・高速化)
plugins:
- axolotl.integrations.liger.LigerPlugin
liger_cross_entropy: false
liger_rope: true
liger_rms_norm: true
liger_swiglu: true
liger_fused_linear_cross_entropy: true
load_in_8bit: false
load_in_4bit: true
strict: false
chat_template: tokenizer_default
rl: dpo
rl_beta: 10.0
dpo_use_weighting: true # Optional[bool]. Whether to perform weighting.
rpo_alpha: 1.0 # Optional[float]. Weighting of NLL term in loss from RPO paper.
max_prompt_length: 512
max_length: 2048
datasets:
- path: kazuyamaa/java-to-python-rlhf-dataset-ver01
type: gemma.custom
train_on_split: train
- path: kazuyamaa/java-to-cpp-rlhf-dataset-ver01
type: gemma.custom
train_on_split: train
- path: kazuyamaa/cpp-to-python-rlhf-dataset-ver01
type: gemma.custom
train_on_split: train
shuffle_merged_datasets: true
dataset_prepared_path: /workspace/data/fft-dpo-data-gemma-2
output_dir: /workspace/data/models/gemma-2-2b-code-translate-dpo-merged
sequence_len: 2048
sample_packing: false
eval_sample_packing: false
pad_to_sequence_len: true
# LoRAに関する設定(フルファインチューニングしたい場合は全て空欄にする)
adapter: qlora
lora_model_dir:
lora_r: 16
lora_alpha: 32
lora_dropout: 0.05
lora_target_linear: true
lora_fan_in_fan_out:
wandb_project: 2b-dpo
wandb_entity: kazukitakayamas051-securities-companies
wandb_watch:
wandb_name: dpo-attempt-01
wandb_log_model:
gradient_accumulation_steps: 8
micro_batch_size: 2
num_epochs: 1
optimizer: paged_adamw_8bit
lr_scheduler: cosine
cosine_min_lr_ratio: 0.1
learning_rate: 3e-7
train_on_inputs: false
group_by_length: false
bf16: auto
fp16:
tf32: false
gradient_checkpointing: true
gradient_checkpointing_kwargs:
use_reentrant: true
early_stopping_patience:
auto_resume_from_checkpoints: true
local_rank:
logging_steps: 1
xformers_attention:
flash_attention: true
save_strategy: steps
save_steps: 100
save_total_limit: 1
warmup_steps: 20
eval_steps:
eval_batch_size:
eval_table_size:
eval_max_new_tokens:
debug:
deepspeed: /workspace/axolotl/deepspeed_configs/zero3_bf16.json
weight_decay: 0.01
fsdp:
fsdp_config:
special_tokens:
pad_token: <pad>
Model Details
gemma-2-2b-sft-lora
This model is a fine-tuned version of google/gemma-2-2b on the kazuyamaa/multi-language-messages-01, the kazuyamaa/code-translate-google_messages, the kazuyamaa/code_x_glue_cc_code_refinement_messages, the kazuyamaa/CodeTranslatorLLM-Code-Translation_messages and the kazuyamaa/CodeTranslatorLLM-Code-Translation_messages datasets. It achieves the following results on the evaluation set:
Model Description
- Developed by: [More Information Needed]
- Funded by [optional]: [More Information Needed]
- Shared by [optional]: [More Information Needed]
- Model type: [More Information Needed]
- Language(s) (NLP): [More Information Needed]
- License: [More Information Needed]
- Finetuned from model [optional]: [More Information Needed]
Model Sources [optional]
- Repository: [More Information Needed]
- Paper [optional]: [More Information Needed]
- Demo [optional]: [More Information Needed]
Uses
Direct Use
[More Information Needed]
Downstream Use [optional]
[More Information Needed]
Out-of-Scope Use
[More Information Needed]
Bias, Risks, and Limitations
[More Information Needed]
Recommendations
Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
How to Get Started with the Model
Use the code below to get started with the model.
[More Information Needed]
Training Details
Training Data
[More Information Needed]
Training Procedure
Preprocessing [optional]
[More Information Needed]
Training Hyperparameters
- Training regime: [More Information Needed]
Speeds, Sizes, Times [optional]
[More Information Needed]
Evaluation
Testing Data, Factors & Metrics
Testing Data
[More Information Needed]
Factors
[More Information Needed]
Metrics
[More Information Needed]
Results
[More Information Needed]
Summary
Model Examination [optional]
[More Information Needed]
Environmental Impact
Carbon emissions can be estimated using the Machine Learning Impact calculator presented in Lacoste et al. (2019).
- Hardware Type: [More Information Needed]
- Hours used: [More Information Needed]
- Cloud Provider: [More Information Needed]
- Compute Region: [More Information Needed]
- Carbon Emitted: [More Information Needed]
Technical Specifications [optional]
Model Architecture and Objective
[More Information Needed]
Compute Infrastructure
[More Information Needed]
Hardware
[More Information Needed]
Software
[More Information Needed]
Citation [optional]
BibTeX:
[More Information Needed]
APA:
[More Information Needed]
Glossary [optional]
[More Information Needed]
More Information [optional]
[More Information Needed]
Model Card Authors [optional]
[More Information Needed]
Model Card Contact
[More Information Needed]
Framework versions
- PEFT 0.15.0
- Downloads last month
- -