ChengXinSong's picture
Create README.md
3a236fa verified
This model is fine-tuned based on the combination of the 100k Adjusted Caption & Paragraph dataset and the Adjusted Total QA dataset.
During fine-tuning, the updated modules include Vision Encoder, Project Layer, and LM LoRA 128.
Vision Encoder:
Directly updated
Project Layer:
Directly updated
LM LoRA 128:
lora_target_modules:
'attention.wqkv'
'attention.wo'
'feed_forward.w1'
'feed_forward.w2'
'feed_forward.w3'