AdaptLLM
/

Adapt-MLLM-to-Domains

Model card Files Files and versions

AdaptLLM commited on Dec 2, 2024

Commit

898e89b

·

verified ·

1 Parent(s): daefbe9

Update README.md

Files changed (1) hide show

README.md +2 -2

README.md CHANGED Viewed

@@ -4,7 +4,7 @@ language:
 ---
 # Adapting Multimodal Large Language Models to Domains via Post-Training
-This repository provides an implementation preview of our paper, **On Domain-Specific Post-Training for Multimodal Large Language Models**.
 We investigate domain adaptation of MLLMs through post-training, focusing on data synthesis, training pipelines, and task evaluation.
 **(1) Data Synthesis**: Using open-source models, we develop a visual instruction synthesizer that effectively generates diverse visual instruction tasks from domain-specific image-caption pairs. **Our synthetic tasks surpass those generated by manual rules, GPT-4, and GPT-4V in enhancing the domain-specific performance of MLLMs.**
@@ -37,7 +37,7 @@ AdaMLLM represents our latest advancement in building domain-specific foundation
 - **[AdaptLLM](https://huggingface.co/papers/2309.09530): Adapt LLM to domains**
   We employ rule-based methods to extract tasks from domain-specific corpora, reformatting them into reading comprehension tasks for continued pre-training. Our 7B finance model outperforms domain-specific models of much larger scales, such as BloombergGPT-50B.
-- **AdaMLLM: Adapt Multimodal LLM to domains**
   We extend supervised task synthesis to multimodality, introducing a unified visual instruction synthesizer to extract instruction-response pairs from domain-specific image-caption pairs. Our synthetic tasks outperform those generated by manual rules, GPT-4, and GPT-4V in improving domain-specific performance for MLLMs.

 ---
 # Adapting Multimodal Large Language Models to Domains via Post-Training
+This repository provides an implementation preview of our paper: [On Domain-Specific Post-Training for Multimodal Large Language Models](https://huggingface.co/papers/2411.19930).
 We investigate domain adaptation of MLLMs through post-training, focusing on data synthesis, training pipelines, and task evaluation.
 **(1) Data Synthesis**: Using open-source models, we develop a visual instruction synthesizer that effectively generates diverse visual instruction tasks from domain-specific image-caption pairs. **Our synthetic tasks surpass those generated by manual rules, GPT-4, and GPT-4V in enhancing the domain-specific performance of MLLMs.**
 - **[AdaptLLM](https://huggingface.co/papers/2309.09530): Adapt LLM to domains**
   We employ rule-based methods to extract tasks from domain-specific corpora, reformatting them into reading comprehension tasks for continued pre-training. Our 7B finance model outperforms domain-specific models of much larger scales, such as BloombergGPT-50B.
+- **[AdaMLLM](https://huggingface.co/papers/2411.19930): Adapt Multimodal LLM to domains**
   We extend supervised task synthesis to multimodality, introducing a unified visual instruction synthesizer to extract instruction-response pairs from domain-specific image-caption pairs. Our synthetic tasks outperform those generated by manual rules, GPT-4, and GPT-4V in improving domain-specific performance for MLLMs.