Update README.md
Browse files
README.md
CHANGED
|
@@ -6,6 +6,9 @@ language:
|
|
| 6 |
|
| 7 |
This repository provides an implementation preview of our paper: [On Domain-Specific Post-Training for Multimodal Large Language Models](https://huggingface.co/papers/2411.19930).
|
| 8 |
|
|
|
|
|
|
|
|
|
|
| 9 |
We investigate domain adaptation of MLLMs through post-training, focusing on data synthesis, training pipelines, and task evaluation.
|
| 10 |
**(1) Data Synthesis**: Using open-source models, we develop a visual instruction synthesizer that effectively generates diverse visual instruction tasks from domain-specific image-caption pairs. **Our synthetic tasks surpass those generated by manual rules, GPT-4, and GPT-4V in enhancing the domain-specific performance of MLLMs.**
|
| 11 |
**(2) Training Pipeline**: While the two-stage training--initially on image-caption pairs followed by visual instruction tasks--is commonly adopted for developing general MLLMs, we apply a single-stage training pipeline to enhance task diversity for domain-specific post-training.
|
|
|
|
| 6 |
|
| 7 |
This repository provides an implementation preview of our paper: [On Domain-Specific Post-Training for Multimodal Large Language Models](https://huggingface.co/papers/2411.19930).
|
| 8 |
|
| 9 |
+
Our code will be available at [https://github.com/bigai-ai/QA-Synthesizer](https://github.com/bigai-ai/QA-Synthesizer)
|
| 10 |
+
|
| 11 |
+
|
| 12 |
We investigate domain adaptation of MLLMs through post-training, focusing on data synthesis, training pipelines, and task evaluation.
|
| 13 |
**(1) Data Synthesis**: Using open-source models, we develop a visual instruction synthesizer that effectively generates diverse visual instruction tasks from domain-specific image-caption pairs. **Our synthetic tasks surpass those generated by manual rules, GPT-4, and GPT-4V in enhancing the domain-specific performance of MLLMs.**
|
| 14 |
**(2) Training Pipeline**: While the two-stage training--initially on image-caption pairs followed by visual instruction tasks--is commonly adopted for developing general MLLMs, we apply a single-stage training pipeline to enhance task diversity for domain-specific post-training.
|