AdaptLLM
/

Adapt-MLLM-to-Domains

English

Model card Files Files and versions

xet

Community

AdaptLLM commited on Dec 14, 2024

Commit

d167b68

verified ·

1 Parent(s): 06cdb86

Update README.md

Browse files

Files changed (1) hide show

README.md +25 -12

README.md CHANGED Viewed

@@ -41,25 +41,28 @@ We investigate domain adaptation of MLLMs through post-training, focusing on dat
 **Code**: [https://github.com/bigai-ai/QA-Synthesizer](https://github.com/bigai-ai/QA-Synthesizer)
 ## About
-AdaMLLM represents our latest advancement in building domain-specific foundation models through post-training on synthetic supervised tasks derived from unsupervised contexts.
 <p align='left'>
-    <img src="https://cdn-uploads.huggingface.co/production/uploads/650801ced5578ef7e20b33d4/2aPl6mKIyHeQp8SO4TXAk.png" width="700">
 </p>
-- **[AdaptLLM](https://huggingface.co/papers/2309.09530): Adapt LLM to domains**
   We employ rule-based methods to extract tasks from domain-specific corpora, reformatting them into reading comprehension tasks for continued pre-training. Our 7B finance model outperforms domain-specific models of much larger scales, such as BloombergGPT-50B.
-- **[AdaMLLM](https://huggingface.co/papers/2411.19930): Adapt Multimodal LLM to domains**
-  We extend supervised task synthesis to multimodality, introducing a unified visual instruction synthesizer to extract instruction-response pairs from domain-specific image-caption pairs. Our synthetic tasks outperform those generated by manual rules, GPT-4, and GPT-4V in improving domain-specific performance for MLLMs.
-## Contact
-Daixuan Cheng: `daixuancheng6@gmail.com`
 ## Citation
 If you find our work helpful, please cite us.
@@ -74,14 +77,24 @@ If you find our work helpful, please cite us.
 }
 ```
-[AdaptLLM](https://huggingface.co/papers/2309.09530) (ICLR 2024)
 ```bibtex
 @inproceedings{
-adaptllm,
 title={Adapting Large Language Models via Reading Comprehension},
 author={Daixuan Cheng and Shaohan Huang and Furu Wei},
 booktitle={The Twelfth International Conference on Learning Representations},
 year={2024},
 url={https://openreview.net/forum?id=y886UXPEZ0}
 }
-```

 **Code**: [https://github.com/bigai-ai/QA-Synthesizer](https://github.com/bigai-ai/QA-Synthesizer)
+## Contact
+Daixuan Cheng: `daixuancheng6@gmail.com`
 ## About
+AdaMLLM is our latest effort to enhance task generalization of (M)LLMs by scaling synthetic supervised tasks based on unsupervised contexts.
 <p align='left'>
+    <img src="https://cdn-uploads.huggingface.co/production/uploads/650801ced5578ef7e20b33d4/HUN3Cr66w_xpj5_c7QQaI.png" width="1000">
 </p>
+- [AdaptLLM](https://huggingface.co/papers/2309.09530)
   We employ rule-based methods to extract tasks from domain-specific corpora, reformatting them into reading comprehension tasks for continued pre-training. Our 7B finance model outperforms domain-specific models of much larger scales, such as BloombergGPT-50B.
+- [Instruction Pre-Training](https://huggingface.co/papers/2406.14491)
+  We develop a general-purpose instruction synthesizer which significantly increases task diversity for LM pre-training, outperforming vanilla pre-training in both general pre-training from scratch and domain-adaptive continual pre-training.
+- AdaMLLM
+  We extend supervised task synthesis to multimodality, introducing a unified visual instruction synthesizer to extract instruction-response pairs from image-caption data. Our synthetic tasks outperform those generated by manual rules, GPT-4, and GPT-4V in improving domain-specific performance for MLLMs.
+Looking ahead, we envision further broadening the scope of supervised task synthesis, efficiently enhancing the general capabilities of trained models.
 ## Citation
 If you find our work helpful, please cite us.
 }
 ```
+[Instruction Pre-Training](https://huggingface.co/papers/2406.14491) (EMNLP 2024)
+```bibtex
+@article{cheng2024instruction,
+  title={Instruction Pre-Training: Language Models are Supervised Multitask Learners},
+  author={Cheng, Daixuan and Gu, Yuxian and Huang, Shaohan and Bi, Junyu and Huang, Minlie and Wei, Furu},
+  journal={arXiv preprint arXiv:2406.14491},
+  year={2024}
+}
+```
+[Adapt LLM to Domains](https://huggingface.co/papers/2309.09530) (ICLR 2024)
 ```bibtex
 @inproceedings{
+cheng2024adapting,
 title={Adapting Large Language Models via Reading Comprehension},
 author={Daixuan Cheng and Shaohan Huang and Furu Wei},
 booktitle={The Twelfth International Conference on Learning Representations},
 year={2024},
 url={https://openreview.net/forum?id=y886UXPEZ0}
 }
+```