Update README.md
Browse files
README.md
CHANGED
|
@@ -41,25 +41,28 @@ We investigate domain adaptation of MLLMs through post-training, focusing on dat
|
|
| 41 |
|
| 42 |
**Code**: [https://github.com/bigai-ai/QA-Synthesizer](https://github.com/bigai-ai/QA-Synthesizer)
|
| 43 |
|
|
|
|
|
|
|
|
|
|
|
|
|
| 44 |
## About
|
| 45 |
|
| 46 |
-
AdaMLLM
|
| 47 |
|
| 48 |
<p align='left'>
|
| 49 |
-
<img src="https://cdn-uploads.huggingface.co/production/uploads/650801ced5578ef7e20b33d4/
|
| 50 |
</p>
|
| 51 |
|
| 52 |
-
|
| 53 |
-
- **[AdaptLLM](https://huggingface.co/papers/2309.09530): Adapt LLM to domains**
|
| 54 |
We employ rule-based methods to extract tasks from domain-specific corpora, reformatting them into reading comprehension tasks for continued pre-training. Our 7B finance model outperforms domain-specific models of much larger scales, such as BloombergGPT-50B.
|
| 55 |
|
| 56 |
-
-
|
| 57 |
-
We
|
| 58 |
-
|
| 59 |
|
| 60 |
-
|
| 61 |
-
|
| 62 |
|
|
|
|
| 63 |
|
| 64 |
## Citation
|
| 65 |
If you find our work helpful, please cite us.
|
|
@@ -74,14 +77,24 @@ If you find our work helpful, please cite us.
|
|
| 74 |
}
|
| 75 |
```
|
| 76 |
|
| 77 |
-
[
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 78 |
```bibtex
|
| 79 |
@inproceedings{
|
| 80 |
-
|
| 81 |
title={Adapting Large Language Models via Reading Comprehension},
|
| 82 |
author={Daixuan Cheng and Shaohan Huang and Furu Wei},
|
| 83 |
booktitle={The Twelfth International Conference on Learning Representations},
|
| 84 |
year={2024},
|
| 85 |
url={https://openreview.net/forum?id=y886UXPEZ0}
|
| 86 |
}
|
| 87 |
-
```
|
|
|
|
| 41 |
|
| 42 |
**Code**: [https://github.com/bigai-ai/QA-Synthesizer](https://github.com/bigai-ai/QA-Synthesizer)
|
| 43 |
|
| 44 |
+
|
| 45 |
+
## Contact
|
| 46 |
+
Daixuan Cheng: `daixuancheng6@gmail.com`
|
| 47 |
+
|
| 48 |
## About
|
| 49 |
|
| 50 |
+
AdaMLLM is our latest effort to enhance task generalization of (M)LLMs by scaling synthetic supervised tasks based on unsupervised contexts.
|
| 51 |
|
| 52 |
<p align='left'>
|
| 53 |
+
<img src="https://cdn-uploads.huggingface.co/production/uploads/650801ced5578ef7e20b33d4/HUN3Cr66w_xpj5_c7QQaI.png" width="1000">
|
| 54 |
</p>
|
| 55 |
|
| 56 |
+
- [AdaptLLM](https://huggingface.co/papers/2309.09530)
|
|
|
|
| 57 |
We employ rule-based methods to extract tasks from domain-specific corpora, reformatting them into reading comprehension tasks for continued pre-training. Our 7B finance model outperforms domain-specific models of much larger scales, such as BloombergGPT-50B.
|
| 58 |
|
| 59 |
+
- [Instruction Pre-Training](https://huggingface.co/papers/2406.14491)
|
| 60 |
+
We develop a general-purpose instruction synthesizer which significantly increases task diversity for LM pre-training, outperforming vanilla pre-training in both general pre-training from scratch and domain-adaptive continual pre-training.
|
|
|
|
| 61 |
|
| 62 |
+
- AdaMLLM
|
| 63 |
+
We extend supervised task synthesis to multimodality, introducing a unified visual instruction synthesizer to extract instruction-response pairs from image-caption data. Our synthetic tasks outperform those generated by manual rules, GPT-4, and GPT-4V in improving domain-specific performance for MLLMs.
|
| 64 |
|
| 65 |
+
Looking ahead, we envision further broadening the scope of supervised task synthesis, efficiently enhancing the general capabilities of trained models.
|
| 66 |
|
| 67 |
## Citation
|
| 68 |
If you find our work helpful, please cite us.
|
|
|
|
| 77 |
}
|
| 78 |
```
|
| 79 |
|
| 80 |
+
[Instruction Pre-Training](https://huggingface.co/papers/2406.14491) (EMNLP 2024)
|
| 81 |
+
```bibtex
|
| 82 |
+
@article{cheng2024instruction,
|
| 83 |
+
title={Instruction Pre-Training: Language Models are Supervised Multitask Learners},
|
| 84 |
+
author={Cheng, Daixuan and Gu, Yuxian and Huang, Shaohan and Bi, Junyu and Huang, Minlie and Wei, Furu},
|
| 85 |
+
journal={arXiv preprint arXiv:2406.14491},
|
| 86 |
+
year={2024}
|
| 87 |
+
}
|
| 88 |
+
```
|
| 89 |
+
|
| 90 |
+
[Adapt LLM to Domains](https://huggingface.co/papers/2309.09530) (ICLR 2024)
|
| 91 |
```bibtex
|
| 92 |
@inproceedings{
|
| 93 |
+
cheng2024adapting,
|
| 94 |
title={Adapting Large Language Models via Reading Comprehension},
|
| 95 |
author={Daixuan Cheng and Shaohan Huang and Furu Wei},
|
| 96 |
booktitle={The Twelfth International Conference on Learning Representations},
|
| 97 |
year={2024},
|
| 98 |
url={https://openreview.net/forum?id=y886UXPEZ0}
|
| 99 |
}
|
| 100 |
+
```
|