AdaptLLM commited on
Commit
858c699
·
verified ·
1 Parent(s): 1e83a2b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +12 -20
README.md CHANGED
@@ -30,27 +30,24 @@ We investigate domain adaptation of MLLMs through post-training, focusing on dat
30
 
31
  ## About
32
 
33
- AdaMLLM is our latest effort to enhance task generalization of (M)LLMs by scaling synthetic supervised tasks based on unsupervised contexts.
34
 
35
  <p align='left'>
36
- <img src="https://cdn-uploads.huggingface.co/production/uploads/650801ced5578ef7e20b33d4/HUN3Cr66w_xpj5_c7QQaI.png" width="1000">
37
  </p>
38
 
39
- - [AdaptLLM](https://huggingface.co/papers/2309.09530)
40
- We employ rule-based methods to extract tasks from domain-specific corpora, reformatting them into reading comprehension tasks for continued pre-training. Our 7B finance model outperforms domain-specific models of much larger scales, such as BloombergGPT-50B.
41
-
42
- - [Instruction Pre-Training](https://huggingface.co/papers/2406.14491)
43
- We develop a general-purpose instruction synthesizer which significantly increases task diversity for LM pre-training, outperforming vanilla pre-training in both general pre-training from scratch and domain-adaptive continual pre-training.
44
 
45
- - AdaMLLM
46
- We extend supervised task synthesis to multimodality, introducing a unified visual instruction synthesizer to extract instruction-response pairs from image-caption data. Our synthetic tasks outperform those generated by manual rules, GPT-4, and GPT-4V in improving domain-specific performance for MLLMs.
47
 
48
- Looking ahead, we envision further broadening the scope of supervised task synthesis, efficiently enhancing the general capabilities of trained models.
 
49
 
50
 
51
  ## Citation
52
- If you find our work helpful, please consider citing us:
53
 
 
54
  ```bibtex
55
  @article{adamllm,
56
  title={On Domain-Specific Post-Training for Multimodal Large Language Models},
@@ -58,14 +55,10 @@ If you find our work helpful, please consider citing us:
58
  journal={arXiv preprint arXiv:2411.19930},
59
  year={2024}
60
  }
 
61
 
62
- @article{instructPT,
63
- title={Instruction Pre-Training: Language Models are Supervised Multitask Learners},
64
- author={Cheng, Daixuan and Gu, Yuxian and Huang, Shaohan and Bi, Junyu and Huang, Minlie and Wei, Furu},
65
- journal={arXiv preprint arXiv:2406.14491},
66
- year={2024}
67
- }
68
-
69
  @inproceedings{
70
  adaptllm,
71
  title={Adapting Large Language Models via Reading Comprehension},
@@ -74,5 +67,4 @@ booktitle={The Twelfth International Conference on Learning Representations},
74
  year={2024},
75
  url={https://openreview.net/forum?id=y886UXPEZ0}
76
  }
77
-
78
- ```
 
30
 
31
  ## About
32
 
33
+ AdaMLLM represents our latest advancement in building domain-specific foundation models through post-training on synthetic supervised tasks derived from unsupervised contexts.
34
 
35
  <p align='left'>
36
+ <img src="https://cdn-uploads.huggingface.co/production/uploads/650801ced5578ef7e20b33d4/2aPl6mKIyHeQp8SO4TXAk.png" width="700">
37
  </p>
38
 
 
 
 
 
 
39
 
40
+ - **[AdaptLLM](https://huggingface.co/papers/2309.09530): Adapt LLM to domains**
41
+ We employ rule-based methods to extract tasks from domain-specific corpora, reformatting them into reading comprehension tasks for continued pre-training. Our 7B finance model outperforms domain-specific models of much larger scales, such as BloombergGPT-50B.
42
 
43
+ - **[AdaMLLM](https://huggingface.co/papers/2411.19930): Adapt Multimodal LLM to domains**
44
+ We extend supervised task synthesis to multimodality, introducing a unified visual instruction synthesizer to extract instruction-response pairs from domain-specific image-caption pairs. Our synthetic tasks outperform those generated by manual rules, GPT-4, and GPT-4V in improving domain-specific performance for MLLMs.
45
 
46
 
47
  ## Citation
48
+ If you find our work helpful, please cite us.
49
 
50
+ AdaMLLM
51
  ```bibtex
52
  @article{adamllm,
53
  title={On Domain-Specific Post-Training for Multimodal Large Language Models},
 
55
  journal={arXiv preprint arXiv:2411.19930},
56
  year={2024}
57
  }
58
+ ```
59
 
60
+ [AdaptLLM](https://huggingface.co/papers/2309.09530) (ICLR 2024)
61
+ ```bibtex
 
 
 
 
 
62
  @inproceedings{
63
  adaptllm,
64
  title={Adapting Large Language Models via Reading Comprehension},
 
67
  year={2024},
68
  url={https://openreview.net/forum?id=y886UXPEZ0}
69
  }
70
+ ```