AIDC-AI
/

Marco-LLM-GLO

Model card Files Files and versions

StarscreamDeceptions commited on Feb 28

Commit

74eab29

·

verified ·

1 Parent(s): 0f03d01

Update README.md

Files changed (1) hide show

README.md +48 -1

README.md CHANGED Viewed

@@ -32,4 +32,51 @@ language:
 - pl
 base_model:
 - Qwen/Qwen2-7B
----

 - pl
 base_model:
 - Qwen/Qwen2-7B
+---
+# Marco-LLM-GLO
+## Introduction
+Marco-LLM is a series of advanced multilingual language models designed to bridge the performance gap between high-resource languages and low-resource languages. This repository contains the Marco-LLM base language model with 7 billion parameters.
+The model has undergone extensive multilingual continual pretraining on a diverse dataset containing over 5 trillion tokens, with a particular focus on enhancing performance in low-resource languages while maintaining strong capabilities in high-resource languages like English and Chinese.
+Compared to state-of-the-art open-source language models, Marco-LLM demonstrates significant improvements in multilingual tasks, including machine translation, question answering, and reasoning across multiple languages.
+For more details, please refer to our [Hugging Face page](https://huggingface.co/AIDC-AI/Marco-LLM-GLO).
+## Model Details
+Marco-LLM includes a 7B parameter model based on the Transformer architecture. The key features of Marco-LLM are:
+-Multilingual Training: The model is trained on a large-scale multilingual dataset covering 29 languages, including both high-resource languages (e.g., English, Chinese) and low-resource languages (e.g., Kazakh, Nepali).
+-Enhanced Tokenizer: An improved tokenizer is used to better handle multilingual data, ensuring higher efficiency and accuracy in tokenization.
+-Post-Training: Marco-LLM supports various post-training methods, such as Supervised Fine-tuning (SFT) and Direct Preference Optimization (DPO), to further enhance performance for specific tasks and languages.
+## Usage
+It is not advised to use the base language models for direct text generation tasks. Instead, it is recommended to apply post-training methods such as Supervised Fine-tuning (SFT), Reinforcement Learning with Human Feedback (RLHF), or continued pretraining to adapt the models for specific use cases.
+## Citation
+If you find our work helpful, please give us a citation.
+```
+@article{unique_identifier,
+title={Marco-LLM: Bridging Languages via Massive Multilingual Training for Cross-Lingual Enhancement},
+journal={arXiv},
+volume={},
+number={2412.04003},
+year={2024},
+url={https://arxiv.org/abs/2412.04003}
+}
+```