FreedomIntelligence
/

Apollo-MoE-1.5B

@@ -55,34 +55,24 @@ pipeline_tag: question-answering
 tags:
 - biology
 - medical
----
 # Democratizing Medical LLMs For Much More Languages
 Covering 12 Major Languages including English, Chinese, French, Hindi, Spanish, Arabic, Russian, Japanese, Korean, German, Italian, Portuguese and 38 Minor Languages So far.
-<center>
 <p align="center">
-   📃 <a href="https://arxiv.org/abs/2410.10626" target="_blank">Paper</a> • 🌐 <a href="" target="_blank">Demo</a> • 🤗 <a href="https://huggingface.co/datasets/FreedomIntelligence/ApolloMoEDataset" target="_blank">ApolloMoEDataset</a> • 🤗 <a href="https://huggingface.co/datasets/FreedomIntelligence/ApolloMoEBench" target="_blank">ApolloMoEBench</a>  • 🤗 <a href="https://huggingface.co/collections/FreedomIntelligence/apollomoe-and-apollo2-670ddebe3bb1ba1aebabbf2c" target="_blank">Models</a> • 🌐 <a href="https://github.com/FreedomIntelligence/Apollo" target="_blank">Apollo</a>
 </p>
-![Apollo](assets/apollo_medium_final.png)
-## 🌈 Update
-* **[2024.10.15]** ApolloMoE repo is published！🎉
-## Languages Coverage
-12 Major Languages and 38 Minor Languages
-<details>
-  <summary>Click to view the Languages Coverage</summary>
-   ![ApolloMoE](assets/languages.png)
-</details>
 ## Architecture
@@ -98,24 +88,29 @@ Covering 12 Major Languages including English, Chinese, French, Hindi, Spanish,
 ### Dense
    🤗 <a href="https://huggingface.co/FreedomIntelligence/Apollo2-0.5B" target="_blank">Apollo2-0.5B</a> • 🤗 <a href="https://huggingface.co/FreedomIntelligence/Apollo2-1.5B" target="_blank">Apollo2-1.5B</a> • 🤗 <a href="https://huggingface.co/FreedomIntelligence/Apollo2-2B" target="_blank">Apollo2-2B</a>  • 🤗 <a href="https://huggingface.co/FreedomIntelligence/Apollo2-3.8B" target="_blank">Apollo2-3.8B</a> • 🤗 <a href="https://huggingface.co/FreedomIntelligence/Apollo2-7B" target="_blank">Apollo2-7B</a>  • 🤗 <a href="https://huggingface.co/FreedomIntelligence/Apollo2-9B" target="_blank">Apollo2-9B</a>
 <details>
   <summary>Click to view the Dense Models Results</summary>
    ![ApolloMoE](assets/dense_results.png)
 </details>
 ### Post-MoE
    🤗 <a href="https://huggingface.co/FreedomIntelligence/Apollo-MoE-0.5B" target="_blank">Apollo-MoE-0.5B</a>  • 🤗 <a href="https://huggingface.co/FreedomIntelligence/Apollo-MoE-1.5B" target="_blank">Apollo-MoE-1.5B</a>  • 🤗 <a href="https://huggingface.co/FreedomIntelligence/Apollo-MoE-7B" target="_blank">Apollo-MoE-7B</a>
 <details>
   <summary>Click to view the Post-MoE Models Results</summary>
    ![ApolloMoE](assets/post_moe_results.png)
 </details>
 ## Usage Format
 #### Apollo2
@@ -125,7 +120,7 @@ Covering 12 Major Languages including English, Chinese, French, Hindi, Spanish,
 #### Apollo-MoE
 - 0.5B, 1.5B, 7B: User:{query}\nAssistant:{response}<|endoftext|>
 ## Dataset & Evaluation
 - Dataset
@@ -139,12 +134,12 @@ Covering 12 Major Languages including English, Chinese, French, Hindi, Spanish,
    </details>
 - Evaluation
   🤗 <a href="https://huggingface.co/datasets/FreedomIntelligence/ApolloMoEBench" target="_blank">ApolloMoEBench</a>
    <details><summary>Click to expand</summary>
      - EN:
        - [MedQA-USMLE](https://huggingface.co/datasets/GBaker/MedQA-USMLE-4-options)
        - [MedMCQA](https://huggingface.co/datasets/medmcqa/viewer/default/test)
@@ -180,25 +175,29 @@ Covering 12 Major Languages including English, Chinese, French, Hindi, Spanish,
      - PT: [BioInstructQA](https://huggingface.co/datasets/BioMistral/BioInstructQA): Portuguese part
      - RU: [RuMedBench](https://github.com/sb-ai-lab/MedBench)
    </details>
 ## Results reproduction
    <details><summary>Click to expand</summary>
-   We take Apollo2-7B or Apollo-MoE-0.5B as example
    1. Download Dataset for project:
       ```
-      bash 0.download_data.sh
       ```
-   2. Prepare test and dev data for specific model:
-      - Create test data for with special token
        ```
        bash 1.data_process_test&dev.sh
@@ -206,21 +205,23 @@ Covering 12 Major Languages including English, Chinese, French, Hindi, Spanish,
    3. Prepare train data for specific model (Create tokenized data in advance):
-      - You can adjust data Training order and Training Epoch in this step
        ```
        bash 2.data_process_train.sh
        ```
    4. Train the model
-      - If you want to train in Multi Nodes please refer to ./src/sft/training_config/zero_multi.yaml
        ```
-       bash 3.single_node_train.sh
        ```
@@ -230,6 +231,12 @@ Covering 12 Major Languages including English, Chinese, French, Hindi, Spanish,
          bash 4.eval.sh
          ```
    </details>

 tags:
 - biology
 - medical
 # Democratizing Medical LLMs For Much More Languages
 Covering 12 Major Languages including English, Chinese, French, Hindi, Spanish, Arabic, Russian, Japanese, Korean, German, Italian, Portuguese and 38 Minor Languages So far.
 <p align="center">
+   📃 <a href="https://arxiv.org/abs/2410.10626" target="_blank">Paper</a> • 🌐 <a href="" target="_blank">Demo</a> • 🤗 <a href="https://huggingface.co/datasets/FreedomIntelligence/ApolloMoEDataset" target="_blank">ApolloMoEDataset</a> • 🤗 <a href="https://huggingface.co/datasets/FreedomIntelligence/ApolloMoEBench" target="_blank">ApolloMoEBench</a>  • 🤗 <a href="https://huggingface.co/collections/FreedomIntelligence/apollomoe-and-apollo2-670ddebe3bb1ba1aebabbf2c" target="_blank">Models</a> • 🌐 <a href="https://github.com/FreedomIntelligence/Apollo" target="_blank">Apollo</a> • 🌐 <a href="https://github.com/FreedomIntelligence/ApolloMoE" target="_blank">ApolloMoE</a>
 </p>
+![Apollo](assets/apollo_medium_final.png)
+## 🌈 Update
+* **[2024.10.15]** ApolloMoE repo is published！🎉
 ## Architecture
 ### Dense
    🤗 <a href="https://huggingface.co/FreedomIntelligence/Apollo2-0.5B" target="_blank">Apollo2-0.5B</a> • 🤗 <a href="https://huggingface.co/FreedomIntelligence/Apollo2-1.5B" target="_blank">Apollo2-1.5B</a> • 🤗 <a href="https://huggingface.co/FreedomIntelligence/Apollo2-2B" target="_blank">Apollo2-2B</a>  • 🤗 <a href="https://huggingface.co/FreedomIntelligence/Apollo2-3.8B" target="_blank">Apollo2-3.8B</a> • 🤗 <a href="https://huggingface.co/FreedomIntelligence/Apollo2-7B" target="_blank">Apollo2-7B</a>  • 🤗 <a href="https://huggingface.co/FreedomIntelligence/Apollo2-9B" target="_blank">Apollo2-9B</a>
 <details>
   <summary>Click to view the Dense Models Results</summary>
    ![ApolloMoE](assets/dense_results.png)
 </details>
 ### Post-MoE
    🤗 <a href="https://huggingface.co/FreedomIntelligence/Apollo-MoE-0.5B" target="_blank">Apollo-MoE-0.5B</a>  • 🤗 <a href="https://huggingface.co/FreedomIntelligence/Apollo-MoE-1.5B" target="_blank">Apollo-MoE-1.5B</a>  • 🤗 <a href="https://huggingface.co/FreedomIntelligence/Apollo-MoE-7B" target="_blank">Apollo-MoE-7B</a>
 <details>
   <summary>Click to view the Post-MoE Models Results</summary>
    ![ApolloMoE](assets/post_moe_results.png)
 </details>
 ## Usage Format
 #### Apollo2
 #### Apollo-MoE
 - 0.5B, 1.5B, 7B: User:{query}\nAssistant:{response}<|endoftext|>
 ## Dataset & Evaluation
 - Dataset
    </details>
 - Evaluation
   🤗 <a href="https://huggingface.co/datasets/FreedomIntelligence/ApolloMoEBench" target="_blank">ApolloMoEBench</a>
    <details><summary>Click to expand</summary>
      - EN:
        - [MedQA-USMLE](https://huggingface.co/datasets/GBaker/MedQA-USMLE-4-options)
        - [MedMCQA](https://huggingface.co/datasets/medmcqa/viewer/default/test)
      - PT: [BioInstructQA](https://huggingface.co/datasets/BioMistral/BioInstructQA): Portuguese part
      - RU: [RuMedBench](https://github.com/sb-ai-lab/MedBench)
    </details>
 ## Results reproduction
    <details><summary>Click to expand</summary>
+   We take Gemma-2b as example
    1. Download Dataset for project:
       ```
+      bash 0.download_data.sh
       ```
+   2. Prepare test and dev for specific model:
+      - Create test data for with special token, you can use ./util/check.ipynb to check models' special tokens
        ```
        bash 1.data_process_test&dev.sh
    3. Prepare train data for specific model (Create tokenized data in advance):
+      - You can adjust data Training order and Training Epoch in this step
        ```
        bash 2.data_process_train.sh
        ```
    4. Train the model
+      - If you want to train in Multi Nodes please refer to ./scripts/multi_node_train_*.sh
        ```
+       bash 3.single_node_train_gemma.sh
        ```
          bash 4.eval.sh
          ```
+   6. Evaluate your model: Play with your ckpts in bash
+         ```
+         python ./src/evaluate/cli_demo.py --model_name='./ckpts/your/path/tfmr'
+         ```
    </details>