allenai
/

OLMoE-1B-7B-0924-Instruct

Text Generation

Mixture of Experts

Model card Files Files and versions

Muennighoff commited on Sep 3, 2024

Commit

5280d35

·

verified ·

1 Parent(s): 8da7e16

Update README.md

Files changed (1) hide show

README.md +3 -3

README.md CHANGED Viewed

@@ -21,7 +21,7 @@ co2_eq_emissions: 1
 # Use
-Install the `transformers` & `torch` libraries and run:
 ```python
 from transformers import OlmoeForCausalLM, AutoTokenizer
@@ -48,8 +48,8 @@ Here's how it works: imagine you have a bunch of toys, and you want to
 Branches:
 - `main`: Preference tuned via DPO model of https://hf.co/OLMoE/OLMoE-1B-7B-0924-SFT (`main` branch)
-- `load-balancing`: Ablation with load balancing loss during DPO starting from the `load-balancing` branch of https://hf.co/OLMoE/OLMoE-1B-7B-0924-SFT
-- `non-annealed`: Ablation starting from the `non-annealed` branch of https://hf.co/OLMoE/OLMoE-1B-7B-0924-SFT which is an SFT of the pretraining checkpoint prior to annealing (branch `step1200000-tokens5033B` of https://hf.co/OLMoE/OLMoE-1B-7B-0924)
 - `kto`: Ablation using KTO instead of DPO. This branch is the checkpoint after 5,000 steps with the RMS optimizer. The other `kto*` branches correspond to the other checkpoints mentioned in the paper.
 # Citation

 # Use
+Install the `pip install git+https://github.com/Muennighoff/transformers.git` & `torch` and run:
 ```python
 from transformers import OlmoeForCausalLM, AutoTokenizer
 Branches:
 - `main`: Preference tuned via DPO model of https://hf.co/OLMoE/OLMoE-1B-7B-0924-SFT (`main` branch)
+- `load-balancing`: Ablation with load balancing loss during DPO starting from the `load-balancing` branch of https://hf.co/allenai/OLMoE-1B-7B-0924-SFT
+- `non-annealed`: Ablation starting from the `non-annealed` branch of https://hf.co/allenai/OLMoE-1B-7B-0924-SFT which is an SFT of the pretraining checkpoint prior to annealing (branch `step1200000-tokens5033B` of https://hf.co/allenai/OLMoE-1B-7B-0924)
 - `kto`: Ablation using KTO instead of DPO. This branch is the checkpoint after 5,000 steps with the RMS optimizer. The other `kto*` branches correspond to the other checkpoints mentioned in the paper.
 # Citation