File size: 467 Bytes
72c0672 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 | # Forked Lingua
## Setup
```bash
bash setup/create_env.sh
```
Once that is done your can activate the environment
```bash
source ~/envs/lingua_<date>/bin/activate
```
## Data
```bash
python setup/download_prepare_hf_data.py dclm_baseline_1.0_10prct <mem> --data_dir /mnt/bn/tiktok-mm-5/aiic/users/linzheng/data/dclm_10prct --seed 42 --nchunks <nchunks>
```
```bash
torchrun --nproc-per-node 8 -m apps.evabyte.train config=apps/evabyte/configs/evabyte_7b.yaml
```
|