File size: 467 Bytes
72c0672
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
# Forked Lingua

## Setup
```bash
bash setup/create_env.sh
```
Once that is done your can activate the environment 
```bash
source ~/envs/lingua_<date>/bin/activate
```

## Data
```bash
python setup/download_prepare_hf_data.py dclm_baseline_1.0_10prct <mem> --data_dir /mnt/bn/tiktok-mm-5/aiic/users/linzheng/data/dclm_10prct --seed 42 --nchunks <nchunks>
```

```bash
torchrun --nproc-per-node 8 -m apps.evabyte.train config=apps/evabyte/configs/evabyte_7b.yaml
```