Commit
·
18f80d0
1
Parent(s):
871bc9e
Add note on issue with transformers version
Browse files
README.md
CHANGED
|
@@ -25,11 +25,15 @@ python gen_vocab.py \
|
|
| 25 |
--cust_vocab ./cust-data/vocab.txt
|
| 26 |
```
|
| 27 |
9. Download pretrained weights from https://pan.baidu.com/s/1rARdfadQlQGKGHa3de82BA, password: 0o65
|
| 28 |
-
10. Initialize weights for fine-tuning
|
|
|
|
| 29 |
```
|
| 30 |
python init_custdata_model.py --cust_vocab ./cust-data/vocab.txt --pretrain_model ./weights --cust_data_init_weights_path ./cust-data/weights
|
| 31 |
```
|
| 32 |
11. Train
|
|
|
|
|
|
|
|
|
|
| 33 |
```
|
| 34 |
python train.py --cust_data_init_weights_path ./cust-data/weights --checkpoint_path ./checkpoint/trocr-custdata --dataset_path "./dataset/*.jpg" --per_device_train_batch_size 8
|
| 35 |
```
|
|
|
|
| 25 |
--cust_vocab ./cust-data/vocab.txt
|
| 26 |
```
|
| 27 |
9. Download pretrained weights from https://pan.baidu.com/s/1rARdfadQlQGKGHa3de82BA, password: 0o65
|
| 28 |
+
10. Initialize weights for fine-tuning
|
| 29 |
+
Make sure you are using 4.15.0 version of transformers by running `pip install transformers==4.15.0`.
|
| 30 |
```
|
| 31 |
python init_custdata_model.py --cust_vocab ./cust-data/vocab.txt --pretrain_model ./weights --cust_data_init_weights_path ./cust-data/weights
|
| 32 |
```
|
| 33 |
11. Train
|
| 34 |
+
To enable M1 GPU support, install the dev version of transformers by running `pip install git+https://github.com/huggingface/transformers`.
|
| 35 |
+
In Dec 21, 2022, the dev version that's working for me is `transformers-4.26.0.dev0`. Later stable releases may have M1 GPU support built-in so you don't need to install the dev version.
|
| 36 |
+
If you are running the whole procedure again, remember to reinstall the older transformers version as instructed in step 10. Otherwise, the weights initialized will not be in the correct format and you will see miserable accuracy rate, likely due to breaking changes involving how tokenization is done.
|
| 37 |
```
|
| 38 |
python train.py --cust_data_init_weights_path ./cust-data/weights --checkpoint_path ./checkpoint/trocr-custdata --dataset_path "./dataset/*.jpg" --per_device_train_batch_size 8
|
| 39 |
```
|