apcl
/

autorepair

Model card Files Files and versions

autorepair / README.md

chiayisu's picture

Update README.md

a325a02 over 2 years ago

|

history blame contribute delete

1.58 kB

	## A Lossless Syntax Tree Generator with Zero-shot Error Correction

	- We follow [jam](https://huggingface.co/apcl/jam)'s pretraining procedure and use the same data to pretrain except we also use srcml to pretrain the models.
	- In the finetuning stage, we finetune our models for 3 epochs.
	- Our [GitHub repo](https://github.com/apcl-research/autorepair) contains the code for reproduction using the same [data](https://huggingface.co/datasets/apcl/autorepair).


	## Pretrained model parameters
	\| Hyperparameter \| Description \| Value \|
	\| ----------- \| ----------- \|------------\|
	\|e \| embedding dimensions \| 1024 \|
	\|L \| number of layers \| 24 \|
	\|h \| attention heads \| 16 \|
	\|c \| block size / context length \| 256 \|
	\|b \| batch size \| 4 \|
	\|a \| accumulation steps \| 32 \|
	\|r \| learning rate \| 3e-5 \|
	\|y \| weight decay \| 1e-5 \|
	\|iter \| iterations \| 570000 \|

	## Model files

	\| Filename \| Description \|
	\| ------- \| ------- \|
	\|ckpt.pt\|A model file for finetuning\|
	\|ckpt_base.pt \| A model file for generating syntax tree with the error correction in zero-shot setting\|
	\|ckpt_finetune.pt \| A model finetuned with the syntatic error dataset \|

	- Note that you can adjust the batch size and accumulation steps based on your GPU memory. But, the batch size * accumulation steps should be 128.
	- If you finetune your models with multiple GPUs, you can turn down accumulation steps. For example, if you finetune with 2 GPUs, you will need to half the accumulation steps.