apcl
/

jam-contextsum

Model card Files Files and versions

jam-contextsum / README.md

chiayisu's picture

Create README.md

bb5bc63 verified over 1 year ago

|

history blame contribute delete

998 Bytes

	## Jam-Contextsum

	Jam-Contextsum is a GPT2-like model finetuned to generate summary on why the method exists.

	## Jam-Contextsum Training Details

	- ckpt_pretrain is the file that we use to finetune the model for generating the summary on why the method exists
	- Our [GitHub repo](https://github.com/apcl-research/jam-contextsum) contains the code for reproduction using the same [data](https://huggingface.co/datasets/apcl/jam_contextsum).


	## ckpt_pretrain.pt
	\| Hyperparameter \| Description \| Value \|
	\| ----------- \| ----------- \|------------\|
	\|e \| embedding dimensions \| 512 \|
	\|L \| number of layers \| 4 \|
	\|h \| attention heads \| 4 \|
	\|c \| block size / context length \| 1,024 \|
	\|b \| batch size \| 4 \|
	\|a \| accumulation steps \| 32 \|
	\|d \| dropout \| 0.20 \|
	\|r \| learning rate \| 3e-5 \|
	\|y \| iterations \| 1e-5 \|
	\|iter \| number of iterations after pretraing \| 137,900 \|