| | --- |
| | language: |
| | - "code" |
| | thumbnail: "https://to-be-updated" |
| | tags: |
| | - code generation |
| | - code translation |
| | - bug fixing |
| | license: "mit" |
| | datasets: |
| | - CodeSearchNet |
| | - CodeXGLUE |
| | metrics: |
| | - EM |
| | - CodeBLEU |
| | --- |
| | |
| | Pretrained model for NatGen: Generative Pre-training by “Naturalizing” Source Code [[`paper`]](https://dl.acm.org/doi/abs/10.1145/3540250.3549162),[[`code`]](https://github.com/saikat107/NatGen),[[`slide`]](https://docs.google.com/presentation/d/1T6kjiohAAR1YvcNvTASR94HptA3xHGCl/edit?usp=sharing&ouid=111755026725574085503&rtpof=true&sd=true). |
| |
|
| | To load the model, |
| |
|
| | ``` |
| | from transformers import AutoTokenizer, AutoModelForSeq2SeqLM |
| | tokenizer = AutoTokenizer.from_pretrained("saikatc/NatGen") |
| | model = AutoModelForSeq2SeqLM.from_pretrained("saikatc/NatGen") |
| | ``` |
| |
|
| |
|
| | For citation, |
| | ``` |
| | @inproceedings{chakraborty2022natgen, |
| | author = {Chakraborty, Saikat and Ahmed, Toufique and Ding, Yangruibo and Devanbu, Premkumar T. and Ray, Baishakhi}, |
| | title = {NatGen: Generative Pre-Training by “Naturalizing” Source Code}, |
| | year = {2022}, |
| | isbn = {9781450394130}, |
| | publisher = {Association for Computing Machinery}, |
| | address = {New York, NY, USA}, |
| | url = {https://doi.org/10.1145/3540250.3549162}, |
| | doi = {10.1145/3540250.3549162}, |
| | booktitle = {Proceedings of the 30th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering}, |
| | pages = {18–30}, |
| | numpages = {13}, |
| | keywords = {Neural Network, Semantic Preserving Transformation, Source Code Transformer, Source Code Pre-training}, |
| | location = {Singapore, Singapore}, |
| | series = {ESEC/FSE 2022} |
| | } |
| | ``` |
| |
|
| |
|