cristinaimprota commited on
Commit
47ef2e1
·
verified ·
1 Parent(s): e4561d3

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +16 -0
README.md CHANGED
@@ -55,3 +55,19 @@ This yields:
55
  - **`clean_training_set.json` — ~4.2M pairs**
56
  - Derived from The Stack
57
  - But with many quality issues and syntax errors filtered out.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
55
  - **`clean_training_set.json` — ~4.2M pairs**
56
  - Derived from The Stack
57
  - But with many quality issues and syntax errors filtered out.
58
+
59
+ ---
60
+
61
+ ## Citation
62
+
63
+ If you use this model, please cite the corresponding publication.
64
+
65
+ ```bibtex
66
+ @inproceedings{improta2025quality,
67
+ title={Quality In, Quality Out: Investigating Training Data's Role in AI Code Generation},
68
+ author={Improta, Cristina and Tufano, Rosalia and Liguori, Pietro and Cotroneo, Domenico and Bavota, Gabriele},
69
+ booktitle={2025 IEEE/ACM 33rd International Conference on Program Comprehension (ICPC)},
70
+ pages={454--465},
71
+ year={2025},
72
+ organization={IEEE Computer Society}
73
+ }