Update README.md
Browse files
README.md
CHANGED
|
@@ -7,4 +7,108 @@ sdk: static
|
|
| 7 |
pinned: false
|
| 8 |
---
|
| 9 |
|
| 10 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 7 |
pinned: false
|
| 8 |
---
|
| 9 |
|
| 10 |
+
# OSS-forge
|
| 11 |
+
|
| 12 |
+
**OSS-Forge** is an open research initiative focused on *trustworthy, secure, and transparent AI-assisted software engineering*.
|
| 13 |
+
We develop and publish:
|
| 14 |
+
|
| 15 |
+
- **static and dynamic analyzers** for AI-generated code
|
| 16 |
+
- **benchmarks and datasets** for software vulnerabilities, defects, exploits, and shellcode
|
| 17 |
+
- **evaluation frameworks** for correctness, robustness, and data poisoning
|
| 18 |
+
- **models and reproducible pipelines** for secure code generation
|
| 19 |
+
- **experimental tools and artifacts** from peer-reviewed scientific publications
|
| 20 |
+
|
| 21 |
+
Our mission is to build a transparent, verifiable, and secure ecosystem for integrating Large Language Models (LLMs) into software development, especially in safety-critical and security-sensitive contexts.
|
| 22 |
+
|
| 23 |
+
---
|
| 24 |
+
|
| 25 |
+
## What You Will Find Here
|
| 26 |
+
|
| 27 |
+
This organization hosts resources from multiple research projects and publications in AI security, software engineering, and code generation. Current categories include:
|
| 28 |
+
|
| 29 |
+
### Static Analyzers & Security Tools
|
| 30 |
+
- **DeVAIC** β Fast static analysis for detecting vulnerabilities in Python code
|
| 31 |
+
- **PatchitPy** β Automated patching of vulnerable Python code via pattern-based transformations
|
| 32 |
+
- **ACCA** β Automated correctness assessment of AI-generated code using symbolic execution
|
| 33 |
+
|
| 34 |
+
### Datasets for Security & Software Engineering
|
| 35 |
+
- **PyResBugs** β 5,007 residual Python bugs with NL descriptions
|
| 36 |
+
- **Shellcode_IA32** β The largest curated dataset of IA-32 shellcode snippets (3 versions)
|
| 37 |
+
- **PoisonPy** β Dataset supporting targeted data-poisoning attacks
|
| 38 |
+
- **Human vs AI Code** β Defects, vulnerabilities, and complexity analysis at scale
|
| 39 |
+
- **EVIL datasets** β Exploit generation datasets (assembly & Python)
|
| 40 |
+
|
| 41 |
+
### Robustness, Poisoning & Exploit Generation
|
| 42 |
+
- **Offensive Code Generation Robustness** β Data augmentation framework
|
| 43 |
+
- **Context-Aware Exploits** β Benchmark for NL-to-exploit generation
|
| 44 |
+
- **AI Code Generator Poisoning** β Targeted poisoning pipelines and evaluation
|
| 45 |
+
|
| 46 |
+
All repositories include code, experimental scripts, datasets, and reproducibility materials.
|
| 47 |
+
|
| 48 |
+
---
|
| 49 |
+
|
| 50 |
+
## Research Themes
|
| 51 |
+
|
| 52 |
+
Our work spans four interconnected areas:
|
| 53 |
+
|
| 54 |
+
1. **Security of AI-generated Code**
|
| 55 |
+
Vulnerability detection, automated patching, exploit generation, robustness testing.
|
| 56 |
+
|
| 57 |
+
2. **Trustworthy LLM Evaluation**
|
| 58 |
+
Correctness, equivalence checking, symbolic execution, reproducible benchmarks.
|
| 59 |
+
|
| 60 |
+
3. **Software Engineering with AI**
|
| 61 |
+
Defect analysis, complexity metrics, orthogonal defect classification (ODC).
|
| 62 |
+
|
| 63 |
+
4. **Adversarial ML for Code Models**
|
| 64 |
+
Data poisoning, robustness stress-testing, unsafe pattern injection.
|
| 65 |
+
|
| 66 |
+
All research artifacts are peer-reviewed and associated with publications at DSN, ISSRE, ICPC, IST, EMSE, JSS, AUSE, and other venues.
|
| 67 |
+
|
| 68 |
+
---
|
| 69 |
+
|
| 70 |
+
## Publications Powered by These Repositories
|
| 71 |
+
|
| 72 |
+
A non-exhaustive list includes works presented at:
|
| 73 |
+
|
| 74 |
+
- **IEEE/IFIP DSN**
|
| 75 |
+
- **IEEE ISSRE**
|
| 76 |
+
- **IEEE/ACM ICPC**
|
| 77 |
+
- **Empirical Software Engineering (EMSE)**
|
| 78 |
+
- **Information and Software Technology (IST)**
|
| 79 |
+
- **Automated Software Engineering (AUSE)**
|
| 80 |
+
- **Journal of Systems and Software (JSS)**
|
| 81 |
+
- **NLP4Prog Workshop**
|
| 82 |
+
|
| 83 |
+
Full references are available inside each corresponding repository.
|
| 84 |
+
|
| 85 |
+
---
|
| 86 |
+
|
| 87 |
+
## Contributing
|
| 88 |
+
|
| 89 |
+
We encourage contributions from the research and practitioner community.
|
| 90 |
+
|
| 91 |
+
You can contribute by:
|
| 92 |
+
|
| 93 |
+
- submitting new datasets
|
| 94 |
+
- improving static analysis rules
|
| 95 |
+
- adding benchmarks or experimental scripts
|
| 96 |
+
- reporting issues or proposing new features
|
| 97 |
+
|
| 98 |
+
Please open discussions or pull requests inside the relevant repository.
|
| 99 |
+
|
| 100 |
+
---
|
| 101 |
+
|
| 102 |
+
## Contact
|
| 103 |
+
|
| 104 |
+
OSS-Forge is developed by a joint research team from the **University of North Carolina at Charlotte (UNCC)** and the **University of Naples Federico II**.
|
| 105 |
+
|
| 106 |
+
### Scientific Leadership
|
| 107 |
+
- Prof. [Domenico Cotroneo](https://webpages.charlotte.edu/dcotrone/) β UNCC
|
| 108 |
+
|
| 109 |
+
### Core Research Contributors
|
| 110 |
+
- Dr. [Pietro Liguori](http://wpage.unina.it/pietro.liguori/) β University of Naples Federico II
|
| 111 |
+
- [Cristina Improta](http://wpage.unina.it/cristina.improta/) β University of Naples Federico II
|
| 112 |
+
- Ph.D. students and graduate researchers and contributors from the DESSERT Research group β University of Naples Federico II
|
| 113 |
+
|
| 114 |
+
---
|