Upload folder using huggingface_hub
Browse files
README.md
CHANGED
|
@@ -57,13 +57,13 @@ We have compared our model with the following models
|
|
| 57 |
- GPT 4o mini
|
| 58 |
|
| 59 |
On the following parameters
|
| 60 |
-
- Compilation(%)
|
| 61 |
-
- OpenZeppelin Compliance(%)
|
| 62 |
-
- Gas Efficiency(%)
|
| 63 |
-
- Security(%)
|
| 64 |
-
- Average Lines of Code
|
| 65 |
-
- Correctness (OpenAI Evaluation) – GPT-4o Mini-assessed alignment of generated code with prompt using a structured correctness rubric.
|
| 66 |
-
- Correctness (Human Evaluation) – Expert-reviewed rating of how well the generated contract fulfills the original prompt and intent.
|
| 67 |
|
| 68 |
## Benchmark
|
| 69 |
Below is a figure summarizing the performance of each model across the four evaluation metrics.
|
|
|
|
| 57 |
- GPT 4o mini
|
| 58 |
|
| 59 |
On the following parameters
|
| 60 |
+
- **Compilation(%)** - Percentage of generated contracts that compile successfully without modification.
|
| 61 |
+
- **OpenZeppelin Compliance(%)** - Adherence to OpenZeppelin library usage and standards.
|
| 62 |
+
- **Gas Efficiency(%)** - Degree of gas optimization based on Slither’s suggestions.
|
| 63 |
+
- **Security(%)** - Percentage of code free from common vulnerabilities detected by Slither.
|
| 64 |
+
- **Average Lines of Code** - Average number of non-empty, commented-included lines in generated contracts, indicating verbosity or conciseness
|
| 65 |
+
- **Correctness (OpenAI Evaluation)** – GPT-4o Mini-assessed alignment of generated code with prompt using a structured correctness rubric.
|
| 66 |
+
- **Correctness (Human Evaluation)** – Expert-reviewed rating of how well the generated contract fulfills the original prompt and intent.
|
| 67 |
|
| 68 |
## Benchmark
|
| 69 |
Below is a figure summarizing the performance of each model across the four evaluation metrics.
|