Spaces:
Running
Running
Danny Liu commited on
Commit ·
1ac3f2c
1
Parent(s): b627e0e
fix some format issue
Browse files- src/about.py +4 -1
src/about.py
CHANGED
|
@@ -33,6 +33,7 @@ INTRODUCTION_TEXT = """
|
|
| 33 |
CONCLUSION_TEXT = """
|
| 34 |
Evaluations on the VerilogEval Human benchmark reveal a strict empirical ceiling, with frontier models plateauing at a 90.8% initial pass rate.
|
| 35 |
The solvability taxonomy exposes that L3U (Unsolvable) errors dominate across all model families, revealing persistent knowledge gaps that inference-time scaling cannot address.
|
|
|
|
| 36 |
Ultimately, register transfer level (RTL) coding capacity relies heavily upon pretraining knowledge.
|
| 37 |
Integrating reward and policy modelling (i.e., GRPO) during the post-training phase amplifies existing competencies by teaching models to compile, while L3S errors (addressable via best-of-N sampling) coexist with L3U errors (requiring model improvement).
|
| 38 |
"""
|
|
@@ -45,6 +46,8 @@ Our four-level error taxonomy evaluates LLM-generated RTL code based on successi
|
|
| 45 |
- **L2 Semantic**: The source string parses into a valid AST but violates at least one static semantic constraint (e.g., detected during elaboration, linting, or synthesis).
|
| 46 |
- **L3S Functional-Solvable**: The synthesized model fails to meet the design specification, but the model has demonstrated the ability to solve the problem in at least one other rollout (addressable via inference-time scaling / best-of-N sampling).
|
| 47 |
- **L3U Functional-Unsolvable**: The synthesized model fails to meet the design specification, and the model cannot solve the problem in any rollout (requires fundamental model improvement).
|
|
|
|
|
|
|
| 48 |
"""
|
| 49 |
|
| 50 |
EVALUATION_QUEUE_TEXT = """
|
|
@@ -54,7 +57,7 @@ CITATION_BUTTON_LABEL = "Copy the following snippet to cite these results"
|
|
| 54 |
CITATION_BUTTON_TEXT = r"""@article{liu2026rtlerror,
|
| 55 |
title={How Large Language Models Fail and Generalize to Learn RTL Coding for Digital Circuit Design},
|
| 56 |
author={Liu, Guan-Ting and Yang, Chao-Han Huck and Deng, Chenhui and Yu, Zhongzhi and Khailany, Brucek and Wang, Yu-Chiang Frank},
|
| 57 |
-
journal={Under Review
|
| 58 |
year={2026}
|
| 59 |
}
|
| 60 |
"""
|
|
|
|
| 33 |
CONCLUSION_TEXT = """
|
| 34 |
Evaluations on the VerilogEval Human benchmark reveal a strict empirical ceiling, with frontier models plateauing at a 90.8% initial pass rate.
|
| 35 |
The solvability taxonomy exposes that L3U (Unsolvable) errors dominate across all model families, revealing persistent knowledge gaps that inference-time scaling cannot address.
|
| 36 |
+
Our analysis exposes a striking surface convergence gap: optimization drastically reduces syntax errors but concurrently increases functional testbench failures.
|
| 37 |
Ultimately, register transfer level (RTL) coding capacity relies heavily upon pretraining knowledge.
|
| 38 |
Integrating reward and policy modelling (i.e., GRPO) during the post-training phase amplifies existing competencies by teaching models to compile, while L3S errors (addressable via best-of-N sampling) coexist with L3U errors (requiring model improvement).
|
| 39 |
"""
|
|
|
|
| 46 |
- **L2 Semantic**: The source string parses into a valid AST but violates at least one static semantic constraint (e.g., detected during elaboration, linting, or synthesis).
|
| 47 |
- **L3S Functional-Solvable**: The synthesized model fails to meet the design specification, but the model has demonstrated the ability to solve the problem in at least one other rollout (addressable via inference-time scaling / best-of-N sampling).
|
| 48 |
- **L3U Functional-Unsolvable**: The synthesized model fails to meet the design specification, and the model cannot solve the problem in any rollout (requires fundamental model improvement).
|
| 49 |
+
|
| 50 |
+
## Benchmark
|
| 51 |
"""
|
| 52 |
|
| 53 |
EVALUATION_QUEUE_TEXT = """
|
|
|
|
| 57 |
CITATION_BUTTON_TEXT = r"""@article{liu2026rtlerror,
|
| 58 |
title={How Large Language Models Fail and Generalize to Learn RTL Coding for Digital Circuit Design},
|
| 59 |
author={Liu, Guan-Ting and Yang, Chao-Han Huck and Deng, Chenhui and Yu, Zhongzhi and Khailany, Brucek and Wang, Yu-Chiang Frank},
|
| 60 |
+
journal={Under Review},
|
| 61 |
year={2026}
|
| 62 |
}
|
| 63 |
"""
|