AscendKernelGen commited on
Commit
796bca4
·
verified ·
1 Parent(s): 754670c

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -4
README.md CHANGED
@@ -19,10 +19,10 @@ The Ascend KernelGen Technical Report is published at https://arxiv.org/abs/2601
19
 
20
  Our framework, **Ascend KernelGen (AKGen)**, bridges the gap between general-purpose code generation and hardware-specific programming through a closed-loop system of data construction, training, and evaluation. Key innovations include:
21
 
22
- * [cite_start]**Ascend-CoT Dataset:** A high-quality, domain-specific dataset incorporating **Chain-of-Thought (CoT)** reasoning. It combines documentation-based reasoning, code-centric reasoning derived from real-world kernel implementations, and general reasoning chains to capture the structured logic required for low-level NPU programming.
23
- * [cite_start]**Domain-Adaptive Post-Training:** A two-stage optimization process. We first employ **Supervised Fine-Tuning (SFT)** with error-derived supervision (correcting API misuse and numerical errors). This is followed by **Reinforcement Learning (RL)** using Direct Preference Optimization (DPO), driven by execution-based correctness and performance signals.
24
- * [cite_start]**Hardware-Grounded Evaluation:** Validated using **NPUKernelBench**, a comprehensive benchmark that assesses compilation success, functional correctness, and performance (latency) on real Ascend hardware across varying complexity levels.
25
- * [cite_start]**Performance:** The model demonstrates siginificant improvement on complex Level-2 kernels compared to baselines, and effectively solving tasks where general-purpose models (like Qwen3, Llama3.1) fail completely.
26
 
27
  ## Citation
28
  @article{cao2026ascendkernelgen,
 
19
 
20
  Our framework, **Ascend KernelGen (AKGen)**, bridges the gap between general-purpose code generation and hardware-specific programming through a closed-loop system of data construction, training, and evaluation. Key innovations include:
21
 
22
+ * **Ascend-CoT Dataset:** A high-quality, domain-specific dataset incorporating **Chain-of-Thought (CoT)** reasoning. It combines documentation-based reasoning, code-centric reasoning derived from real-world kernel implementations, and general reasoning chains to capture the structured logic required for low-level NPU programming.
23
+ * **Domain-Adaptive Post-Training:** A two-stage optimization process. We first employ **Supervised Fine-Tuning (SFT)** with error-derived supervision (correcting API misuse and numerical errors). This is followed by **Reinforcement Learning (RL)** using Direct Preference Optimization (DPO), driven by execution-based correctness and performance signals.
24
+ * **Hardware-Grounded Evaluation:** Validated using **NPUKernelBench**, a comprehensive benchmark that assesses compilation success, functional correctness, and performance (latency) on real Ascend hardware across varying complexity levels.
25
+ * **Performance:** The model demonstrates siginificant improvement on complex Level-2 kernels compared to baselines, and effectively solving tasks where general-purpose models (like Qwen3, Llama3.1) fail completely.
26
 
27
  ## Citation
28
  @article{cao2026ascendkernelgen,