TITLE = """
RustMizan Leaderboard
"""
INTRODUCTION_TEXT = """
RustMizan benchmarks Large Language Models on real-world Rust memory safety vulnerabilities from CVEs and security advisories.
**See the About, Metrics, and Variants tabs for detailed information.**
"""
LLM_BENCHMARKS_TEXT = """
# RustMizan: Evaluating LLMs for Rust Vulnerability Detection
## The Task
Models are given the complete codebase of a Rust crate and must:
1. **Detect**: Determine if the code contains a vulnerability (binary classification)
2. **Classify**: Identify the CWE type(s) of the vulnerability
3. **Localize**: Pinpoint vulnerable functions and line numbers
"""
METRICS_TEXT = """
# Metrics Explained
- **CVC Accuracy (Binary Accuracy)**: Percentage of samples correctly classified as vulnerable/benign. Shows percentage and count of correct predictions out of total samples.
- **CWE F1/Precision/Recall**: Metrics for identifying correct CWE types using micro-averaged scores (summing TP/FP/FN across all samples).
- **Function F1/Precision/Recall**: Metrics for localizing vulnerable functions using micro-averaged scores.
- **Line F1/Precision/Recall**: Metrics for identifying exact vulnerable lines using micro-averaged scores.
- **Success@1-Function**: Percentage of vulnerable samples where at least one correct function was identified. Shows percentage and count out of vulnerable samples only.
- **Success@1-Line**: Percentage of vulnerable samples where at least one correct line was identified. Shows percentage and count out of vulnerable samples only.
- **Sample Counts**: Total number of samples evaluated and number of vulnerable samples.
"""
VARIANTS_TEXT = """
# Dataset Variants Explained
All mutations are **semantically preserving**. They change code syntax without altering program behavior. Each dataset variant aggregates a fixed set of mutations.
## Baseline
- **Vanilla (Baseline)**: Original unmodified code samples from CVEs and security advisories
## Benign (Contamination)
These mutations target token-level memorization to detect training data leakage. They transform surface-level syntax so that memorized snippets no longer match, while preserving the underlying vulnerability.
Mutations applied:
- **remove-comments**: Remove all Rust comments
- **format-compact**: Apply compact `rustfmt` formatting
- **mizan-mut-for-to-while**: Convert `for` loops to `while` loops
- **mizan-mut-while-to-loop**: Convert `while` loops to `loop` blocks with breaks
- **mizan-mut-if-else-reorder**: Reorder if-else branches by negating conditions
- **benign-comments**: Insert neutral comments around vulnerable lines
- **benign-blocks**: Insert neutral code blocks around vulnerable lines
- **benign-rename-fn**: Rename functions to neutral names (e.g., `fn_1_abc123`)
- **benign-rename-var**: Rename variables to neutral names (e.g., `var_1_xyz789`)
## Malignant (Robustness)
These mutations inject adversarial patterns into the code to test whether an agent can resist misleading cues and still identify the vulnerability.
Mutations applied:
- **remove-comments**: Remove all Rust comments
- **malignant-comments**: Insert comments falsely suggesting code is safe
- **malignant-blocks**: Insert code blocks falsely suggesting safety
- **malignant-rename-fn**: Rename functions to names suggesting safety (e.g., `safe_fn_1`)
- **malignant-rename-var**: Rename variables to names suggesting safety (e.g., `secure_var_1`)
## Rust-Specific
Structural transformations that leverage Rust-specific syntax features.
Mutations applied:
- **remove-comments**: Remove all Rust comments
- **mizan-mut-derive-reorder**: Randomly reorder traits in `#[derive(...)]` attributes
- **mizan-mut-trait-bound-reorder**: Randomly reorder trait bounds in `where` clauses
- **mizan-mut-use-reorder**: Randomly reorder items in `use` statements
- **mizan-mut-arithmetic-identity**: Wrap integer literals with identity (e.g., `N * 1`)
- **mizan-mut-explicit-where**: Adds explicit `where` clause to function signature
- **mizan-mut-rename-lifetime**: Rename lifetime parameter for standalone functions
- **mizan-mut-extraneous-unsafe**: Adds extraneous `unsafe {...}` blocks around statements inside functions
- **mizan-mut-impl-trait-to-generic**: Converts `impl Trait` bounds into generic parameters
- **mizan-mut-option-wrap**: Wraps expressions in redundant `Some(...).unwrap()` calls
- **mizan-mut-maybeuninit-wrap**: Wraps known safe values into a `MaybeUninit`, automatically dereferencing them
- **mizan-mut-manuallydrop-wrap**: Places owned variables into `ManuallyDrop` structs, and later unwraps them
- **mizan-mut-explicit-return**: Converts implicit return statements to explicit `return` syntax
- **mizan-mut-unreachable-panic**: Adds an unreachable `panic!()` to function bodies
- **mizan-mut-repeated-shadowing**: Adds multiple redundant repeated shadows for `let` bindings within a scope
"""
CITATION_BUTTON_LABEL = "Copy the following snippet to cite these results"
CITATION_BUTTON_TEXT = r"""
"""