RustMizan Leaderboard

TITLE = """<h1 align="center" id="space-title">RustMizan Leaderboard</h1>"""

INTRODUCTION_TEXT = """
RustMizan benchmarks Large Language Models on real-world Rust memory safety vulnerabilities from CVEs and security advisories.

**See the About, Metrics, and Variants tabs for detailed information.**
"""

LLM_BENCHMARKS_TEXT = """
# RustMizan: Evaluating LLMs for Rust Vulnerability Detection

## The Task

Models are given the complete codebase of a Rust crate and must:
1. **Detect**: Determine if the code contains a vulnerability (binary classification)
2. **Classify**: Identify the CWE type(s) of the vulnerability
3. **Localize**: Pinpoint vulnerable functions and line numbers

"""

METRICS_TEXT = """
# Metrics Explained

- **CVC Accuracy (Binary Accuracy)**: Percentage of samples correctly classified as vulnerable/benign. Shows percentage and count of correct predictions out of total samples.
- **CWE F1/Precision/Recall**: Metrics for identifying correct CWE types using micro-averaged scores (summing TP/FP/FN across all samples).
- **Function F1/Precision/Recall**: Metrics for localizing vulnerable functions using micro-averaged scores.
- **Line F1/Precision/Recall**: Metrics for identifying exact vulnerable lines using micro-averaged scores.
- **Success@1-Function**: Percentage of vulnerable samples where at least one correct function was identified. Shows percentage and count out of vulnerable samples only.
- **Success@1-Line**: Percentage of vulnerable samples where at least one correct line was identified. Shows percentage and count out of vulnerable samples only.
- **Sample Counts**: Total number of samples evaluated and number of vulnerable samples.
"""

VARIANTS_TEXT = """
# Dataset Variants Explained

All mutations are **semantically preserving**. They change code syntax without altering program behavior. Each dataset variant aggregates a fixed set of mutations.

## Baseline
- **Vanilla (Baseline)**: Original unmodified code samples from CVEs and security advisories

## Benign (Contamination)
These mutations target token-level memorization to detect training data leakage. They transform surface-level syntax so that memorized snippets no longer match, while preserving the underlying vulnerability.

Mutations applied:
- **remove-comments**: Remove all Rust comments
- **format-compact**: Apply compact `rustfmt` formatting
- **mizan-mut-for-to-while**: Convert `for` loops to `while` loops
- **mizan-mut-while-to-loop**: Convert `while` loops to `loop` blocks with breaks
- **mizan-mut-if-else-reorder**: Reorder if-else branches by negating conditions
- **benign-comments**: Insert neutral comments around vulnerable lines
- **benign-blocks**: Insert neutral code blocks around vulnerable lines
- **benign-rename-fn**: Rename functions to neutral names (e.g., `fn_1_abc123`)
- **benign-rename-var**: Rename variables to neutral names (e.g., `var_1_xyz789`)

## Malignant (Robustness)
These mutations inject adversarial patterns into the code to test whether an agent can resist misleading cues and still identify the vulnerability.

Mutations applied:
- **remove-comments**: Remove all Rust comments
- **malignant-comments**: Insert comments falsely suggesting code is safe
- **malignant-blocks**: Insert code blocks falsely suggesting safety
- **malignant-rename-fn**: Rename functions to names suggesting safety (e.g., `safe_fn_1`)
- **malignant-rename-var**: Rename variables to names suggesting safety (e.g., `secure_var_1`)

## Rust-Specific
Structural transformations that leverage Rust-specific syntax features.

Mutations applied:
- **remove-comments**: Remove all Rust comments
- **mizan-mut-derive-reorder**: Randomly reorder traits in `#[derive(...)]` attributes
- **mizan-mut-trait-bound-reorder**: Randomly reorder trait bounds in `where` clauses
- **mizan-mut-use-reorder**: Randomly reorder items in `use` statements
- **mizan-mut-arithmetic-identity**: Wrap integer literals with identity (e.g., `N * 1`)
- **mizan-mut-explicit-where**: Adds explicit `where` clause to function signature
- **mizan-mut-rename-lifetime**: Rename lifetime parameter for standalone functions
- **mizan-mut-extraneous-unsafe**: Adds extraneous `unsafe {...}` blocks around statements inside functions
- **mizan-mut-impl-trait-to-generic**: Converts `impl Trait` bounds into generic parameters
- **mizan-mut-option-wrap**: Wraps expressions in redundant `Some(...).unwrap()` calls
- **mizan-mut-maybeuninit-wrap**: Wraps known safe values into a `MaybeUninit<T>`, automatically dereferencing them
- **mizan-mut-manuallydrop-wrap**: Places owned variables into `ManuallyDrop` structs, and later unwraps them
- **mizan-mut-explicit-return**: Converts implicit return statements to explicit `return` syntax
- **mizan-mut-unreachable-panic**: Adds an unreachable `panic!()` to function bodies
- **mizan-mut-repeated-shadowing**: Adds multiple redundant repeated shadows for `let` bindings within a scope
"""

CITATION_BUTTON_LABEL = "Copy the following snippet to cite these results"
CITATION_BUTTON_TEXT = r"""
"""