TITLE = """

RustMizan Leaderboard

""" INTRODUCTION_TEXT = """ RustMizan benchmarks Large Language Models on real-world Rust memory safety vulnerabilities from CVEs and security advisories. **See the About, Metrics, and Variants tabs for detailed information.** """ LLM_BENCHMARKS_TEXT = """ # RustMizan: Evaluating LLMs for Rust Vulnerability Detection ## The Task Models are given the complete codebase of a Rust crate and must: 1. **Detect**: Determine if the code contains a vulnerability (binary classification) 2. **Classify**: Identify the CWE type(s) of the vulnerability 3. **Localize**: Pinpoint vulnerable functions and line numbers """ METRICS_TEXT = """ # Metrics Explained - **CVC Accuracy (Binary Accuracy)**: Percentage of samples correctly classified as vulnerable/benign. Shows percentage and count of correct predictions out of total samples. - **CWE F1/Precision/Recall**: Metrics for identifying correct CWE types using micro-averaged scores (summing TP/FP/FN across all samples). - **Function F1/Precision/Recall**: Metrics for localizing vulnerable functions using micro-averaged scores. - **Line F1/Precision/Recall**: Metrics for identifying exact vulnerable lines using micro-averaged scores. - **Success@1-Function**: Percentage of vulnerable samples where at least one correct function was identified. Shows percentage and count out of vulnerable samples only. - **Success@1-Line**: Percentage of vulnerable samples where at least one correct line was identified. Shows percentage and count out of vulnerable samples only. - **Sample Counts**: Total number of samples evaluated and number of vulnerable samples. """ VARIANTS_TEXT = """ # Dataset Variants Explained All mutations are **semantically preserving**. They change code syntax without altering program behavior. Each dataset variant aggregates a fixed set of mutations. ## Baseline - **Vanilla (Baseline)**: Original unmodified code samples from CVEs and security advisories ## Benign (Contamination) These mutations target token-level memorization to detect training data leakage. They transform surface-level syntax so that memorized snippets no longer match, while preserving the underlying vulnerability. Mutations applied: - **remove-comments**: Remove all Rust comments - **format-compact**: Apply compact `rustfmt` formatting - **mizan-mut-for-to-while**: Convert `for` loops to `while` loops - **mizan-mut-while-to-loop**: Convert `while` loops to `loop` blocks with breaks - **mizan-mut-if-else-reorder**: Reorder if-else branches by negating conditions - **benign-comments**: Insert neutral comments around vulnerable lines - **benign-blocks**: Insert neutral code blocks around vulnerable lines - **benign-rename-fn**: Rename functions to neutral names (e.g., `fn_1_abc123`) - **benign-rename-var**: Rename variables to neutral names (e.g., `var_1_xyz789`) ## Malignant (Robustness) These mutations inject adversarial patterns into the code to test whether an agent can resist misleading cues and still identify the vulnerability. Mutations applied: - **remove-comments**: Remove all Rust comments - **malignant-comments**: Insert comments falsely suggesting code is safe - **malignant-blocks**: Insert code blocks falsely suggesting safety - **malignant-rename-fn**: Rename functions to names suggesting safety (e.g., `safe_fn_1`) - **malignant-rename-var**: Rename variables to names suggesting safety (e.g., `secure_var_1`) ## Rust-Specific Structural transformations that leverage Rust-specific syntax features. Mutations applied: - **remove-comments**: Remove all Rust comments - **mizan-mut-derive-reorder**: Randomly reorder traits in `#[derive(...)]` attributes - **mizan-mut-trait-bound-reorder**: Randomly reorder trait bounds in `where` clauses - **mizan-mut-use-reorder**: Randomly reorder items in `use` statements - **mizan-mut-arithmetic-identity**: Wrap integer literals with identity (e.g., `N * 1`) - **mizan-mut-explicit-where**: Adds explicit `where` clause to function signature - **mizan-mut-rename-lifetime**: Rename lifetime parameter for standalone functions - **mizan-mut-extraneous-unsafe**: Adds extraneous `unsafe {...}` blocks around statements inside functions - **mizan-mut-impl-trait-to-generic**: Converts `impl Trait` bounds into generic parameters - **mizan-mut-option-wrap**: Wraps expressions in redundant `Some(...).unwrap()` calls - **mizan-mut-maybeuninit-wrap**: Wraps known safe values into a `MaybeUninit`, automatically dereferencing them - **mizan-mut-manuallydrop-wrap**: Places owned variables into `ManuallyDrop` structs, and later unwraps them - **mizan-mut-explicit-return**: Converts implicit return statements to explicit `return` syntax - **mizan-mut-unreachable-panic**: Adds an unreachable `panic!()` to function bodies - **mizan-mut-repeated-shadowing**: Adds multiple redundant repeated shadows for `let` bindings within a scope """ CITATION_BUTTON_LABEL = "Copy the following snippet to cite these results" CITATION_BUTTON_TEXT = r""" """