AI & ML interests

None defined yet.

Recent Activity

DataRepoĀ  updated a dataset about 2 months ago
FPEvalRepoPublic/Result4Statistic
DataRepoĀ  published a dataset about 2 months ago
FPEvalRepoPublic/Result4Statistic
DataRepoĀ  updated a dataset 3 months ago
FPEvalRepoPublic/PrivateTestCase
View all activity

Organization Card

FPEval: A Holistic Evaluation Framework for Functional Programming

Built upon FPBench, the framework includes 721 programming tasks across three difficulty levels. It focuses on three mainstream FP languages (Haskell, OCaml, and Scala) and utilizes Java as an imperative baseline for comparative analysis.


šŸ“Š Dataset Structure

The dataset is organized into two primary components:

1. LeetCodeProblem/

This directory contains the core problem specifications and testing infrastructure.

  • Data Collection Period: 2021 – March 2025.
  • meta.json: A comprehensive file for each problem containing:
    • Problem descriptions and I/O constraints.
    • Public test cases.
    • Private Test Cases: High-coverage inputs/outputs synthetically generated using GPT-4o.
  • Test Templates: Pre-configured templates to ensure seamless evaluation:
    • Main.hs (Haskell)
    • main.ml (OCaml)
    • MySuite.scala (Scala)

2. LLMsGeneratedCode/

This directory stores outputs from GPT-3.5-turbo, GPT-4o, and GPT-5, categorized into three distinct refinement stages:

Stage Description
CodeGenerated Initial zero-shot code outputs.
BaselineRepair Code refined based on old code.
InstructionRepair Code optimized using static analysis feedback and hand-crafted idiomatic instructions.

šŸ›  Static Analysis & Repair Tools

FPEval utilizes industry-standard tools to provide feedback during the InstructionRepair stage, ensuring the generated code adheres to idiomatic patterns:

  • Haskell: HLint (style/idioms) and GHC warnings (correctness).
  • OCaml: dune (build validation) and ocamlformat (formatting).
  • Scala: Scalastyle (functional style enforcement).
  • Java: Checkstyle and PMD (standard imperative quality checks).

šŸ“‚ Directory Layout

.
ā”œā”€ā”€ LeetCodeProblem/
│   ā”œā”€ā”€ [Problem_NAME]/
│   │   ā”œā”€ā”€ meta.json             # Core problem data & private tests
│   │   ā”œā”€ā”€ Main.hs               # Haskell test template
│   │   ā”œā”€ā”€ main.ml               # OCaml test template
│   │   └── MySuite.scala         # Scala test template
ā”œā”€ā”€ LLMsGeneratedCode/
│   ā”œā”€ā”€ [Model_Name]/             # gpt-3.5-turbo, gpt-4o, gpt-5
│   │   ā”œā”€ā”€ CodeGenerated/        # Initial raw outputs
│   │   ā”œā”€ā”€ BaselineRepair/       # Post-compiler feedback attempts
│   │   └── InstructionRepair/    # Post-static analysis refinement
└── README.md

models 0

None public yet