DataRepo commited on
Commit
68ef960
Β·
verified Β·
1 Parent(s): 47ccf08

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +59 -0
README.md CHANGED
@@ -8,3 +8,62 @@ pinned: false
8
  ---
9
 
10
  Edit this `README.md` markdown file to author your organization card.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
8
  ---
9
 
10
  Edit this `README.md` markdown file to author your organization card.
11
+ # FPEval: A Holistic Evaluation Framework for Functional Programming
12
+ Built upon **FPBench**, the framework includes **721 programming tasks** across three difficulty levels. It focuses on three mainstream FP languages (**Haskell, OCaml, and Scala**) and utilizes **Java** as an imperative baseline for comparative analysis.
13
+
14
+ ---
15
+
16
+ ## Dataset Structure
17
+
18
+ The dataset is organized into two primary components:
19
+
20
+ ### 1. `LeetCodeProblem/`
21
+ This directory contains the core problem specifications and testing infrastructure.
22
+ * **Data Collection Period:** 2021 – March 2025.
23
+ * **`meta.json`**: A comprehensive file for each problem containing:
24
+ * Problem descriptions and I/O constraints.
25
+ * Public test cases.
26
+ * **Private Test Cases**: High-coverage inputs/outputs synthetically generated using **GPT-4o**.
27
+ * **Test Templates**: Pre-configured templates to ensure seamless evaluation:
28
+ * `Main.hs` (Haskell)
29
+ * `main.ml` (OCaml)
30
+ * `MySuite.scala` (Scala)
31
+
32
+ ### 2. `LLMsGeneratedCode/`
33
+ This directory stores outputs from **GPT-3.5-turbo**, **GPT-4o**, and **GPT-5**, categorized into three distinct refinement stages:
34
+
35
+ | Stage | Description |
36
+ | :--- | :--- |
37
+ | **`CodeGenerated`** | Initial zero-shot code outputs. |
38
+ | **`BaselineRepair`** | Code refined based on old code. |
39
+ | **`InstructionRepair`** | Code optimized using static analysis feedback and hand-crafted idiomatic instructions. |
40
+
41
+ ---
42
+
43
+ ## Static Analysis & Repair Tools
44
+
45
+ FPEval utilizes industry-standard tools to provide feedback during the `InstructionRepair` stage, ensuring the generated code adheres to idiomatic patterns:
46
+
47
+ * **Haskell**: `HLint` (style/idioms) and `GHC` warnings (correctness).
48
+ * **OCaml**: `dune` (build validation) and `ocamlformat` (formatting).
49
+ * **Scala**: `Scalastyle` (functional style enforcement).
50
+ * **Java**: `Checkstyle` and `PMD` (standard imperative quality checks).
51
+
52
+ ---
53
+
54
+ ## πŸ“‚ Directory Layout
55
+
56
+ ```text
57
+ .
58
+ β”œβ”€β”€ LeetCodeProblem/
59
+ β”‚ β”œβ”€β”€ [Problem_NAME]/
60
+ β”‚ β”‚ β”œβ”€β”€ meta.json # Core problem data & private tests
61
+ β”‚ β”‚ β”œβ”€β”€ Main.hs # Haskell test template
62
+ β”‚ β”‚ β”œβ”€β”€ main.ml # OCaml test template
63
+ β”‚ β”‚ └── MySuite.scala # Scala test template
64
+ β”œβ”€β”€ LLMsGeneratedCode/
65
+ β”‚ β”œβ”€β”€ [Model_Name]/ # gpt-3.5-turbo, gpt-4o, gpt-5
66
+ β”‚ β”‚ β”œβ”€β”€ CodeGenerated/ # Initial raw outputs
67
+ β”‚ β”‚ β”œβ”€β”€ BaselineRepair/ # Post-compiler feedback attempts
68
+ β”‚ β”‚ └── InstructionRepair/ # Post-static analysis refinement
69
+ └── README.md