Rishi-Jain-27's picture
Created data generator and data and finetune.py
da653f3
|
Raw
History Blame Contribute Delete
4.86 kB

Dataset preview

Random sample of 6 examples (system prompt omitted for brevity).

Example 1 — switch_match (c), 6 nodes

User (input):

1| const char* trim_value(int kind) {
2|   switch (kind) {
3|     case 0:
4|       return "first";
5|     case 1:
6|       return "second";
7|     case 2:
8|       return "third";
9|     default:
10|       return "other";
11|   }
12| }

Assistant (target):

<thinking>
1. Control structures: 1 decision point, 4 terminal branches.
2. Execution nodes in order: A Start: trim_value, B Which case applies?, C Return first label, D Return second label, E Return third label, F Return default label.
3. Source lines: A maps to line 1, B maps to line 2, C maps to line 4, D maps to line 6, E maps to line 8, F maps to line 10.
</thinking>
graph TD
    A[Start: trim_value]
    B{Which case applies?}
    C[Return first label]
    D[Return second label]
    E[Return third label]
    F[Return default label]
    A --> B
    B -- First case --> C
    B -- Second case --> D
    B -- Third case --> E
    B -- Default --> F
<linemap>
A: 1
B: 2
C: 4
D: 6
E: 8
F: 10
</linemap>

Example 2 — guard_return (c), 4 nodes

User (input):

1| int aggregate(int found) {
2|   if (found) {
3|     return 0;
4|   }
5|   return 1;
6| }

Assistant (target):

<thinking>
1. Structural parse: 1 decision point, 2 terminal branches.
2. Execution nodes in order: A Start: aggregate, B Flag is set?, C Return false, D Return true.
3. Source lines: A maps to line 1, B maps to line 2, C maps to line 3, D maps to line 5.
</thinking>
graph TD
    A[Start: aggregate]
    B{Flag is set?}
    C[Return false]
    D[Return true]
    A --> B
    B -- True --> C
    B -- False --> D
<linemap>
A: 1
B: 2
C: 3
D: 5
</linemap>

Example 3 — dict_lookup (javascript), 4 nodes

User (input):

1| function searchList(registry, field) {
2|   if (field in registry) {
3|     return registry[field];
4|   }
5|   return null;
6| }

Assistant (target):

<thinking>
1. Structural parse: 1 decision point, 2 terminal branches.
2. Nodes mapped chronologically: A Start: searchList, B Key exists in map?, C Return the mapped value, D Return nothing.
3. Source lines: A maps to line 1, B maps to line 2, C maps to line 3, D maps to line 5.
</thinking>
graph TD
    A[Start: searchList]
    B{Key exists in map?}
    C[Return the mapped value]
    D[Return nothing]
    A --> B
    B -- Found --> C
    B -- Missing --> D
<linemap>
A: 1
B: 2
C: 3
D: 5
</linemap>

Example 4 — try_except (javascript), 4 nodes

User (input):

1| function formatName(line) {
2|   try {
3|     return parseInt(line);
4|   } catch (err) {
5|     return 0;
6|   }
7| }

Assistant (target):

<thinking>
1. Control structures: 1 decision point, 2 terminal branches.
2. Node sequence: A Start: formatName, B Conversion succeeds?, C Return the value, D Return zero.
3. Source lines: A maps to line 1, B maps to line 2, C maps to line 3, D maps to line 5.
</thinking>
graph TD
    A[Start: formatName]
    B{Conversion succeeds?}
    C[Return the value]
    D[Return zero]
    A --> B
    B -- Success --> C
    B -- Failure --> D
<linemap>
A: 1
B: 2
C: 3
D: 5
</linemap>

Example 5 — for_search (javascript), 5 nodes

User (input):

1| function lookup(results, wanted) {
2|   for (let i = 0; i < results.length; i++) {
3|     if (results[i] === wanted) {
4|       return i;
5|     }
6|   }
7|   return -1;
8| }

Assistant (target):

<thinking>
1. Structural parse: 1 decision point, 1 loop, 2 terminal branches.
2. Execution nodes in order: A Start: lookup, B Iterate over collection, C Item matches target?, D Return the index, E Return not found.
3. Source lines: A is line 1, B is line 2, C is line 3, D is line 4, E is line 7.
</thinking>
graph TD
    A[Start: lookup]
    B[Iterate over collection]
    C{Item matches target?}
    D[Return the index]
    E[Return not found]
    A --> B
    B -- For each --> C
    B -- No more items --> E
    C -- Match --> D
    C -- No match --> B
<linemap>
A: 1
B: 2
C: 3
D: 4
E: 7
</linemap>

Example 6 — dict_lookup (python), 4 nodes

User (input):

1| def lookup(mapping, user_id):
2|     if user_id in mapping:
3|         return mapping[user_id]
4|     return None

Assistant (target):

<thinking>
1. Structural parse: 1 decision point, 2 terminal branches.
2. Node sequence: A Start: lookup, B Key exists in map?, C Return the mapped value, D Return nothing.
3. Source lines: A maps to line 1, B maps to line 2, C maps to line 3, D maps to line 4.
</thinking>
graph TD
    A[Start: lookup]
    B{Key exists in map?}
    C[Return the mapped value]
    D[Return nothing]
    A --> B
    B -- Found --> C
    B -- Missing --> D
<linemap>
A: 1
B: 2
C: 3
D: 4
</linemap>