File size: 49,813 Bytes
7165154
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
1001
1002
1003
1004
1005
1006
1007
1008
1009
1010
1011
1012
1013
1014
1015
1016
1017
1018
1019
1020
1021
1022
1023
1024
1025
1026
1027
1028
1029
1030
1031
1032
1033
1034
1035
1036
1037
1038
1039
1040
1041
1042
1043
1044
1045
1046
1047
1048
1049
1050
1051
1052
1053
1054
1055
1056
1057
1058
1059
1060
1061
1062
1063
1064
1065
1066
1067
1068
1069
1070
1071
1072
1073
1074
1075
1076
1077
1078
1079
1080
1081
1082
1083
1084
1085
1086
1087
1088
1089
1090
1091
1092
1093
1094
1095
1096
1097
1098
"""
Task prompts for the multi-step workflow.
Each function returns a formatted prompt string with variables replaced.
"""

def step1_environment_setup_and_tutorial_discovery(github_repo_name, tutorial_filter=""):
    """
    Step 1: Environment Setup & Tutorial Discovery Coordinator

    Args:
        github_repo_name: Repository name
        tutorial_filter: Optional tutorial filter (file path or title matching)
    """
    return f"""# Environment Setup & Tutorial Discovery Coordinator

## Role
Orchestrator agent that coordinates parallel environment setup and tutorial discovery for scientific research codebases. You manage subagent execution, handle errors, validate outputs, and ensure successful completion of both tasks.

## Core Mission
Transform scientific research codebases into reusable tools by coordinating two specialized agents working in parallel to prepare the codebase for tool extraction.

## Subagent Capabilities
- **environment-python-manager**: Comprehensive Python environment setup with uv, pytest configuration, and dependency management
- **tutorial-scanner**: Systematic tutorial identification, classification, and quality assessment for tool extraction

## Input Parameters
- `repo/{github_repo_name}`: Repository codebase directory
- `github_repo_name`: Project name (exact capitalization from context)
- `PROJECT_ROOT`: Absolute path to project directory
- `UV_PYTHON_ENV`: Target uv python environment name
- `tutorial_filter`: Optional tutorial filter (file path or title matching)

## Expected Outputs
- `reports/environment-manager_results.md`: Environment setup summary
- `reports/tutorial-scanner.json`: Complete tutorial analysis
- `reports/tutorial-scanner-include-in-tools.json`: Filtered tutorials for tool creation

---

## Execution Coordination

### Phase 1: Parallel Agent Launch
Execute both agents simultaneously using Task tool with concurrent calls:

```
Task 1: environment-python-manager
- Mission: Set up {github_repo_name}-env with Python β‰₯3.10
- Working directory: Current directory (NOT repo/ subfolder)
- Requirements: uv environment, pytest configuration, dependency installation
- Output: reports/environment-manager_results.md

Task 2: tutorial-scanner
- Mission: Scan repo/{github_repo_name}/ for tool-worthy tutorials
- Filter parameter: {tutorial_filter} (if provided)
- Requirements: Strict filtering, quality assessment, JSON output generation
- Output: reports/tutorial-scanner.json + reports/tutorial-scanner-include-in-tools.json
```

### Phase 2: Progress Monitoring & Error Recovery

**Timeout Management:**
- Monitor agent progress with 10-minute timeout per agent
- Implement graceful failure handling for long-running operations

**Error Recovery Strategies:**
- **Environment failures**: Provide alternative Python versions (3.10, 3.11, 3.12)
- **Tutorial scanning failures**: Attempt partial scanning with error reporting
- **Resource conflicts**: Ensure agents don't interfere with shared directories
- **Filter failures**: Validate filter syntax and provide clear error messages

### Phase 3: Output Validation Framework

**Environment Validation:**
- Verify environment-manager_results.md exists and contains required sections
- Confirm environment activation commands are properly documented
- Validate Python version compliance (β‰₯3.10)

**Tutorial Validation:**
- Validate JSON schema compliance for both output files
- Cross-reference tutorial paths with actual repository structure
- Verify filter results match expected criteria
- Ensure no legacy/deprecated content marked as "include-in-tools"

**Quality Checks:**
- Environment: Successful dependency installation, pytest configuration
- Tutorials: Proper classification, quality standards applied consistently

---

## Tutorial Filter Coordination

When `tutorial_filter` is provided:
- Pass exact filter string to tutorial-scanner: `"{tutorial_filter}"`
- Ensure case-insensitive matching for both file paths and tutorial titles
- Validate OR logic: match if EITHER file path OR title matches
- **Strict enforcement**: No fallback to all tutorials if no matches found
- Report match statistics in final summary

---

## Success Criteria & Completion

### Completion Requirements
Both agents must complete successfully before marking task complete. Use [βœ“] to confirm success and [βœ—] to confirm failure. Provide a one-line reason for success or failure. If there are any failures, fix them and run the coordination again up to 3 attempts of iterations.

- [ ] **Environment Setup**: Environment setup completed with no critical errors
- [ ] **Tutorial Scanning**: Tutorial scanning completed with valid JSON outputs
- [ ] **Output Generation**: All required output files generated and validated
- [ ] **Quality Control**: No deprecated/legacy content incorrectly classified

### Consolidated Reporting
Generate final summary combining both agent results:
```
Environment Setup & Tutorial Discovery Complete

Environment Status:
- Environment: {github_repo_name}-env
- Python Version: [version]
- Dependencies: [count] packages installed
- Activation: source {github_repo_name}-env/bin/activate

Tutorial Analysis:
- Total tutorials scanned: [count]
- Tutorials included in tools: [count]
- Filter applied: [filter_status]
- Quality assessment: [pass/issues]

Execution Metrics:
- Environment setup time: [duration]
- Tutorial scanning time: [duration]
- Total execution time: [duration]
```

### Error Reporting
If either agent fails:
- Document specific failure points
- Provide actionable remediation steps
- Attempt automatic recovery where possible
- Escalate to user only for unrecoverable failures

---

## Variable Standards
- Use `{github_repo_name}` consistently throughout
- Maintain exact capitalization from input parameters
- Ensure environment paths are relative to current working directory
- Standardize filter parameter passing between supervisor and subagents
"""


def step2_tutorial_execution(github_repo_name, api_key=""):
    """
    Step 2: Tutorial Execution Coordinator

    Args:
        github_repo_name: Repository name
        api_key: Optional API key for tutorials requiring external API access
    """
    return f"""# Tutorial Execution Coordinator

## Role
Orchestrator agent that coordinates tutorial execution by managing the tutorial-executor subagent to generate gold-standard outputs from discovered tutorials. You oversee execution progress, handle errors, validate outputs, and ensure successful completion.

## Core Mission
Transform tutorial materials into executable, validated notebooks with gold-standard outputs for downstream tool extraction by coordinating systematic tutorial execution.

## Subagent Capabilities
- **tutorial-executor**: Comprehensive tutorial execution specialist that handles notebook preparation, environment management, iterative error resolution, and output generation for all tutorials

## Input Requirements
- `reports/tutorial-scanner-include-in-tools.json`: List of tutorials requiring execution
- `{github_repo_name}-env`: Pre-configured Python environment for execution
- Repository structure under `repo/{github_repo_name}/`
- `api_key`: Optional API key for tutorials requiring external API access: "{api_key}"

## Expected Outputs
- `notebooks/{"{tutorial_file_name}"}/{"{tutorial_file_name}"}_execution_final.ipynb`: Final validated notebooks
- `notebooks/{"{tutorial_file_name}"}/images/`: Extracted figures and visualizations
- `reports/executed_notebooks.json`: Complete execution summary with GitHub URLs

---

## Execution Coordination

### Phase 1: Pre-Execution Validation

**Input Validation:**
- Verify `reports/tutorial-scanner-include-in-tools.json` exists and contains valid tutorials
- Confirm `{github_repo_name}-env` environment is available and functional
- Validate repository structure and tutorial file accessibility
- Check for required tools (papermill, jupytext, image extraction scripts)

**Environment Preparation:**
- Test environment activation: `source {github_repo_name}-env/bin/activate`
- Verify essential dependencies are installed (papermill, nbclient, ipykernel, imagehash)
- Ensure repository paths are accessible from current working directory

**API Key Integration:**
- When API key is provided ("{api_key}"), instruct tutorial-executor to:
  - Detect notebooks requiring API keys (OpenAI, Anthropic, Gemini, AlphaGenome, ESM etc.)
  - Inject API key assignments at the beginning of notebooks:
    ```python
    # API Configuration
    api_key = "{api_key}"
    openai.api_key = api_key  # For OpenAI
    # client = anthropic.Anthropic(api_key=api_key)  # For Anthropic
    # etc.
    ```
  - Handle common API patterns (openai, anthropic, google-generativeai, etc.)
  - Document API key injection in execution logs

### Phase 2: Tutorial Execution Launch

**Single Agent Coordination:**
```
Task: tutorial-executor
- Mission: Execute all tutorials from tutorial-scanner results
- Input: reports/tutorial-scanner-include-in-tools.json
- Environment: {github_repo_name}-env
- API Key: "{api_key}" (if provided, inject into notebooks requiring API access)
- Requirements: Generate execution notebooks, handle errors, extract images
- Output: notebooks/ directory structure + reports/executed_notebooks.json
```

**Execution Monitoring:**
- Track tutorial-executor progress through status updates
- Monitor for critical failures that require intervention
- Implement timeout handling (30-minute maximum per tutorial)
- Provide progress feedback for long-running executions

### Phase 3: Error Recovery & Quality Assurance

**Error Recovery Strategies:**
- **Environment Issues**: Guide tutorial-executor through dependency installation
- **Data Dependencies**: Assist with data file discovery and path resolution
- **Version Compatibility**: Support Python/package version conflict resolution
- **Execution Failures**: Coordinate retry attempts (up to 5 iterations per tutorial)

**Quality Validation Framework:**
- **Execution Completeness**: Verify all tutorials attempted and status documented
- **Output Integrity**: Confirm final notebooks execute without errors
- **File Organization**: Validate snake_case naming conventions applied consistently
- **Image Extraction**: Ensure figures extracted to proper directory structure

### Phase 4: Output Validation & Reporting

**Output Structure Validation:**
```
Expected Structure:
notebooks/
β”œβ”€β”€ tutorial_file_1/
β”‚   β”œβ”€β”€ tutorial_file_1_execution_final.ipynb
β”‚   └── images/
β”‚       β”œβ”€β”€ figure_1.png
β”‚       └── figure_2.png
β”œβ”€β”€ tutorial_file_2/
β”‚   β”œβ”€β”€ tutorial_file_2_execution_final.ipynb
β”‚   └── images/
└── ...

reports/executed_notebooks.json
```

**JSON Validation:**
- Verify `reports/executed_notebooks.json` contains all successful executions
- Validate GitHub URL generation and accessibility
- Confirm execution_path accuracy for all entries
- Test HTTP URLs with fetch requests to ensure validity

**Branch Detection Verification:**
```bash
git -C repo/{github_repo_name} branch --show-current
```

---

## Success Criteria & Completion

### Completion Requirements
Use [βœ“] to confirm success and [βœ—] to confirm failure. Provide a one-line reason for success or failure. If there are any failures, coordinate resolution and retry up to 3 attempts.

- [ ] **Input Validation**: Tutorial list and environment successfully validated
- [ ] **Execution Launch**: Tutorial-executor agent launched and completed successfully
- [ ] **Output Generation**: All expected notebooks and images generated
- [ ] **Quality Assurance**: Execution integrity verified and documented
- [ ] **JSON Validation**: executed_notebooks.json created with valid GitHub URLs
- [ ] **File Organization**: Proper directory structure and naming conventions followed

### Consolidated Reporting
Generate final summary of execution results:
```
Tutorial Execution Coordination Complete

Execution Summary:
- Total tutorials processed: [count]
- Successfully executed: [count]
- Failed executions: [count]
- Environment: {github_repo_name}-env

Output Artifacts:
- Final notebooks: notebooks/*/[tutorial_file]_execution_final.ipynb
- Extracted images: notebooks/*/images/
- Execution report: reports/executed_notebooks.json

Quality Metrics:
- Error-free executions: [percentage]
- Image extraction success: [count]
- GitHub URL validation: [pass/fail]
```

### Error Documentation
For any failures encountered:
- Document specific tutorial execution failures with root causes
- Provide actionable remediation steps for manual intervention
- Report environment or dependency issues requiring resolution
- Escalate unrecoverable failures with detailed error analysis

**Iteration Tracking:**
- **Current coordination attempt**: ___ of 3 maximum
- **Tutorial-executor retry cycles**: ___ per tutorial (max 5)
- **Critical issues requiring intervention**: ___

---

## File Naming Standards
- **Snake Case Convention**: Convert all tutorial file names to snake_case format
  - Example: `Data-Processing-Tutorial` β†’ `data_processing_tutorial`
- **Directory Structure**: `notebooks/{"{tutorial_file_name}"}/`
- **Final Notebooks**: `{"{tutorial_file_name}"}_execution_final.ipynb`
- **Image Directory**: `notebooks/{"{tutorial_file_name}"}/images/`
- **Consistent Application**: Apply naming convention throughout all outputs

## Environment Requirements
- **Primary Environment**: `{github_repo_name}-env` (pre-configured)
- **Required Tools**: papermill, jupytext, nbclient, ipykernel, imagehash
- **Execution Context**: Activated environment for all tutorial operations
- **Path Resolution**: Repository-relative paths for data and file access
"""


def step3_tool_extraction_and_testing(github_repo_name, api_key=""):
    """
    Step 3: Tool Extraction & Testing Coordinator

    Args:
        github_repo_name: Repository name
        api_key: Optional API key for testing tools requiring external API access
    """
    return f"""# Tool Extraction & Testing Coordinator

## Role
Orchestrator agent that coordinates sequential tool extraction and testing by managing specialized subagents to transform tutorial notebooks into production-ready, tested function libraries.

## Core Mission
Convert executed tutorial notebooks into reusable tools with comprehensive test suites through systematic two-phase coordination: extraction followed by verification and improvement.

## Subagent Capabilities
- **tutorial-tool-extractor-implementor**: Systematic tool extraction specialist that analyzes tutorials and implements reusable functions with scientific rigor
- **test-verifier-improver**: Comprehensive testing specialist that creates, executes, and iteratively improves test suites until 100% pass rate

## Input Requirements
- `reports/executed_notebooks.json`: List of successfully executed tutorials requiring tool extraction
- `{github_repo_name}-env`: Pre-configured Python environment with dependencies
- `notebooks/`: Directory containing executed tutorial notebooks and images
- `api_key`: Optional API key for testing tools requiring external API access: "{api_key}"

## Expected Outputs
```
src/tools/{"{tutorial_file_name}"}.py                        # Production-ready tool implementations (file-based)
tests/code/{"{tutorial_file_name}"}/<tool1_name>_test.py     # Individual test file for tool 1
tests/code/{"{tutorial_file_name}"}/<tool2_name>_test.py     # Individual test file for tool 2
tests/code/{"{tutorial_file_name}"}/<toolN_name>_test.py     # Individual test file for tool N
tests/data/{"{tutorial_file_name}"}/                         # Test data fixtures (if needed)
tests/results/{"{tutorial_file_name}"}/                      # Test execution results
tests/logs/{"{tutorial_file_name}"}_<tool_name>_test.log     # Individual test execution logs per tool
tests/logs/{"{tutorial_file_name}"}_test.md                  # Final comprehensive test summary
```

### File-Based Tutorial Organization
**Important**: Tutorial extraction and testing is **file-based**, not individual tutorial-based:
- **Single File, Multiple Tutorials**: One README.md or notebook file may contain multiple tutorial sections (e.g., Tutorial 1, Tutorial 2, ... Tutorial 6)
- **Consolidated Implementation**: All tutorials from the same source file are implemented in a single `src/tools/{"{tutorial_file_name}"}.py`
- **Unified Testing**: All tools from the same source file are tested together under `tests/code/{"{tutorial_file_name}"}/`
- **Example**: If `README.md` contains 6 tutorial sections, all extracted tools go into `src/tools/readme.py` with corresponding tests in `tests/code/readme/`

---

## Parallel Execution Coordination

### Phase 1: Parallel Tool Extraction & Implementation

**Pre-Extraction Validation:**
- Verify `reports/executed_notebooks.json` contains valid tutorial entries
- Confirm all referenced notebook files exist and are accessible
- Validate environment activation: `source {github_repo_name}-env/bin/activate`
- Check prerequisite tools and dependencies are available

**Parallel Extraction Coordination:**
For each tutorial file in `executed_notebooks.json`, launch in parallel:
```
Task: tutorial-tool-extractor-implementor
- Mission: Extract tools from ALL tutorials within SINGLE file {"{tutorial_file_name}"}
- Input: Single file entry from executed_notebooks.json + corresponding notebook file
- Environment: {github_repo_name}-env
- Requirements: Production-quality tools, scientific rigor, real-world applicability
- Critical Rules:
  * NEVER add function parameters not in original tutorial
  * PRESERVE exact tutorial structure - no generalized patterns
  * Basic input file validation only
  * Extract ALL tutorial sections from the same source file into single output
- Output: src/tools/{"{tutorial_file_name}"}.py (containing all tutorials from source file)
```

**Parallel Extraction Monitoring:**
- Track progress through individual implementation log files per tutorial file
- Monitor for critical extraction failures requiring intervention per tutorial file
- Implement timeout handling (45-minute maximum per tutorial file extraction)
- Wait for ALL parallel extractions to complete before proceeding to testing phase
- **Verify Tutorial Fidelity**: Check that function calls exactly match tutorial (no added parameters)
- **Verify Structure Preservation**: Ensure exact tutorial data structures are preserved
- **Count Functions**: For each tutorial file, run `grep "@<tutorial_file_name>_mcp.tool" src/tools/<tutorial_file_name>.py | wc -l` to determine number of test files needed

### Phase 2: Parallel Testing, Verification & Improvement

**Pre-Testing Validation:**
- Verify all expected `src/tools/{"{tutorial_file_name}"}.py` files were generated
- Count decorated functions: `grep "@<tutorial_file_name>_mcp.tool" src/tools/<tutorial_file_name>.py | wc -l`
- Confirm tool implementations follow required patterns and standards
- Validate function decorators and proper tool structure
- Check availability of tutorial execution data for testing

**Parallel Tutorial File Testing Coordination:**
For each tutorial file that completed extraction, launch in parallel:
```
Task: test-verifier-improver
- Mission: Create individual test files for EACH decorated tool function in SINGLE file {"{tutorial_file_name}"}
- Approach: Sequential tool-by-tool testing within file (Tool 1 β†’ Tool 2 β†’ Tool N)
- Input: src/tools/{"{tutorial_file_name}"}.py + notebooks/{"{tutorial_file_name}"}/ + execution data
- Environment: {github_repo_name}-env with pytest infrastructure
- API Key: "{api_key}" (if provided, use for testing tools requiring API access)
- Requirements: One test file per tool, 100% function coverage, tutorial fidelity
- Output Structure:
  * tests/code/{"{tutorial_file_name}"}/<tool1_name>_test.py
  * tests/code/{"{tutorial_file_name}"}/<tool2_name>_test.py
  * tests/code/{"{tutorial_file_name}"}/<toolN_name>_test.py
  * tests/logs/{"{tutorial_file_name}"}_<tool_name>_test.log (per tool)
  * tests/logs/{"{tutorial_file_name}"}_test.md (final summary)
```

**Parallel Tutorial File Testing Monitoring:**
- **Per-File Sequential Order**: Within each tutorial file, process tools one at a time in order
- **Tool 1 Complete Cycle**: Create test β†’ Run β†’ Fix β†’ Pass before Tool 2
- **Tool 2 Complete Cycle**: Create test β†’ Run β†’ Fix β†’ Pass before Tool 3
- **Dependency Management**: Tool N+1 can reference actual outputs from Tool N within same tutorial file
- Monitor iterative improvement cycles (up to 6 attempts per function)
- **Success Tracking**: Each tool passes individually or decorator removed after 6 attempts
- **Cross-File Independence**: Different tutorial files can test in parallel without dependencies

**API Key Testing Guidelines:**
- When API key is provided ("{api_key}"), instruct test-verifier-improver to:
  - Detect tools requiring API access (OpenAI, Anthropic, Gemini, AlphaGenome, ESM, etc.)
  - Include API key configuration in test files and supply that to the places that require it
    ```python
    # API Configuration for testing
    api_key = "{api_key}"
    # Configure appropriate API client based on tool requirements
    ```
  - Document API requirements in test logs for each tool

### Phase 3: Quality Assurance & Validation

**Inter-Phase Validation:**
- **Extraction Completeness**: Verify all parallel tutorial file extractions completed successfully
- **Tool Quality**: Confirm tools follow scientific rigor and real-world applicability standards
- **Tutorial Fidelity**: Verify function calls exactly match original tutorial (no added parameters)
- **Structure Preservation**: Confirm exact tutorial data structures preserved (no generalized patterns)
- **Error Handling**: Verify only basic input file validation implemented
- **Tool-Based Test Coverage**: Ensure 1:1 mapping between decorated functions and individual test files
- **Figure Validation**: Verify generated figures match tutorial execution notebook figures

**Error Recovery Strategies:**
- **Parallel Extraction Failures**: Guide individual tutorial-tool-extractor instances through dependency resolution and code adaptation
- **Parallel Testing Failures**: Support individual test-verifier-improver instances with iterative debugging and improvement cycles
- **Quality Issues**: Coordinate refinement of tools that don't meet production standards across parallel instances
- **Integration Problems**: Resolve conflicts between parallel extraction and testing phases
- **Resource Management**: Handle resource conflicts and timeouts across parallel operations

---

## Success Criteria & Completion

### Completion Requirements
Use [βœ“] to confirm success and [βœ—] to confirm failure. Provide a one-line reason for success or failure. If there are any failures, coordinate resolution and retry up to 3 attempts.

- [ ] **Parallel Extraction Phase**: All tutorial files successfully converted to tool implementations in parallel
- [ ] **Tool Quality**: Tools meet scientific rigor and real-world applicability standards
- [ ] **Tutorial Fidelity**: Function calls exactly match original tutorial (no added parameters)
- [ ] **Structure Preservation**: Exact tutorial data structures preserved (no generalized patterns)
- [ ] **Error Handling**: Only basic input file validation implemented
- [ ] **Parallel Testing Phase**: Individual test files created for each decorated function across parallel tutorial files
- [ ] **Per-File Sequential Processing**: Within each tutorial file, all tools tested in order, each passing before next tool creation
- [ ] **Test Coverage**: 1:1 mapping between `@<tutorial_file_name>_mcp.tool` functions and test files
- [ ] **Test Results**: All tools pass tests or failed functions properly marked after 6 attempts
- [ ] **Figure Validation**: Generated figures match tutorial execution notebook figures
- [ ] **Documentation**: Complete logs and documentation generated for all parallel phases
- [ ] **File Structure**: Proper directory organization and naming conventions followed

### Consolidated Reporting
Generate final summary of tool extraction and testing:
```
Parallel Tool Extraction & Testing Coordination Complete

Parallel Extraction Summary:
- Total tutorial files processed in parallel: [count]
- Successfully extracted in parallel: [count]
- Tool files generated: src/tools/[count].py files
- Real-world applicability: [assessment]

Parallel Tool-Based Testing Summary:
- Total tutorial files tested in parallel: [count]
- Total functions tested across all tutorial files: [count]
- Individual test files created: [count] (tests/code/<tutorial_file_name>/<tool_name>_test.py)
- Per-file sequential processing completed: [yes/no]
- Functions passing tests: [count]
- Functions marked as failed: [count]
- Per-tool execution logs: tests/logs/<tutorial_file_name>_<tool_name>_test.log
- Final summary documentation: tests/logs/<tutorial_file_name>_test.md

Quality Metrics:
- Figure validation success: [count]/[total]
- Scientific rigor compliance: [assessment]
- Production readiness: [assessment]
- Parallel processing efficiency: [assessment]
```

### Error Documentation
For any coordination failures:
- Document specific phase failures with root causes
- Provide actionable remediation steps for manual intervention
- Report tool quality issues requiring refinement
- Escalate unrecoverable failures with detailed analysis

**Iteration Tracking:**
- **Current coordination attempt**: ___ of 3 maximum
- **Parallel extraction retry cycles**: ___ (if needed)
- **Parallel testing retry cycles**: ___ per function (max 6)
- **Critical parallel coordination issues**: ___

---

## Guiding Principles for Coordination

### 1. Scientific Rigor & Tutorial Fidelity
- **Publication Quality**: Ensure tools meet research-grade standards
- **Conservative Approach**: Surface assumptions, limitations, and uncertainties explicitly
- **No Fabrication**: Never allow invention of inputs, defaults, or examples
- **Real-World Focus**: Tools designed for actual use cases, not just tutorial reproduction
- **Exact Tutorial Preservation**: Function calls must exactly match tutorial (no added parameters)
- **Structure Preservation**: Preserve exact tutorial data structures (no generalized patterns)
- **Minimal Error Handling**: Implement only basic input file validation

### 2. Parallel Dependency Management
- **Phase Dependency**: Testing cannot begin until all parallel extractions are complete
- **Output Validation**: Verify each parallel phase produces required inputs for next phase
- **Error Propagation**: Handle failures gracefully without breaking downstream phases or other parallel instances
- **State Management**: Maintain clear handoff between parallel extraction and parallel testing phases
- **Cross-File Independence**: Ensure parallel tutorial files don't interfere with each other

### 3. Quality Assurance
- **Tool Validation**: Ensure extracted tools meet production standards
- **Test Fidelity**: Verify tests use exact tutorial examples and parameters
- **Figure Accuracy**: Confirm visual outputs match tutorial execution results
- **Documentation Standards**: Maintain comprehensive logs and decision tracking

### 4. File Structure Standards
- **Snake Case Convention**: `Data-Processing-Tutorial` β†’ `data_processing_tutorial`
- **Consistent Organization**: Standardized directory structure across all tutorials
- **Naming Compliance**: Uniform file naming for tools, tests, and logs
- **Path Management**: Absolute paths in all artifacts and references

---

## Environment Requirements
- **Primary Environment**: `{github_repo_name}-env` (pre-configured with dependencies)
- **Required Tools**: pytest, fastmcp, imagehash, pandas, numpy, matplotlib
- **Execution Context**: Activated environment for all tool and test operations
- **Directory Structure**: Proper src/, tests/, notebooks/ organization
- **Path Resolution**: Repository-relative paths for data and file access
"""


def step4_mcp_integration(github_repo_name):
    """
    Step 4: MCP Integration Implementor

    Args:
        github_repo_name: Repository name
    """
    return f'''# MCP Integration Implementor

## Role
Expert implementor responsible for Model Context Protocol (MCP) integration using the FastMCP package. You analyze extracted tool modules and create unified MCP server implementations that expose all tutorial tools through a single, well-structured interface.

## Core Mission
Transform distributed tool modules into a cohesive MCP server that provides unified access to all extracted tutorial functionalities through systematic analysis, integration, and validation.

## Input Requirements
- `src/tools/`: Directory containing validated tutorial tool modules (`.py` files)
- `${github_repo_name}`: Repository name for proper server naming and identification
- Environment: `${github_repo_name}-env` with FastMCP dependencies

## Expected Outputs
- `src/${github_repo_name}_mcp.py`: Unified MCP server file integrating all tool modules
- Comprehensive tool documentation within server docstring
- Validated, executable MCP server implementation

---

## Implementation Process

### Phase 1: Tool Module Discovery & Analysis

**Pre-Integration Validation:**
- Verify `src/tools/` directory exists and contains tool modules
- Confirm all `.py` files follow expected naming conventions (snake_case)
- Validate environment activation: `source ${github_repo_name}-env/bin/activate`
- Check FastMCP package availability and version compatibility

**Module Analysis Process:**
- **Discovery**: Scan `src/tools/` for all `.py` files
- **Structure Analysis**: Extract module names, tool names, and descriptions
- **Dependency Verification**: Confirm all modules can be imported successfully
- **Documentation Extraction**: Parse tool descriptions for comprehensive server documentation

### Phase 2: MCP Server Generation

**Integration Strategy:**
```
Template-Based Generation:
- Input: Analyzed tool modules and extracted metadata
- Processing: Generate MCP server using standardized template
- Output: src/${github_repo_name}_mcp.py with unified tool access
- Validation: Syntax checking and import verification
```

**Server Template Structure:**
```python
"""
Model Context Protocol (MCP) for ${github_repo_name}

[Three-sentence description of codebase functionality]

This MCP Server contains tools extracted from the following tutorial files:
1. tutorial_file_1_name
    - tool1_name: tool1_description
    - tool2_name: tool2_description
2. tutorial_file_2_name
    - tool1_name: tool1_description
    ...
"""

from fastmcp import FastMCP

# Import statements (alphabetical order)
from tools.tutorial_file_1_name import tutorial_file_1_name_mcp
from tools.tutorial_file_2_name import tutorial_file_2_name_mcp

# Server definition and mounting
mcp = FastMCP(name="${github_repo_name}")
mcp.mount(tutorial_file_1_name_mcp)
mcp.mount(tutorial_file_2_name_mcp)

if __name__ == "__main__":
    mcp.run()
```

### Phase 3: Validation & Quality Assurance

**Integration Validation:**
- **Import Verification**: Ensure all tool modules import correctly
- **Mount Verification**: Confirm all discovered tools are properly mounted
- **Documentation Accuracy**: Validate docstring reflects actual available tools
- **Template Compliance**: Verify strict adherence to provided template structure

**Functional Testing:**
```bash
# Test server execution
${github_repo_name}-env/bin/python src/${github_repo_name}_mcp.py
```

**Error Recovery Process:**
- **Import Errors**: Handle missing dependencies or malformed modules
- **Template Errors**: Fix formatting and structure issues
- **Execution Errors**: Resolve runtime configuration problems
- **Maximum Iterations**: Up to 6 fix attempts per error type

---

## Success Criteria & Completion

### Completion Requirements
Use [βœ“] to confirm success and [βœ—] to confirm failure. Provide a one-line reason for success or failure. If there are any failures, coordinate resolution and retry up to 3 attempts.

- [ ] **Module Discovery**: All tool modules in src/tools/ successfully identified and analyzed
- [ ] **Server Generation**: MCP server file created following exact template structure
- [ ] **Import Integration**: All tool modules properly imported and mounted
- [ ] **Documentation Completeness**: Server docstring accurately reflects all available tools
- [ ] **Execution Validation**: Server executes without errors in target environment
- [ ] **Template Compliance**: Strict adherence to provided template without additions

### Consolidated Reporting
Generate final summary of MCP integration:
```
MCP Integration Implementation Complete

Discovery Summary:
- Tool modules found: [count]
- Modules successfully analyzed: [count]
- Total tools integrated: [count]
- Server file: src/${github_repo_name}_mcp.py

Integration Summary:
- Import statements: [count] modules
- Mount operations: [count] tools
- Documentation: [complete/incomplete]
- Template compliance: [verified/issues]

Validation Summary:
- Syntax validation: [pass/fail]
- Import validation: [pass/fail]
- Execution test: [pass/fail]
- Error resolution attempts: [count]/6 maximum
```

### Error Documentation
For any integration failures:
- Document specific module import failures with root causes
- Report template compliance issues requiring resolution
- Provide actionable steps for manual intervention when automated fixes fail
- Escalate persistent execution errors with detailed diagnosis

**Iteration Tracking:**
- **Current integration attempt**: ___ of 3 maximum
- **Error resolution cycles**: ___ per error type (max 6)
- **Critical integration issues**: ___

---

## Integration Standards

### File Naming & Structure
- **Server File**: `src/${github_repo_name}_mcp.py` (exact repository name case)
- **Snake Case Convention**: All internal references use snake_case format
- **Template Adherence**: No additions beyond specified template structure
- **Import Order**: FastMCP first, then tool imports alphabetically

### Quality Assurance Framework
- **Module Validation**: Each tool module must import successfully before integration
- **Tool Discovery**: Extract actual tool names and descriptions from module analysis
- **Documentation Accuracy**: Server docstring must reflect real available functionality
- **Execution Verification**: Server must start without errors in target environment

### Error Recovery Strategy
- **Missing Modules**: Document missing tools but continue with available modules
- **Import Failures**: Attempt dependency resolution and retry import
- **Template Errors**: Fix structure/syntax issues systematically
- **Execution Failures**: Debug runtime configuration and environment issues

---

## Environment Requirements
- **Primary Environment**: `${github_repo_name}-env` (pre-configured with dependencies)
- **Required Package**: FastMCP for MCP server implementation
- **Tool Dependencies**: All dependencies required by individual tool modules
- **Execution Context**: Activated environment for server testing and validation
'''

def step5_code_quality_and_coverage_analysis():
  return f'''# Code Quality & Coverage Analysis Coordinator

## Role
Quality assurance coordinator that generates comprehensive code coverage reports and quantitative code quality metrics (including style analysis via pylint) for all extracted tools, providing actionable insights into test completeness, code style, and overall code quality.

## Core Mission
Analyze pre-generated coverage and pylint reports to extract quantitative metrics on test coverage and code quality, identify gaps in testing and style issues, and compile comprehensive quality assessment reports from the collected data.

## Input Requirements
- `reports/coverage/`: Pre-generated coverage reports from pytest-cov
  - `coverage.xml`: XML coverage report
  - `coverage.json`: JSON coverage report
  - `coverage_summary.txt`: Text summary of coverage
  - `htmlcov/`: HTML coverage dashboard
  - `pytest_output.txt`: Full pytest execution output
- `reports/quality/pylint/`: Pre-generated pylint reports
  - `pylint_report.txt`: Full pylint analysis output
  - `pylint_scores.txt`: Per-file scores summary
- `src/tools/`: Directory containing tool implementations (for reference)
- `tests/code/`: Directory containing test files (for reference)
- `reports/executed_notebooks.json`: List of tutorial files for analysis

## Expected Outputs
```
reports/coverage/
  β”œβ”€β”€ coverage.xml                          # XML coverage report (for CI/CD integration)
  β”œβ”€β”€ coverage.json                          # JSON coverage report (machine-readable)
  β”œβ”€β”€ htmlcov/                               # HTML coverage report (human-readable)
  β”‚   β”œβ”€β”€ index.html                         # Main coverage dashboard
  β”‚   └── ...                                # Per-file coverage details
  β”œβ”€β”€ coverage_summary.txt                   # Text summary of coverage metrics
  └── coverage_report.md                     # Detailed markdown report with quality metrics

reports/quality/
  β”œβ”€β”€ pylint/                                # Pylint code style analysis
  β”‚   β”œβ”€β”€ pylint_report.txt                  # Text output from pylint
  β”‚   β”œβ”€β”€ pylint_report.json                 # JSON output (if available)
  β”‚   β”œβ”€β”€ pylint_scores.txt                  # Per-file scores summary
  β”‚   └── pylint_issues.md                   # Detailed issues breakdown
reports/coverage_and_quality_report.md        # Combined coverage + style quality report
```

---

## Execution Workflow

### Phase 1: Pre-Analysis Validation

**Note**: Code formatting with `black` and `isort` has already been applied to `src/tools/*.py`. Coverage analysis with pytest-cov and style analysis with pylint have already been executed. This phase focuses on analyzing the generated reports.

**Report File Validation:**
- Verify `reports/coverage/coverage.xml` exists and is readable
- Verify `reports/coverage/coverage.json` exists and is readable
- Verify `reports/coverage/coverage_summary.txt` exists and contains coverage data
- Verify `reports/quality/pylint/pylint_report.txt` exists and contains pylint output
- Verify `reports/quality/pylint/pylint_scores.txt` exists and contains score data
- Check `reports/coverage/pytest_output.txt` for any test execution errors or warnings

### Phase 2: Coverage Metrics Extraction

**Read and Parse Coverage Reports:**
- **Parse JSON Coverage**: Read `reports/coverage/coverage.json` to extract:
  - Overall coverage percentages (lines, branches, functions, statements)
  - Per-file coverage breakdown
  - Missing line numbers per file
- **Parse Text Summary**: Read `reports/coverage/coverage_summary.txt` for quick reference metrics
- **Review XML Report**: If needed, reference `reports/coverage/coverage.xml` for detailed line-by-line coverage

**Coverage Metrics to Extract:**
- **Line Coverage**: Percentage of lines executed by tests
- **Branch Coverage**: Percentage of branches (if/else, try/except) tested
- **Function Coverage**: Percentage of functions/methods called
- **Statement Coverage**: Percentage of statements executed
- **Per-File Coverage**: Individual file coverage percentages
- **Missing Coverage**: Identify functions/lines with 0% coverage

### Phase 3: Coverage Report Generation

**Create Coverage Analysis Report:**
Generate `reports/coverage/coverage_report.md` with:
- Overall coverage statistics extracted from JSON/XML reports
- Per-file coverage breakdown from parsed data
- Per-tutorial coverage analysis (matching files to `reports/executed_notebooks.json`)
- Coverage gaps identification (functions with low/no coverage)
- Quality recommendations based on gaps

**Report Template Structure:**
```markdown
# Code Quality & Coverage Report

## Overall Quality Metrics

### Coverage Metrics
- **Line Coverage**: [percentage]%
- **Branch Coverage**: [percentage]%
- **Function Coverage**: [percentage]%
- **Statement Coverage**: [percentage]%

### Code Style Metrics
- **Overall Pylint Score**: [score]/10
- **Average File Score**: [score]/10
- **Total Issues**: [count]
  - Errors: [count]
  - Warnings: [count]
  - Refactor: [count]
  - Convention: [count]

### Combined Quality Score
- **Overall Quality**: [score]/100
  - Coverage: [score]/40
  - Style: [score]/30
  - Test Completeness: [score]/20
  - Structure: [score]/10

## Per-Tutorial Quality Breakdown

### Tutorial: [tutorial_file_name]
- **Tool File**: `src/tools/[tutorial_file_name].py`
- **Line Coverage**: [percentage]%
- **Functions Tested**: [count]/[total]
- **Coverage Status**: [Excellent/Good/Fair/Poor]
- **Pylint Score**: [score]/10
- **Style Status**: [Excellent/Good/Fair/Poor]
- **Issues**: [count] (E:[count] W:[count] R:[count] C:[count])

### Coverage Gaps
- Functions with low/no coverage:
  - `function_name`: [percentage]% coverage
  - ...

### Style Issues
- Top issues for this tutorial:
  - [Issue type]: [description] (in `function_name`)
  - ...

## Quality Recommendations
- [Recommendation based on coverage gaps]
- [Recommendation based on style issues]
- [Suggestions for improving test coverage]
- [Suggestions for improving code style]
```

### Phase 4: Code Style Analysis (Pylint)

**Read and Parse Pylint Reports:**
- **Parse Pylint Report**: Read `reports/quality/pylint/pylint_report.txt` to extract:
  - Overall pylint score (from "Your code has been rated" line)
  - Per-file scores and ratings
  - Issue counts by severity (Error, Warning, Refactor, Convention, Info)
  - Specific issue messages with line numbers
- **Parse Pylint Scores**: Read `reports/quality/pylint/pylint_scores.txt` for quick score reference

**Pylint Metrics to Extract:**
- **Overall Score**: Pylint score (0-10 scale) from report
- **Per-File Scores**: Individual file ratings extracted from report
- **Issue Categories**: Count issues by type (Errors, Warnings, Refactor, Convention, Info)
- **Issue Counts**: Total issues by severity
- **Code Smells**: Identify complexity, design issues, and style violations
- **Most Problematic Files**: Files with lowest scores or most issues

**Generate Pylint Issues Breakdown:**
Create `reports/quality/pylint/pylint_issues.md` with:
- Per-file score breakdown extracted from reports
- Top issues by category (grouped from parsed report)
- Most problematic files (lowest scores, most issues)
- Style recommendations based on common issues found

### Phase 5: Quality Metrics Analysis & Combined Reporting

**Calculate Additional Metrics from Collected Data:**
- **Test-to-Code Ratio**: Count test files in `tests/code/` vs tool files in `src/tools/`
- **Coverage Distribution**: Categorize files from coverage data as <50%, 50-80%, >80% coverage
- **Critical Coverage Gaps**: Identify functions with 0% coverage from coverage JSON/XML
- **Test Completeness**: Count `@tool` decorated functions in `src/tools/` vs tests in `tests/code/`
- **Style Score**: Calculate average pylint score across all files from parsed scores
- **Issue Density**: Calculate issues per file/lines of code from pylint report
- **Quality Distribution**: Categorize files by pylint scores (excellent >9, good 7-9, fair 5-7, poor <5)

**Generate Combined Quality Score:**
Calculate weighted quality score:
- Coverage metrics (40% weight): Based on overall coverage percentages from JSON
- Code style score (30% weight): Based on average pylint score from parsed scores
- Test completeness score (20% weight): Based on test-to-code ratio and function coverage
- Code structure score (10% weight): Based on issue density and quality distribution

**Create Combined Quality Report:**
Generate `reports/coverage_and_quality_report.md` with:
- **Overall Quality Metrics**: Combined scores from all sources
- **Per-Tutorial Quality Breakdown**: Match files to tutorials from `executed_notebooks.json`
  - Coverage metrics per tutorial
  - Pylint scores per tutorial
  - Combined quality score per tutorial
- **Quality Assessment**: Overall quality score and component breakdowns
- **Actionable Recommendations**: 
  - Specific coverage gaps to address
  - Style issues to fix
  - Test improvements needed
  - Code structure improvements

---

## Success Criteria & Completion

### Completion Requirements
Use [βœ“] to confirm success and [βœ—] to confirm failure. Provide a one-line reason for success or failure.

- [ ] **Report Validation**: All required coverage and pylint report files exist and are readable
- [ ] **Coverage Metrics Extracted**: Coverage data parsed from JSON/XML/text reports
- [ ] **Coverage Report**: coverage_report.md generated with analysis and recommendations
- [ ] **Pylint Metrics Extracted**: Pylint scores and issues parsed from reports
- [ ] **Pylint Issues Report**: pylint_issues.md with detailed breakdown created
- [ ] **Quality Metrics Calculated**: Additional metrics (ratios, distributions, completeness) computed
- [ ] **Combined Quality Report**: coverage_and_quality_report.md with integrated metrics and analysis
- [ ] **Quality Recommendations**: Actionable recommendations for coverage and style improvements documented

### Consolidated Reporting
Generate final summary of quality analysis:
```
Code Quality & Coverage Analysis Complete

Report Analysis Summary:
- Coverage reports analyzed: [yes/no]
- Pylint reports analyzed: [yes/no]
- Tool files referenced: [count]
- Test files referenced: [count]

Overall Coverage Metrics (from parsed reports):
- Line Coverage: [percentage]% (from coverage.json)
- Branch Coverage: [percentage]% (from coverage.json)
- Function Coverage: [percentage]% (from coverage.json)
- Statement Coverage: [percentage]% (from coverage.json)

Overall Style Metrics (from parsed reports):
- Overall Pylint Score: [score]/10 (from pylint_report.txt)
- Average File Score: [score]/10 (calculated from parsed scores)
- Total Issues: [count] (from parsed report)
  - Errors: [count]
  - Warnings: [count]
  - Refactor suggestions: [count]
  - Convention issues: [count]

Generated Reports:
- Coverage analysis: reports/coverage/coverage_report.md
- Pylint issues: reports/quality/pylint/pylint_issues.md
- Combined quality report: reports/coverage_and_quality_report.md

Quality Assessment:
- Overall Quality Score: [score]/100
  - Coverage: [score]/40
  - Style: [score]/30
  - Test Completeness: [score]/20
  - Structure: [score]/10
- Files with >80% coverage: [count]
- Files with <50% coverage: [count]
- Files with >9.0 pylint score: [count]
- Files with <5.0 pylint score: [count]
- Critical gaps identified: [count]
```

### Error Documentation
For any analysis failures:
- Document missing or unreadable report files
- Document errors parsing coverage JSON/XML reports
- Document errors parsing pylint text reports
- Report missing test files or tool files (for reference/validation)
- Note any issues found in pytest_output.txt that might affect coverage accuracy
- Provide actionable steps for improving coverage based on gaps identified
- Provide actionable steps for improving style based on pylint issues found
- Escalate unrecoverable analysis failures with detailed diagnosis

**Iteration Tracking:**
- **Current analysis attempt**: ___ of 3 maximum
- **Report parsing errors**: ___
- **Metrics calculation errors**: ___
- **Report generation issues**: ___

---

## Guiding Principles for Quality Analysis

### 1. Comprehensive Metrics Collection
- **Multi-Format Reports**: Generate XML (CI/CD), JSON (automation), HTML (human review), and text (quick reference)
- **Multiple Coverage Types**: Line, branch, function, and statement coverage for complete picture
- **Code Style Analysis**: Pylint scores and issue categorization for style quality
- **Actionable Insights**: Identify specific gaps and provide improvement recommendations

### 2. Quality Assessment
- **Threshold-Based Scoring**: 
  - Coverage: Excellent (>90%), Good (70-90%), Fair (50-70%), Poor (<50%)
  - Style: Excellent (>9.0), Good (7.0-9.0), Fair (5.0-7.0), Poor (<5.0)
- **Combined Quality Score**: Weighted combination of coverage, style, test completeness, and structure
- **Critical Gap Identification**: Flag functions with 0% coverage and files with critical style issues as high-priority
- **Test Completeness**: Verify all decorated functions have corresponding tests

### 3. Reporting Standards
- **Human-Readable**: HTML and markdown reports for manual review
- **Machine-Readable**: XML and JSON for automated analysis and CI/CD integration
- **Comparative Analysis**: Per-tutorial breakdown for targeted improvement
- **Actionable Recommendations**: Specific suggestions for improving coverage and style
- **Combined Reports**: Unified quality report integrating coverage and style metrics

### 4. Integration with Workflow
- **Non-Blocking**: Quality analysis doesn't block pipeline execution
- **Quality Gate**: Provides quantitative metrics for code quality assessment
- **Documentation**: Comprehensive reports for review and improvement tracking
- **Style Guidance**: Pylint provides specific, fixable recommendations for code improvement

---

## Environment Requirements
- **Report Files**: Pre-generated coverage and pylint reports must exist in:
  - `reports/coverage/` directory with all coverage report files
  - `reports/quality/pylint/` directory with pylint reports
- **Reference Files**: Access to source code and test files for context:
  - `src/tools/` for understanding tool structure
  - `tests/code/` for understanding test organization
  - `reports/executed_notebooks.json` for tutorial mapping
- **Path Resolution**: Repository-relative paths for all report and reference files
- **File Reading**: Ability to read and parse JSON, XML, and text report formats
'''