Spaces:
Running
Running
Commit
·
dc84292
1
Parent(s):
f3a50f8
Add About section to methodology report addressing prompt hacking
Browse files- Reference Kosch & Feger (2025) "Prompt-Hacking: The New p-Hacking?"
- Explain how CatLLM uses standardized prompts for reproducibility
- Addresses concern that prompt variability undermines replicability
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
- __pycache__/app.cpython-311.pyc +0 -0
- app.py +12 -1
__pycache__/app.cpython-311.pyc
CHANGED
|
Binary files a/__pycache__/app.cpython-311.pyc and b/__pycache__/app.cpython-311.pyc differ
|
|
|
app.py
CHANGED
|
@@ -83,11 +83,22 @@ def generate_methodology_report_pdf(categories, model, column_name, num_rows, mo
|
|
| 83 |
|
| 84 |
story = []
|
| 85 |
|
| 86 |
-
# === PAGE 1: Title, Category Mapping ===
|
| 87 |
story.append(Paragraph("CatLLM Methodology Report", title_style))
|
| 88 |
story.append(Paragraph(f"Generated: {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}", normal_style))
|
| 89 |
story.append(Spacer(1, 15))
|
| 90 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 91 |
# Category mapping
|
| 92 |
story.append(Paragraph("Category Mapping", heading_style))
|
| 93 |
story.append(Paragraph("Each category column contains binary values: 1 = present, 0 = not present", normal_style))
|
|
|
|
| 83 |
|
| 84 |
story = []
|
| 85 |
|
| 86 |
+
# === PAGE 1: Title, About, Category Mapping ===
|
| 87 |
story.append(Paragraph("CatLLM Methodology Report", title_style))
|
| 88 |
story.append(Paragraph(f"Generated: {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}", normal_style))
|
| 89 |
story.append(Spacer(1, 15))
|
| 90 |
|
| 91 |
+
# About CatLLM - addressing prompt hacking
|
| 92 |
+
story.append(Paragraph("About This Report", heading_style))
|
| 93 |
+
about_text = """This methodology report documents the classification process for reproducibility and transparency. \
|
| 94 |
+
CatLLM addresses an issue identified by researchers in "Prompt-Hacking: The New p-Hacking?" (Kosch & Feger, 2025): \
|
| 95 |
+
researchers could keep modifying prompts to obtain outputs that support desired conclusions, and this variability \
|
| 96 |
+
in pseudo-natural language poses a challenge for reproducibility since each prompt, even if only slightly altered, \
|
| 97 |
+
can yield different outputs, making it impossible to replicate findings reliably. CatLLM restricts the prompt to a \
|
| 98 |
+
standard template that is impartial to the researcher's hypothesis or inclinations, ensuring consistent and reproducible results."""
|
| 99 |
+
story.append(Paragraph(about_text, normal_style))
|
| 100 |
+
story.append(Spacer(1, 15))
|
| 101 |
+
|
| 102 |
# Category mapping
|
| 103 |
story.append(Paragraph("Category Mapping", heading_style))
|
| 104 |
story.append(Paragraph("Each category column contains binary values: 1 = present, 0 = not present", normal_style))
|