File size: 3,829 Bytes
030876e
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
def inference_prompt_revise_summary(fulltext, ref_summary, generated_summary, version, missing_subclaims):
    prompt = f"""
You are a medical summarization model specialized in readability-controlled text revision.

Your task is to improve the **Generated Summary** by adding back the key missing clinical information listed under **Missing Subclaims**, while keeping the readability style defined for the level **{version}**.

Do not copy the reference summary. Keep coherence, brevity, and correctness.

---

### INPUT

**Full Text (for context):**
{fulltext}

**Reference Summary (for comparison only):**
{ref_summary}

**Generated Summary (to revise):**
{generated_summary}

**Missing Subclaims (to integrate naturally):**
{missing_subclaims}

---

### READABILITY STYLES

- **easy (FH 70–100, grade 5–7):**
  - Short sentences, familiar vocabulary, concrete ideas.
  - Avoid subordinate clauses and medical jargon.
  - Tone: explanatory, simple, and friendly.

- **intermediate (FH 50–69, grade 8–12):**
  - Moderate sentence complexity and domain vocabulary.
  - Clear and structured explanation.

- **hard (FH 0–49, university/professional):**
  - Use specialized terminology, formal and dense phrasing.
  - Include:
    - precise domain vocabulary;
    - causal or analytical connectors (por consiguiente, sin embargo, dado que…);
    - one definition, one process description, and one implication statement if possible;
    - optional subordinate clauses for academic rhythm.

---

### OUTPUT
Return **only the revised summary text**, coherent and medically correct, matching the {version} readability level.
"""
    return prompt



### Synthetic data generation (https://chatgpt.com/c/68f1c138-5a78-8332-8052-eeb65cca1bde)
--------------------------------

def generate_revised_summary_prompt(fulltext, ref_summary, generated_summary, version, missing_subclaims):
    prompt = f"""
You are a medical summarization model that revises simplified summaries to restore important missing information
while keeping the same readability level.

---

### INPUT INFORMATION

**Readability Level:** {version}

**Full Medical Text (for context):**
{fulltext}

**Reference Summary (complete clinical version):**
{ref_summary}

**Generated Summary (current version, missing some information):**
{generated_summary}

**Important Subclaims Missing:**
{missing_subclaims}

---

### READABILITY STYLE GUIDE

- **easy (FH 70–100, grade 5–7):**
  - Short sentences, common vocabulary, concrete ideas.
  - Avoid subordinate clauses and technical terms.
  - Tone: explanatory, lively, and accessible.

- **intermediate (FH 50–69, grade 8–12):**
  - Moderate complexity, suitable for high school readers.

- **hard (FH 0–49, university/professional):**
  - Use specialized terminology, formal register, dense information packaging, and long multi-clause sentences.
  - Incorporate:
    - precise domain vocabulary;
    - causal or analytical connectors (por consiguiente, sin embargo, en virtud de, dado que…);
    - at least one definition, one process description, and one statement of implications or challenges;
    - optional parenthetical clarifications or subordinate relative clauses for academic rhythm.

---

### TASK
Revise the **Generated Summary** to make it more complete by integrating all the **Important Subclaims Missing**,
while preserving the tone, fluency, and readability level defined above.

- Do **not** copy the reference summary directly.
- Use your own phrasing consistent with the given readability level.
- Keep it concise, coherent, and medically accurate.
- Do not add new facts not supported by the text.
- Integrate subclaims *naturally* — not as a list.

---

### OUTPUT
Return **only the revised summary text**, with no explanation, notes, or formatting.
"""
    return prompt