File size: 12,115 Bytes
5374a2d
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
{
 "cells": [
  {
   "cell_type": "code",
   "execution_count": 1,
   "id": "b0ea2822",
   "metadata": {},
   "outputs": [],
   "source": [
    "import pandas as pd\n",
    "\n",
    "df = pd.read_parquet(\"hf://datasets/qiaojin/PubMedQA/pqa_artificial/train-00000-of-00001.parquet\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "id": "532fb572",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'Chronic rhinosinusitis (CRS) is a heterogeneous disease with an uncertain pathogenesis. Group 2 innate lymphoid cells (ILC2s) represent a recently discovered cell population which has been implicated in driving Th2 inflammation in CRS; however, their relationship with clinical disease characteristics has yet to be investigated.\\nThe aim of this study was to identify ILC2s in sinus mucosa in patients with CRS and controls and compare ILC2s across characteristics of disease.\\nA cross-sectional study of patients with CRS undergoing endoscopic sinus surgery was conducted. Sinus mucosal biopsies were obtained during surgery and control tissue from patients undergoing pituitary tumour resection through transphenoidal approach. ILC2s were identified as CD45(+) Lin(-) CD127(+) CD4(-) CD8(-) CRTH2(CD294)(+) CD161(+) cells in single cell suspensions through flow cytometry. ILC2 frequencies, measured as a percentage of CD45(+) cells, were compared across CRS phenotype, endotype, inflammatory CRS subtype and other disease characteristics including blood eosinophils, serum IgE, asthma status and nasal symptom score.\\n35 patients (40% female, age 48 ± 17 years) including 13 with eosinophilic CRS (eCRS), 13 with non-eCRS and 9 controls were recruited. ILC2 frequencies were associated with the presence of nasal polyps (P = 0.002) as well as high tissue eosinophilia (P = 0.004) and eosinophil-dominant CRS (P = 0.001) (Mann-Whitney U). They were also associated with increased blood eosinophilia (P = 0.005). There were no significant associations found between ILC2s and serum total IgE and allergic disease. In the CRS with nasal polyps (CRSwNP) population, ILC2s were increased in patients with co-existing asthma (P = 0.03). ILC2s were also correlated with worsening nasal symptom score in CRS (P = 0.04).'"
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "'\\n'.join(df.loc[0]['context']['contexts'])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "id": "273a2b36",
   "metadata": {},
   "outputs": [],
   "source": [
    "df.to_json(\"pubmedqa_train.json\", orient='records', indent=4)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "bbb50950",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "id": "9be45340",
   "metadata": {},
   "outputs": [],
   "source": [
    "import pandas as pd\n",
    "\n",
    "df = pd.read_parquet(\"hf://datasets/qiaojin/PubMedQA/pqa_labeled/train-00000-of-00001.parquet\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "id": "77730f3e",
   "metadata": {},
   "outputs": [],
   "source": [
    "df.to_json(\"pubmedqa_label.json\", orient='records', indent=4)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "id": "c6471fbd",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>pubid</th>\n",
       "      <th>question</th>\n",
       "      <th>context</th>\n",
       "      <th>long_answer</th>\n",
       "      <th>final_decision</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>21645374</td>\n",
       "      <td>Do mitochondria play a role in remodelling lac...</td>\n",
       "      <td>{'contexts': ['Programmed cell death (PCD) is ...</td>\n",
       "      <td>Results depicted mitochondrial dynamics in viv...</td>\n",
       "      <td>yes</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>16418930</td>\n",
       "      <td>Landolt C and snellen e acuity: differences in...</td>\n",
       "      <td>{'contexts': ['Assessment of visual acuity dep...</td>\n",
       "      <td>Using the charts described, there was only a s...</td>\n",
       "      <td>no</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>9488747</td>\n",
       "      <td>Syncope during bathing in infants, a pediatric...</td>\n",
       "      <td>{'contexts': ['Apparent life-threatening event...</td>\n",
       "      <td>\"Aquagenic maladies\" could be a pediatric form...</td>\n",
       "      <td>yes</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>17208539</td>\n",
       "      <td>Are the long-term results of the transanal pul...</td>\n",
       "      <td>{'contexts': ['The transanal endorectal pull-t...</td>\n",
       "      <td>Our long-term study showed significantly bette...</td>\n",
       "      <td>no</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>10808977</td>\n",
       "      <td>Can tailored interventions increase mammograph...</td>\n",
       "      <td>{'contexts': ['Telephone counseling and tailor...</td>\n",
       "      <td>The effects of the intervention were most pron...</td>\n",
       "      <td>yes</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>...</th>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>995</th>\n",
       "      <td>8921484</td>\n",
       "      <td>Does gestational age misclassification explain...</td>\n",
       "      <td>{'contexts': ['After 34 weeks gestation, summa...</td>\n",
       "      <td>Gestational age misclassification is an unlike...</td>\n",
       "      <td>no</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>996</th>\n",
       "      <td>16564683</td>\n",
       "      <td>Is there any interest to perform ultrasonograp...</td>\n",
       "      <td>{'contexts': ['To evaluate the accuracy of ult...</td>\n",
       "      <td>Sonography has no place in the diagnosis of un...</td>\n",
       "      <td>no</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>997</th>\n",
       "      <td>23147106</td>\n",
       "      <td>Is peak concentration needed in therapeutic dr...</td>\n",
       "      <td>{'contexts': ['We analyzed the pharmacokinetic...</td>\n",
       "      <td>These results suggest little need to use peak ...</td>\n",
       "      <td>no</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>998</th>\n",
       "      <td>21550158</td>\n",
       "      <td>Can autologous platelet-rich plasma gel enhanc...</td>\n",
       "      <td>{'contexts': ['This investigation assesses the...</td>\n",
       "      <td>The PRP group recorded reduced pain, swelling,...</td>\n",
       "      <td>yes</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>999</th>\n",
       "      <td>17559449</td>\n",
       "      <td>Are sugars-free medicines more erosive than su...</td>\n",
       "      <td>{'contexts': ['The reduced use of sugars-conta...</td>\n",
       "      <td>Paediatric SF medicines were not more erosive ...</td>\n",
       "      <td>no</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>1000 rows × 5 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "        pubid                                           question  \\\n",
       "0    21645374  Do mitochondria play a role in remodelling lac...   \n",
       "1    16418930  Landolt C and snellen e acuity: differences in...   \n",
       "2     9488747  Syncope during bathing in infants, a pediatric...   \n",
       "3    17208539  Are the long-term results of the transanal pul...   \n",
       "4    10808977  Can tailored interventions increase mammograph...   \n",
       "..        ...                                                ...   \n",
       "995   8921484  Does gestational age misclassification explain...   \n",
       "996  16564683  Is there any interest to perform ultrasonograp...   \n",
       "997  23147106  Is peak concentration needed in therapeutic dr...   \n",
       "998  21550158  Can autologous platelet-rich plasma gel enhanc...   \n",
       "999  17559449  Are sugars-free medicines more erosive than su...   \n",
       "\n",
       "                                               context  \\\n",
       "0    {'contexts': ['Programmed cell death (PCD) is ...   \n",
       "1    {'contexts': ['Assessment of visual acuity dep...   \n",
       "2    {'contexts': ['Apparent life-threatening event...   \n",
       "3    {'contexts': ['The transanal endorectal pull-t...   \n",
       "4    {'contexts': ['Telephone counseling and tailor...   \n",
       "..                                                 ...   \n",
       "995  {'contexts': ['After 34 weeks gestation, summa...   \n",
       "996  {'contexts': ['To evaluate the accuracy of ult...   \n",
       "997  {'contexts': ['We analyzed the pharmacokinetic...   \n",
       "998  {'contexts': ['This investigation assesses the...   \n",
       "999  {'contexts': ['The reduced use of sugars-conta...   \n",
       "\n",
       "                                           long_answer final_decision  \n",
       "0    Results depicted mitochondrial dynamics in viv...            yes  \n",
       "1    Using the charts described, there was only a s...             no  \n",
       "2    \"Aquagenic maladies\" could be a pediatric form...            yes  \n",
       "3    Our long-term study showed significantly bette...             no  \n",
       "4    The effects of the intervention were most pron...            yes  \n",
       "..                                                 ...            ...  \n",
       "995  Gestational age misclassification is an unlike...             no  \n",
       "996  Sonography has no place in the diagnosis of un...             no  \n",
       "997  These results suggest little need to use peak ...             no  \n",
       "998  The PRP group recorded reduced pain, swelling,...            yes  \n",
       "999  Paediatric SF medicines were not more erosive ...             no  \n",
       "\n",
       "[1000 rows x 5 columns]"
      ]
     },
     "execution_count": 15,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "id": "c0f1ab25",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "/gpfs/radev/pi/ying_rex/tl688/selfevolve/EvoAgentX/examples/pubmedqa\r\n"
     ]
    }
   ],
   "source": [
    "!pwd"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.11.13"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}