Spaces:

amra-ai
/

studies

Runtime error

Roland Ding commited on Nov 27, 2023

Commit

c769f48

1 Parent(s): d1682ed

11.11.27.77

updated as per request 2023-11-23
+ added pre_view table heading and others extension
+ updated the assessment categories to lower cases.
+ added logic to the article revision.
+ extended the task generation into input_variables and chat_prompt generation
functions.
+ added reformat_instruction function to the application.py due to
generalized instruction format.

On branch main
Changes to be committed:
modified: .data/instruction_agg_performance.json
modified: application.py
modified: backend_update.ipynb
modified: chains.py
modified: features.py

Files changed (5) hide show

.data/instruction_agg_performance.json +1 -1
application.py +2 -0
backend_update.ipynb +49 -424
chains.py +19 -8
features.py +44 -21

.data/instruction_agg_performance.json CHANGED Viewed

@@ -1 +1 @@

- [{"name":"clin-perfFUtable-FIN","inputs":"Change in Probing Depth\nFFI\nGlobal Rating of Change\nGood to Excellent Odom Score\nGrip Strength\nIncision Length\nJOA\nKaplan-Meier\nLikert\nMcCormick Grade\nNDI\nNeer's Grade\nPatient Improvement\nPSS\nPinch Strength\nPocket Depth\nROM\nStride Length\nSWMT\nTime to Fusion\nVAS Change\nVAS Score\nWalking Velocity\nNew Clinical Symptoms","section_sequence":"","assessment":"~~Clinical Performance~~","chain":["Report ALL of the information of the above text by performing the following tasks.\n\nPerform the following tasks:\n1) REPORT each table with the SAME columns as listed.\n\n\nFor Each Table\n1) Report EACH \"table heading\".\n2) Identify if the Table has Groups or NO groups. \n A) If the Table has \"Group\", Report each Group Name as written in the text. \n B) If the Table has No Group, \n 1. Add a Column 0 named: \"Group\". Column 0 will precede Column 1.\n 2. Report: \"No Group\" for each row.\n\n3) Identify if the Table has a \"Preoperative Value (Units)\" or NO \"Preoperative Value (Units)\" Heading. \n A) If the Table has \"Preoperative Value (Units)\", Report each Preoperative Value as written in the text. \n B) If the Table has NO \"Preoperative Value (Units)\" Heading. \n 1. Add a Column 3a named: \"Group\". Column 3a will follow the \"Outcome\" Column\n 2. Report: \"Not Reported\" for each row.\n\n\n4) Report the corresponding Outcome, Values, and Study N for each \"Clinical Outcome\" Table.\n A) Report each VALUE to only One Decimal Point.\n\n5) IF the table is Vertical, Transpose each VERTICAL table to a HORIZONTAL table.\n\n\n\nOverall rules:\n1) Do not exclude any Groups. \n2) Do not exclude any Tables. \n3) Do not exclude any Outcomes.\n4) Do not exclude any Values."]},{"name":"rad-perfFUtable-FIN","inputs":"Adjacent Segment ROM\nCarpal Height Ratio\nCervical Lordosis Profeta Method\nChange in Disc Height\nChange in Segmental Lordosis Angle\nCoronal Vertical Axis\nDeltoid Tuberosity Index\nDisc Height\nEngh grading scale\nFatty Atrophy\nForaminal Area Height\nFusion Assessment\nFusion Events-simplified 2\nParallel Pitch Lines\nScrew Placement Accuracy\nSegmental Lordosis Angle","section_sequence":"","assessment":"~~Radiologic Performance~~","chain":["Report ALL of the information of the above text by performing the following tasks.\n\nPerform the following tasks:\n1) REPORT each table with the SAME columns as listed.\n\n\nFor Each Table\n1) Report EACH \"table heading\".\n2) Identify if the Table has Groups or NO groups. \n A) If the Table has \"Group\", Report each Group Name as written in the text. \n B) If the Table has No Group, \n 1. Add a Column 0 named: \"Group\". Column 0 will precede Column 1.\n 2. Report: \"No Group\" for each row.\n\n3) Identify if the Table has a \"Preoperative Value (Units)\" or NO \"Preoperative Value (Units)\" Heading. \n A) If the Table has \"Preoperative Value (Units)\", Report each Preoperative Value as written in the text. \n B) If the Table has NO \"Preoperative Value (Units)\" Heading. \n 1. Add a Column 3a named: \"Group\". Column 3a will follow the \"Outcome\" Column\n 2. Report: \"Not Reported\" for each row.\n\n\n4) Report the corresponding Outcome, Values, and Study N for each \"Radiologic Outcome\" Table.\n A) Report each VALUE to only One Decimal Point.\n\n5) IF the table is Vertical, Transpose each VERTICAL table to a HORIZONTAL table.\n\n\n\nOverall rules:\n1) Do not exclude any Groups. \n2) Do not exclude any Tables. \n3) Do not exclude any Outcomes.\n4) Do not exclude any Values."]},{"name":"saf-Futable-FIN","inputs":"abb Nonunion-NG\nGeneric Combo-NG\nMerged Revision\/Reoperation-NG\nSubsidence-NG","section_sequence":"","assessment":"~~Safety~~","chain":["From the above text, extract information to complete the following sections sequentially to create a TABLE. \nCreate ONE table with the following Heading and Columns.\n\nPerform the following tasks\n1) Create the following \"table heading\" as described. \n2) Create the following columns.\n3) Populate each column as described. \n\n\n\n\n\n1) Combine the above Tables. \n2) Create one Table Heading--\"Safety Outcome Summary\"\n\n3) Report EACH Column heading. Only report what is inside of quotes. Remove the Quotation marks and the terms \"Column\" and \"Heading\". \n NOTE: Report ALL the Column Headings in ONE ROW.\n A) REPLACE the first column heading as: \"Adverse Events\".\n B) Report the second column heading as: \"Event N\".\n C) Report the third column heading as: \"Study N\".\n D) Report the fourth column heading as: \"Event %\".\n 1. Report the Event % Value to One Decimal Point.\n Note: Do Not Report any Levels N column or values.\n\n4) Report the corresponding data for each Column Values for each \"Safety Outcome\" Table.\n A) Report all the Adverse Events in the First Column.\n B) Report all the Event N Values in the SAME Column.\n Note: For Duplicate adverse events chose the Highest Event N. Only report one Event N per Adverse Event.\n C) Report all the Study N Values in the SAME Column.\n D) Report all the Event % Values in the SAME Column.\n Note: For Duplicate adverse events chose the Highest Event %. Only report one Event % per Adverse Event.\n\n\n5) IF the table is Vertical, Transpose each VERTICAL table to a HORIZONTAL table."]},{"name":"oth-perfFUtable-FIN","inputs":"Blood Loss\nHospital Charge\nLength of Hospital Stay\nNeed for ICU\nOperation Time\nTransfusion Rate\nTime to Readmission","section_sequence":"","assessment":"~~Other Performance~~","chain":["Report ALL of the information of the above text by performing the following tasks.\n\nPerform the following tasks:\n1) REPORT each table with the SAME columns as listed.\n\n\nFor Each Table\n1) Report EACH \"table heading\".\n2) Identify if the Table has Groups or NO groups. \n A) If the Table has \"Group\", Report each Group Name as written in the text. \n B) If the Table has No Group, \n 1. Add a Column 0 named: \"Group\". Column 0 will precede Column 1.\n 2. Report: \"No Group\" for each row.\n\n3) Identify if the Table has NO \"Preoperative Value (Units)\" Heading. \n A) If the Table has NO \"Preoperative Value (Units)\" Heading:\n 1. INSERT a column named: \"Group\". The \"Group\" Column will follow the \"Outcome\" Column.\n 2. Report: \"Not Reported\" for each row.\n\n\n4) Report the corresponding Outcome, Values, and Study N for each \"Other Outcome\" Table.\n A) Report each VALUE to only One Decimal Point.\n\n5) IF the table is Vertical, Transpose each VERTICAL table to a HORIZONTAL table.\n\n\n\nOverall rules:\n1) Do not exclude any Groups. \n2) Do not exclude any Tables. \n3) Do not exclude any Outcomes.\n4) Do not exclude any Values."]}]

+ [{"name":"clin-perfFUtable-FIN","inputs":"Change in Probing Depth\nFFI\nGlobal Rating of Change\nGood to Excellent Odom Score\nGrip Strength\nIncision Length\nJOA\nKaplan-Meier\nLikert\nMcCormick Grade\nNDI\nNeer's Grade\nPatient Improvement\nPSS\nPinch Strength\nPocket Depth\nROM\nStride Length\nSWMT\nTime to Fusion\nVAS Change\nVAS Score\nWalking Velocity\nNew Clinical Symptoms","section_sequence":"","assessment":"clinical","chain":["Report ALL of the information of the above text by performing the following tasks.\n\nPerform the following tasks:\n1) REPORT each table with the SAME columns as listed.\n\n\nFor Each Table\n1) Report EACH \"table heading\".\n2) Identify if the Table has Groups or NO groups. \n A) If the Table has \"Group\", Report each Group Name as written in the text. \n B) If the Table has No Group, \n 1. Add a Column 0 named: \"Group\". Column 0 will precede Column 1.\n 2. Report: \"No Group\" for each row.\n\n3) Identify if the Table has a \"Preoperative Value (Units)\" or NO \"Preoperative Value (Units)\" Heading. \n A) If the Table has \"Preoperative Value (Units)\", Report each Preoperative Value as written in the text. \n B) If the Table has NO \"Preoperative Value (Units)\" Heading. \n 1. Add a Column 3a named: \"Group\". Column 3a will follow the \"Outcome\" Column\n 2. Report: \"Not Reported\" for each row.\n\n\n4) Report the corresponding Outcome, Values, and Study N for each \"Clinical Outcome\" Table.\n A) Report each VALUE to only One Decimal Point.\n\n5) IF the table is Vertical, Transpose each VERTICAL table to a HORIZONTAL table.\n\n\n\nOverall rules:\n1) Do not exclude any Groups. \n2) Do not exclude any Tables. \n3) Do not exclude any Outcomes.\n4) Do not exclude any Values."]},{"name":"rad-perfFUtable-FIN","inputs":"Adjacent Segment ROM\nCarpal Height Ratio\nCervical Lordosis Profeta Method\nChange in Disc Height\nChange in Segmental Lordosis Angle\nCoronal Vertical Axis\nDeltoid Tuberosity Index\nDisc Height\nEngh grading scale\nFatty Atrophy\nForaminal Area Height\nFusion Assessment\nFusion Events-simplified 2\nParallel Pitch Lines\nScrew Placement Accuracy\nSegmental Lordosis Angle","section_sequence":"","assessment":"radiologic","chain":["Report ALL of the information of the above text by performing the following tasks.\n\nPerform the following tasks:\n1) REPORT each table with the SAME columns as listed.\n\n\nFor Each Table\n1) Report EACH \"table heading\".\n2) Identify if the Table has Groups or NO groups. \n A) If the Table has \"Group\", Report each Group Name as written in the text. \n B) If the Table has No Group, \n 1. Add a Column 0 named: \"Group\". Column 0 will precede Column 1.\n 2. Report: \"No Group\" for each row.\n\n3) Identify if the Table has a \"Preoperative Value (Units)\" or NO \"Preoperative Value (Units)\" Heading. \n A) If the Table has \"Preoperative Value (Units)\", Report each Preoperative Value as written in the text. \n B) If the Table has NO \"Preoperative Value (Units)\" Heading. \n 1. Add a Column 3a named: \"Group\". Column 3a will follow the \"Outcome\" Column\n 2. Report: \"Not Reported\" for each row.\n\n\n4) Report the corresponding Outcome, Values, and Study N for each \"Radiologic Outcome\" Table.\n A) Report each VALUE to only One Decimal Point.\n\n5) IF the table is Vertical, Transpose each VERTICAL table to a HORIZONTAL table.\n\n\n\nOverall rules:\n1) Do not exclude any Groups. \n2) Do not exclude any Tables. \n3) Do not exclude any Outcomes.\n4) Do not exclude any Values."]},{"name":"saf-Futable-FIN","inputs":"abb Nonunion-NG\nGeneric Combo-NG\nMerged Revision\/Reoperation-NG\nSubsidence-NG","section_sequence":"","assessment":"safety","chain":["From the above text, extract information to complete the following sections sequentially to create a TABLE. \nCreate ONE table with the following Heading and Columns.\n\nPerform the following tasks\n1) Create the following \"table heading\" as described. \n2) Create the following columns.\n3) Populate each column as described. \n\n\n\n\n\n1) Combine the above Tables. \n2) Create one Table Heading--\"Safety Outcome Summary\"\n\n3) Report EACH Column heading. Only report what is inside of quotes. Remove the Quotation marks and the terms \"Column\" and \"Heading\". \n NOTE: Report ALL the Column Headings in ONE ROW.\n A) REPLACE the first column heading as: \"Adverse Events\".\n B) Report the second column heading as: \"Event N\".\n C) Report the third column heading as: \"Study N\".\n D) Report the fourth column heading as: \"Event %\".\n 1. Report the Event % Value to One Decimal Point.\n Note: Do Not Report any Levels N column or values.\n\n4) Report the corresponding data for each Column Values for each \"Safety Outcome\" Table.\n A) Report all the Adverse Events in the First Column.\n B) Report all the Event N Values in the SAME Column.\n Note: For Duplicate adverse events chose the Highest Event N. Only report one Event N per Adverse Event.\n C) Report all the Study N Values in the SAME Column.\n D) Report all the Event % Values in the SAME Column.\n Note: For Duplicate adverse events chose the Highest Event %. Only report one Event % per Adverse Event.\n\n\n5) IF the table is Vertical, Transpose each VERTICAL table to a HORIZONTAL table."]},{"name":"oth-perfFUtable-FIN","inputs":"Blood Loss\nHospital Charge\nLength of Hospital Stay\nNeed for ICU\nOperation Time\nTransfusion Rate\nTime to Readmission","section_sequence":"","assessment":"other","chain":["Report ALL of the information of the above text by performing the following tasks.\n\nPerform the following tasks:\n1) REPORT each table with the SAME columns as listed.\n\n\nFor Each Table\n1) Report EACH \"table heading\".\n2) Identify if the Table has Groups or NO groups. \n A) If the Table has \"Group\", Report each Group Name as written in the text. \n B) If the Table has No Group, \n 1. Add a Column 0 named: \"Group\". Column 0 will precede Column 1.\n 2. Report: \"No Group\" for each row.\n\n3) Identify if the Table has NO \"Preoperative Value (Units)\" Heading. \n A) If the Table has NO \"Preoperative Value (Units)\" Heading:\n 1. INSERT a column named: \"Group\". The \"Group\" Column will follow the \"Outcome\" Column.\n 2. Report: \"Not Reported\" for each row.\n\n\n4) Report the corresponding Outcome, Values, and Study N for each \"Other Outcome\" Table.\n A) Report each VALUE to only One Decimal Point.\n\n5) IF the table is Vertical, Transpose each VERTICAL table to a HORIZONTAL table.\n\n\n\nOverall rules:\n1) Do not exclude any Groups. \n2) Do not exclude any Tables. \n3) Do not exclude any Outcomes.\n4) Do not exclude any Values."]}]

application.py CHANGED Viewed

@@ -52,6 +52,8 @@ tables_inst=[
     f"include all table titles."
 ]
 article_prompts = {
     "Authors": '''extract all of the authors of the article from the above text.\n
     Return the results on the same line separated by commas as Authors: Author A, Author B...

     f"include all table titles."
 ]
+reformat_inst = f"reformat the returned tables into a markdown table syntax if applicable."
 article_prompts = {
     "Authors": '''extract all of the authors of the article from the above text.\n
     Return the results on the same line separated by commas as Authors: Author A, Author B...

backend_update.ipynb CHANGED Viewed

@@ -25,7 +25,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 2,
    "metadata": {},
    "outputs": [],
    "source": [
@@ -68,7 +68,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 56,
    "metadata": {},
    "outputs": [],
    "source": [
@@ -234,16 +234,16 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 236,
    "metadata": {},
    "outputs": [
     {
      "data": {
       "text/plain": [
-       "dict_keys(['Checklist', 'Revision', 'Overview', 'Overview Command', 'Overview Summary Com.', 'Clinical', 'Clinical Command', 'Clin Summary Com', 'Radiologic', 'Radiologic Command', 'Rad Summary Com', 'Safety', 'Safety Command', 'Safety Summary', 'Safety Summary Com', 'Other', 'Other Command', 'Oth Summary Com', 'Units', 'Notes'])"
       ]
      },
-     "execution_count": 236,
      "metadata": {},
      "output_type": "execute_result"
     }
@@ -251,7 +251,7 @@
    "source": [
     "import pandas as pd\n",
     "\n",
-    "raw = pd.read_excel(\"./.data/AMRA AI Reader_Search Term_V14h.xlsx\",sheet_name=None)\n",
     "raw.keys()"
    ]
   },
@@ -259,450 +259,75 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "#### Processing Overview sheet"
    ]
   },
   {
    "cell_type": "code",
-   "execution_count": 122,
    "metadata": {},
    "outputs": [
     {
      "name": "stdout",
      "output_type": "stream",
      "text": [
-      "Checklist\n"
-     ]
-    },
-    {
-     "data": {
-      "text/plain": [
-       "['Roland CheckList',\n",
-       " 'Roland to Confirm',\n",
-       " 'Unnamed: 2',\n",
-       " 'Unnamed: 3',\n",
-       " 'Unnamed: 4',\n",
-       " 'Unnamed: 5',\n",
-       " 'Unnamed: 6',\n",
-       " 'Unnamed: 7',\n",
-       " 'Unnamed: 8',\n",
-       " 'Unnamed: 9',\n",
-       " 'Unnamed: 10',\n",
-       " 'Unnamed: 11',\n",
-       " 'Unnamed: 12',\n",
-       " 'Unnamed: 13',\n",
-       " 'Unnamed: 14',\n",
-       " 'Unnamed: 15']"
-      ]
-     },
-     "metadata": {},
-     "output_type": "display_data"
-    },
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "Revision\n"
-     ]
-    },
-    {
-     "data": {
-      "text/plain": [
-       "['Revision History', 'Unnamed: 1', 'Unnamed: 2', 'Unnamed: 3']"
-      ]
-     },
-     "metadata": {},
-     "output_type": "display_data"
-    },
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "Overview\n"
-     ]
-    },
-    {
-     "data": {
-      "text/plain": [
-       "['Assessment Step',\n",
-       " 'Spine',\n",
-       " 'Extermity',\n",
-       " 'Notes for Roland',\n",
-       " 'Unnamed: 4',\n",
-       " 'Unnamed: 5',\n",
-       " 'Unnamed: 6',\n",
-       " 'Unnamed: 7',\n",
-       " 'Unnamed: 8',\n",
-       " 'Unnamed: 9',\n",
-       " 'Notes']"
-      ]
-     },
-     "metadata": {},
-     "output_type": "display_data"
-    },
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "Overview Command\n"
-     ]
-    },
-    {
-     "data": {
-      "text/plain": [
-       "['name', 'inputs', 'instruction', 'Command Depends on IFU']"
-      ]
-     },
-     "metadata": {},
-     "output_type": "display_data"
-    },
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "Overview Summary Com.\n"
-     ]
-    },
-    {
-     "data": {
-      "text/plain": [
-       "['name', 'inputs', 'instruction']"
-      ]
-     },
-     "metadata": {},
-     "output_type": "display_data"
-    },
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "Clinical\n"
      ]
-    },
-    {
-     "data": {
-      "text/plain": [
-       "['Assessment Step',\n",
-       " 'Region',\n",
-       " 'search_term',\n",
-       " 'Command Term',\n",
-       " 'KeyClinPerfOutcome',\n",
-       " 'Command Template with Preop with Group']"
-      ]
-     },
-     "metadata": {},
-     "output_type": "display_data"
-    },
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "Clinical Command\n"
-     ]
-    },
-    {
-     "data": {
-      "text/plain": [
-       "['name',\n",
-       " 'inputs',\n",
-       " 'Command Template with Preop with Group',\n",
-       " 'Command Template with Preop without Group',\n",
-       " 'Command Template without Preop with Group',\n",
-       " 'Command Template without Preop without Group',\n",
-       " 'root',\n",
-       " 'group',\n",
-       " 'outcome',\n",
-       " 'preoperative',\n",
-       " 'postoperative',\n",
-       " 'study_n']"
-      ]
-     },
-     "metadata": {},
-     "output_type": "display_data"
-    },
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "Clin Summary Com\n"
-     ]
-    },
-    {
-     "data": {
-      "text/plain": [
-       "['name', 'inputs', 'search_term', 'instruction', 'Unnamed: 4']"
-      ]
-     },
-     "metadata": {},
-     "output_type": "display_data"
-    },
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "Radiologic\n"
-     ]
-    },
-    {
-     "data": {
-      "text/plain": [
-       "['Assessment Step',\n",
-       " 'Region',\n",
-       " 'search_term',\n",
-       " 'Command Term',\n",
-       " 'KeyRadPerfOutcome',\n",
-       " 'Command Template with Preop with Group']"
-      ]
-     },
-     "metadata": {},
-     "output_type": "display_data"
-    },
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "Radiologic Command\n"
-     ]
-    },
-    {
-     "data": {
-      "text/plain": [
-       "['name',\n",
-       " 'inputs',\n",
-       " 'Command Template with Preop with Group',\n",
-       " 'Command Template with Preop without Group',\n",
-       " 'Command Template without Preop with Group',\n",
-       " 'Command Template without Preop without Group',\n",
-       " 'root',\n",
-       " 'group',\n",
-       " 'outcome',\n",
-       " 'preoperative',\n",
-       " 'postoperative',\n",
-       " 'study_n']"
-      ]
-     },
-     "metadata": {},
-     "output_type": "display_data"
-    },
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "Rad Summary Com\n"
-     ]
-    },
-    {
-     "data": {
-      "text/plain": [
-       "['name', 'inputs', 'search_term', 'Summary Command Template', 'Unnamed: 4']"
-      ]
-     },
-     "metadata": {},
-     "output_type": "display_data"
-    },
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "Safety\n"
-     ]
-    },
-    {
-     "data": {
-      "text/plain": [
-       "['Assessment Step',\n",
-       " 'Region',\n",
-       " 'search_term',\n",
-       " 'Command Term',\n",
-       " 'Clinical Study Table Term',\n",
-       " 'Summary Table Term',\n",
-       " 'Command Template',\n",
-       " 'Input']"
-      ]
-     },
-     "metadata": {},
-     "output_type": "display_data"
-    },
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "Safety Command\n"
-     ]
-    },
-    {
-     "data": {
-      "text/plain": [
-       "['name',\n",
-       " 'inputs',\n",
-       " 'instruction',\n",
-       " 'root',\n",
-       " 'group',\n",
-       " 'outcome',\n",
-       " 'preoperative',\n",
-       " 'postoperative',\n",
-       " 'study_n']"
-      ]
-     },
-     "metadata": {},
-     "output_type": "display_data"
-    },
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "Safety Summary\n"
-     ]
-    },
-    {
-     "data": {
-      "text/plain": [
-       "['Assessment Step',\n",
-       " 'Region',\n",
-       " 'search_term',\n",
-       " 'Summary Replacement Term',\n",
-       " 'Risk Replacement Term']"
-      ]
-     },
-     "metadata": {},
-     "output_type": "display_data"
-    },
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "Safety Summary Com\n"
-     ]
-    },
-    {
-     "data": {
-      "text/plain": [
-       "['name', 'inputs', 'instruction']"
-      ]
-     },
-     "metadata": {},
-     "output_type": "display_data"
-    },
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "Other\n"
-     ]
-    },
-    {
-     "data": {
-      "text/plain": [
-       "['Assessment Step',\n",
-       " 'Region',\n",
-       " 'search_term',\n",
-       " 'Command Term',\n",
-       " 'OtherOutcome',\n",
-       " 'Command Template with Preoperative with Group']"
-      ]
-     },
-     "metadata": {},
-     "output_type": "display_data"
-    },
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "Other Command\n"
-     ]
-    },
-    {
-     "data": {
-      "text/plain": [
-       "['name',\n",
-       " 'inputs',\n",
-       " 'Command Template with Preop with Group',\n",
-       " 'Command Template with Preop without Group',\n",
-       " 'Command Template without Preop with Group',\n",
-       " 'Command Template without Preop without Group',\n",
-       " 'root',\n",
-       " 'group',\n",
-       " 'outcome',\n",
-       " 'preoperative',\n",
-       " 'postoperative',\n",
-       " 'study_n']"
-      ]
-     },
-     "metadata": {},
-     "output_type": "display_data"
-    },
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "Oth Summary Com\n"
-     ]
-    },
-    {
-     "data": {
-      "text/plain": [
-       "['name', 'inputs', 'search_term', 'instruction', 'Unnamed: 4']"
-      ]
-     },
-     "metadata": {},
-     "output_type": "display_data"
-    },
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "Units\n"
-     ]
-    },
-    {
-     "data": {
-      "text/plain": [
-       "['Units']"
-      ]
-     },
-     "metadata": {},
-     "output_type": "display_data"
-    },
     {
      "name": "stdout",
      "output_type": "stream",
      "text": [
-      "Notes\n"
      ]
     },
     {
      "data": {
       "text/plain": [
-       "['Unnamed: 0',\n",
-       " 'Unnamed: 1',\n",
-       " 'Unnamed: 2',\n",
-       " 'Unnamed: 3',\n",
-       " 'Unnamed: 4',\n",
-       " 'Unnamed: 5',\n",
-       " 'Unnamed: 6',\n",
-       " 'Unnamed: 7',\n",
-       " 'Unnamed: 8',\n",
-       " 'Unnamed: 9',\n",
-       " 'Unnamed: 10',\n",
-       " 'Unnamed: 11',\n",
-       " 'Unnamed: 12',\n",
-       " 'Unnamed: 13',\n",
-       " 'Unnamed: 14',\n",
-       " 'Unnamed: 15',\n",
-       " 'Unnamed: 16',\n",
-       " 'Unnamed: 17',\n",
-       " 'Unnamed: 18',\n",
-       " '\\n\\n\\nTable Heading--Radiologic Outcome-Fusion Events\\nGroup\\tN Fusion Events n (%)\\tN Nonunion Events n (%)\\tStudy N\\nNo Group\\t35 (94.5%)\\t2 (5.5%)\\t\\t32\\nOverall Fusion Rate (94.5%)\\tOverall Nonunion Rate (5.5%)\\n\\n\\n\\nTable Heading--Radiologic Outcome-Fusion Events\\nGroup\\tN Fusion Events n (%)\\tN Nonunion Events n (%)\\tStudy N\\nGroup A-1\\t16 (94.1%)\\t1 (5.9%)\\t\\t17\\nGroup A-2\\t14 (66.7%)\\t7 (33.3%)\\t\\t21\\nGroup B-1\\t18 (100%)\\t0 (0%)\\t\\t\\t18\\nGroup B-2\\t21 (95.5%)\\t1 (4.5%)\\t\\t22\\nOverall\\t69 (78.9%)\\t9 (21.1%)\\t\\n\\n\\nTable Heading--Radiologic Outcome-Fusion Events\\nGroup\\tN Fusion Events n (%)\\tN Nonunion Events n (%)\\tStudy N\\nNo Group\\t18 (72%)\\t6 (24%)\\t\\t25\\nOverall Fusion Rate (72%)  Overall Nonunion Rate (28%)\\t\\t25\\n\\n\\n\\nTable Heading--Radiologic Outcome-Fusion Events\\nGroup\\tN Fusion Events n (%)\\tN Nonunion Events n (%)\\tStudy N\\nTitanium\\t32/37 (86.5%)\\t5/37 (13.5%)\\t53\\nPEEK\\t34/34 (100%)\\t0/34 (0%)\\t53\\nOverall\\t66/71 (93.0%)\\t5/71 (7.0%)\\t\\t\\t\\n\\n\\n\\n\\nTable Heading--Radiologic Outcome-Fusion Events\\nGroup\\tN Fusion Events n (%)\\tN Nonunion Events n (%)\\tStudy N\\nGroup A\\t27/28 (96.43%)\\t1/28 (3.57%)\\t28\\nGroup B\\t25/26 (96.15%)\\t1/26 (3.85%)\\t26\\nOverall\\t52/54 (96.30%)\\t2/54 (3.70%)\\t54\\n\\n\\n\\n\\nTable Heading--Radiologic Outcome-Fusion Events\\nN Fusion Events n (%)\\tN Nonunion Events n (%)\\tStudy N\\nGroup A\\t27/28 (96.43%)\\t1/28 (3.57%)\\t28\\nGroup B\\t25/26 (96.15%)\\t1/26 (3.85%)\\t26\\nOverall\\t52/54 (96.30%)\\t2/54 (3.70%)\\t54\\n\\n\\n\\nTable Heading--Radiologic Outcome-Fusion Events\\nFusion\\t\\tN Fusion Events n (%)\\tN Nonunion Events n (%)\\tStudy N\\nGrade I\\t\\t1 (3.3%)\\t0 (0%)\\t\\t\\t\\nGrade II\\t\\t4 (13.3%)\\t0 (0%)\\t\\t\\nGrade III\\t25 (83.3%)\\t0 (0%)\\t\\t\\nOverall\\t\\t30 (100%)\\t0 (0%)\\t30\\n\\n\\n\\n\\n\\nTable Heading--Radiologic Outcome-Fusion Events\\nGroup\\tN Fusion Events n (%)\\tN Nonunion Events n (%)\\tStudy N\\nNo Group\\t41 (97%)\\t1 (3%)\\t\\t31\\nOverall Fusion Rate (97%)\\tOverall Nonunion Rate (3%)\\t\\t']"
       ]
      },
      "metadata": {},
-     "output_type": "display_data"
     }
    ],
    "source": [
-    "for k in raw.keys():\n",
-    "    print(k)\n",
-    "    display(list(raw[k].columns))"
    ]
   },
   {

   },
   {
    "cell_type": "code",
+   "execution_count": 1,
    "metadata": {},
    "outputs": [],
    "source": [
   },
   {
    "cell_type": "code",
+   "execution_count": 5,
    "metadata": {},
    "outputs": [],
    "source": [
   },
   {
    "cell_type": "code",
+   "execution_count": 28,
    "metadata": {},
    "outputs": [
     {
      "data": {
       "text/plain": [
+       "dict_keys(['Checklist', 'Revision', 'terms', 'prompts', 'prompt_agg', 'prompt_sum', 'Overview', 'Overview Command', 'Overview Summary Com.', 'Clinical', 'Clinical Command', 'Clin Summary Com', 'Radiologic', 'Radiologic Command', 'parser_safety', 'Units', 'Notes', 'Rad Summary Com', 'Safety', 'Safety Command', 'Safety Summary Com', 'Other', 'Other Command', 'Oth Summary Com'])"
       ]
      },
+     "execution_count": 28,
      "metadata": {},
      "output_type": "execute_result"
     }
    "source": [
     "import pandas as pd\n",
     "\n",
+    "raw = pd.read_excel(\"./.data/AMRA AI Reader_Search Term_V14l.xlsx\",sheet_name=None)\n",
     "raw.keys()"
    ]
   },
    "cell_type": "markdown",
    "metadata": {},
    "source": [
+    "#### Reduced processing"
    ]
   },
   {
    "cell_type": "code",
+   "execution_count": 18,
    "metadata": {},
    "outputs": [
     {
      "name": "stdout",
      "output_type": "stream",
      "text": [
+      "(3494, 7) (85, 10)\n"
      ]
+    }
+   ],
+   "source": [
+    "df_terms = raw[\"terms\"]\n",
+    "df_terms.fillna(\"\",inplace=True)\n",
+    "\n",
+    "# create a name for rows with assessment as overview and terms with the format of \"overview - {instruction}\"\n",
+    "df_terms.loc[df_terms.assessment == \"overview\",\"indication_terms\"] = df_terms.loc[df_terms.assessment == \"overview\",\"instruction\"].apply(lambda x: f\"overview - {x}\")\n",
+    "df_terms.instruction = df_terms.instruction.apply(lambda x: [t.strip() for t in x.split(\",\")])\n",
+    "\n",
+    "df_local_to_aws_dynamodb(\"terms\",df_terms)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 29,
+   "metadata": {},
+   "outputs": [
     {
      "name": "stdout",
      "output_type": "stream",
      "text": [
+      "85/85\n",
+      "upload erros count: 0\n"
      ]
     },
     {
      "data": {
       "text/plain": [
+       "[]"
       ]
      },
+     "execution_count": 29,
      "metadata": {},
+     "output_type": "execute_result"
     }
    ],
    "source": [
+    "df_prompt = raw[\"prompts\"]\n",
+    "df_prompt.fillna(\"\",inplace=True)\n",
+    "\n",
+    "df_prompt[\"chain\"] = df_prompt.apply(lambda x: [x[\"root\"],x[\"group\"],x[\"outcome\"],x[\"preoperative\"],x[\"postoperative\"],x[\"study_n\"]],axis=1)\n",
+    "df_prompt.loc[df_prompt.assessment == \"overview\",\"chain\"] = df_prompt.loc[df_prompt.assessment == \"overview\",\"chain\"].apply(lambda x: [i for i in x if i])\n",
+    "df_prompt.inputs = df_prompt.inputs.apply(lambda x: [i.strip() for i in x.split(\",\")])\n",
+    "\n",
+    "df_prompt.drop([\"root\",\"group\",\"outcome\",\"preoperative\",\"postoperative\",\"study_n\"],axis=1,inplace=True)\n",
+    "\n",
+    "df_local_to_aws_dynamodb(\"prompts\",df_prompt)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "#### Processing Overview sheet"
    ]
   },
   {

chains.py CHANGED Viewed

@@ -67,18 +67,30 @@ post_prompt_maping = {}
 post_replace_term = lambda res,map=post_prompt_maping:replace_term(res,map=map)
 def gen_task(prompt,article):
     input_text = "".join([article[s] for s in prompt["inputs"]])
     messages = [
         ("system","You are a helpful AI that can answer questions about clinical trail and operation studies."),
         ("human",input_text)
     ]
-    for msg in prompt["chain"]:
-        messages.append(("human",msg))
-    chat_prompt = ChatPromptTemplate.from_messages(messages=messages)
-    chain = chat_prompt | llm | post_replace_term
     input_variables = {}
     if "term" in prompt:
         app_data["current"]["term"] = prompt["term"][0]["prompting_term"]
@@ -86,9 +98,8 @@ def gen_task(prompt,article):
     if "n_col" in chat_prompt.input_variables:
         input_variables["n_col"] = len(prompt["chain"])
     if "term" in chat_prompt.input_variables:
-        input_variables["term"] = app_data["current"]["term"] # get the first term's prompting term
-    return async_generate(article=article,name=prompt["name"],chain=chain,input_variables=input_variables)
 if __name__ == "__main__":
     # lets try the Blood Loss, Operation Time, and Need for ICU in other folder

 post_replace_term = lambda res,map=post_prompt_maping:replace_term(res,map=map)
 def gen_task(prompt,article):
+    chat_prompt = gen_chat_prompt(prompt,article)
+    chain = chat_prompt | llm | post_replace_term
+    input_variables = gen_input_variables (chat_prompt,prompt)
+    return async_generate(article=article,name=prompt["name"],chain=chain,input_variables=input_variables)
+def gen_chat_prompt(prompt,article):
     input_text = "".join([article[s] for s in prompt["inputs"]])
     messages = [
         ("system","You are a helpful AI that can answer questions about clinical trail and operation studies."),
         ("human",input_text)
     ]
+    if len(prompt["chain"]) > 1:
+        for i in article["logic"]["chain id"]:
+            messages.append(("human",prompt["chain"][i]))
+    else:
+        messages.append(("human",prompt["chain"][0]))
+    messages.append(("system",reformat_inst))
+    return ChatPromptTemplate.from_messages(messages=messages)
+def gen_input_variables(chat_prompt,prompt):
     input_variables = {}
     if "term" in prompt:
         app_data["current"]["term"] = prompt["term"][0]["prompting_term"]
     if "n_col" in chat_prompt.input_variables:
         input_variables["n_col"] = len(prompt["chain"])
     if "term" in chat_prompt.input_variables:
+        input_variables["term"] = app_data["current"]["term"]
+    return input_variables
 if __name__ == "__main__":
     # lets try the Blood Loss, Operation Time, and Need for ICU in other folder

features.py CHANGED Viewed

@@ -5,9 +5,9 @@ from datetime import datetime
 import gradio as gr
 import asyncio
-from langchain.llms import OpenAI
-from langchain.prompts import PromptTemplate
-from langchain.chains import LLMChain
 # internal packages
 from chains import *
@@ -153,7 +153,7 @@ def update_article_segment(article):
         "key_content":          article["Abstract"] + article["Material and Methods"] + article["Results"],
         })
     # add the recognized logic to the article
-    # article.update(identify_logic(article["key_content"]))
     # one thing to notice here, due to the fact that update_article_segment function perform direct change on the article object,
     # there is no need to re-assign the article object to the same variable name
@@ -207,24 +207,32 @@ def refresh():
 @terminal_print
 def create_overview(article):
-    # md_text = ""
-    assessment = "Overview"
     md_text = f"## Overview\n\n"
-    overview_components = article["extraction"][assessment]
-    for component in overview_components:
-        md_text += f"#### {assessment} - {component}\n\n"
-        if component in article:
-            md_text += article[component] + "\n\n"
-        else:
-            md_text += "No content found\n\n"
-        # md_text += article[component] + "\n\n"
     return gr.update(value=md_text)
 @terminal_print
 def create_detail_views(article):
     md_text = "## Performance\n\n"
-    assessments = ["Clinical Performance","Radiologic Performance","Safety","Other Performance"]
     performance_tables = ["clin-perfFUtable-FIN","rad-perfFUtable-FIN","saf-Futable-FIN","oth-perfFUtable-FIN"]
     # add performance
@@ -448,13 +456,17 @@ def select_overview_prompts(article):
     valid_prompts = set()
     for t in app_data["terms"]:
         # select overview prompts
-        if validate_term(article,t,"Overview"):
             # add the prompts to the memory
             valid_prompts.update(t["instruction"])
     sorted_prompts = sorted(valid_prompts,key=lambda prompt:app_data["prompts"][prompt]["section_sequence"])
-    article["extraction"]["Overview"] = sorted_prompts
     return {p:app_data["prompts"][p] for p in valid_prompts}
 @terminal_print
 def select_performance_prompts(article,performance_assessment):
     valid_terms = []
@@ -484,6 +496,17 @@ def select_performance_prompts(article,performance_assessment):
     # print("valid performance prompts: ",valid_prompts)
     return valid_prompts
 @terminal_print
 def process_prompts(article): # function overly complicated. need to be simplified.
     '''
@@ -508,7 +531,7 @@ def process_prompts(article): # function overly complicated. need to be simplifi
     article["extraction"] = {}
     overview_prompts = select_overview_prompts(article)
-    performance_assessments = ["Clinical Performance","Radiologic Performance","Safety","Other Performance"]
     performance_prompts = {}
     for assessment in performance_assessments:
@@ -528,7 +551,7 @@ def validate_term(article,term,assessment):
     if term["region"].lower() != "all" and term["region"].lower() != article["domain"].lower():
         return False
-    if assessment == "Overview" and term["assessment"] == "Overview":
         return True
     # validate if the term is used for overview
@@ -559,7 +582,7 @@ def keyword_search(keywords,full_text):
 def post_process(article):
     post_inputs = {}
     for assessment,segements in article["extraction"].items():
-        if assessment == "Overview":
             continue
         post_inputs[assessment] = "\n".join([article[s] for s in segements])

 import gradio as gr
 import asyncio
+# from langchain.llms import OpenAI
+# from langchain.prompts import PromptTemplate
+# from langchain.chains import LLMChain
 # internal packages
 from chains import *
         "key_content":          article["Abstract"] + article["Material and Methods"] + article["Results"],
         })
     # add the recognized logic to the article
+    update_logic(article)
     # one thing to notice here, due to the fact that update_article_segment function perform direct change on the article object,
     # there is no need to re-assign the article object to the same variable name
 @terminal_print
 def create_overview(article):
     md_text = f"## Overview\n\n"
+    overview_components = article["extraction"]["overview"]
+    for component in overview_components: # command name removed
+        md_text += article[component] + "\n\n" if component in article else "no content found\n\n"
     return gr.update(value=md_text)
+def pre_view(content):
+    if "Table Heading" in content: # remove table heading
+        content = content.replace("Table Heading","")
+    # remove the line with ariticle id
+    content = content.split("\n")
+    content = [c for c in content if "article id" not in c.lower()]
+    #get the first line and only keep the alphanumeric characters
+    text = content.split("\n")
+    text[0] = "###" + "".join([c for c in text[0] if c.isalnum()])
+    return "\n".join(text).replace('"', '')
 @terminal_print
 def create_detail_views(article):
     md_text = "## Performance\n\n"
+    assessments = ["clinical","radiologic","safety","other"]
     performance_tables = ["clin-perfFUtable-FIN","rad-perfFUtable-FIN","saf-Futable-FIN","oth-perfFUtable-FIN"]
     # add performance
     valid_prompts = set()
     for t in app_data["terms"]:
         # select overview prompts
+        if validate_term(article,t,"overview"):
             # add the prompts to the memory
             valid_prompts.update(t["instruction"])
+    print(valid_prompts)
     sorted_prompts = sorted(valid_prompts,key=lambda prompt:app_data["prompts"][prompt]["section_sequence"])
+    article["extraction"]["overview"] = sorted_prompts
     return {p:app_data["prompts"][p] for p in valid_prompts}
 @terminal_print
 def select_performance_prompts(article,performance_assessment):
     valid_terms = []
     # print("valid performance prompts: ",valid_prompts)
     return valid_prompts
+def update_logic(article):
+    article["logic"] = {
+        "group":article["key_content"].lower().count("group")>=3,
+        "preoperative":article["key_content"].lower().count("preoperative")>=2,
+        "chain id":[i for i in range(6)]
+    }
+    if not article["logic"]["group"]:
+        article["logic"]["chain id"].remove(1)
+    if not article["logic"]["preoperative"]:
+        article["logic"]["chain id"].remove(3)
 @terminal_print
 def process_prompts(article): # function overly complicated. need to be simplified.
     '''
     article["extraction"] = {}
     overview_prompts = select_overview_prompts(article)
+    performance_assessments = ["clinical","radiologic","safety","other"]
     performance_prompts = {}
     for assessment in performance_assessments:
     if term["region"].lower() != "all" and term["region"].lower() != article["domain"].lower():
         return False
+    if assessment == "overview" and term["assessment"] == "overview":
         return True
     # validate if the term is used for overview
 def post_process(article):
     post_inputs = {}
     for assessment,segements in article["extraction"].items():
+        if assessment == "overview":
             continue
         post_inputs[assessment] = "\n".join([article[s] for s in segements])