Spaces:
Sleeping
Sleeping
Update app.py
Browse files
app.py
CHANGED
|
@@ -62,7 +62,28 @@ def fetch_file_from_s3_file(file_key):
|
|
| 62 |
|
| 63 |
# Function to summarize text using OpenAI GPT
|
| 64 |
def summarize_text(text):
|
| 65 |
-
system_prompt = "You are
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 66 |
try:
|
| 67 |
response = openai.ChatCompletion.create(
|
| 68 |
model="gpt-4o-mini",
|
|
|
|
| 62 |
|
| 63 |
# Function to summarize text using OpenAI GPT
|
| 64 |
def summarize_text(text):
|
| 65 |
+
system_prompt = """You are an assistant designed to summarize extracted OCR text in a JSON format. Follow these instructions:
|
| 66 |
+
|
| 67 |
+
1.Extract Header and Line Item Information:
|
| 68 |
+
|
| 69 |
+
Identify the header and line items in the OCR text.
|
| 70 |
+
Only provide the highest-level headers, with corresponding line items underneath them if applicable. Do not nest the data.
|
| 71 |
+
|
| 72 |
+
2.Multiple Values:
|
| 73 |
+
|
| 74 |
+
If there are multiple values for a particular header or line item, store them as separate values in a list.
|
| 75 |
+
For instance, if there are multiple line items or multiple values under a header, include all of them as elements in an array.
|
| 76 |
+
|
| 77 |
+
3.Format:
|
| 78 |
+
|
| 79 |
+
Do not include nested objects.
|
| 80 |
+
The format should be key-value pairs, where:
|
| 81 |
+
Header represents a main category or section.
|
| 82 |
+
LineItems represent individual points under the headers (if applicable).
|
| 83 |
+
|
| 84 |
+
4.Ensure clarity:
|
| 85 |
+
|
| 86 |
+
The resulting JSON should be clear, with no unnecessary nesting or extra objects."""
|
| 87 |
try:
|
| 88 |
response = openai.ChatCompletion.create(
|
| 89 |
model="gpt-4o-mini",
|