Spaces:

TransLegal
/

grading-answers

Running

Fredrik Sitje commited on 20 days ago

Commit

6b21293

1 Parent(s): c3069c3

Optimize term-category pair filtering in Streamlit app by converting DataFrame to a list for improved performance. This change enhances efficiency by avoiding the use of iterrows, streamlining the data processing workflow.

Files changed (1) hide show

src/streamlit_app.py CHANGED Viewed

@@ -709,9 +709,12 @@ def get_term_category_pairs(df):
     # Sort by term name and category_index
     all_pairs_df = all_pairs_df.sort_values(['term', 'category_index'])
     # Filter out categories that have no subcategories after filtering Unknown answers
-    filtered_pairs = [(row['term'], row['category']) for _, row in all_pairs_df.iterrows()
-                      if category_has_subcategories(row['term'], row['category'], df)]
     return filtered_pairs

     # Sort by term name and category_index
     all_pairs_df = all_pairs_df.sort_values(['term', 'category_index'])
+    # Convert to list efficiently (avoids slow iterrows)
+    all_pairs_list = all_pairs_df[['term', 'category']].values.tolist()
     # Filter out categories that have no subcategories after filtering Unknown answers
+    filtered_pairs = [(term, category) for term, category in all_pairs_list
+                      if category_has_subcategories(term, category, df)]
     return filtered_pairs