Fredrik Sitje commited on
Commit
6b21293
·
1 Parent(s): c3069c3

Optimize term-category pair filtering in Streamlit app by converting DataFrame to a list for improved performance. This change enhances efficiency by avoiding the use of iterrows, streamlining the data processing workflow.

Browse files
Files changed (1) hide show
  1. src/streamlit_app.py +5 -2
src/streamlit_app.py CHANGED
@@ -709,9 +709,12 @@ def get_term_category_pairs(df):
709
  # Sort by term name and category_index
710
  all_pairs_df = all_pairs_df.sort_values(['term', 'category_index'])
711
 
 
 
 
712
  # Filter out categories that have no subcategories after filtering Unknown answers
713
- filtered_pairs = [(row['term'], row['category']) for _, row in all_pairs_df.iterrows()
714
- if category_has_subcategories(row['term'], row['category'], df)]
715
 
716
  return filtered_pairs
717
 
 
709
  # Sort by term name and category_index
710
  all_pairs_df = all_pairs_df.sort_values(['term', 'category_index'])
711
 
712
+ # Convert to list efficiently (avoids slow iterrows)
713
+ all_pairs_list = all_pairs_df[['term', 'category']].values.tolist()
714
+
715
  # Filter out categories that have no subcategories after filtering Unknown answers
716
+ filtered_pairs = [(term, category) for term, category in all_pairs_list
717
+ if category_has_subcategories(term, category, df)]
718
 
719
  return filtered_pairs
720