arbabarshad Claude Opus 4.6 commited on
Commit
9e21e50
Β·
1 Parent(s): 5d805ea

Simplify RAG prompt, add common name display, add example runner

Browse files

- Fix dotenv loading with override=True in app.py and app_database_prep.py
- Simplify the LLM prompt to reduce hallucination (remove complex conditionals and hardcoded references)
- Add explicit region descriptions in prompt (Midwest USA, USA, Africa, India)
- Display common name in species availability info
- Add --examples flag to app.py for running predefined queries programmatically
- Add generate_latex.py for converting example results to LaTeX
- Rebuild vector database with all 4 regions (859 documents)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

app.py CHANGED
@@ -20,6 +20,7 @@ from anyio.from_thread import start_blocking_portal #For model callback streamin
20
  from langchain.prompts import SystemMessagePromptTemplate, HumanMessagePromptTemplate, ChatPromptTemplate
21
  import os
22
  from dotenv import load_dotenv
 
23
 
24
  import streamlit as st
25
  import json
@@ -63,25 +64,27 @@ print("Started")
63
 
64
  def get_species_list_from_db(db_name):
65
  embedding = OpenAIEmbeddings()
66
- vectordb_temp = Chroma(persist_directory=db_name,
67
  embedding_function=embedding)
68
  species_list=[]
 
69
  for meta in vectordb_temp.get()["metadatas"] :
70
  try:
71
  matched_first_species = meta['matched_specie_0']
72
  except KeyError:
73
  continue
74
- # Since each document is considered as a single chunk, the chunk_index is 0 for all
75
  species_list.append( matched_first_species)
 
 
76
 
77
- return species_list
78
 
79
 
80
 
81
  # default_persist_directory = './db5' # For deployement
82
  default_persist_directory='./vector-databases-deployed/db5-agllm-data-isu-field-insects-all-species'
83
 
84
- species_list=get_species_list_from_db(default_persist_directory)
85
  # default_persist_directory = 'vector-databases/db5-pre-completion' # For Development
86
  csv_filepath1 = "./agllm-data/corrected/Corrected_supplemented-insect_data-2500-sorted.xlsx"
87
  csv_filepath2 = "./agllm-data/corrected/Corrected_supplemented-insect_data-remaining.xlsx"
@@ -321,10 +324,12 @@ def initialize_qa_chain(specie_selector, application_mode, model_name, region, d
321
 
322
  if available_regions:
323
  available_regions_list = sorted(list(available_regions))
324
- availability_message = f"Information for **{specie_selector}** is available in region(s): **{', '.join(available_regions_list)}**."
 
325
  else:
326
  available_regions_list = []
327
- availability_message = f"No regional information found for **{specie_selector}** in the database."
 
328
  except Exception as e:
329
  print(f"Error checking region availability: {e}")
330
  available_regions_list = []
@@ -398,69 +403,28 @@ def initialize_qa_chain(specie_selector, application_mode, model_name, region, d
398
  # from langchain.chains.combine_documents import create_stuff_documents_chain
399
  # from langchain.chains import create_retrieval_chain
400
 
401
- # Updated prompt template for multi-part response with region-specific contexts (biosecurity focus)
402
- # Determine if this is a Midwest-focused query
403
- is_midwest_selected = (region == "Midwest USA")
404
 
405
- general_system_template = f"""
406
- You are an AI assistant specialized in agricultural biosecurity and pest management for {specie_selector}. The user is primarily interested in the '{region}' region.
407
 
408
- The following context has been retrieved from a database organized by region:
 
 
 
 
 
 
409
 
410
  {{context}}
411
 
412
  Instructions:
413
- 1. Analyze the user's question in relation to {specie_selector}.
414
- 2. Structure your answer in the following multi-part format:
415
-
416
- **Part 1: Selected Region Information ({region})**
417
- {"If you are providing Midwest USA information, note this is expert-curated data from the ISU Field Guide, which is the primary authoritative source for this region." if is_midwest_selected else ""}
418
-
419
- If relevant information exists in the context for the selected region that answers the user's query:
420
- Based on your selected region ({region}), for {specie_selector}, [summary of information for selected region] [1].
421
-
422
- If no relevant information exists for the selected region:
423
- "Based on the provided documents, there is no specific information for {specie_selector} in your selected region ({region}) regarding your question."
424
-
425
- **Part 2: Other Regions Information** (Only include if information from other regions is available AND relevant to the query)
426
- {"Since you selected Midwest USA, note that broader USA-wide management information may also be available. This can provide additional context on how this species is managed across the United States." if is_midwest_selected else ""}
427
-
428
- If you found relevant information from other regions that answers the user's query, include:
429
-
430
- Additionally, information was found for other regions:
431
- - In [Other Region Name]: [summary of information that directly answers the user's query] [next reference number].
432
- - In [Another Region Name]: [summary of information that directly answers the user's query] [next reference number].
433
-
434
- Only include regions where the information directly addresses the user's question.
435
- Use consecutive reference numbers starting from where Part 1 left off.
436
- If no other regions have relevant information, omit this part entirely.
437
-
438
- **Part 3: General Knowledge** (Only include if context information is insufficient or incomplete)
439
- If the available context does not fully address the query, add:
440
-
441
- Based on my general knowledge as {model_name}: [Your general knowledge insights that directly address the query] [next reference number].
442
-
443
- If the context information is sufficient, omit this part entirely.
444
-
445
- 3. After providing all parts of your answer, include a References section ONLY for information you actually used:
446
 
447
- References:
448
- [1] Based on {"ISU Field Guide (Midwest USA)" if is_midwest_selected else f"Expert Curated information about {specie_selector} in {region}"}
449
- [2] Based on Expert Curated information about {specie_selector} in [Other Region Name]
450
- [3] Based on Expert Curated information about {specie_selector} in [Another Region Name]
451
- [x] {model_name}'s inherent knowledge
452
-
453
- IMPORTANT:
454
- - Only include reference numbers that correspond to information you actually used in your answer.
455
- - Reference numbers should be sequential (1, 2, 3...) based on the order they appear in your answer.
456
- - If you don't use information from a particular region, don't include a reference for it.
457
- - If you don't use general knowledge, don't include a reference for it.
458
- - Every claim with a reference marker [x] must have a corresponding entry in the References section.
459
-
460
- 4. Apply this language constraint: {language_constraint}
461
- 5. Keep your summaries concise and directly related to the user's question.
462
-
463
- User Question about {specie_selector}: {{question}}
464
  """
465
 
466
  class RegionFormattingLLMChain:
@@ -700,6 +664,55 @@ with gr.Blocks(theme=gr.themes.Soft()) as demo:
700
 
701
  # When the user clicks the clear button
702
  # clear.click(lambda: None, None, chatbot, queue=False).success(init_history, [state], [state])
703
- if __name__ == "__main__":
704
- # demo.launch()
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
705
  demo.queue().launch(allowed_paths=["/"], share=False, show_error=True)
 
20
  from langchain.prompts import SystemMessagePromptTemplate, HumanMessagePromptTemplate, ChatPromptTemplate
21
  import os
22
  from dotenv import load_dotenv
23
+ load_dotenv(override=True)
24
 
25
  import streamlit as st
26
  import json
 
64
 
65
  def get_species_list_from_db(db_name):
66
  embedding = OpenAIEmbeddings()
67
+ vectordb_temp = Chroma(persist_directory=db_name,
68
  embedding_function=embedding)
69
  species_list=[]
70
+ species_to_common_name={}
71
  for meta in vectordb_temp.get()["metadatas"] :
72
  try:
73
  matched_first_species = meta['matched_specie_0']
74
  except KeyError:
75
  continue
 
76
  species_list.append( matched_first_species)
77
+ if matched_first_species not in species_to_common_name and 'common_name' in meta:
78
+ species_to_common_name[matched_first_species] = meta['common_name']
79
 
80
+ return species_list, species_to_common_name
81
 
82
 
83
 
84
  # default_persist_directory = './db5' # For deployement
85
  default_persist_directory='./vector-databases-deployed/db5-agllm-data-isu-field-insects-all-species'
86
 
87
+ species_list, species_to_common_name = get_species_list_from_db(default_persist_directory)
88
  # default_persist_directory = 'vector-databases/db5-pre-completion' # For Development
89
  csv_filepath1 = "./agllm-data/corrected/Corrected_supplemented-insect_data-2500-sorted.xlsx"
90
  csv_filepath2 = "./agllm-data/corrected/Corrected_supplemented-insect_data-remaining.xlsx"
 
324
 
325
  if available_regions:
326
  available_regions_list = sorted(list(available_regions))
327
+ common_name = species_to_common_name.get(specie_selector, "Unknown")
328
+ availability_message = f"**Common Name:** {common_name}\n\nInformation for **{specie_selector}** is available in region(s): **{', '.join(available_regions_list)}**."
329
  else:
330
  available_regions_list = []
331
+ common_name = species_to_common_name.get(specie_selector, "Unknown")
332
+ availability_message = f"**Common Name:** {common_name}\n\nNo regional information found for **{specie_selector}** in the database."
333
  except Exception as e:
334
  print(f"Error checking region availability: {e}")
335
  available_regions_list = []
 
403
  # from langchain.chains.combine_documents import create_stuff_documents_chain
404
  # from langchain.chains import create_retrieval_chain
405
 
406
+ general_system_template = f"""You are an agricultural pest management assistant answering questions about {specie_selector}.
 
 
407
 
408
+ The user's selected region is '{region}'. Prioritize information from this region.
 
409
 
410
+ The four regions in the database are:
411
+ - **Midwest USA**: Expert-curated IPM data from the ISU Field Guide, specific to the U.S. Midwest.
412
+ - **USA**: General IPM and management information applicable across the United States.
413
+ - **Africa**: IPM practices relevant to African agricultural systems.
414
+ - **India**: IPM practices relevant to Indian agricultural systems.
415
+
416
+ Context retrieved from the database (organized by region):
417
 
418
  {{context}}
419
 
420
  Instructions:
421
+ - Answer based ONLY on the provided context. If no context exists for a region, do not invent information for it.
422
+ - Start with information from the selected region ('{region}') if available. If not available, clearly state that.
423
+ - Then include relevant information from other regions, clearly labeling which region each piece of information comes from.
424
+ - If the context is insufficient to fully answer the question, you may add brief general knowledge, clearly marked as such.
425
+ - {language_constraint}
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
426
 
427
+ Question: {{question}}
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
428
  """
429
 
430
  class RegionFormattingLLMChain:
 
664
 
665
  # When the user clicks the clear button
666
  # clear.click(lambda: None, None, chatbot, queue=False).success(init_history, [state], [state])
667
+
668
+
669
+ def run_examples():
670
+ """Run predefined examples and save results for paper."""
671
+ import json
672
+ from datetime import datetime
673
+
674
+ examples = [
675
+ {"species": "Striga asiatica", "region": "USA", "model": "Llama-3 70B", "mode": "Researcher", "question": "What is the most effective way to manage Striga in my maize field?"},
676
+ {"species": "Striga asiatica", "region": "Africa", "model": "Llama-3 70B", "mode": "Researcher", "question": "What is the most effective way to manage Striga in my maize field?"},
677
+ {"species": "Striga asiatica", "region": "Midwest USA", "model": "Llama-3 70B", "mode": "Researcher", "question": "What is the most effective way to manage Striga in my maize field?"},
678
+ ]
679
+
680
+ results = []
681
+ for i, ex in enumerate(examples, 1):
682
+ print(f"\n{'='*60}")
683
+ print(f"Example {i}: {ex['species']} | Region: {ex['region']} | Model: {ex['model']}")
684
+ print(f"Question: {ex['question']}")
685
+ print(f"{'='*60}")
686
+
687
+ qa_chain, availability_msg = initialize_qa_chain(
688
+ ex["species"], ex["mode"], ex["model"], ex["region"]
689
+ )
690
+ print(availability_msg)
691
+
692
+ result = qa_chain({"question": ex["question"], "chat_history": []})
693
+ answer = result["answer"]
694
+ print(f"\n--- Answer ---\n{answer}")
695
+
696
+ results.append({
697
+ "species": ex["species"],
698
+ "region": ex["region"],
699
+ "model": ex["model"],
700
+ "mode": ex["mode"],
701
+ "question": ex["question"],
702
+ "availability": availability_msg,
703
+ "answer": answer,
704
+ })
705
+
706
+ output_file = f"examples-work-out/example_results_{datetime.now().strftime('%Y%m%d_%H%M%S')}.json"
707
+ os.makedirs("examples-work-out", exist_ok=True)
708
+ with open(output_file, 'w') as f:
709
+ json.dump(results, f, indent=2)
710
+ print(f"\n\nResults saved to {output_file}")
711
+
712
+
713
+ if __name__ == "__main__":
714
+ import sys
715
+ if "--examples" in sys.argv:
716
+ run_examples()
717
+ else:
718
  demo.queue().launch(allowed_paths=["/"], share=False, show_error=True)
app_database_prep.py CHANGED
@@ -19,6 +19,7 @@ langchain.debug=True # TODO: DOUBLE CHECK
19
  system_message = {"role": "system", "content": "You are a helpful assistant."} # TODO: double check how this plays out later.
20
  import os
21
  from dotenv import load_dotenv
 
22
 
23
  import streamlit as st
24
 
 
19
  system_message = {"role": "system", "content": "You are a helpful assistant."} # TODO: double check how this plays out later.
20
  import os
21
  from dotenv import load_dotenv
22
+ load_dotenv(override=True)
23
 
24
  import streamlit as st
25
 
examples-work-out/example_results_20260212_105524.json ADDED
@@ -0,0 +1,29 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [
2
+ {
3
+ "species": "Striga asiatica",
4
+ "region": "USA",
5
+ "model": "Llama-3 70B",
6
+ "mode": "Researcher",
7
+ "question": "What is the most effective way to manage Striga in my maize field?",
8
+ "availability": "**Common Name:** Witch weed\n\nInformation for **Striga asiatica** is available in region(s): **Africa, USA**.",
9
+ "answer": "Based on the provided context from the USA region, the most effective way to manage Striga in your maize field is through an integrated management strategy that combines cultural, chemical, and biological controls.\n\nCultural control practices that can be employed include crop rotation with non-host crops to reduce Striga seed banks in the soil, deep plowing to bury Striga seeds to depths where they cannot germinate, and using resistant crop varieties to limit the impact of Striga infestations. Maintaining high soil fertility can also help crops outgrow the parasitic effects of Striga.\n\nChemical control options include the use of pre-emergent and post-emergent herbicides, such as imazapyr and glyphosate, which have been effective when used appropriately.\n\nBiological control options, although still in the experimental stage, involve the use of certain fungi and bacteria that can attack Striga seeds or seedlings.\n\nIt is essential to monitor and detect Striga infestations early, as this is crucial for effective management. Regular field inspections can help identify early signs of infestation.\n\nNo specific information is available from the Midwest USA region, Africa, or India that provides more specific guidance for maize fields. However, it is generally known that Striga management requires a long-term, multi-faceted approach that takes into account the specific conditions of the field and the crop being grown.\n\nIn summary, a combination of cultural, chemical, and biological controls, along with early detection and monitoring, is the most effective way to manage Striga in your maize field."
10
+ },
11
+ {
12
+ "species": "Striga asiatica",
13
+ "region": "Africa",
14
+ "model": "Llama-3 70B",
15
+ "mode": "Researcher",
16
+ "question": "What is the most effective way to manage Striga in my maize field?",
17
+ "availability": "**Common Name:** Witch weed\n\nInformation for **Striga asiatica** is available in region(s): **Africa, USA**.",
18
+ "answer": "Based on the provided context, here's a comprehensive answer to your question:\n\n**From Africa:**\nUnfortunately, there is no specific information on managing Striga in maize fields in the African context. The provided documents from Africa primarily focus on the description of Striga asiatica, its morphology, and growth habits.\n\n**From USA:**\nHowever, we can draw insights from the USA context, which provides more information on management strategies. An integrated approach combining cultural, chemical, and biological controls is recommended for effective long-term management of Striga asiatica (Document 1 from USA). \n\n**Cultural Control/Management Practices:**\nCrop rotation with non-host crops can help reduce Striga seed banks in the soil (Document 3 from USA). Deep plowing can bury Striga seeds to depths where they cannot germinate. Using resistant crop varieties can limit the impact of Striga infestations. Maintaining high soil fertility can help crops outgrow the parasitic effects of Striga (Document 3 from USA).\n\n**Chemical Control Options:**\nPre-emergent and post-emergent herbicides can be used to control Striga infestations. Herbicides such as imazapyr and glyphosate have been effective when used appropriately (Document 2 from USA).\n\n**Biological Control Options:**\nThere are ongoing studies into the use of certain fungi and bacteria that can attack Striga seeds or seedlings, though these are not widely implemented yet (Document 2 from USA).\n\n**General Knowledge:**\nIt is essential to monitor your maize field regularly to detect Striga infestations early, as this can help prevent the spread of the parasite and reduce the impact on your crop.\n\nIn summary, while there is no specific information on managing Striga in maize fields in the African context, an integrated approach combining cultural, chemical, and biological controls, as suggested in the USA context, can be a effective way to manage Striga in your maize field. Regular monitoring, crop rotation"
19
+ },
20
+ {
21
+ "species": "Striga asiatica",
22
+ "region": "Midwest USA",
23
+ "model": "Llama-3 70B",
24
+ "mode": "Researcher",
25
+ "question": "What is the most effective way to manage Striga in my maize field?",
26
+ "availability": "**Common Name:** Witch weed\n\nInformation for **Striga asiatica** is available in region(s): **Africa, USA**.",
27
+ "answer": "Based on the provided context, I'll provide an answer tailored to the Midwest USA region.\n\n**Midwest USA:** Unfortunately, there is no specific information available in the provided context for the Midwest USA region that directly addresses the most effective way to manage Striga in a maize field.\n\n**USA:** However, we can draw from general IPM and management information applicable across the United States. According to Document 1 from USA, integrated management strategies combining cultural, chemical, and biological controls are recommended for effective long-term management of Striga asiatica. This suggests that a multi-faceted approach may be the most effective way to manage Striga in a maize field.\n\n**Cultural Control/Management Practices (USA):** Document 3 from USA provides some cultural control and management practices that can be employed to manage Striga infestations. These include:\n\n* Crop rotation with non-host crops to reduce Striga seed banks in the soil\n* Deep plowing to bury Striga seeds to depths where they cannot germinate\n* Use of resistant crop varieties to limit the impact of Striga infestations\n* Maintaining high soil fertility to help crops outgrow the parasitic effects of Striga\n\n**Chemical Control Options (USA):** Document 2 from USA mentions that pre-emergent and post-emergent herbicides can be used to control Striga infestations. Herbicides such as imazapyr and glyphosate have been effective when used appropriately.\n\n**Biological Control Options (USA):** While not widely implemented, there are ongoing studies into the use of certain fungi and bacteria that can attack Striga seeds or seedlings (Document 2 from USA).\n\n**General Knowledge:** It is essential to note that Striga management requires a long-term strategy, as the seeds can remain viable in the soil for many years. A combination of these methods, tailored to the specific field conditions and maize variety, may be the most effective way to manage Striga in a maize"
28
+ }
29
+ ]
examples-work-out/example_results_20260212_105524.tex ADDED
@@ -0,0 +1,156 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ \documentclass[11pt]{article}
2
+ \usepackage[margin=1in]{geometry}
3
+ \usepackage{tcolorbox}
4
+ \usepackage{booktabs}
5
+ \usepackage{hyperref}
6
+
7
+ \title{AgLLM -- Example Queries and Responses}
8
+ \author{}
9
+ \date{}
10
+
11
+ \begin{document}
12
+ \maketitle
13
+
14
+ The following examples demonstrate AgLLM responses for the species \textit{Striga asiatica} (Witch weed) across different regional configurations. Each example shows the user query and the system's response, which retrieves region-specific information from the vector database and generates an answer using the selected LLM.
15
+
16
+
17
+ \subsection*{Example 1}
18
+
19
+ \begin{tcolorbox}[colback=gray!5, colframe=gray!60, title=Configuration]
20
+ \begin{tabular}{@{}ll@{}}
21
+ \textbf{Species:} & Striga asiatica \\
22
+ \textbf{Common Name:} & N/A \\
23
+ \textbf{Selected Region:} & USA \\
24
+ \textbf{Available Regions:} & Africa, USA \\
25
+ \textbf{Model:} & Llama-3 70B \\
26
+ \textbf{Mode:} & Researcher \\
27
+ \end{tabular}
28
+ \end{tcolorbox}
29
+
30
+ \begin{tcolorbox}[colback=blue!3, colframe=blue!40, title=User Query]
31
+ What is the most effective way to manage Striga in my maize field?
32
+ \end{tcolorbox}
33
+
34
+ \begin{tcolorbox}[colback=green!3, colframe=green!40, title=AgLLM Response]
35
+ Based on the provided context from the USA region, the most effective way to manage Striga in your maize field is through an integrated management strategy that combines cultural, chemical, and biological controls.
36
+
37
+ \medskip
38
+ Cultural control practices that can be employed include crop rotation with non-host crops to reduce Striga seed banks in the soil, deep plowing to bury Striga seeds to depths where they cannot germinate, and using resistant crop varieties to limit the impact of Striga infestations. Maintaining high soil fertility can also help crops outgrow the parasitic effects of Striga.
39
+
40
+ \medskip
41
+ Chemical control options include the use of pre-emergent and post-emergent herbicides, such as imazapyr and glyphosate, which have been effective when used appropriately.
42
+
43
+ \medskip
44
+ Biological control options, although still in the experimental stage, involve the use of certain fungi and bacteria that can attack Striga seeds or seedlings.
45
+
46
+ \medskip
47
+ It is essential to monitor and detect Striga infestations early, as this is crucial for effective management. Regular field inspections can help identify early signs of infestation.
48
+
49
+ \medskip
50
+ No specific information is available from the Midwest USA region, Africa, or India that provides more specific guidance for maize fields. However, it is generally known that Striga management requires a long-term, multi-faceted approach that takes into account the specific conditions of the field and the crop being grown.
51
+
52
+ \medskip
53
+ In summary, a combination of cultural, chemical, and biological controls, along with early detection and monitoring, is the most effective way to manage Striga in your maize field.
54
+ \end{tcolorbox}
55
+
56
+ \vspace{1em}
57
+
58
+ \subsection*{Example 2}
59
+
60
+ \begin{tcolorbox}[colback=gray!5, colframe=gray!60, title=Configuration]
61
+ \begin{tabular}{@{}ll@{}}
62
+ \textbf{Species:} & Striga asiatica \\
63
+ \textbf{Common Name:} & N/A \\
64
+ \textbf{Selected Region:} & Africa \\
65
+ \textbf{Available Regions:} & Africa, USA \\
66
+ \textbf{Model:} & Llama-3 70B \\
67
+ \textbf{Mode:} & Researcher \\
68
+ \end{tabular}
69
+ \end{tcolorbox}
70
+
71
+ \begin{tcolorbox}[colback=blue!3, colframe=blue!40, title=User Query]
72
+ What is the most effective way to manage Striga in my maize field?
73
+ \end{tcolorbox}
74
+
75
+ \begin{tcolorbox}[colback=green!3, colframe=green!40, title=AgLLM Response]
76
+ Based on the provided context, here's a comprehensive answer to your question:
77
+
78
+ \medskip
79
+ \textbf{From Africa:}
80
+ Unfortunately, there is no specific information on managing Striga in maize fields in the African context. The provided documents from Africa primarily focus on the description of Striga asiatica, its morphology, and growth habits.
81
+
82
+ \medskip
83
+ \textbf{From USA:}
84
+ However, we can draw insights from the USA context, which provides more information on management strategies. An integrated approach combining cultural, chemical, and biological controls is recommended for effective long-term management of Striga asiatica (Document 1 from USA).
85
+
86
+ \medskip
87
+ \textbf{Cultural Control/Management Practices:}
88
+ Crop rotation with non-host crops can help reduce Striga seed banks in the soil (Document 3 from USA). Deep plowing can bury Striga seeds to depths where they cannot germinate. Using resistant crop varieties can limit the impact of Striga infestations. Maintaining high soil fertility can help crops outgrow the parasitic effects of Striga (Document 3 from USA).
89
+
90
+ \medskip
91
+ \textbf{Chemical Control Options:}
92
+ Pre-emergent and post-emergent herbicides can be used to control Striga infestations. Herbicides such as imazapyr and glyphosate have been effective when used appropriately (Document 2 from USA).
93
+
94
+ \medskip
95
+ \textbf{Biological Control Options:}
96
+ There are ongoing studies into the use of certain fungi and bacteria that can attack Striga seeds or seedlings, though these are not widely implemented yet (Document 2 from USA).
97
+
98
+ \medskip
99
+ \textbf{General Knowledge:}
100
+ It is essential to monitor your maize field regularly to detect Striga infestations early, as this can help prevent the spread of the parasite and reduce the impact on your crop.
101
+
102
+ \medskip
103
+ In summary, while there is no specific information on managing Striga in maize fields in the African context, an integrated approach combining cultural, chemical, and biological controls, as suggested in the USA context, can be a effective way to manage Striga in your maize field. Regular monitoring, crop rotation
104
+ \end{tcolorbox}
105
+
106
+ \vspace{1em}
107
+
108
+ \subsection*{Example 3}
109
+
110
+ \begin{tcolorbox}[colback=gray!5, colframe=gray!60, title=Configuration]
111
+ \begin{tabular}{@{}ll@{}}
112
+ \textbf{Species:} & Striga asiatica \\
113
+ \textbf{Common Name:} & N/A \\
114
+ \textbf{Selected Region:} & Midwest USA \\
115
+ \textbf{Available Regions:} & Africa, USA \\
116
+ \textbf{Model:} & Llama-3 70B \\
117
+ \textbf{Mode:} & Researcher \\
118
+ \end{tabular}
119
+ \end{tcolorbox}
120
+
121
+ \begin{tcolorbox}[colback=blue!3, colframe=blue!40, title=User Query]
122
+ What is the most effective way to manage Striga in my maize field?
123
+ \end{tcolorbox}
124
+
125
+ \begin{tcolorbox}[colback=green!3, colframe=green!40, title=AgLLM Response]
126
+ Based on the provided context, I'll provide an answer tailored to the Midwest USA region.
127
+
128
+ \medskip
129
+ \textbf{Midwest USA:} Unfortunately, there is no specific information available in the provided context for the Midwest USA region that directly addresses the most effective way to manage Striga in a maize field.
130
+
131
+ \medskip
132
+ \textbf{USA:} However, we can draw from general IPM and management information applicable across the United States. According to Document 1 from USA, integrated management strategies combining cultural, chemical, and biological controls are recommended for effective long-term management of Striga asiatica. This suggests that a multi-faceted approach may be the most effective way to manage Striga in a maize field.
133
+
134
+ \medskip
135
+ \textbf{Cultural Control/Management Practices (USA):} Document 3 from USA provides some cultural control and management practices that can be employed to manage Striga infestations. These include:
136
+
137
+ \medskip
138
+ \textbullet\ Crop rotation with non-host crops to reduce Striga seed banks in the soil
139
+ \textbullet\ Deep plowing to bury Striga seeds to depths where they cannot germinate
140
+ \textbullet\ Use of resistant crop varieties to limit the impact of Striga infestations
141
+ \textbullet\ Maintaining high soil fertility to help crops outgrow the parasitic effects of Striga
142
+
143
+ \medskip
144
+ \textbf{Chemical Control Options (USA):} Document 2 from USA mentions that pre-emergent and post-emergent herbicides can be used to control Striga infestations. Herbicides such as imazapyr and glyphosate have been effective when used appropriately.
145
+
146
+ \medskip
147
+ \textbf{Biological Control Options (USA):} While not widely implemented, there are ongoing studies into the use of certain fungi and bacteria that can attack Striga seeds or seedlings (Document 2 from USA).
148
+
149
+ \medskip
150
+ \textbf{General Knowledge:} It is essential to note that Striga management requires a long-term strategy, as the seeds can remain viable in the soil for many years. A combination of these methods, tailored to the specific field conditions and maize variety, may be the most effective way to manage Striga in a maize
151
+ \end{tcolorbox}
152
+
153
+ \vspace{1em}
154
+
155
+
156
+ \end{document}
examples-work-out/generate_latex.py ADDED
@@ -0,0 +1,130 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Generate a LaTeX file from example results JSON.
3
+ Usage: python generate_latex.py example_results_20260212_105524.json
4
+ """
5
+ import json
6
+ import sys
7
+ import os
8
+ import re
9
+
10
+
11
+ def escape_latex(text):
12
+ """Escape special LaTeX characters."""
13
+ replacements = [
14
+ ('\\', r'\textbackslash{}'),
15
+ ('&', r'\&'),
16
+ ('%', r'\%'),
17
+ ('$', r'\$'),
18
+ ('#', r'\#'),
19
+ ('_', r'\_'),
20
+ ('{', r'\{'),
21
+ ('}', r'\}'),
22
+ ('~', r'\textasciitilde{}'),
23
+ ('^', r'\textasciicircum{}'),
24
+ ]
25
+ for old, new in replacements:
26
+ text = text.replace(old, new)
27
+ return text
28
+
29
+
30
+ def md_bold_to_latex(text):
31
+ """Convert **bold** markdown to LaTeX \\textbf{}."""
32
+ return re.sub(r'\*\*(.+?)\*\*', r'\\textbf{\1}', text)
33
+
34
+
35
+ def format_answer(text):
36
+ """Convert answer text to LaTeX, handling markdown-style formatting."""
37
+ text = escape_latex(text)
38
+ text = md_bold_to_latex(text)
39
+ # Convert bullet points
40
+ text = re.sub(r'^\* ', r'\\textbullet\\ ', text, flags=re.MULTILINE)
41
+ # Paragraph breaks
42
+ text = text.replace('\n\n', '\n\n\\medskip\n')
43
+ return text
44
+
45
+
46
+ def generate_latex(results, output_path):
47
+ examples_tex = ""
48
+
49
+ for i, ex in enumerate(results, 1):
50
+ # Extract common name from availability markdown
51
+ common_name_match = re.search(r'\*\*Common Name:\*\* (.+?)\\n', ex.get('availability', ''))
52
+ common_name = common_name_match.group(1) if common_name_match else 'N/A'
53
+
54
+ # Extract available regions
55
+ regions_match = re.search(r'available in region\(s\): \*\*(.+?)\*\*', ex.get('availability', ''))
56
+ available_regions = regions_match.group(1) if regions_match else 'N/A'
57
+
58
+ question = escape_latex(ex['question'])
59
+ answer = format_answer(ex['answer'])
60
+ species = escape_latex(ex['species'])
61
+ region = escape_latex(ex['region'])
62
+ model = escape_latex(ex['model'])
63
+ mode = escape_latex(ex['mode'])
64
+ common_name = escape_latex(common_name)
65
+ available_regions = escape_latex(available_regions)
66
+
67
+ examples_tex += rf"""
68
+ \subsection*{{Example {i}}}
69
+
70
+ \begin{{tcolorbox}}[colback=gray!5, colframe=gray!60, title=Configuration]
71
+ \begin{{tabular}}{{@{{}}ll@{{}}}}
72
+ \textbf{{Species:}} & {species} \\
73
+ \textbf{{Common Name:}} & {common_name} \\
74
+ \textbf{{Selected Region:}} & {region} \\
75
+ \textbf{{Available Regions:}} & {available_regions} \\
76
+ \textbf{{Model:}} & {model} \\
77
+ \textbf{{Mode:}} & {mode} \\
78
+ \end{{tabular}}
79
+ \end{{tcolorbox}}
80
+
81
+ \begin{{tcolorbox}}[colback=blue!3, colframe=blue!40, title=User Query]
82
+ {question}
83
+ \end{{tcolorbox}}
84
+
85
+ \begin{{tcolorbox}}[colback=green!3, colframe=green!40, title=AgLLM Response]
86
+ {answer}
87
+ \end{{tcolorbox}}
88
+
89
+ \vspace{{1em}}
90
+ """
91
+
92
+ latex = rf"""\documentclass[11pt]{{article}}
93
+ \usepackage[margin=1in]{{geometry}}
94
+ \usepackage{{tcolorbox}}
95
+ \usepackage{{booktabs}}
96
+ \usepackage{{hyperref}}
97
+
98
+ \title{{AgLLM -- Example Queries and Responses}}
99
+ \author{{}}
100
+ \date{{}}
101
+
102
+ \begin{{document}}
103
+ \maketitle
104
+
105
+ The following examples demonstrate AgLLM responses for the species \textit{{Striga asiatica}} (Witch weed) across different regional configurations. Each example shows the user query and the system's response, which retrieves region-specific information from the vector database and generates an answer using the selected LLM.
106
+
107
+ {examples_tex}
108
+
109
+ \end{{document}}
110
+ """
111
+
112
+ with open(output_path, 'w') as f:
113
+ f.write(latex)
114
+ print(f"LaTeX file generated: {output_path}")
115
+
116
+
117
+ if __name__ == "__main__":
118
+ if len(sys.argv) < 2:
119
+ print("Usage: python generate_latex.py <results.json>")
120
+ sys.exit(1)
121
+
122
+ input_file = sys.argv[1]
123
+ if not os.path.isabs(input_file):
124
+ input_file = os.path.join(os.path.dirname(__file__), input_file)
125
+
126
+ with open(input_file) as f:
127
+ results = json.load(f)
128
+
129
+ output_path = input_file.replace('.json', '.tex')
130
+ generate_latex(results, output_path)
species-organized/PestID Species - Organized.xlsx CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:67ea15614f571729ba9f24512ac8498576c2b0ebbade382797bf275e159a4dd2
3
- size 74872
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:d9b0e9af388f34b0452a7df83a969df19da056c5e6ff6f03eef2ef8fe6556607
3
+ size 75571
species-organized/~$PestID Species - Organized.xlsx DELETED
@@ -1,3 +0,0 @@
1
- version https://git-lfs.github.com/spec/v1
2
- oid sha256:a56509277035f865d54f697560634f21da3e36da0c983cc6f72ebb79ec3b73eb
3
- size 165
 
 
 
 
vector-databases-deployed/db5-agllm-data-isu-field-insects-all-species/{ccf8b07b-b9f2-468a-812a-2d7f01ec2bb8 β†’ 6934d416-b9d8-4bd9-b666-9621dc3ed67f}/data_level0.bin RENAMED
File without changes
vector-databases-deployed/db5-agllm-data-isu-field-insects-all-species/{ccf8b07b-b9f2-468a-812a-2d7f01ec2bb8 β†’ 6934d416-b9d8-4bd9-b666-9621dc3ed67f}/header.bin RENAMED
File without changes
vector-databases-deployed/db5-agllm-data-isu-field-insects-all-species/{ccf8b07b-b9f2-468a-812a-2d7f01ec2bb8 β†’ 6934d416-b9d8-4bd9-b666-9621dc3ed67f}/length.bin RENAMED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:e7e2dcff542de95352682dc186432e98f0188084896773f1973276b0577d5305
3
  size 40000
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:9ddf37297936a4b972cfb2c49f1215f38c32ec782e45fbfce6590540e91dbd02
3
  size 40000
vector-databases-deployed/db5-agllm-data-isu-field-insects-all-species/{ccf8b07b-b9f2-468a-812a-2d7f01ec2bb8 β†’ 6934d416-b9d8-4bd9-b666-9621dc3ed67f}/link_lists.bin RENAMED
File without changes
vector-databases-deployed/db5-agllm-data-isu-field-insects-all-species/chroma.sqlite3 CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:0eb79a00bf67e95a8a48172339e165eb674224bbba53b02d4b5163a9a315650a
3
- size 10674176
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:0012d7100fc6b0082f7f8744c90432a2b9d55fa9250274c7e57bd3c3160d5ddc
3
+ size 10678272