Qianhui19 commited on
Commit
611ea0c
·
verified ·
1 Parent(s): 46ddd5c

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +9 -4
README.md CHANGED
@@ -3,14 +3,18 @@ license: cc-by-nc-4.0
3
  tags:
4
  - agent
5
  - chemistry
 
6
  ---
7
  ## Model Overview
8
- This model is an **Contaminants of Emerging Concern Annotation Intelligent Agent** built on the Dify platform, integrated with the Norman knowledge base, Pubchemlite_exposomics database, and Invitrodb_v4.3 database. It enables high-throughput, large-scale annotation of emerging contaminants, including **usage classification**, **toxicity endpoints**, and corresponding **AC50 value queries** by inputting the IUPAC name of the target contaminant.
 
 
 
9
 
10
  ## Model Purpose
11
  To construct a specialized knowledge database for emerging contaminants usage classification, which combines multi-source chemical/toxicological databases and AI agents. The core goals are:
12
  1. Realize fast and large-scale annotation of emerging contaminants' usage categories.
13
- 2. Provide efficient query services for toxicity endpoints and their corresponding AC50 values.
14
  3. Support high-throughput data analysis scenarios for emerging contaminants in environmental chemistry and toxicology research.
15
 
16
  ## Key Definitions
@@ -29,7 +33,8 @@ To construct a specialized knowledge database for emerging contaminants usage cl
29
  The system integrates three core databases with differentiated deployment strategies:
30
  1. **Norman Chemical Classification Database**
31
  - Serves as a relational knowledge base, uploaded and parsed on the FastGPT platform, then embedded into the Dify platform.
32
- - Optimized classification: Integrated or removed redundant categories, finally categorized chemicals into **9 classes** (see Figure 1).
 
33
  2. **Pubchemlite_exposomics & Invitrodb_v4.3 Databases**
34
  - Deployed in local SQL databases to support efficient local query and invocation.
35
  - Query workflow: GPT-4o generates SQL statements → Extract valid SQL queries → Backend executes database queries and returns results.
@@ -69,7 +74,7 @@ The system integrates three core databases with differentiated deployment strate
69
  1. Input the **IUPAC name** of the emerging contaminant into the Dify chat interface.
70
  2. The AI agent invokes:
71
  - FastGPT knowledge base for **usage classification** via FDA plugin.
72
- - Local SQL databases for **toxicity endpoints and AC50 values** via GPT-4o-generated SQL queries.
73
  3. Receive the structured output (JSON format) containing usage category, toxicity endpoints, and corresponding AC50 values.
74
 
75
  ## Limitations
 
3
  tags:
4
  - agent
5
  - chemistry
6
+ - environment
7
  ---
8
  ## Model Overview
9
+ This model is an **Contaminants of Emerging Concern Annotation Intelligent Agent** built on the Dify platform, integrated with the Norman knowledge base, Pubchemlite_exposomics database, and Invitrodb_v4.3 database. It enables high-throughput, large-scale annotation of emerging contaminants, including **usage classification** and **toxicity endpoints** by inputting the IUPAC name of the target contaminant.
10
+ <div align="center">
11
+ <img src="figure/pipeline.jpg" alt="Norman_category" width="800">
12
+ </div>
13
 
14
  ## Model Purpose
15
  To construct a specialized knowledge database for emerging contaminants usage classification, which combines multi-source chemical/toxicological databases and AI agents. The core goals are:
16
  1. Realize fast and large-scale annotation of emerging contaminants' usage categories.
17
+ 2. Provide efficient query services for toxicity endpoints.
18
  3. Support high-throughput data analysis scenarios for emerging contaminants in environmental chemistry and toxicology research.
19
 
20
  ## Key Definitions
 
33
  The system integrates three core databases with differentiated deployment strategies:
34
  1. **Norman Chemical Classification Database**
35
  - Serves as a relational knowledge base, uploaded and parsed on the FastGPT platform, then embedded into the Dify platform.
36
+ - Optimized classification: Integrated or removed redundant categories, finally categorized chemicals into **9 classes**.
37
+ <img src="figure/Norman_category.png" alt="Norman_category" width="400">
38
  2. **Pubchemlite_exposomics & Invitrodb_v4.3 Databases**
39
  - Deployed in local SQL databases to support efficient local query and invocation.
40
  - Query workflow: GPT-4o generates SQL statements → Extract valid SQL queries → Backend executes database queries and returns results.
 
74
  1. Input the **IUPAC name** of the emerging contaminant into the Dify chat interface.
75
  2. The AI agent invokes:
76
  - FastGPT knowledge base for **usage classification** via FDA plugin.
77
+ - Local SQL databases for **toxicity endpoints** via GPT-4o-generated SQL queries.
78
  3. Receive the structured output (JSON format) containing usage category, toxicity endpoints, and corresponding AC50 values.
79
 
80
  ## Limitations