Update README.md
Browse files
README.md
CHANGED
|
@@ -3,14 +3,18 @@ license: cc-by-nc-4.0
|
|
| 3 |
tags:
|
| 4 |
- agent
|
| 5 |
- chemistry
|
|
|
|
| 6 |
---
|
| 7 |
## Model Overview
|
| 8 |
-
This model is an **Contaminants of Emerging Concern Annotation Intelligent Agent** built on the Dify platform, integrated with the Norman knowledge base, Pubchemlite_exposomics database, and Invitrodb_v4.3 database. It enables high-throughput, large-scale annotation of emerging contaminants, including **usage classification
|
|
|
|
|
|
|
|
|
|
| 9 |
|
| 10 |
## Model Purpose
|
| 11 |
To construct a specialized knowledge database for emerging contaminants usage classification, which combines multi-source chemical/toxicological databases and AI agents. The core goals are:
|
| 12 |
1. Realize fast and large-scale annotation of emerging contaminants' usage categories.
|
| 13 |
-
2. Provide efficient query services for toxicity endpoints
|
| 14 |
3. Support high-throughput data analysis scenarios for emerging contaminants in environmental chemistry and toxicology research.
|
| 15 |
|
| 16 |
## Key Definitions
|
|
@@ -29,7 +33,8 @@ To construct a specialized knowledge database for emerging contaminants usage cl
|
|
| 29 |
The system integrates three core databases with differentiated deployment strategies:
|
| 30 |
1. **Norman Chemical Classification Database**
|
| 31 |
- Serves as a relational knowledge base, uploaded and parsed on the FastGPT platform, then embedded into the Dify platform.
|
| 32 |
-
- Optimized classification: Integrated or removed redundant categories, finally categorized chemicals into **9 classes
|
|
|
|
| 33 |
2. **Pubchemlite_exposomics & Invitrodb_v4.3 Databases**
|
| 34 |
- Deployed in local SQL databases to support efficient local query and invocation.
|
| 35 |
- Query workflow: GPT-4o generates SQL statements → Extract valid SQL queries → Backend executes database queries and returns results.
|
|
@@ -69,7 +74,7 @@ The system integrates three core databases with differentiated deployment strate
|
|
| 69 |
1. Input the **IUPAC name** of the emerging contaminant into the Dify chat interface.
|
| 70 |
2. The AI agent invokes:
|
| 71 |
- FastGPT knowledge base for **usage classification** via FDA plugin.
|
| 72 |
-
- Local SQL databases for **toxicity endpoints
|
| 73 |
3. Receive the structured output (JSON format) containing usage category, toxicity endpoints, and corresponding AC50 values.
|
| 74 |
|
| 75 |
## Limitations
|
|
|
|
| 3 |
tags:
|
| 4 |
- agent
|
| 5 |
- chemistry
|
| 6 |
+
- environment
|
| 7 |
---
|
| 8 |
## Model Overview
|
| 9 |
+
This model is an **Contaminants of Emerging Concern Annotation Intelligent Agent** built on the Dify platform, integrated with the Norman knowledge base, Pubchemlite_exposomics database, and Invitrodb_v4.3 database. It enables high-throughput, large-scale annotation of emerging contaminants, including **usage classification** and **toxicity endpoints** by inputting the IUPAC name of the target contaminant.
|
| 10 |
+
<div align="center">
|
| 11 |
+
<img src="figure/pipeline.jpg" alt="Norman_category" width="800">
|
| 12 |
+
</div>
|
| 13 |
|
| 14 |
## Model Purpose
|
| 15 |
To construct a specialized knowledge database for emerging contaminants usage classification, which combines multi-source chemical/toxicological databases and AI agents. The core goals are:
|
| 16 |
1. Realize fast and large-scale annotation of emerging contaminants' usage categories.
|
| 17 |
+
2. Provide efficient query services for toxicity endpoints.
|
| 18 |
3. Support high-throughput data analysis scenarios for emerging contaminants in environmental chemistry and toxicology research.
|
| 19 |
|
| 20 |
## Key Definitions
|
|
|
|
| 33 |
The system integrates three core databases with differentiated deployment strategies:
|
| 34 |
1. **Norman Chemical Classification Database**
|
| 35 |
- Serves as a relational knowledge base, uploaded and parsed on the FastGPT platform, then embedded into the Dify platform.
|
| 36 |
+
- Optimized classification: Integrated or removed redundant categories, finally categorized chemicals into **9 classes**.
|
| 37 |
+
<img src="figure/Norman_category.png" alt="Norman_category" width="400">
|
| 38 |
2. **Pubchemlite_exposomics & Invitrodb_v4.3 Databases**
|
| 39 |
- Deployed in local SQL databases to support efficient local query and invocation.
|
| 40 |
- Query workflow: GPT-4o generates SQL statements → Extract valid SQL queries → Backend executes database queries and returns results.
|
|
|
|
| 74 |
1. Input the **IUPAC name** of the emerging contaminant into the Dify chat interface.
|
| 75 |
2. The AI agent invokes:
|
| 76 |
- FastGPT knowledge base for **usage classification** via FDA plugin.
|
| 77 |
+
- Local SQL databases for **toxicity endpoints** via GPT-4o-generated SQL queries.
|
| 78 |
3. Receive the structured output (JSON format) containing usage category, toxicity endpoints, and corresponding AC50 values.
|
| 79 |
|
| 80 |
## Limitations
|