Update README.md
Browse files
README.md
CHANGED
|
@@ -118,6 +118,74 @@ The system is modular, consisting of several Python components:
|
|
| 118 |
- **Visualization: Customize graph appearance in src/visualization/plotting.py.**
|
| 119 |
- **Data Storage: Modify src/data_management/storage.py to use different formats or databases.**
|
| 120 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 121 |
## 📁 Project Structure
|
| 122 |
|
| 123 |
```bash
|
|
|
|
| 118 |
- **Visualization: Customize graph appearance in src/visualization/plotting.py.**
|
| 119 |
- **Data Storage: Modify src/data_management/storage.py to use different formats or databases.**
|
| 120 |
|
| 121 |
+
## 🚧 Limitations
|
| 122 |
+
|
| 123 |
+
- **Language**
|
| 124 |
+
Optimized for English. Performance may degrade significantly on other languages.
|
| 125 |
+
|
| 126 |
+
- **Domain Specificity**
|
| 127 |
+
Achieves best results in AI/ML domains. Adaptation (e.g., domain-specific rules or keywords) is required for other fields.
|
| 128 |
+
|
| 129 |
+
- **PDF Quality**
|
| 130 |
+
Heavily reliant on clean text extraction. Scanned PDFs, complex layouts, or poor OCR significantly reduce accuracy.
|
| 131 |
+
|
| 132 |
+
- **Scalability**
|
| 133 |
+
Processing very large corpora (e.g., >10,000 papers) may require significant computational resources or distributed infrastructure.
|
| 134 |
+
|
| 135 |
+
- **Relationship Nuance**
|
| 136 |
+
Relationships are extracted based on co-occurrence and semantic similarity. Logical or causal connections may not be captured.
|
| 137 |
+
|
| 138 |
+
- **Temporal Accuracy**
|
| 139 |
+
Depends on accurate publication date extraction from metadata or filenames. Errors may affect timeline analysis.
|
| 140 |
+
|
| 141 |
+
- **Visualization Clutter**
|
| 142 |
+
Interactive graph visualizations become cluttered and less interpretable when node count exceeds ~1000.
|
| 143 |
+
|
| 144 |
+
---
|
| 145 |
+
|
| 146 |
+
## 🌱 Future Work
|
| 147 |
+
|
| 148 |
+
- **Multi-language Support**
|
| 149 |
+
Integration of multilingual NLP models to support non-English documents.
|
| 150 |
+
|
| 151 |
+
- **Citation Integration**
|
| 152 |
+
Incorporating citation links and citation graph data into network analysis.
|
| 153 |
+
|
| 154 |
+
- **ML-based Extraction**
|
| 155 |
+
Training supervised or semi-supervised models to improve concept and relation extraction quality.
|
| 156 |
+
|
| 157 |
+
- **Advanced Visualizations**
|
| 158 |
+
Implementation of timeline views, dashboards, and alternative graph layouts (e.g., hierarchical, clustered).
|
| 159 |
+
|
| 160 |
+
- **Improved Temporal Modeling**
|
| 161 |
+
Use of advanced time-series techniques to detect emerging trends and historical shifts.
|
| 162 |
+
|
| 163 |
+
- **Web Interface**
|
| 164 |
+
A user-friendly UI for uploading documents, viewing visualizations, and downloading results.
|
| 165 |
+
|
| 166 |
+
- **Knowledge Graph Export**
|
| 167 |
+
Export capabilities for standard knowledge graph formats like RDF, OWL, or JSON-LD.
|
| 168 |
+
|
| 169 |
+
- **Concept Disambiguation**
|
| 170 |
+
Methods to differentiate between identically named but contextually distinct concepts.
|
| 171 |
+
|
| 172 |
+
---
|
| 173 |
+
|
| 174 |
+
## 📋 Citation
|
| 175 |
+
|
| 176 |
+
If you use **ChronoSense** in your research or projects, please cite the following:
|
| 177 |
+
|
| 178 |
+
```bibtex
|
| 179 |
+
@software{chronosense2025,
|
| 180 |
+
author = {Abdullah Kocaman (Zayn)},
|
| 181 |
+
title = {ChronoSense: Scientific Concept Analysis and Visualization System},
|
| 182 |
+
year = {2025},
|
| 183 |
+
version = {1.0},
|
| 184 |
+
url = {https://huggingface.co/NextGenC/ChronoSense},
|
| 185 |
+
note = {A system for extracting, analyzing, and visualizing concepts and trends from scientific documents using NLP and Network Analysis}
|
| 186 |
+
}
|
| 187 |
+
|
| 188 |
+
|
| 189 |
## 📁 Project Structure
|
| 190 |
|
| 191 |
```bash
|