Spaces:
Paused
Paused
Soham Waghmare
commited on
Commit
·
bd3d821
1
Parent(s):
fcd8ded
Create README.md
Browse files
README.md
ADDED
|
@@ -0,0 +1,152 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# KnowledgeNet
|
| 2 |
+
|
| 3 |
+
KnowledgeNet is an AI-driven framework designed to automate the research process by collecting, processing, and presenting information from various online sources. It leverages advanced web crawling, natural language processing, and data visualization techniques to deliver comprehensive research insights.
|
| 4 |
+
|
| 5 |
+
## Features
|
| 6 |
+
|
| 7 |
+
- **Automated Web Crawling**: Utilize [Crawl4AI](https://github.com/sohamw03/Crawl4AI) with Playwright for efficient data extraction.
|
| 8 |
+
- **Natural Language Processing**: Employ large language models (LLMs) for data analysis and summarization.
|
| 9 |
+
- **Interactive Dashboard**: Visualize research findings through an intuitive React/Next.js frontend.
|
| 10 |
+
- **Scalability**: Implement Celery with message brokers for distributed task management.
|
| 11 |
+
- **Cloud Integration**: Deploy using AWS services for robust and scalable infrastructure.
|
| 12 |
+
|
| 13 |
+
## Architecture Overview
|
| 14 |
+
|
| 15 |
+
```mermaid
|
| 16 |
+
graph TD
|
| 17 |
+
User["User"] -->|Research Query| API_Gateway["AWS API Gateway"]
|
| 18 |
+
API_Gateway --> Lambda["AWS Lambda"]
|
| 19 |
+
Lambda --> SQS["AWS SQS (Task Queue)"]
|
| 20 |
+
SQS --> Celery["Celery Workers (Distributed Tasks)"]
|
| 21 |
+
Celery -->|Processes Tasks| DynamoDB["AWS DynamoDB"]
|
| 22 |
+
Celery -->|Uploads Files| S3["AWS S3 (File Storage)"]
|
| 23 |
+
Celery -->|Triggers Notifications| SNS["AWS SNS (Notifications)"]
|
| 24 |
+
Celery -->|Logs Metadata| CloudWatch["AWS CloudWatch"]
|
| 25 |
+
|
| 26 |
+
DynamoDB -->|Provides Data| Lambda
|
| 27 |
+
S3 -->|Serves Content| CloudFront["AWS CloudFront (CDN)"]
|
| 28 |
+
CloudFront -->|Delivers Content| User
|
| 29 |
+
CloudWatch --> DevOps["DevOps Monitoring"]
|
| 30 |
+
|
| 31 |
+
subgraph "AWS Infrastructure"
|
| 32 |
+
API_Gateway
|
| 33 |
+
Lambda
|
| 34 |
+
SQS
|
| 35 |
+
Celery
|
| 36 |
+
DynamoDB
|
| 37 |
+
S3
|
| 38 |
+
SNS
|
| 39 |
+
CloudWatch
|
| 40 |
+
CloudFront
|
| 41 |
+
end
|
| 42 |
+
```
|
| 43 |
+
|
| 44 |
+
## Installation
|
| 45 |
+
|
| 46 |
+
To set up the KnowledgeNet environment, follow these steps:
|
| 47 |
+
|
| 48 |
+
1. **Clone the Repository:**
|
| 49 |
+
|
| 50 |
+
```bash
|
| 51 |
+
git clone [https://github.com/sohamw03/knowledge_net.git](https://github.com/sohamw03/knowledge_net.git)
|
| 52 |
+
cd knowledge_net
|
| 53 |
+
```
|
| 54 |
+
|
| 55 |
+
2. **Backend Setup:**
|
| 56 |
+
|
| 57 |
+
Navigate to the backend directory:
|
| 58 |
+
|
| 59 |
+
```bash
|
| 60 |
+
cd backend
|
| 61 |
+
```
|
| 62 |
+
|
| 63 |
+
Create a virtual environment:
|
| 64 |
+
|
| 65 |
+
```bash
|
| 66 |
+
python -m venv env
|
| 67 |
+
source env/bin/activate # On Windows, use 'env\Scripts\activate'
|
| 68 |
+
```
|
| 69 |
+
|
| 70 |
+
Install the required packages:
|
| 71 |
+
|
| 72 |
+
```bash
|
| 73 |
+
pip install -r requirements.txt
|
| 74 |
+
```
|
| 75 |
+
|
| 76 |
+
3. **Frontend Setup:**
|
| 77 |
+
|
| 78 |
+
Navigate to the frontend directory:
|
| 79 |
+
|
| 80 |
+
```bash
|
| 81 |
+
cd ../frontend
|
| 82 |
+
```
|
| 83 |
+
|
| 84 |
+
Install the dependencies:
|
| 85 |
+
|
| 86 |
+
```bash
|
| 87 |
+
npm install
|
| 88 |
+
```
|
| 89 |
+
|
| 90 |
+
## Usage
|
| 91 |
+
|
| 92 |
+
1. **Start the Backend:**
|
| 93 |
+
|
| 94 |
+
Ensure you're in the backend directory and the virtual environment is activated.
|
| 95 |
+
|
| 96 |
+
Run the Flask application:
|
| 97 |
+
|
| 98 |
+
```bash
|
| 99 |
+
flask run
|
| 100 |
+
```
|
| 101 |
+
|
| 102 |
+
2. **Start the Frontend:**
|
| 103 |
+
|
| 104 |
+
In a new terminal, navigate to the frontend directory.
|
| 105 |
+
|
| 106 |
+
Start the development server:
|
| 107 |
+
|
| 108 |
+
```bash
|
| 109 |
+
npm run dev
|
| 110 |
+
```
|
| 111 |
+
|
| 112 |
+
3. **Access the Application:**
|
| 113 |
+
|
| 114 |
+
Open your browser and navigate to `http://localhost:3000` to interact with KnowledgeNet.
|
| 115 |
+
|
| 116 |
+
## Contributing
|
| 117 |
+
|
| 118 |
+
We welcome contributions to enhance KnowledgeNet. To contribute:
|
| 119 |
+
|
| 120 |
+
1. Fork the repository.
|
| 121 |
+
2. Create a new branch:
|
| 122 |
+
|
| 123 |
+
```bash
|
| 124 |
+
git checkout -b feature/YourFeatureName
|
| 125 |
+
```
|
| 126 |
+
|
| 127 |
+
3. Commit your changes:
|
| 128 |
+
|
| 129 |
+
```bash
|
| 130 |
+
git commit -m 'Add some feature'
|
| 131 |
+
```
|
| 132 |
+
|
| 133 |
+
4. Push to the branch:
|
| 134 |
+
|
| 135 |
+
```bash
|
| 136 |
+
git push origin feature/YourFeatureName
|
| 137 |
+
```
|
| 138 |
+
|
| 139 |
+
5. Open a pull request detailing your changes.
|
| 140 |
+
|
| 141 |
+
## License
|
| 142 |
+
|
| 143 |
+
This project is licensed under the Apache-2.0 License. See the `LICENSE` file for more details.
|
| 144 |
+
|
| 145 |
+
## Acknowledgements
|
| 146 |
+
|
| 147 |
+
* Crawl4AI for web crawling capabilities.
|
| 148 |
+
* Playwright for browser automation.
|
| 149 |
+
* Celery for distributed task management.
|
| 150 |
+
* AWS for cloud infrastructure services.
|
| 151 |
+
|
| 152 |
+
|