Soham Waghmare commited on
Commit
bd3d821
·
1 Parent(s): fcd8ded

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +152 -0
README.md ADDED
@@ -0,0 +1,152 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # KnowledgeNet
2
+
3
+ KnowledgeNet is an AI-driven framework designed to automate the research process by collecting, processing, and presenting information from various online sources. It leverages advanced web crawling, natural language processing, and data visualization techniques to deliver comprehensive research insights.
4
+
5
+ ## Features
6
+
7
+ - **Automated Web Crawling**: Utilize [Crawl4AI](https://github.com/sohamw03/Crawl4AI) with Playwright for efficient data extraction.
8
+ - **Natural Language Processing**: Employ large language models (LLMs) for data analysis and summarization.
9
+ - **Interactive Dashboard**: Visualize research findings through an intuitive React/Next.js frontend.
10
+ - **Scalability**: Implement Celery with message brokers for distributed task management.
11
+ - **Cloud Integration**: Deploy using AWS services for robust and scalable infrastructure.
12
+
13
+ ## Architecture Overview
14
+
15
+ ```mermaid
16
+ graph TD
17
+ User["User"] -->|Research Query| API_Gateway["AWS API Gateway"]
18
+ API_Gateway --> Lambda["AWS Lambda"]
19
+ Lambda --> SQS["AWS SQS (Task Queue)"]
20
+ SQS --> Celery["Celery Workers (Distributed Tasks)"]
21
+ Celery -->|Processes Tasks| DynamoDB["AWS DynamoDB"]
22
+ Celery -->|Uploads Files| S3["AWS S3 (File Storage)"]
23
+ Celery -->|Triggers Notifications| SNS["AWS SNS (Notifications)"]
24
+ Celery -->|Logs Metadata| CloudWatch["AWS CloudWatch"]
25
+
26
+ DynamoDB -->|Provides Data| Lambda
27
+ S3 -->|Serves Content| CloudFront["AWS CloudFront (CDN)"]
28
+ CloudFront -->|Delivers Content| User
29
+ CloudWatch --> DevOps["DevOps Monitoring"]
30
+
31
+ subgraph "AWS Infrastructure"
32
+ API_Gateway
33
+ Lambda
34
+ SQS
35
+ Celery
36
+ DynamoDB
37
+ S3
38
+ SNS
39
+ CloudWatch
40
+ CloudFront
41
+ end
42
+ ```
43
+
44
+ ## Installation
45
+
46
+ To set up the KnowledgeNet environment, follow these steps:
47
+
48
+ 1. **Clone the Repository:**
49
+
50
+ ```bash
51
+ git clone [https://github.com/sohamw03/knowledge_net.git](https://github.com/sohamw03/knowledge_net.git)
52
+ cd knowledge_net
53
+ ```
54
+
55
+ 2. **Backend Setup:**
56
+
57
+ Navigate to the backend directory:
58
+
59
+ ```bash
60
+ cd backend
61
+ ```
62
+
63
+ Create a virtual environment:
64
+
65
+ ```bash
66
+ python -m venv env
67
+ source env/bin/activate # On Windows, use 'env\Scripts\activate'
68
+ ```
69
+
70
+ Install the required packages:
71
+
72
+ ```bash
73
+ pip install -r requirements.txt
74
+ ```
75
+
76
+ 3. **Frontend Setup:**
77
+
78
+ Navigate to the frontend directory:
79
+
80
+ ```bash
81
+ cd ../frontend
82
+ ```
83
+
84
+ Install the dependencies:
85
+
86
+ ```bash
87
+ npm install
88
+ ```
89
+
90
+ ## Usage
91
+
92
+ 1. **Start the Backend:**
93
+
94
+ Ensure you're in the backend directory and the virtual environment is activated.
95
+
96
+ Run the Flask application:
97
+
98
+ ```bash
99
+ flask run
100
+ ```
101
+
102
+ 2. **Start the Frontend:**
103
+
104
+ In a new terminal, navigate to the frontend directory.
105
+
106
+ Start the development server:
107
+
108
+ ```bash
109
+ npm run dev
110
+ ```
111
+
112
+ 3. **Access the Application:**
113
+
114
+ Open your browser and navigate to `http://localhost:3000` to interact with KnowledgeNet.
115
+
116
+ ## Contributing
117
+
118
+ We welcome contributions to enhance KnowledgeNet. To contribute:
119
+
120
+ 1. Fork the repository.
121
+ 2. Create a new branch:
122
+
123
+ ```bash
124
+ git checkout -b feature/YourFeatureName
125
+ ```
126
+
127
+ 3. Commit your changes:
128
+
129
+ ```bash
130
+ git commit -m 'Add some feature'
131
+ ```
132
+
133
+ 4. Push to the branch:
134
+
135
+ ```bash
136
+ git push origin feature/YourFeatureName
137
+ ```
138
+
139
+ 5. Open a pull request detailing your changes.
140
+
141
+ ## License
142
+
143
+ This project is licensed under the Apache-2.0 License. See the `LICENSE` file for more details.
144
+
145
+ ## Acknowledgements
146
+
147
+ * Crawl4AI for web crawling capabilities.
148
+ * Playwright for browser automation.
149
+ * Celery for distributed task management.
150
+ * AWS for cloud infrastructure services.
151
+
152
+