Shinegupta commited on
Commit
c9c2688
Β·
verified Β·
1 Parent(s): 514f898

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +292 -13
README.md CHANGED
@@ -1,13 +1,292 @@
1
- ---
2
- title: Fetii AI Assistant
3
- emoji: πŸƒ
4
- colorFrom: gray
5
- colorTo: indigo
6
- sdk: gradio
7
- sdk_version: 5.46.1
8
- app_file: app.py
9
- pinned: false
10
- license: mit
11
- ---
12
-
13
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Fetii AI Assistant
2
+
3
+ A sophisticated Streamlit-based analytics dashboard and conversational AI system for analyzing Austin rideshare patterns and trip data.
4
+
5
+ ## Overview
6
+
7
+ Fetii AI Assistant combines advanced data processing, interactive visualizations, and natural language query processing to provide insights into Austin rideshare operations. The system processes trip data to identify patterns, peak hours, popular locations, and group size distributions while offering an intuitive chat interface for data exploration.
8
+
9
+ ## Architecture
10
+
11
+ ```mermaid
12
+ graph TB
13
+ A[User Interface] --> B[Streamlit Frontend]
14
+ B --> C[Main Application]
15
+ C --> D[Data Processor]
16
+ C --> E[Chatbot Engine]
17
+ C --> F[Visualizations Module]
18
+
19
+ D --> G[CSV Data Source]
20
+ D --> H[Sample Data Generator]
21
+
22
+ E --> I[Query Parser]
23
+ E --> J[Response Generator]
24
+ E --> K[Location Matcher]
25
+
26
+ F --> L[Plotly Charts]
27
+ F --> M[D3.js Network Viz]
28
+ F --> N[Interactive Heatmaps]
29
+
30
+ style A fill:#e1f5fe
31
+ style B fill:#f3e5f5
32
+ style C fill:#fff3e0
33
+ style D fill:#e8f5e8
34
+ style E fill:#fce4ec
35
+ style F fill:#f1f8e9
36
+ ```
37
+
38
+ ## System Components
39
+
40
+ ### Core Modules
41
+
42
+ ```mermaid
43
+ classDiagram
44
+ class DataProcessor {
45
+ +load_and_process_data()
46
+ +get_quick_insights()
47
+ +get_location_stats()
48
+ +get_time_patterns()
49
+ +query_data()
50
+ -_clean_data()
51
+ -_extract_temporal_features()
52
+ -_extract_location_features()
53
+ }
54
+
55
+ class FetiiChatbot {
56
+ +process_query()
57
+ +get_conversation_history()
58
+ +clear_history()
59
+ -_parse_query()
60
+ -_generate_response()
61
+ -_fuzzy_search_location()
62
+ }
63
+
64
+ class Visualizations {
65
+ +create_visualizations()
66
+ +create_hourly_chart()
67
+ +create_group_size_chart()
68
+ +create_time_heatmap()
69
+ +create_distance_analysis()
70
+ }
71
+
72
+ DataProcessor --> FetiiChatbot : uses
73
+ DataProcessor --> Visualizations : feeds data
74
+ FetiiChatbot --> Visualizations : requests charts
75
+ ```
76
+
77
+ ## Data Flow
78
+
79
+ ```mermaid
80
+ sequenceDiagram
81
+ participant U as User
82
+ participant S as Streamlit UI
83
+ participant C as Chatbot
84
+ participant D as Data Processor
85
+ participant V as Visualizations
86
+
87
+ U->>S: Asks question about rideshare data
88
+ S->>C: Forward user query
89
+ C->>C: Parse query intent and parameters
90
+ C->>D: Request relevant data analysis
91
+ D->>D: Process data and calculate insights
92
+ D-->>C: Return analysis results
93
+ C->>C: Generate natural language response
94
+ C-->>S: Return formatted response
95
+ S->>V: Request updated visualizations
96
+ V->>D: Get processed data
97
+ D-->>V: Return visualization data
98
+ V-->>S: Return interactive charts
99
+ S-->>U: Display response and updated charts
100
+ ```
101
+
102
+ ## Features
103
+
104
+ ### 1. Data Processing Engine
105
+ - **CSV Data Loading**: Robust parsing of rideshare trip data
106
+ - **Data Cleaning**: Handles missing values, invalid entries, and data standardization
107
+ - **Feature Engineering**: Extracts temporal patterns, location categories, and group classifications
108
+ - **Real-time Analytics**: Calculates insights on-demand for responsive user experience
109
+
110
+ ### 2. Conversational AI Interface
111
+ - **Natural Language Processing**: Understands complex queries about locations, times, and patterns
112
+ - **Context-Aware Responses**: Maintains conversation history and provides relevant follow-up suggestions
113
+ - **Fuzzy Matching**: Intelligent location search with partial name matching
114
+ - **Query Intent Recognition**: Identifies whether users want statistics, comparisons, or general information
115
+
116
+ ### 3. Interactive Visualizations
117
+ - **Peak Hour Analysis**: Dynamic bar charts showing trip distribution across hours
118
+ - **Group Size Patterns**: Pie charts and breakdowns of passenger group sizes
119
+ - **Location Popularity**: Horizontal bar charts of top pickup and dropoff spots
120
+ - **Time Heatmaps**: Day-hour heatmaps revealing temporal patterns
121
+ - **Network Diagrams**: D3.js-powered flow visualizations showing location connections
122
+
123
+ ### 4. Modern UI/UX Design
124
+ - **Clean Interface**: Professional design with Inter font family and optimized spacing
125
+ - **Responsive Layout**: Adapts to different screen sizes and devices
126
+ - **Real-time Updates**: Live data refresh and interactive chart updates
127
+ - **Accessibility**: High contrast ratios and semantic markup for screen readers
128
+
129
+ ## Query Types Supported
130
+
131
+ The chatbot recognizes and responds to several query patterns:
132
+
133
+ ```mermaid
134
+ mindmap
135
+ root((Query Types))
136
+ Location Stats
137
+ Specific venue analysis
138
+ Pickup vs dropoff comparison
139
+ Popular destination ranking
140
+ Time Patterns
141
+ Peak hours identification
142
+ Day-of-week trends
143
+ Seasonal variations
144
+ Group Analysis
145
+ Size distribution
146
+ Large group behavior
147
+ Average party metrics
148
+ General Insights
149
+ Trip summaries
150
+ Overall statistics
151
+ Data overview
152
+ ```
153
+
154
+ ## Technical Implementation
155
+
156
+ ### Query Processing Pipeline
157
+
158
+ ```mermaid
159
+ flowchart LR
160
+ A[User Input] --> B[Text Preprocessing]
161
+ B --> C[Pattern Matching]
162
+ C --> D[Parameter Extraction]
163
+ D --> E[Intent Classification]
164
+ E --> F[Data Query]
165
+ F --> G[Response Generation]
166
+ G --> H[Format Output]
167
+ H --> I[Display Result]
168
+
169
+ style A fill:#bbdefb
170
+ style E fill:#c8e6c9
171
+ style G fill:#ffcdd2
172
+ style I fill:#f8bbd9
173
+ ```
174
+
175
+ ### Data Processing Workflow
176
+
177
+ ```mermaid
178
+ graph TD
179
+ A[Raw CSV Data] --> B[Data Validation]
180
+ B --> C[Missing Value Handling]
181
+ C --> D[Feature Extraction]
182
+ D --> E[Temporal Processing]
183
+ D --> F[Location Processing]
184
+ D --> G[Group Classification]
185
+ E --> H[Time Categories]
186
+ F --> I[Address Parsing]
187
+ G --> J[Size Buckets]
188
+ H --> K[Insights Cache]
189
+ I --> K
190
+ J --> K
191
+ K --> L[API Endpoints]
192
+ ```
193
+
194
+ ## File Structure
195
+
196
+ ```
197
+ fetii-ai/
198
+ β”œβ”€β”€ main.py # Main Streamlit application
199
+ β”œβ”€β”€ data_processor.py # Core data processing logic
200
+ β”œβ”€β”€ chatbot_engine.py # Natural language processing
201
+ β”œβ”€β”€ visualizations.py # Chart generation and styling
202
+ β”œβ”€β”€ config.py # Configuration and constants
203
+ β”œβ”€β”€ utils.py # Utility functions
204
+ β”œβ”€β”€ requirements.txt # Python dependencies
205
+ └── README.md # This documentation
206
+ ```
207
+
208
+ ## Key Technologies
209
+
210
+ - **Streamlit**: Web application framework for rapid prototyping
211
+ - **Plotly**: Interactive visualization library with modern styling
212
+ - **D3.js**: Advanced network and flow diagram generation
213
+ - **Pandas**: Data manipulation and analysis
214
+ - **NumPy**: Numerical computing for statistical operations
215
+ - **Regular Expressions**: Pattern matching for query parsing
216
+
217
+ ## Installation & Setup
218
+
219
+ ```bash
220
+ # Clone the repository
221
+ git clone <repository-url>
222
+ cd fetii-ai
223
+
224
+ # Install dependencies
225
+ pip install -r requirements.txt
226
+
227
+ # Run the application
228
+ streamlit run main.py
229
+ ```
230
+
231
+ ## Configuration Options
232
+
233
+ The system provides extensive configuration through `config.py`:
234
+
235
+ - **Color Schemes**: Modern blue-based palette with accessibility considerations
236
+ - **Chart Settings**: Consistent styling across all visualizations
237
+ - **Query Patterns**: Customizable regex patterns for intent recognition
238
+ - **Data Thresholds**: Adjustable limits for analysis and filtering
239
+ - **UI Components**: Font families, spacing, and responsive breakpoints
240
+
241
+ ## Data Schema
242
+
243
+ Expected CSV format:
244
+ ```
245
+ Trip ID, Booking User ID, Pick Up Latitude, Pick Up Longitude,
246
+ Drop Off Latitude, Drop Off Longitude, Pick Up Address,
247
+ Drop Off Address, Trip Date and Time, Total Passengers
248
+ ```
249
+
250
+ ## Advanced Features
251
+
252
+ ### Fuzzy Location Matching
253
+ The system implements intelligent location search that handles:
254
+ - Exact name matches
255
+ - Partial string matching
256
+ - Word-based similarity
257
+ - Common abbreviation recognition
258
+
259
+ ### Context-Aware Responses
260
+ Chatbot responses adapt based on:
261
+ - Previous conversation history
262
+ - Query complexity level
263
+ - Available data completeness
264
+ - User expertise inference
265
+
266
+ ### Performance Optimizations
267
+ - Data caching for repeated queries
268
+ - Efficient pandas operations
269
+ - Lazy loading of visualizations
270
+ - Memory-conscious data processing
271
+
272
+ ## Future Enhancements
273
+
274
+ - Machine learning predictions for trip demand
275
+ - Real-time data streaming integration
276
+ - Advanced geographic clustering
277
+ - Multi-city dataset support
278
+ - Export capabilities for reports
279
+ - API endpoints for external integration
280
+
281
+ ## Contributing
282
+
283
+ When contributing to this project:
284
+ 1. Follow the established code structure and naming conventions
285
+ 2. Update visualizations to maintain consistent styling
286
+ 3. Test query patterns thoroughly with various input formats
287
+ 4. Ensure responsive design principles are maintained
288
+ 5. Document any new configuration options
289
+
290
+ ## License
291
+
292
+ This project is designed for analytics and insights generation. Ensure compliance with data privacy regulations when processing real rideshare data.