Arpit-Bansal commited on
Commit
6ad0ad0
·
1 Parent(s): abefa1d

update readme for HF

Browse files
Files changed (1) hide show
  1. README.md +8 -505
README.md CHANGED
@@ -1,508 +1,11 @@
1
- # ML Service - Metro Train Scheduling System
2
-
3
- [![Python 3.8+](https://img.shields.io/badge/python-3.8+-blue.svg)](https://www.python.org/downloads/)
4
- [![FastAPI](https://img.shields.io/badge/FastAPI-0.104.1-green.svg)](https://fastapi.tiangolo.com/)
5
-
6
- A comprehensive machine learning and optimization service for metro train scheduling, featuring synthetic data generation, multi-objective optimization, and a RESTful API for integration.
7
-
8
- ---
9
-
10
- ## 🎯 Project Overview
11
-
12
- This repository maintains **two main services**:
13
-
14
- ### 1. **DataService** - Data Generation & Scheduling API
15
- FastAPI-based service that generates synthetic metro data and optimizes daily train schedules.
16
-
17
- ### 2. **Optimization Algorithms** (greedyOptim)
18
- Multiple optimization algorithms for trainset scheduling including genetic algorithms, particle swarm, simulated annealing, and OR-Tools integration.
19
-
20
- ### 3. **Self-Training ML Engine** (SelfTrainService) - *Coming Soon*
21
- Adaptive machine learning engine that learns from historical schedules and improves over time.
22
-
23
- ---
24
-
25
- ## 🚀 Quick Start
26
-
27
- ### Installation
28
-
29
- ```bash
30
- # Navigate to project
31
- cd /home/arpbansal/code/sih2025/mlservice
32
-
33
- # Install dependencies
34
- pip install -r requirements.txt
35
- ```
36
-
37
- ### Run Demo
38
-
39
- ```bash
40
- # Comprehensive demo with full output
41
- python demo_schedule.py
42
-
43
- # Quick examples
44
- python quickstart.py
45
- ```
46
-
47
- ### Start API Server
48
-
49
- ```bash
50
- # Start FastAPI service
51
- python run_api.py
52
-
53
- # Access at:
54
- # - http://localhost:8000/docs (Interactive API docs)
55
- # - http://localhost:8000/api/v1/schedule/example (Example schedule)
56
- ```
57
-
58
- ---
59
-
60
- ## 📚 Key Features
61
-
62
- ✅ **25-40 trainsets** with realistic health statuses (fully healthy, partial, unavailable)
63
- ✅ **Single bidirectional metro line** with 25 stations (Aluva-Pettah)
64
- ✅ **Full-day scheduling**: 5:00 AM to 11:00 PM operation
65
- ✅ **Real-world constraints**:
66
- - Maintenance windows and job cards
67
- - Fitness certificates (rolling stock, signalling, telecom)
68
- - Branding/advertising priorities
69
- - Mileage balancing across fleet
70
- ✅ **Multi-objective optimization** with configurable weights
71
- ✅ **RESTful API** with OpenAPI/Swagger documentation
72
- ✅ **Multiple optimization algorithms** (GA, PSO, SA, CMA-ES, NSGA-II, OR-Tools)
73
-
74
- ---
75
-
76
- ## 📁 Project Structure
77
-
78
- ```
79
- mlservice/
80
- ├── DataService/ # 🆕 FastAPI data generation & scheduling
81
- │ ├── api.py # REST API endpoints
82
- │ ├── metro_models.py # Pydantic data models
83
- │ ├── metro_data_generator.py # Synthetic data generation
84
- │ ├── schedule_optimizer.py # Schedule optimization engine
85
- │ └── README.md # Detailed DataService docs
86
-
87
- ├── greedyOptim/ # Optimization algorithms
88
- │ ├── scheduler.py # Main scheduling interface
89
- │ ├── genetic_algorithm.py # Genetic algorithm
90
- │ ├── advanced_optimizers.py # CMA-ES, PSO, SA
91
- │ ├── hybrid_optimizers.py # Multi-objective, ensemble
92
- │ ├── evaluator.py # Fitness evaluation
93
- │ └── ...
94
-
95
- ├── SelfTrainService/ # ML training service (future)
96
-
97
- ├── demo_schedule.py # 🆕 Comprehensive demo
98
- ├── quickstart.py # 🆕 Quick examples
99
- ├── run_api.py # 🆕 API startup script
100
- ├── requirements.txt # Dependencies
101
- ├── Dockerfile # 🆕 Docker container
102
- └── docker-compose.yml # 🆕 Docker compose
103
- ```
104
-
105
- ---
106
-
107
- ## 📊 Schedule Output Example
108
-
109
- The system generates comprehensive daily schedules:
110
-
111
- ```json
112
- {
113
- "schedule_id": "KMRL-2025-10-25-DAWN",
114
- "generated_at": "2025-10-24T23:45:00+05:30",
115
- "valid_from": "2025-10-25T05:00:00+05:30",
116
- "valid_until": "2025-10-25T23:00:00+05:30",
117
- "depot": "Muttom_Depot",
118
-
119
- "trainsets": [
120
- {
121
- "trainset_id": "TS-001",
122
- "status": "REVENUE_SERVICE",
123
- "priority_rank": 1,
124
- "assigned_duty": "DUTY-A1",
125
- "service_blocks": [
126
- {
127
- "block_id": "BLK-001",
128
- "departure_time": "05:30",
129
- "origin": "Aluva",
130
- "destination": "Pettah",
131
- "trip_count": 3,
132
- "estimated_km": 96
133
- }
134
- ],
135
- "daily_km_allocation": 224,
136
- "cumulative_km": 145620,
137
- "fitness_certificates": {...},
138
- "job_cards": {...},
139
- "branding": {...},
140
- "readiness_score": 0.98
141
- }
142
- ],
143
-
144
- "fleet_summary": {
145
- "total_trainsets": 30,
146
- "revenue_service": 22,
147
- "standby": 4,
148
- "maintenance": 2,
149
- "cleaning": 2,
150
- "availability_percent": 93.3
151
- },
152
-
153
- "optimization_metrics": {...},
154
- "conflicts_and_alerts": [...],
155
- "decision_rationale": {...}
156
- }
157
- ```
158
-
159
- ---
160
-
161
- ## 🔌 API Endpoints
162
-
163
- ### Generate Schedule
164
-
165
- ```bash
166
- # Quick generation with defaults
167
- curl -X POST "http://localhost:8000/api/v1/generate/quick?date=2025-10-25&num_trains=30"
168
-
169
- # Custom parameters
170
- curl -X POST "http://localhost:8000/api/v1/generate" \
171
- -H "Content-Type: application/json" \
172
- -d '{
173
- "date": "2025-10-25",
174
- "num_trains": 30,
175
- "num_stations": 25,
176
- "min_service_trains": 22,
177
- "min_standby_trains": 3
178
- }'
179
- ```
180
-
181
- ### Other Endpoints
182
-
183
- ```bash
184
- # Get example schedule
185
- GET /api/v1/schedule/example
186
-
187
- # Get route information
188
- GET /api/v1/route/{num_stations}
189
-
190
- # Get train health data
191
- GET /api/v1/trains/health/{num_trains}
192
-
193
- # Get depot layout
194
- GET /api/v1/depot/layout
195
-
196
- # Health check
197
- GET /health
198
- ```
199
-
200
- **Full API Documentation**: http://localhost:8000/docs
201
-
202
- ---
203
-
204
- ## 🧠 Optimization Algorithms
205
-
206
- ### Available Methods
207
-
208
- | Algorithm | Code | Best For |
209
- |-----------|------|----------|
210
- | Genetic Algorithm | `ga` | General purpose, balanced |
211
- | Particle Swarm | `pso` | Fast convergence |
212
- | Simulated Annealing | `sa` | Avoiding local optima |
213
- | CMA-ES | `cmaes` | Continuous optimization |
214
- | NSGA-II | `nsga2` | Multi-objective |
215
- | Ensemble | `ensemble` | Best overall results |
216
- | OR-Tools CP-SAT | `cp-sat` | Constraint satisfaction |
217
-
218
- ### Usage Example
219
-
220
- ```python
221
- from greedyOptim.scheduler import TrainsetSchedulingOptimizer
222
-
223
- optimizer = TrainsetSchedulingOptimizer(data, config)
224
- result = optimizer.optimize(method='ga')
225
- ```
226
-
227
- ---
228
-
229
- ## 🐳 Docker Deployment
230
-
231
- ```bash
232
- # Build and run
233
- docker-compose up -d
234
-
235
- # View logs
236
- docker-compose logs -f
237
-
238
- # Stop
239
- docker-compose down
240
- ```
241
-
242
- Or use Docker directly:
243
-
244
- ```bash
245
- docker build -t metro-scheduler .
246
- docker run -p 8000:8000 metro-scheduler
247
- ```
248
-
249
- ---
250
-
251
- ## 💡 Use Cases
252
-
253
- 1. **Daily Operations**: Generate optimized schedules for metro operations
254
- 2. **Maintenance Planning**: Balance service and maintenance requirements
255
- 3. **Fleet Management**: Optimize train utilization and mileage balancing
256
- 4. **Advertising**: Maximize branded train exposure
257
- 5. **What-if Analysis**: Test different scenarios and constraints
258
- 6. **Data Generation**: Create synthetic data for ML model training
259
-
260
- ---
261
-
262
- ## 🎯 General Backend Flow
263
-
264
- **Single Endpoint Strategy** (Future Enhancement):
265
-
266
- ```
267
- User Request
268
-
269
- Main Endpoint
270
-
271
- ├→ Try ML Engine (SelfTrainService)
272
- │ └→ If available & confident → Return ML prediction
273
-
274
- └→ Fallback to Optimization Algo (greedyOptim)
275
- └→ Return optimized schedule
276
- ```
277
-
278
- Users can also explicitly choose:
279
- - ML-based prediction
280
- - Optimization algorithms
281
- - Hybrid approach
282
-
283
- ---
284
-
285
- ## 📖 Documentation
286
-
287
- - **DataService API**: See [DataService/README.md](DataService/README.md)
288
- - **Optimization**: See [docs/integrate.md](docs/integrate.md)
289
- - **Quick Examples**: Run `python quickstart.py`
290
- - **Full Demo**: Run `python demo_schedule.py`
291
-
292
- ---
293
-
294
- ## 🔧 Configuration
295
-
296
- ### Key Parameters
297
-
298
- ```python
299
- {
300
- "num_trains": 25-40, # Fleet size
301
- "num_stations": 25, # Route stations
302
- "min_service_trains": 20, # Min active trains
303
- "min_standby_trains": 2, # Min standby
304
- "max_daily_km_per_train": 300, # Max km/train/day
305
- "balance_mileage": true, # Enable balancing
306
- "prioritize_branding": true # Prioritize ads
307
- }
308
- ```
309
-
310
- ### Optimization Weights
311
-
312
- ```python
313
- {
314
- "service_readiness": 0.35, # 35%
315
- "mileage_balancing": 0.25, # 25%
316
- "branding_priority": 0.20, # 20%
317
- "operational_cost": 0.20 # 20%
318
- }
319
- ```
320
-
321
  ---
322
-
323
- ## 🧪 Testing
324
-
325
- ```bash
326
- # Run comprehensive demo
327
- python demo_schedule.py
328
-
329
- # Run quick examples
330
- python quickstart.py
331
-
332
- # Run unit tests
333
- python test_optimization.py
334
- ```
335
-
336
  ---
337
 
338
- ## 📦 Dependencies
339
-
340
- ```
341
- fastapi>=0.104.1
342
- uvicorn[standard]>=0.24.0
343
- pydantic>=2.5.0
344
- ortools>=9.14.6206
345
- python-multipart>=0.0.6
346
- ```
347
-
348
- Install with:
349
- ```bash
350
- pip install -r requirements.txt
351
- ```
352
-
353
- ---
354
-
355
- ## 🛠️ Development
356
-
357
- ### Setup
358
-
359
- ```bash
360
- # Clone repository
361
- git clone [repository-url]
362
- cd mlservice
363
-
364
- # Install dependencies
365
- pip install -r requirements.txt
366
-
367
- # Run in development mode
368
- uvicorn DataService.api:app --reload
369
- ```
370
-
371
- ### Adding New Features
372
-
373
- 1. Data models: Edit `DataService/metro_models.py`
374
- 2. Optimization: Add to `greedyOptim/`
375
- 3. API endpoints: Edit `DataService/api.py`
376
-
377
- ---
378
-
379
- ## 🐛 Troubleshooting
380
-
381
- **Port already in use**:
382
- ```bash
383
- # Use different port
384
- uvicorn DataService.api:app --port 8001
385
- ```
386
-
387
- **Import errors**:
388
- ```bash
389
- # Add to PYTHONPATH
390
- export PYTHONPATH="${PYTHONPATH}:$(pwd)"
391
- ```
392
-
393
- **Package conflicts**:
394
- ```bash
395
- # Use virtual environment
396
- python -m venv venv
397
- source venv/bin/activate
398
- pip install -r requirements.txt
399
- ```
400
-
401
- ---
402
-
403
- ## 📈 Performance
404
-
405
- - **Optimization time**: ~300-500ms for 30 trains
406
- - **API response time**: <1s for full schedule generation
407
- - **Memory usage**: ~50-100MB
408
- - **Scalability**: Tested up to 40 trains
409
-
410
- ---
411
-
412
- ## 🏆 Built For
413
-
414
- **Smart India Hackathon 2025** 🇮🇳
415
-
416
- This project demonstrates:
417
- - Real-world metro scheduling optimization
418
- - Modern API design with FastAPI
419
- - Multiple AI/ML algorithms
420
- - Production-ready architecture
421
- - Comprehensive documentation
422
-
423
- ---
424
-
425
- ## 👥 Team
426
-
427
- - [Add team member names]
428
-
429
- ---
430
-
431
- ## 📞 Contact & Support
432
-
433
- - **GitHub**: SIHProjectio/ML-service
434
- - **Issues**: [GitHub Issues]
435
- - **Docs**: http://localhost:8000/docs (when running)
436
-
437
- ---
438
-
439
- ## 📄 License
440
-
441
- [Add license information]
442
-
443
- ---
444
-
445
- **Last Updated**: October 24, 2025
446
-
447
- **Version**: 1.0.0
448
-
449
- 4. Constraint Satisfaction
450
- Maintenance Window Compliance: How well schedules accommodate required maintenance slots
451
- Turnaround Time Adherence: Success rate in meeting minimum turnaround requirements
452
- Battery/Energy Constraints: If applicable, energy consumption profiles
453
- 5. Multi-Objective Optimization Trade-offs
454
- Pareto Front Analysis: Trade-offs between minimizing fleet size vs. maximizing service quality
455
- Cost vs. Service Level: Operating cost reduction while maintaining service standards
456
- Passenger Satisfaction vs. Operational Efficiency: Balance achieved
457
- 6. Scalability Analysis
458
- Performance with Route Length: How algorithms perform with different numbers of stations (13-25 stations tested)
459
- Fleet Size Scaling: Results for 5, 10, 15, 20, 25 train fleets
460
- Time Complexity: Algorithm runtime growth with problem size
461
- 7. Comparative Analysis
462
- Baseline Comparison: Your optimized schedules vs. current Kochi Metro schedules
463
- Algorithm Comparison:
464
- Greedy optimizer results
465
- Genetic algorithm results
466
- OR-Tools CP-SAT results
467
- Hybrid approach results
468
- Best Performing Method: Identify which optimizer works best for different scenarios
469
- 8. Real-World Applicability
470
- Kochi Metro Specifications Met:
471
- Average operating speed: 35 km/h maintained
472
- Maximum speed: 80 km/h respected
473
- Route distance: 25.612 km covered
474
- 22 stations serviced
475
- Operational Hours: 5:00 AM to 11:00 PM coverage achieved
476
- Peak Hour Performance: 5-7 minute headways during rush hours
477
- 9. Data Generation Validation
478
- Synthetic Data Realism: Statistical comparison with actual metro operations
479
- Distribution Analysis: Passenger demand patterns, breakdown frequencies, delay distributions
480
- Sensor Data Accuracy: GPS coordinates, speed profiles match real-world patterns
481
- 10. API Performance
482
- Response Times: Average API latency for schedule generation requests
483
- Throughput: Requests handled per second
484
- Success Rate: Percentage of valid schedules generated
485
- Quantitative Metrics You Can Report:
486
- Schedule generation time: X seconds for Y trains
487
- Fleet size reduction: Z% fewer trains needed vs. baseline
488
- Total operating cost reduction: ₹X per day
489
- Passenger wait time improvement: Y% reduction
490
- Algorithm success rate: X% of runs produce valid schedules
491
- Average headway variance: ±X minutes
492
- Coverage percentage: Y% of demand satisfied
493
- Energy efficiency: X kWh per km improvement
494
- Visualization Opportunities:
495
- Gantt charts of optimized train schedules
496
- Fleet utilization timelines
497
- Headway consistency graphs (peak vs. off-peak)
498
- Algorithm performance comparison tables
499
- Pareto fronts for multi-objective optimization
500
- Cost-benefit analysis charts
501
- Convergence plots for genetic algorithm
502
- Scalability curves (runtime vs. problem size)
503
- You should present these results with:
504
-
505
- Tables showing comparative metrics
506
- Graphs visualizing schedule quality and optimization performance
507
- Statistical analysis proving improvements are significant
508
- Real test cases using Kochi Metro parameters
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ title: Train Schedule Optimization
3
+ emoji: 🐨
4
+ colorFrom: blue
5
+ colorTo: pink
6
+ sdk: docker
7
+ pinned: false
8
+ license: cc-by-4.0
 
 
 
 
 
 
 
9
  ---
10
 
11
+ Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference