YAML Metadata Warning:empty or missing yaml metadata in repo card
Check out the documentation for more information.
Game-Theoretic Scheduling of Istio Ambient Mesh Waypoint Proxies in Multi-Tenant Kubernetes Clusters
Overview
This project implements a comprehensive framework for game-theoretic scheduling of waypoint proxies in Istio Ambient Mesh deployments within multi-tenant Kubernetes clusters. It includes 5 scheduling strategies (3 from the research blueprint + 2 novel contributions), a complete mathematical framework with formal proofs, and a rigorous evaluation pipeline with statistical testing.
Strategies
Blueprint Strategies
S1: Baseline K8s Default (
strategies/s01_baseline_k8s_default.py)- Emulates default kube-scheduler with LeastRequestedPriority scoring
- Routes tenants to nearest proxy (lowest latency)
- Serves as baseline for comparison
S2: Resource-Aware Heuristic (
strategies/s02_resource_aware_heuristic.py)- Two-phase greedy: efficiency-based placement + LPT load-balancing routing
- 2-approximation guarantee for makespan objective
- O(NM log N) computational complexity
S3: Stackelberg Equilibrium (
strategies/s03_stackelberg_equilibrium.py)- Full bilevel optimization: leader (platform) places proxies, followers (tenants) route traffic
- Exact enumeration for small instances, PGD for larger ones
- Existence guaranteed by Kakutani's fixed-point theorem
Novel Contributions
S4: Potential Game (
strategies/s04_potential_game.py) β NOVEL- Reformulates routing as a Rosenthal congestion game with exact potential
- Guaranteed convergence via best-response dynamics (finite-time, any initial state)
- Price of Anarchy: PoA β€ 5/3 for quadratic congestion (tight bound)
- Decentralized: tenants can independently compute best responses
- Pure-strategy Nash equilibrium always exists
S5: VCG Mechanism Design (
strategies/s05_vcg_mechanism_design.py) β NOVEL- Applies Vickrey-Clarke-Groves mechanism for truthful resource allocation
- Incentive compatible: truthful reporting is a dominant strategy
- Individually rational: all tenants have non-negative utility
- Allocatively efficient: maximizes social welfare
- Computes Clarke pivot payments to ensure truthfulness
Theoretical Results
| Property | S1 | S2 | S3 | S4 | S5 |
|---|---|---|---|---|---|
| Equilibrium existence | N/A | N/A | β | β | β |
| Convergence guarantee | N/A | N/A | Conditional | β Always | N/A |
| PoA bound | None | 2-approx | None | β€ 5/3 | 1 (optimal) |
| Incentive compatible | β | β | β | β | β |
| Decentralized | β | β | β | β | β |
| Complexity | O(NM) | O(NM log N) | NP-hard | O(NΒ²MΒ·T) | NP-hard |
Full proofs in docs/mathematical_proofs.md β includes 15 theorems covering equilibrium existence, uniqueness, convergence rates, PoA bounds, and mechanism design properties.
Project Structure
project/
βββ core/
β βββ system_model.py # Cluster, tenant, node, latency models
β βββ payoff_engine.py # Cost functions, Nash verification
β βββ equilibrium_solver.py # CVXPY solver, best-response, PGD
βββ strategies/
β βββ s01_baseline_k8s_default.py
β βββ s02_resource_aware_heuristic.py
β βββ s03_stackelberg_equilibrium.py
β βββ s04_potential_game.py # Novel
β βββ s05_vcg_mechanism_design.py # Novel
βββ data/
β βββ synthetic_generator.py # Scenario generation (small/medium/large/xlarge)
βββ evaluation/
β βββ metrics.py # 30+ metrics (latency, fairness, SLA, isolation)
β βββ visualizations.py # 8 plot types (CDF, radar, Pareto, heatmap, etc.)
β βββ statistical_tests.py # Wilcoxon, bootstrap CI, Friedman, Nemenyi
βββ docs/
β βββ mathematical_proofs.md # 15 formal theorems with complete proofs
βββ results/ # Generated plots and JSON results
βββ run_all.py # Master evaluation runner
βββ README.md
Usage
Quick Test
python run_all.py --quick
Full Evaluation
python run_all.py --scenarios small medium large --seeds 30
Custom Run
python run_all.py --scenarios medium --seeds 10 --output ./my_results
Dependencies
numpy scipy cvxpy matplotlib seaborn pandas networkx
Key Design Decisions
- Synthetic data generation calibrated to Alibaba cluster-trace and Azure VM trace distributions (log-normal demands, zone-aware latency, discrete capacity tiers)
- CVXPY for convex subproblems with SCS solver; greedy fallback for robustness
- Pure strategies for potential game β enables Rosenthal potential and strongest convergence guarantees
- Numerical verification of theoretical properties (potential game property, Nash equilibrium, incentive compatibility)
Mathematical Framework
The system models N tenants and M nodes with:
- Congestion:
cong_j(x) = ΞΊ(β_j/C_j)^p(polynomial, typically p=2) - Tenant cost:
c_i = Ξ±Β·latency + Ξ²Β·contention + Ξ³Β·isolation - Platform cost: weighted sum of avg latency, resource imbalance, unfairness, isolation
See docs/mathematical_proofs.md for complete formalization.