🔬 Mathematical Optimization of Hybrid Software Architectures

Balancing Coupling and Cohesion — INPT · CEDoc 2TI · SEEDS Research Team

Supervisor: Prof. Driss ALLAKI · Duration: 2 months · Team: 2–3 interns

📌 Project Overview

Modern software systems increasingly blend monolithic and microservice paradigms. While monoliths offer simplicity and maintainability, microservices bring scalability and independence. The challenge lies in how to split a system — poor decomposition creates tight inter-module coupling and weak intra-module cohesion, making systems brittle and hard to evolve.

This project builds a mathematically grounded decision-support tool that recommends optimal hybrid architectures. We model a software system as a weighted dependency graph and use combinatorial optimization to find component groupings that maximize cohesion within clusters and minimize coupling across them.

🏆 Results Summary

All three methods successfully recover the ground-truth decomposition:

Metric	Spectral	Genetic Algorithm	Louvain	Ground Truth
Modularity Q	0.7571	0.7571	0.7571	0.7571
Avg. Cohesion	0.5036	0.5036	0.5036	0.5036
Avg. Coupling	0.0114	0.0114	0.0114	0.0114
NMI	1.0000	1.0000	1.0000	1.0000
ARI	1.0000	1.0000	1.0000	1.0000
Runtime (s)	0.078	2.847	0.004	—

Visual Comparison

🎯 Objectives

#	Objective	Status
1	Formalize software systems as weighted graphs	✅
2	Define quantitative metrics for cohesion & coupling	✅
3	Formulate decomposition as multi-objective optimization	✅
4	Implement and compare three solution strategies	✅
5	Build visualization and evaluation tools	✅

🗂️ Repository Structure

project/
│
├── README.md                              ← You are here
├── intro_for_new_members.pdf              ← Start here if you are new!
│
├── data/
│   ├── synthetic_dependency_graph.csv     ← Edge list (320 edges, 60 nodes)
│   └── node_clusters.csv                 ← Ground-truth cluster assignments
│
├── notebooks/
│   ├── 01_spectral_clustering.ipynb       ← Solution 1: Spectral Graph Partitioning
│   ├── 02_metaheuristic_ga.ipynb          ← Solution 2: Genetic Algorithm
│   └── 03_community_detection.ipynb       ← Solution 3: Louvain + Final Comparison
│
└── reports/
    ├── 01_graph_visualization.png         ← Graph + adjacency matrix
    ├── 01_eigenvalue_spectrum.png         ← Eigengap analysis
    ├── 01_spectral_results.png            ← Spectral embedding visualization
    ├── 01_spectral_evaluation.png         ← Cohesion/coupling/confusion matrix
    ├── 01_fiedler_analysis.png            ← Fiedler vector analysis
    ├── 02_ga_convergence.png              ← GA fitness convergence
    ├── 02_ga_results.png                  ← GA clustering visualization
    ├── 02_ga_sensitivity.png              ← GA hyperparameter sensitivity
    ├── 02_ga_landscape.png                ← Fitness landscape visualization
    ├── 03_louvain_results.png             ← Louvain community detection
    ├── 03_resolution_analysis.png         ← Resolution parameter sweep
    ├── 03_louvain_hierarchy.png           ← Hierarchical decomposition
    ├── 03_final_comparison.png            ← All methods compared
    └── *.csv                              ← Numerical results per method

🧭 How to Work This Project — Step by Step

Step 0 — Read the Intro PDF First (New Members)

File: intro_for_new_members.pdf

Covers: software architecture basics, Kubernetes & containers, coupling/cohesion problem, existing approaches, and why we need math.

Step 1 — Understand the Dataset

File: data/synthetic_dependency_graph.csv

Column	Description
`source`	Source component (e.g., `auth_00`, `billing_03`)
`target`	Target component
`weight`	Dependency strength (0.0–1.0)
`true_cluster`	Ground-truth cluster label (0–4) or "cross" for inter-cluster edges

The graph has 60 nodes across 5 services (auth, billing, catalog, orders, notify) with 320 edges.

Step 2 — Run the Three Notebooks

📓 Notebook 1 — Spectral Graph Partitioning

Uses the graph Laplacian $L = D − A$ and its eigenvectors. The eigengap heuristic correctly identifies K=5 clusters.

📓 Notebook 2 — Genetic Algorithm

Evolutionary optimization with tournament selection, uniform crossover, and random mutation. Includes sensitivity analysis and fitness landscape visualization.

📓 Notebook 3 — Louvain Community Detection

Greedy modularity maximization with resolution parameter analysis. Includes the final three-method comparison.

📐 Mathematical Definitions

Modularity Q

$Q = \frac{1}{2m} \sum_{ij} \left[ A_{ij} - \frac{k_i k_j}{2m} \right] \delta(c_i, c_j)$

Cohesion (intra-cluster density)

$\text{Cohesion}(C_k) = \frac{\sum_{i,j \in C_k} A_{ij}}{|C_k|(|C_k|-1)}$

Coupling (inter-cluster density)

$\text{Coupling}(C_k, C_l) = \frac{\sum_{i \in C_k, j \in C_l} A_{ij}}{|C_k| \cdot |C_l|}$

📦 Dependencies

pip install networkx numpy scipy pandas scikit-learn matplotlib seaborn pyvis plotly python-louvain

🏁 Deliverables

README.md — Project documentation
intro_for_new_members.pdf — Onboarding document (7 pages)
data/synthetic_dependency_graph.csv — Dataset (320 edges, 60 nodes, 5 clusters)
data/node_clusters.csv — Ground-truth labels
notebooks/01_spectral_clustering.ipynb — Spectral method (21 cells)
notebooks/02_metaheuristic_ga.ipynb — Genetic algorithm (23 cells)
notebooks/03_community_detection.ipynb — Louvain + comparison (22 cells)
reports/*.png — 13 publication-quality figures
reports/*.csv — Numerical results for all methods

👥 Team & Roles

Role	Responsibility
Intern A	Graph modeling, dataset generation, math formalization
Intern B	Optimization implementation (notebooks 1 & 2)
Intern C	Evaluation, visualization, notebook 3 & report

📬 Contact

Supervisor: MSC — INPT, Rabat
Research Team: R2

"A good decomposition is not just clean code — it is a mathematical optimum."

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support