YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

πŸ”¬ Mathematical Optimization of Hybrid Software Architectures

Balancing Coupling and Cohesion β€” INPT Β· CEDoc 2TI Β· SEEDS Research Team

Supervisor: Prof. Driss ALLAKI Β· Duration: 2 months Β· Team: 2–3 interns


πŸ“Œ Project Overview

Modern software systems increasingly blend monolithic and microservice paradigms. While monoliths offer simplicity and maintainability, microservices bring scalability and independence. The challenge lies in how to split a system β€” poor decomposition creates tight inter-module coupling and weak intra-module cohesion, making systems brittle and hard to evolve.

This project builds a mathematically grounded decision-support tool that recommends optimal hybrid architectures. We model a software system as a weighted dependency graph and use combinatorial optimization to find component groupings that maximize cohesion within clusters and minimize coupling across them.


πŸ† Results Summary

All three methods successfully recover the ground-truth decomposition:

Metric Spectral Genetic Algorithm Louvain Ground Truth
Modularity Q 0.7571 0.7571 0.7571 0.7571
Avg. Cohesion 0.5036 0.5036 0.5036 0.5036
Avg. Coupling 0.0114 0.0114 0.0114 0.0114
NMI 1.0000 1.0000 1.0000 1.0000
ARI 1.0000 1.0000 1.0000 1.0000
Runtime (s) 0.078 2.847 0.004 β€”

Visual Comparison

Final Comparison


🎯 Objectives

# Objective Status
1 Formalize software systems as weighted graphs βœ…
2 Define quantitative metrics for cohesion & coupling βœ…
3 Formulate decomposition as multi-objective optimization βœ…
4 Implement and compare three solution strategies βœ…
5 Build visualization and evaluation tools βœ…

πŸ—‚οΈ Repository Structure

project/
β”‚
β”œβ”€β”€ README.md                              ← You are here
β”œβ”€β”€ intro_for_new_members.pdf              ← Start here if you are new!
β”‚
β”œβ”€β”€ data/
β”‚   β”œβ”€β”€ synthetic_dependency_graph.csv     ← Edge list (320 edges, 60 nodes)
β”‚   └── node_clusters.csv                 ← Ground-truth cluster assignments
β”‚
β”œβ”€β”€ notebooks/
β”‚   β”œβ”€β”€ 01_spectral_clustering.ipynb       ← Solution 1: Spectral Graph Partitioning
β”‚   β”œβ”€β”€ 02_metaheuristic_ga.ipynb          ← Solution 2: Genetic Algorithm
β”‚   └── 03_community_detection.ipynb       ← Solution 3: Louvain + Final Comparison
β”‚
└── reports/
    β”œβ”€β”€ 01_graph_visualization.png         ← Graph + adjacency matrix
    β”œβ”€β”€ 01_eigenvalue_spectrum.png         ← Eigengap analysis
    β”œβ”€β”€ 01_spectral_results.png            ← Spectral embedding visualization
    β”œβ”€β”€ 01_spectral_evaluation.png         ← Cohesion/coupling/confusion matrix
    β”œβ”€β”€ 01_fiedler_analysis.png            ← Fiedler vector analysis
    β”œβ”€β”€ 02_ga_convergence.png              ← GA fitness convergence
    β”œβ”€β”€ 02_ga_results.png                  ← GA clustering visualization
    β”œβ”€β”€ 02_ga_sensitivity.png              ← GA hyperparameter sensitivity
    β”œβ”€β”€ 02_ga_landscape.png                ← Fitness landscape visualization
    β”œβ”€β”€ 03_louvain_results.png             ← Louvain community detection
    β”œβ”€β”€ 03_resolution_analysis.png         ← Resolution parameter sweep
    β”œβ”€β”€ 03_louvain_hierarchy.png           ← Hierarchical decomposition
    β”œβ”€β”€ 03_final_comparison.png            ← All methods compared
    └── *.csv                              ← Numerical results per method

🧭 How to Work This Project β€” Step by Step

Step 0 β€” Read the Intro PDF First (New Members)

File: intro_for_new_members.pdf

Covers: software architecture basics, Kubernetes & containers, coupling/cohesion problem, existing approaches, and why we need math.

Step 1 β€” Understand the Dataset

File: data/synthetic_dependency_graph.csv

Column Description
source Source component (e.g., auth_00, billing_03)
target Target component
weight Dependency strength (0.0–1.0)
true_cluster Ground-truth cluster label (0–4) or "cross" for inter-cluster edges

The graph has 60 nodes across 5 services (auth, billing, catalog, orders, notify) with 320 edges.

Step 2 β€” Run the Three Notebooks

πŸ““ Notebook 1 β€” Spectral Graph Partitioning

Uses the graph Laplacian $L = D βˆ’ A$ and its eigenvectors. The eigengap heuristic correctly identifies K=5 clusters.

Eigenvalue Spectrum Spectral Results

πŸ““ Notebook 2 β€” Genetic Algorithm

Evolutionary optimization with tournament selection, uniform crossover, and random mutation. Includes sensitivity analysis and fitness landscape visualization.

GA Convergence GA Sensitivity

πŸ““ Notebook 3 β€” Louvain Community Detection

Greedy modularity maximization with resolution parameter analysis. Includes the final three-method comparison.

Louvain Hierarchy


πŸ“ Mathematical Definitions

Modularity Q

Q=12mβˆ‘ij[Aijβˆ’kikj2m]Ξ΄(ci,cj)Q = \frac{1}{2m} \sum_{ij} \left[ A_{ij} - \frac{k_i k_j}{2m} \right] \delta(c_i, c_j)

Cohesion (intra-cluster density)

Cohesion(Ck)=βˆ‘i,j∈CkAij∣Ck∣(∣Ckβˆ£βˆ’1)\text{Cohesion}(C_k) = \frac{\sum_{i,j \in C_k} A_{ij}}{|C_k|(|C_k|-1)}

Coupling (inter-cluster density)

Coupling(Ck,Cl)=βˆ‘i∈Ck,j∈ClAij∣Ckβˆ£β‹…βˆ£Cl∣\text{Coupling}(C_k, C_l) = \frac{\sum_{i \in C_k, j \in C_l} A_{ij}}{|C_k| \cdot |C_l|}


πŸ“¦ Dependencies

pip install networkx numpy scipy pandas scikit-learn matplotlib seaborn pyvis plotly python-louvain

🏁 Deliverables

  • README.md β€” Project documentation
  • intro_for_new_members.pdf β€” Onboarding document (7 pages)
  • data/synthetic_dependency_graph.csv β€” Dataset (320 edges, 60 nodes, 5 clusters)
  • data/node_clusters.csv β€” Ground-truth labels
  • notebooks/01_spectral_clustering.ipynb β€” Spectral method (21 cells)
  • notebooks/02_metaheuristic_ga.ipynb β€” Genetic algorithm (23 cells)
  • notebooks/03_community_detection.ipynb β€” Louvain + comparison (22 cells)
  • reports/*.png β€” 13 publication-quality figures
  • reports/*.csv β€” Numerical results for all methods

πŸ‘₯ Team & Roles

Role Responsibility
Intern A Graph modeling, dataset generation, math formalization
Intern B Optimization implementation (notebooks 1 & 2)
Intern C Evaluation, visualization, notebook 3 & report

πŸ“¬ Contact

Supervisor: MSC β€” INPT, Rabat
Research Team: R2

"A good decomposition is not just clean code β€” it is a mathematical optimum."

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support