File size: 1,144 Bytes
4c91838
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30

"""
dbscan.py

This module defines a DBSCAN clustering model and a parameter grid for hyperparameter tuning.

DBSCAN (Density-Based Spatial Clustering of Applications with Noise) is a density-based clustering algorithm.
It groups points closely packed together and marks as outliers those points in low-density regions.

Parameters:
    - eps (float): The maximum distance between two samples for them to be considered as in the same neighborhood.
    - min_samples (int): The number of samples (or total weight) in a neighborhood for a point to be considered a core point.
"""

from sklearn.cluster import DBSCAN

# Define the DBSCAN estimator
estimator = DBSCAN(eps=0.5, min_samples=5)

# Define the hyperparameter grid for tuning
param_grid = {
    'model__eps': [0.2, 0.5, 1.0, 1.5, 2.0],  # Explore a wide range of neighborhood radii
    'model__min_samples': [3, 5, 10, 20]  # Adjust density thresholds for core points
}

# Default scoring metric
# Note: Silhouette score works best for convex clusters and may not always be ideal for DBSCAN.
# For more complex shapes, consider custom evaluation metrics.
default_scoring = 'silhouette'