Spaces:
Sleeping
Sleeping
| """ | |
| umap.py | |
| This module defines a Uniform Manifold Approximation and Projection (UMAP) model | |
| for dimensionality reduction. UMAP is a nonlinear dimensionality reduction technique | |
| that is efficient for visualizing and analyzing high-dimensional data. | |
| Key Features: | |
| - Preserves both local and global data structures better than t-SNE in some cases. | |
| - Scales efficiently to larger datasets compared to t-SNE. | |
| - Suitable for exploratory data analysis and clustering. | |
| Parameters: | |
| - n_components (int): Number of dimensions for projection (default: 2 for visualization). | |
| - n_neighbors (int): Determines the size of the local neighborhood to consider for manifold approximation. | |
| - Typical values range between 5 and 50. | |
| - min_dist (float): Minimum distance between points in the low-dimensional space. | |
| - Smaller values maintain tighter clusters. | |
| - metric (str): Distance metric for computing similarity (default: 'euclidean'). | |
| Default: | |
| - n_components=2: Projects the data into a 2D space for visualization purposes. | |
| - n_neighbors=15: Balances local and global structure preservation. | |
| - min_dist=0.1: Provides moderate clustering while preserving distances. | |
| Requirements: | |
| - umap-learn library must be installed. | |
| """ | |
| # Import UMAP from the umap-learn library | |
| import umap.umap_ as umap | |
| # Define the UMAP estimator | |
| estimator = umap.UMAP(n_components=2, n_neighbors=15, min_dist=0.1) # Default configuration for 2D projection | |