Model Card for {{ model_id | default("Model ID", true) }}

This is an off the shelf KDE model from SciPy. It is Kernel Density Estimator, in this case it is used to track the relative density of lanternfly sightings in Pittsburgh.

Model Details

Model Description

This model is a KDE. This is an unsupervised model that estimates the density of continuous values from discrete points.

This is an off the shelf model from the SciPy library and stored to allow for rapid access.

Developed by: Devin DeCosmo
Model type: Image Classifier
Language(s) (NLP): English
License: MIT
Finetuned from model: SciPy Gaussian KDE

Uses

This model is used to estimate the density of values in proportion to each other. From 0 - 1. In this case, it uses longitude and latitude as X,Y coordinates to perform this analysis.

Direct Use

The direct use is classifying our lanternfly sighting samples from our geolocal dataset. As the Gaussian KDE is a generalized unsupervised learning model, this could be used for other datsets with latitude/longitude coordinates.

Out-of-Scope Use

KDE's are unable to perform regression or classification on out of set data. They can only predict concentration within the space of the provided data.

Bias, Risks, and Limitations

This KDE can only use the data in our current dataset. At this time that is data at CMU during Fall 2025. This puts geographic and temporal contstraints on the current model fit.

This model only shows the highest concentration of lanternflies. It does not and can not make any estimations of reasons for these density measurments. Additional tools are needed to use the KDE outputs in useful research tasks.

Recommendations

This model is recommended to be used with data gathered with a specific area and time period in mind. This will allow the KDE to accurately model the data and regions provided.

Training and Testing Details

Training and Testing Data

This model was trained on our geolocal dataset rlogh/lanternfly_swatter_training

Training and Testing Procedure

KDE models do not train like standard ML models. Instead they read the entire dataset, or subset of data, and calculate the relative densities based on the proximity of points.

Training and Testing Hyperparameters

The smoothing and calculations of the KDE can be altered depending on the bandwidth estimation method used.

In this case, the standard value of "scott" was used. This allowed for a middle ground between distinct small clusters and larger overall trends. Additional experimentation with the bandwidth method could be necessary for future datasets with different.

Evaluation

There are no metrics like accuracy for unsupevised models. To ensure the data fits the dataset correctly the plot is inspected by hand. This included testing different bandwith parameters like Scott, silverman, and integer values to determine the best fit. From this, the scott was determined to show the most easily readable values for hotspot.

Results

From this, we have a useful, lightweight model from SciPy that can rapidly model the relative densities of collected lanternfly data.

The limits of these result from the bandwidth parameters of and limits of the KDE function. In future if the bandwidth could be adjusted automatically based on the input region the models could be made more generalizable.

Summary

This model is a pre-built KDE from the SciPy library. In this case, it is being used to map different lanternfly datapoints for research and user purposes.

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

ddecosmo
/

lanternfly-kde-model