Spaces:
Running
Running
A newer version of the Gradio SDK is available:
6.6.0
Anomaly (Outlier) Detection Models
This directory hosts scripts defining anomaly detection estimators (e.g., Isolation Forest, One-Class SVM, etc.) for use with train_anomaly_detection.py. Each file specifies a scikit-learn–compatible outlier detector and, if applicable, a parameter grid.
Key Points:
- Estimator: Must allow
.fit(X)and.predict(X)or similar. Typically returns +1 / −1 for inliers / outliers (we unify to 0 / 1). - Parameter Grid: You can define hyperparameters (like
n_estimators,contamination) for potential searching. - Default Approach: We do not rely on labeled anomalies (unsupervised). The script will produce a predictions CSV with 0 = normal, 1 = outlier.
Note: The main script train_anomaly_detection.py handles data loading, label encoding, dropping/selecting columns, the .fit(X), .predict(X) steps, saving the outlier predictions, and (optionally) a 2D plot with outliers in red.
Available Anomaly Detection Models
Usage
For example, to detect outliers with an Isolation Forest:
python scripts/train_anomaly_detection.py \
--model_module isolation_forest \
--data_path data/breast_cancer/data.csv \
--drop_columns "id,diagnosis" \
--visualize
This:
- Loads
isolation_forest.py, sets upIsolationForest(...). - Fits the model to the data, saves it, then
predict(...). - Saves a
predictions.csvwithOutlierPrediction. - If
--visualize, does a 2D PCA scatter, coloring outliers red.