Spaces:
Sleeping
Sleeping
| # UAP Analysis App: Improvement and Debugging Notes | |
| ## Scope Reviewed | |
| - Backend API: `backend/main.py` | |
| - Frontend app/store/pages: `frontend/src/**` | |
| - Project positioning/features: `README.md` | |
| ## High-Priority Debugging and Reliability Issues | |
| ### 1) Global in-memory backend state is shared across all users/sessions | |
| - File: `backend/main.py` | |
| - Current behavior: `state` is a single process-level dictionary storing dataset, filtered data, analysis results, and query context. | |
| - Risk: | |
| - Cross-user data leakage. | |
| - Race conditions (one user can overwrite another user’s data mid-session). | |
| - Non-deterministic behavior in multi-worker deployment. | |
| - Improvement: | |
| - Introduce per-session/project IDs and isolate state in Redis or a DB cache keyed by session. | |
| - Add dataset ownership + TTL cleanup. | |
| ### 2) Dashboard “Analysis Runs” counter is effectively broken | |
| - File: `backend/main.py` | |
| - Current behavior: `/api/dashboard/summary` reports `analyzed_columns` from `state["col_names"]`, but `col_names` is never populated in `run_analysis`. | |
| - Impact: dashboard can show incorrect/always-zero run stats. | |
| - Improvement: | |
| - Set `state["col_names"] = req.columns` (validated columns only) during analysis. | |
| - Consider storing analysis timestamp and run count. | |
| ### 3) Numeric filtering ignores partial ranges | |
| - Files: `backend/main.py`, `frontend/src/components/data/FilterPanel.tsx` | |
| - Current behavior: numeric filter only applies when both `min_val` and `max_val` are provided. | |
| - Impact: user-provided minimum-only or maximum-only constraints do nothing. | |
| - Improvement: | |
| - Support all combinations: min-only, max-only, and bounded range. | |
| ### 4) Filtering error handling is silent on frontend | |
| - File: `frontend/src/components/data/DataExplorer.tsx` | |
| - Current behavior: `handleFilter` catches errors and suppresses them. | |
| - Impact: user cannot tell whether filtering failed or returned no matches. | |
| - Improvement: | |
| - Surface backend error in UI (same pattern used by upload/load errors). | |
| ### 5) Backend query endpoint has expensive prompt construction and token-risk | |
| - File: `backend/main.py` | |
| - Current behavior: concatenates up to 500 rows of full text into one prompt string. | |
| - Risks: | |
| - Large latency/cost spikes. | |
| - Prompt truncation or model failure on long text columns. | |
| - Improvement: | |
| - Add token-aware chunking/sampling and optional map-reduce summarization. | |
| - Enforce max characters/tokens per request with clear user feedback. | |
| ### 6) CORS config is overly permissive and potentially invalid for credentials | |
| - File: `backend/main.py` | |
| - Current behavior: `allow_origins=["*"]` with `allow_credentials=True`. | |
| - Risk: browser credential behavior can be inconsistent; security posture is weak for production. | |
| - Improvement: | |
| - Use explicit origin allowlist per environment. | |
| - Keep credentials disabled unless required. | |
| ## Data-Analysis Quality Gaps (Core Product) | |
| ### 7) Analysis pipeline is currently mock/simulated, not aligned with README claims | |
| - File: `backend/main.py` | |
| - Current behavior: analysis uses value counts + random 2D points + correlation proxy for “XGBoost-like” output. | |
| - Impact: users may interpret synthetic outputs as real model outputs. | |
| - Improvement: | |
| - Label this mode explicitly as `demo/mock` in API/UI. | |
| - Add a production pipeline path using real embeddings + UMAP/HDBSCAN + trained model artifacts. | |
| ### 8) Cluster assignment is too coarse for high-cardinality text fields | |
| - File: `backend/main.py` | |
| - Current behavior: top 32 frequent values become “clusters”; all others mapped to `Other`. | |
| - Impact: weak signal extraction for long-tail UAP narratives. | |
| - Improvement: | |
| - Use embedding-based similarity clustering with min-cluster-size tuning. | |
| - Keep top terms as labels only, not as cluster definitions. | |
| ### 9) “XGBoost” results are correlation-based placeholders | |
| - File: `backend/main.py` | |
| - Current behavior: feature importance derived from absolute correlation among category codes, with random “accuracy”. | |
| - Impact: misleading ML interpretation. | |
| - Improvement: | |
| - Either rename section (`Association Importance`) or run real train/validation with metrics and confidence intervals. | |
| ### 10) Cramer’s V stability safeguards are minimal | |
| - File: `backend/main.py` | |
| - Current behavior: exceptions are swallowed to `0.0` values. | |
| - Impact: matrix can hide data-quality problems. | |
| - Improvement: | |
| - Return diagnostics (insufficient contingency size, sparse table warning, low sample counts). | |
| ## UX and Feature Improvements for Analysis Workflows | |
| ### 11) Add reproducibility controls | |
| - Current gap: random projections are generated without surfaced seed controls; pipeline details are hidden. | |
| - Improvement: | |
| - UI inputs for random seed and analysis config profile. | |
| - Persist configuration alongside results. | |
| ### 12) Add time/location-first analysis modules | |
| - Context: UAP datasets are usually spatiotemporal. | |
| - Improvement ideas: | |
| - Temporal anomaly detection (daily/weekly trend breaks). | |
| - Geo heatmaps + hotspot evolution over time. | |
| - Co-occurrence matrices for shape/light/motion features. | |
| ### 13) Add model/result provenance panel | |
| - Improvement: | |
| - Track dataset hash, row count after filters, analysis timestamp, selected columns, pipeline version. | |
| - Show this metadata in Analysis and export payloads. | |
| ### 14) Improve filter capabilities for real EDA | |
| - Current gap: categorical filter relies on top-values list and cannot easily search rare categories. | |
| - Improvement: | |
| - Add searchable categorical picker and “include nulls/exclude nulls”. | |
| - Add reusable saved filter presets. | |
| ### 15) Add export/reporting features | |
| - Improvement: | |
| - Export filtered dataset, correlation matrix, and feature-importance JSON/CSV. | |
| - One-click markdown/PDF report with charts and configuration metadata. | |
| ## Engineering Quality and Maintainability | |
| ### 16) Add automated tests for core API behaviors | |
| - Suggested minimal suite: | |
| - `/api/data/load`, `/api/data/filter`, `/api/analyze/run`, `/api/dashboard/summary`, `/api/query/gemini` failure paths. | |
| - Numeric filter edge cases (min-only/max-only). | |
| - State isolation once sessionized. | |
| ### 17) Add request/analysis observability | |
| - Improvement: | |
| - Structured logging + request IDs. | |
| - Timing metrics per stage (load/filter/analyze/query). | |
| - Distinguish user errors (4xx) from pipeline errors (5xx). | |
| ### 18) Clarify mode separation: demo vs production | |
| - Improvement: | |
| - Feature flags/environment variable to select mock vs full analysis backend. | |
| - UI badges and warning copy to prevent scientific misinterpretation. | |
| ## Suggested Implementation Order | |
| 1. Fix state isolation and dashboard run-count correctness. | |
| 2. Fix filtering behavior + frontend error surfacing. | |
| 3. Mark current analysis mode as mock and rename misleading outputs. | |
| 4. Add reproducibility/provenance metadata and exports. | |
| 5. Introduce real embedding + clustering + model pipeline behind a feature flag. | |