Spaces:
Sleeping
Sleeping
A newer version of the Gradio SDK is available: 6.15.1
metadata
title: EIS Topic Intelligence
sdk: gradio
sdk_version: 5.25.2
app_file: app.py
pinned: true
license: mit
short_description: EIS topic modelling with LLM council validation
EIS Topic Intelligence
SPJIMR Research Analytics β Topic modelling pipeline for the Enterprise Information Systems journal corpus.
What It Does
- Loads a Scopus-exported CSV (needs
TitleandAbstractcolumns minimum). - Builds paper-level embeddings from
Title + Abstractusing SPECTER2 transformer model; falls back to TF-IDF + SVD if transformers are unavailable. - Runs UMAP + HDBSCAN parameter optimization targeting 15β25 crisp clusters with 5β100 papers per cluster.
- Falls back to KMeans only if density clustering cannot meet the required range.
- Labels each cluster through a 3-member LLM council:
- Three Mistral council personas when
MISTRAL_API_KEYis configured (live LLM mode). - Deterministic keyword/PAJAIS/local semantic fallback when no key is set β app still runs end to end.
- Three Mistral council personas when
- Maps clusters to the 25 PAJAIS IS-research categories.
- Exports TCCM/computational-technique validation for the top-cited 100 papers.
- Provides a Compliance tab showing PASS / CONFIG_REQUIRED / INPUT_REQUIRED / MANUAL_REQUIRED for each requirement.
Main Deliverables
outputs/comparison.csvβ All clusters with labels, PAJAIS category, confidence, agreementoutputs/taxonomy_map.jsonβ PAJAIS taxonomy mapping + gap analysisoutputs/topic_model_report.mdβ Full markdown reportoutputs/narrative.txtβ Narrative summaryoutputs/cluster_optimization_log.csvβ All UMAP/HDBSCAN parameter trials + scoresoutputs/llm_council_validation.csvβ Per-cluster council vote evidenceoutputs/tccm_validation.csvβ Top-100 cited papers with theory/method extractionoutputs/compliance_checklist.csvβ Professor requirement complianceoutputs/run_metadata.jsonβ Embedding model + selected parametersoutputs/combined_labels.jsonβ Full cluster data with keywords and titles
Run Locally
pip install -r requirements.txt
python app.py
Open the Gradio URL and click βΆ Run Complete Pipeline after uploading your Scopus CSV.
For command-line generation (no UI):
python run_pipeline.py path/to/scopus.csv
LLM Council Setup
Set MISTRAL_API_KEY as a Space secret (or in a local .env file) to activate live 3-LLM council labelling. The app runs fully without it using deterministic fallback.