| --- |
| title: persona-ui |
| colorFrom: blue |
| colorTo: indigo |
| sdk: docker |
| app_port: 8501 |
| pinned: false |
| --- |
| # Persona UI |
|
|
| [](https://huggingface.co/spaces/implicit-personalization/persona-ui) |
|
|
| Streamlit interface for persona vector extraction, analysis, and chat. |
|
|
| ## Overview |
|
|
| A web app built on top of [persona-vectors](../persona-vectors) that provides these tabs: |
|
|
| - **Chat** β interactive conversations with a model using persona-based system prompts (templated or biography) |
| - **Analysis** β load local or Hub persona vectors and explore cosine similarity, PCA, UMAP, attribute-colored projections, and dendrograms |
| - **Probing** β sweep and inspect linear probes trained over saved persona vectors |
| - **Extract** β run persona-vector extraction from HuggingFace persona datasets or a local JSONL dataset directly from the browser |
|
|
| ## Repository Layout |
|
|
| ``` |
| persona-ui/ |
| βββ app.py # Main entry point (Streamlit) |
| βββ state.py # Session state management (chat history, KV cache) |
| βββ tabs/ |
| β βββ chat.py / chat_ui.py / chat_shared.py # Chat tab |
| β βββ compare_chat.py # Side-by-side chat comparison mode |
| β βββ analysis_core.py # Analysis tab entry point |
| β βββ analysis/ # Analysis tab internals |
| β β βββ _shared.py / _state.py # Shared loading + session state |
| β β βββ cosine.py # Cosine similarity view |
| β β βββ dendrogram.py # Persona dendrograms |
| β β βββ layered.py # PCA/UMAP/Isomap projections |
| β βββ extract.py # Extraction tab |
| β βββ probe.py / probe_ui.py # Probe diagnostics + upload/tracing controls |
| β βββ probe_sweep.py # Probe sweep tab |
| βββ utils/ |
| βββ analysis_sources.py # Local + Hub persona-vector store wiring |
| βββ chat.py # Chat generation logic |
| βββ chat_export.py # Export chat logs to JSON |
| βββ contrast.py # Contrastive token log-prob coloring |
| βββ datasets.py # Dataset loader wrapper |
| βββ helpers.py # UI labels and slug helpers |
| βββ probe_trace.py # Chat-token activation tracing |
| βββ probe_overlay.py # Per-token probe-score overlay |
| βββ probes.py / probe_files.py # Probe loading, scoring, artifact paths |
| βββ preload.py # Background startup warmup |
| βββ runtime.py # Model caching and NDIF queries |
| ``` |
|
|
| Dataset loading and environment helpers are provided by the sibling [persona-data](https://github.com/implicit-personalization/persona-data) package. |
| Core extraction, analysis, and steering logic comes from [persona-vectors](https://github.com/implicit-personalization/persona-vectors). |
|
|
| ## Installation |
|
|
| ```bash |
| uv sync |
| cp .env.example .env |
| ``` |
|
|
| ## Local Development |
|
|
| The checked-in dependency config uses published packages. For local package |
| work, uncomment the `tool.uv.sources` block in `pyproject.toml` and keep sibling checkouts next to this repo. |
|
|
| Example: |
|
|
| ```bash |
| git clone <persona-data-url> ../persona-data |
| git clone <persona-vectors-url> ../persona-vectors |
| ``` |
|
|
| Expected layout: |
|
|
| ```text |
| parent/ |
| βββ persona-ui |
| βββ persona-data |
| βββ persona-vectors |
| ``` |
|
|
| ## Quickstart |
|
|
| ```bash |
| streamlit run app.py |
| ``` |
|
|
| ## Hugging Face Spaces Deployment |
|
|
| This app can be deployed to Hugging Face Spaces using Docker. |
|
|
| ### Build Locally |
|
|
| ```bash |
| docker build -t persona-ui . |
| # Pass your local .env if you want the container to use the same configuration |
| docker run --env-file .env --rm -p 8501:8501 persona-ui |
| ``` |
|
|
| ## Configuration |
|
|
| Copy `.env.example` to `.env` and fill in: |
|
|
| ```bash |
| NDIF_API_KEY=... # Optional shared NDIF key; users can also enter one per session |
| HF_HOME=... # Optional: HuggingFace cache directory |
| HF_TOKEN=... # Optional: higher Hugging Face Hub rate limits; public datasets do not require it |
| ARTIFACTS_DIR=... # Optional: where persona vectors are read from (default: ./artifacts) |
| PERSONA_VECTORS_HUB_REPO=... # Optional: default Analysis/Probing Hub dataset repo |
| PERSONA_UI_STORE_CACHE_ENTRIES=4 # Optional: open local/Hub vector stores kept warm |
| PERSONA_UI_VECTOR_CACHE_ENTRIES=4 # Optional: loaded analysis datasets kept warm |
| PERSONA_UI_PREPARED_CACHE_ENTRIES=8 # Optional: prepared projections / k-means groups kept warm |
| PERSONA_UI_FIGURE_STATE_ENTRIES=2 # Optional: recent rendered Analysis figures kept in-session |
| PERSONA_UI_PREPARED_STATE_ENTRIES=4 # Optional: recent projection-ready markers kept in-session |
| ``` |
|
|
| The app picks up `.env` automatically via `load_dotenv()` on startup, and hosted |
| environments such as Hugging Face Spaces can provide the same values as |
| environment variables. If `NDIF_API_KEY` is unset, Chat and Extract users are prompted for a per-session key when they need remote execution. |
|
|
| ## Persona Vectors |
|
|
| The Analysis and Probing tabs read persona vectors from either a Hugging Face |
| dataset (pushed by `persona-vectors/main.py push` or the |
| `extraction_*.sh` scripts) or from local artifacts. The Extract tab writes |
| local artifacts to: |
|
|
| ``` |
| artifacts/ |
| βββ activations/<model_dir>/<mask_strategy>/<prompt_variant>/ # also: persona-vectors/... |
| β βββ manifest.json |
| β βββ <persona_id>.safetensors |
| βββ chats/<model_dir>/<persona_id>/ |
| βββ <export>.json |
| ``` |
|
|
| `<model_dir>` is the model name with `/` replaced by `__` (e.g. `google__gemma-2-9b-it`). |
| The manifest stores persona names, tensor shape metadata, and sample ids. |
| Chat exports still store `dataset_source` in the JSON payload. |
|
|