--- title: persona-ui colorFrom: blue colorTo: indigo sdk: docker app_port: 8501 pinned: false --- # Persona UI [![Deploy to Hugging Face Spaces](https://huggingface.co/spaces/implicit-personalization/persona-ui/badge.svg)](https://huggingface.co/spaces/implicit-personalization/persona-ui) Streamlit interface for persona vector extraction, analysis, and chat. ## Overview A web app built on top of [persona-vectors](../persona-vectors) that provides these tabs: - **Chat** — interactive conversations with a model using persona-based system prompts (templated or biography) - **Analysis** — load local or Hub persona vectors and explore cosine similarity, PCA, UMAP, attribute-colored projections, and dendrograms - **Probing** — sweep and inspect linear probes trained over saved persona vectors - **Extract** — run persona-vector extraction from HuggingFace persona datasets or a local JSONL dataset directly from the browser ## Repository Layout ``` persona-ui/ ├── app.py # Main entry point (Streamlit) ├── state.py # Session state management (chat history, KV cache) ├── tabs/ │ ├── chat.py / chat_ui.py / chat_shared.py # Chat tab │ ├── compare_chat.py # Side-by-side chat comparison mode │ ├── analysis_core.py # Analysis tab entry point │ ├── analysis/ # Analysis tab internals │ │ ├── _shared.py / _state.py # Shared loading + session state │ │ ├── cosine.py # Cosine similarity view │ │ ├── dendrogram.py # Persona dendrograms │ │ └── layered.py # PCA/UMAP/Isomap projections │ ├── extract.py # Extraction tab │ ├── probe.py / probe_ui.py # Probe diagnostics + upload/tracing controls │ └── probe_sweep.py # Probe sweep tab └── utils/ ├── analysis_sources.py # Local + Hub persona-vector store wiring ├── chat.py # Chat generation logic ├── chat_export.py # Export chat logs to JSON ├── contrast.py # Contrastive token log-prob coloring ├── datasets.py # Dataset loader wrapper ├── helpers.py # UI labels and slug helpers ├── probe_trace.py # Chat-token activation tracing ├── probe_overlay.py # Per-token probe-score overlay ├── probes.py / probe_files.py # Probe loading, scoring, artifact paths ├── preload.py # Background startup warmup └── runtime.py # Model caching and NDIF queries ``` Dataset loading and environment helpers are provided by the sibling [persona-data](https://github.com/implicit-personalization/persona-data) package. Core extraction, analysis, and steering logic comes from [persona-vectors](https://github.com/implicit-personalization/persona-vectors). ## Installation ```bash uv sync cp .env.example .env ``` ## Local Development The checked-in dependency config uses published packages. For local package work, uncomment the `tool.uv.sources` block in `pyproject.toml` and keep sibling checkouts next to this repo. Example: ```bash git clone ../persona-data git clone ../persona-vectors ``` Expected layout: ```text parent/ ├── persona-ui ├── persona-data └── persona-vectors ``` ## Quickstart ```bash streamlit run app.py ``` ## Hugging Face Spaces Deployment This app can be deployed to Hugging Face Spaces using Docker. ### Build Locally ```bash docker build -t persona-ui . # Pass your local .env if you want the container to use the same configuration docker run --env-file .env --rm -p 8501:8501 persona-ui ``` ## Configuration Copy `.env.example` to `.env` and fill in: ```bash NDIF_API_KEY=... # Optional shared NDIF key; users can also enter one per session HF_HOME=... # Optional: HuggingFace cache directory HF_TOKEN=... # Optional: higher Hugging Face Hub rate limits; public datasets do not require it ARTIFACTS_DIR=... # Optional: where persona vectors are read from (default: ./artifacts) PERSONA_VECTORS_HUB_REPO=... # Optional: default Analysis/Probing Hub dataset repo PERSONA_UI_STORE_CACHE_ENTRIES=4 # Optional: open local/Hub vector stores kept warm PERSONA_UI_VECTOR_CACHE_ENTRIES=4 # Optional: loaded analysis datasets kept warm PERSONA_UI_PREPARED_CACHE_ENTRIES=8 # Optional: prepared projections / k-means groups kept warm PERSONA_UI_FIGURE_STATE_ENTRIES=2 # Optional: recent rendered Analysis figures kept in-session PERSONA_UI_PREPARED_STATE_ENTRIES=4 # Optional: recent projection-ready markers kept in-session ``` The app picks up `.env` automatically via `load_dotenv()` on startup, and hosted environments such as Hugging Face Spaces can provide the same values as environment variables. If `NDIF_API_KEY` is unset, Chat and Extract users are prompted for a per-session key when they need remote execution. ## Persona Vectors The Analysis and Probing tabs read persona vectors from either a Hugging Face dataset (pushed by `persona-vectors/main.py push` or the `extraction_*.sh` scripts) or from local artifacts. The Extract tab writes local artifacts to: ``` artifacts/ ├── activations//// # also: persona-vectors/... │ ├── manifest.json │ └── .safetensors └── chats/// └── .json ``` `` is the model name with `/` replaced by `__` (e.g. `google__gemma-2-9b-it`). The manifest stores persona names, tensor shape metadata, and sample ids. Chat exports still store `dataset_source` in the JSON payload.