persona-ui / README.md
Jac-Zac
Updated to latest persona-vector
e8b71ab
metadata
title: persona-ui
colorFrom: blue
colorTo: indigo
sdk: docker
app_port: 8501
pinned: false

Persona UI

Deploy to Hugging Face Spaces

Streamlit interface for persona vector extraction, analysis, and chat.

Overview

A web app built on top of persona-vectors that provides these tabs:

  • Chat β€” interactive conversations with a model using persona-based system prompts (templated or biography)
  • Analysis β€” load local or Hub persona vectors and explore cosine similarity, PCA, UMAP, attribute-colored projections, and dendrograms
  • Probing β€” sweep and inspect linear probes trained over saved persona vectors
  • Extract β€” run persona-vector extraction from HuggingFace persona datasets or a local JSONL dataset directly from the browser

Repository Layout

persona-ui/
β”œβ”€β”€ app.py                   # Main entry point (Streamlit)
β”œβ”€β”€ state.py                 # Session state management (chat history, KV cache)
β”œβ”€β”€ tabs/
β”‚   β”œβ”€β”€ chat.py / chat_ui.py / chat_shared.py  # Chat tab
β”‚   β”œβ”€β”€ compare_chat.py      # Side-by-side chat comparison mode
β”‚   β”œβ”€β”€ analysis_core.py     # Analysis tab entry point
β”‚   β”œβ”€β”€ analysis/            # Analysis tab internals
β”‚   β”‚   β”œβ”€β”€ _shared.py / _state.py            # Shared loading + session state
β”‚   β”‚   β”œβ”€β”€ cosine.py        # Cosine similarity view
β”‚   β”‚   β”œβ”€β”€ dendrogram.py    # Persona dendrograms
β”‚   β”‚   └── layered.py       # PCA/UMAP/Isomap projections
β”‚   β”œβ”€β”€ extract.py           # Extraction tab
β”‚   β”œβ”€β”€ probe.py / probe_ui.py  # Probe diagnostics + upload/tracing controls
β”‚   └── probe_sweep.py       # Probe sweep tab
└── utils/
    β”œβ”€β”€ analysis_sources.py  # Local + Hub persona-vector store wiring
    β”œβ”€β”€ chat.py              # Chat generation logic
    β”œβ”€β”€ chat_export.py       # Export chat logs to JSON
    β”œβ”€β”€ contrast.py          # Contrastive token log-prob coloring
    β”œβ”€β”€ datasets.py          # Dataset loader wrapper
    β”œβ”€β”€ helpers.py           # UI labels and slug helpers
    β”œβ”€β”€ probe_trace.py       # Chat-token activation tracing
    β”œβ”€β”€ probe_overlay.py     # Per-token probe-score overlay
    β”œβ”€β”€ probes.py / probe_files.py  # Probe loading, scoring, artifact paths
    β”œβ”€β”€ preload.py           # Background startup warmup
    └── runtime.py           # Model caching and NDIF queries

Dataset loading and environment helpers are provided by the sibling persona-data package. Core extraction, analysis, and steering logic comes from persona-vectors.

Installation

uv sync
cp .env.example .env

Local Development

The checked-in dependency config uses published packages. For local package work, uncomment the tool.uv.sources block in pyproject.toml and keep sibling checkouts next to this repo.

Example:

git clone <persona-data-url> ../persona-data
git clone <persona-vectors-url> ../persona-vectors

Expected layout:

parent/
β”œβ”€β”€ persona-ui
β”œβ”€β”€ persona-data
└── persona-vectors

Quickstart

streamlit run app.py

Hugging Face Spaces Deployment

This app can be deployed to Hugging Face Spaces using Docker.

Build Locally

docker build -t persona-ui .
# Pass your local .env if you want the container to use the same configuration
docker run --env-file .env --rm -p 8501:8501 persona-ui

Configuration

Copy .env.example to .env and fill in:

NDIF_API_KEY=...       # Optional shared NDIF key; users can also enter one per session
HF_HOME=...            # Optional: HuggingFace cache directory
HF_TOKEN=...           # Optional: higher Hugging Face Hub rate limits; public datasets do not require it
ARTIFACTS_DIR=...      # Optional: where persona vectors are read from (default: ./artifacts)
PERSONA_VECTORS_HUB_REPO=...  # Optional: default Analysis/Probing Hub dataset repo
PERSONA_UI_STORE_CACHE_ENTRIES=4      # Optional: open local/Hub vector stores kept warm
PERSONA_UI_VECTOR_CACHE_ENTRIES=4     # Optional: loaded analysis datasets kept warm
PERSONA_UI_PREPARED_CACHE_ENTRIES=8   # Optional: prepared projections / k-means groups kept warm
PERSONA_UI_FIGURE_STATE_ENTRIES=2     # Optional: recent rendered Analysis figures kept in-session
PERSONA_UI_PREPARED_STATE_ENTRIES=4   # Optional: recent projection-ready markers kept in-session

The app picks up .env automatically via load_dotenv() on startup, and hosted environments such as Hugging Face Spaces can provide the same values as environment variables. If NDIF_API_KEY is unset, Chat and Extract users are prompted for a per-session key when they need remote execution.

Persona Vectors

The Analysis and Probing tabs read persona vectors from either a Hugging Face dataset (pushed by persona-vectors/main.py push or the extraction_*.sh scripts) or from local artifacts. The Extract tab writes local artifacts to:

artifacts/
β”œβ”€β”€ activations/<model_dir>/<mask_strategy>/<prompt_variant>/   # also: persona-vectors/...
β”‚   β”œβ”€β”€ manifest.json
β”‚   └── <persona_id>.safetensors
└── chats/<model_dir>/<persona_id>/
    └── <export>.json

<model_dir> is the model name with / replaced by __ (e.g. google__gemma-2-9b-it). The manifest stores persona names, tensor shape metadata, and sample ids. Chat exports still store dataset_source in the JSON payload.