Spaces:

implicit-personalization
/

persona-ui

Running

App Files Files Community

persona-ui / README.md

Jac-Zac

Updated to latest persona-vector

e8b71ab 1 day ago

preview code

raw

history blame contribute delete

5.8 kB

metadata

title: persona-ui
colorFrom: blue
colorTo: indigo
sdk: docker
app_port: 8501
pinned: false

Persona UI

Streamlit interface for persona vector extraction, analysis, and chat.

Overview

A web app built on top of persona-vectors that provides these tabs:

Chat — interactive conversations with a model using persona-based system prompts (templated or biography)
Analysis — load local or Hub persona vectors and explore cosine similarity, PCA, UMAP, attribute-colored projections, and dendrograms
Probing — sweep and inspect linear probes trained over saved persona vectors
Extract — run persona-vector extraction from HuggingFace persona datasets or a local JSONL dataset directly from the browser

Repository Layout

persona-ui/
├── app.py                   # Main entry point (Streamlit)
├── state.py                 # Session state management (chat history, KV cache)
├── tabs/
│   ├── chat.py / chat_ui.py / chat_shared.py  # Chat tab
│   ├── compare_chat.py      # Side-by-side chat comparison mode
│   ├── analysis_core.py     # Analysis tab entry point
│   ├── analysis/            # Analysis tab internals
│   │   ├── _shared.py / _state.py            # Shared loading + session state
│   │   ├── cosine.py        # Cosine similarity view
│   │   ├── dendrogram.py    # Persona dendrograms
│   │   └── layered.py       # PCA/UMAP/Isomap projections
│   ├── extract.py           # Extraction tab
│   ├── probe.py / probe_ui.py  # Probe diagnostics + upload/tracing controls
│   └── probe_sweep.py       # Probe sweep tab
└── utils/
    ├── analysis_sources.py  # Local + Hub persona-vector store wiring
    ├── chat.py              # Chat generation logic
    ├── chat_export.py       # Export chat logs to JSON
    ├── contrast.py          # Contrastive token log-prob coloring
    ├── datasets.py          # Dataset loader wrapper
    ├── helpers.py           # UI labels and slug helpers
    ├── probe_trace.py       # Chat-token activation tracing
    ├── probe_overlay.py     # Per-token probe-score overlay
    ├── probes.py / probe_files.py  # Probe loading, scoring, artifact paths
    ├── preload.py           # Background startup warmup
    └── runtime.py           # Model caching and NDIF queries

Dataset loading and environment helpers are provided by the sibling persona-data package. Core extraction, analysis, and steering logic comes from persona-vectors.

Installation

uv sync
cp .env.example .env

Local Development

The checked-in dependency config uses published packages. For local package work, uncomment the tool.uv.sources block in pyproject.toml and keep sibling checkouts next to this repo.

Example:

git clone <persona-data-url> ../persona-data
git clone <persona-vectors-url> ../persona-vectors

Expected layout:

parent/
├── persona-ui
├── persona-data
└── persona-vectors

Quickstart

streamlit run app.py

Hugging Face Spaces Deployment

This app can be deployed to Hugging Face Spaces using Docker.

Build Locally

docker build -t persona-ui .
# Pass your local .env if you want the container to use the same configuration
docker run --env-file .env --rm -p 8501:8501 persona-ui

Configuration

Copy .env.example to .env and fill in:

NDIF_API_KEY=...       # Optional shared NDIF key; users can also enter one per session
HF_HOME=...            # Optional: HuggingFace cache directory
HF_TOKEN=...           # Optional: higher Hugging Face Hub rate limits; public datasets do not require it
ARTIFACTS_DIR=...      # Optional: where persona vectors are read from (default: ./artifacts)
PERSONA_VECTORS_HUB_REPO=...  # Optional: default Analysis/Probing Hub dataset repo
PERSONA_UI_STORE_CACHE_ENTRIES=4      # Optional: open local/Hub vector stores kept warm
PERSONA_UI_VECTOR_CACHE_ENTRIES=4     # Optional: loaded analysis datasets kept warm
PERSONA_UI_PREPARED_CACHE_ENTRIES=8   # Optional: prepared projections / k-means groups kept warm
PERSONA_UI_FIGURE_STATE_ENTRIES=2     # Optional: recent rendered Analysis figures kept in-session
PERSONA_UI_PREPARED_STATE_ENTRIES=4   # Optional: recent projection-ready markers kept in-session

The app picks up .env automatically via load_dotenv() on startup, and hosted environments such as Hugging Face Spaces can provide the same values as environment variables. If NDIF_API_KEY is unset, Chat and Extract users are prompted for a per-session key when they need remote execution.

Persona Vectors

The Analysis and Probing tabs read persona vectors from either a Hugging Face dataset (pushed by persona-vectors/main.py push or the extraction_*.sh scripts) or from local artifacts. The Extract tab writes local artifacts to:

artifacts/
├── activations/<model_dir>/<mask_strategy>/<prompt_variant>/   # also: persona-vectors/...
│   ├── manifest.json
│   └── <persona_id>.safetensors
└── chats/<model_dir>/<persona_id>/
    └── <export>.json

<model_dir> is the model name with / replaced by __ (e.g. google__gemma-2-9b-it). The manifest stores persona names, tensor shape metadata, and sample ids. Chat exports still store dataset_source in the JSON payload.