Spaces:

thisismrismail
/

scenarist

Sleeping

App Files Files Community

scenarist / API_DOCUMENTATION.md

github-actions[bot]

Sync backend to Hugging Face Space (commit: 39b5c807918249fa80049d49f4b6a74d6a0ed1fc)

6d86412 2 days ago

preview code

raw

history blame contribute delete

44 kB

Orsync Scenarist v7.0 — API Documentation

Base URL: http://localhost:7860 Interactive Docs: http://localhost:7860/docs (Swagger UI) OpenAPI Schema: http://localhost:7860/openapi.json

Read/demo endpoints are available for local development. Destructive admin mutation endpoints are disabled by default unless ENABLE_ADMIN_MUTATIONS=true; when ADMIN_TOKEN is configured, those mutation endpoints also require the X-Admin-Token header or an admin bearer token. This is an MVP guard, not a full production auth/RBAC system.

Project Workflow Overview
Getting Started — First-Run Sequence
Core Workflows
API Endpoints Reference
Data Models
Error Handling

1. Project Workflow Overview

Orsync Scenarist is a pharma strategic intelligence platform. The system works in three phases:

┌──────────────────────────────────────────────────────────────────────┐
│                        PHASE 1: DATA SETUP                          │
│                                                                      │
│  POST /api/pipeline/seed                                             │
│  └─→ Loads ~480 HCP (doctor) profiles into Neo4j knowledge graph     │
│                                                                      │
├──────────────────────────────────────────────────────────────────────┤
│                    PHASE 2: CAMPAIGN ANALYSIS                        │
│                                                                      │
│  POST /api/strategy/full-evaluate  { campaign_text: "..." }         │
│  └─→ Vectorizes campaign text into 12 behavioral dimensions          │
│  └─→ Runs GMM clustering on the HCP population                      │
│  └─→ Computes Mahalanobis distance to each cluster centroid          │
│  └─→ Returns heatmap, cluster cards, rejection check                 │
│  └─→ Auto-optimizes campaign if rejected                             │
│                                                                      │
├──────────────────────────────────────────────────────────────────────┤
│                   PHASE 3: SIMULATION & TRAINING                     │
│                                                                      │
│  GET  /api/persona/from-cluster/{id}   → pick a doctor persona      │
│  POST /api/simulation/start            → begin roleplay session      │
│  POST /api/simulation/turn  (repeat)   → conversation loop           │
│  POST /api/mohp/evaluate               → compliance check per turn   │
│  GET  /api/analytics/session/{id}      → post-session review         │
│                                                                      │
└──────────────────────────────────────────────────────────────────────┘

12 Canonical Strategy Feature Keys

Campaign vectors and HCP behavioral vectors use the same ordered 12D feature space. The internal order is canonical and must not be alphabetized:

#	FEATURE_KEY
0	`therapeutic_focus`
1	`messaging_tone`
2	`target_seniority`
3	`channel_preference`
4	`kol_alignment`
5	`trial_phase_relevance`
6	`formulary_impact`
7	`patient_population_size`
8	`competitive_positioning`
9	`regulatory_stage`
10	`budget_tier`
11	`urgency_score`

Strategy Clusters

Strategy cluster_cards are dynamic. The Strategy pipeline fits or reuses cached GMM/PCA artifacts over HCP behavioral features and returns however many clusters the fitted model selected for the current HCP data/filter/cache key. The latest live smoke showed 10 cluster_cards. Older four-archetype language may still apply to the compatibility persona endpoint, but Strategy, War Room, Analytics, and HCP Explorer should not assume exactly four clusters.

Strategy cluster display names are deterministic and unique by cluster_id for the supported dynamic cluster range, using oncology-relevant segment labels such as Evidence-Driven KOLs, Digital Oncology Adopters, and Precision-Medicine Champions. The four legacy archetype names are reserved for the older persona compatibility endpoint and should not be treated as Strategy, War Room, Analytics, or HCP Explorer labels.

Infrastructure, Persistence, and Runtime Guards

Expensive endpoints are rate limited by default. This includes Strategy vectorization/full evaluation/optimization, simulation turns, and MOHP evaluation. A rate-limited request returns HTTP 429 with: { "error": "rate_limited", "retry_after_seconds": 12 }.
Redis-backed session updates use atomic mutation logic with Redis WATCH/transaction retries, with a per-session local lock fallback when Redis transactions are unavailable. This avoids losing rapid simulation turns or analytics appends in normal MVP operation.
REDIS_URL, NEO4J_URI, NEO4J_USERNAME, NEO4J_PASSWORD, CHROMA_HOST, CHROMA_PORT, and CHROMA_URL can point to local Docker services or managed external services. Startup diagnostics report local vs external mode without logging passwords.
The backend Dockerfile still starts local Redis, Neo4j, and Chroma for development/demo use. Production deployments should mount persistent volumes or use managed services: redis_data, neo4j_data, and chroma_data are the expected volume concepts for durable local container deployments.
Destructive admin mutation endpoints remain protected by the MVP guard: ENABLE_ADMIN_MUTATIONS=true is required, and ADMIN_TOKEN is enforced when configured. This is not full auth/RBAC.

2. Getting Started — First-Run Sequence

After starting the server, call these endpoints in order:

Step 1 — Verify server is running:
  GET /healthz
  → { "status": "ok" }

Step 2 — Seed the knowledge graph with doctor data:
  POST /api/pipeline/seed
  → { "status": "seeded", "records_loaded": 480, "records_ingested": 480 }

Step 3 — Verify doctors loaded:
  GET /api/graph/cluster/0/doctors?limit=3
  → { "cluster_id": 0, "doctors": [...], "count": 3 }

Step 4 — Run your first campaign analysis:
  POST /api/strategy/full-evaluate
  { "campaign_text": "Our Phase 3 trial demonstrated 40% improvement in PFS..." }

Step 5 — Check system stats:
  GET /api/stats/embedding
  GET /api/stats/projection
  GET /api/stats/cache

3. Core Workflows

3.1 Campaign Analysis Workflow

This is the primary use case. A user submits campaign text and gets a full strategic analysis.

Current LLM, Optimization, and Cluster Behavior

Ollama Cloud is wired through backend/app/core/config.py and backend/app/core/llm_client.py.
Without OLLAMA_API_KEY, the strategy response uses heuristic fallback, shown by VectorizationModel=fallback-no-api-key and OptimizedReason=heuristic_fallback_no_api_key when optimization is triggered.
With OLLAMA_API_KEY loaded at backend startup, the latest validation activated the LLM path with VectorizationModel=gemma4:31b-cloud; forced rejected scenarios returned OptimizedReason=llm_rewrite rather than heuristic_fallback_no_api_key.
Full LLM plus vector database optimization requires setting OLLAMA_API_KEY and restarting the backend so the new environment is loaded.
OLLAMA_HOST=0.0.0.0 is treated as a server bind address, not a client target. With an Ollama Cloud key present, the backend resolves that to the cloud endpoint; without a key, it resolves to local loopback for local Ollama development.
If the provider rejects the key/model or is unreachable, Strategy returns an explicit VectorizationModel=fallback-llm-error / OptimizedReason=heuristic_fallback_llm_error fallback instead of crashing the page.
The 50-scenario oncology validation called Chroma campaign_memory and clinical_evidence retrieval in the LLM optimization path; three uploaded-context scenarios passed context_id and exercised uploaded context chunks.
In the LLM path, backend/app/services/rag_optimizer.py uses the submitted campaign text, Chroma campaign_memory, Chroma clinical_evidence, and uploaded context chunks when context_id is passed. That assembled context is sent to the LLM.
In fallback mode, optimization does not use the vector database or uploaded document context for the rewrite.
Doctor/HCP features are used to build, select, and rank clusters before optimization. The optimizer receives the chosen cluster and target vector; it does not optimize directly over raw doctor rows.
Strategy clusters are fitted and cached from doctor/HCP data. They do not refit based only on campaign text. Campaign text can change ranking, probabilities, and the best-fit cluster, but not the fitted cluster model itself.
Clusters can change when doctor data changes, a region filter changes the HCP population, the adapter/schema/min_k/data source changes, or the cache/model version changes.

Frontend                                    Backend
───────                                    ───────
  │                                           │
  │  POST /api/strategy/full-evaluate         │
  │  { "campaign_text": "...",                │
  │    "region": "egypt" }                    │
  │ ─────────────────────────────────────────>│
  │                                           │ 1. Load HCP population
  │                                           │ 2. Reuse/fit cached dynamic GMM artifacts
  │                                           │ 3. Vectorize campaign (→12D)
  │                                           │ 4. Project to PCA space
  │                                           │ 5. Compute Mahalanobis distances
  │                                           │ 6. Check rejection threshold
  │<──────────────────────────────────────────│
  │  {                                        │
  │    campaign_vector_12d: [...],            │
  │    cluster_cards: [...],                  │
  │    heatmap: {...},                        │
  │    rejected: false,                       │
  │    optimized: null                        │
  │  }                                        │
  │                                           │
  │  (If rejected=true, show optimized text)  │
  │                                           │
  │  User clicks on Cluster 0 card            │
  │                                           │
  │  GET /api/strategy/cluster/0/doctors      │
  │      ?limit=50&region=egypt               │
  │ ─────────────────────────────────────────>│
  │<──────────────────────────────────────────│
  │  { doctors: [...], total_in_cluster: 120 }│
  │                                           │
  │  User clicks on a specific doctor         │
  │                                           │
  │  GET /api/persona/HCP-00-042              │
  │ ─────────────────────────────────────────>│
  │<──────────────────────────────────────────│
  │  { codeName, traits, h_index, ... }       │
  │                                           │
  │  User saves successful campaign           │
  │                                           │
  │  POST /api/strategy/memory/store          │
  │  { campaign_text, success_score: 0.85 }   │
  │ ─────────────────────────────────────────>│
  │<──────────────────────────────────────────│
  │  { stored: true, campaign_id: "..." }     │

3.2 Simulation Workflow

A rep practices a pitch against an AI doctor persona in a turn-based conversation.

Frontend                                    Backend
───────                                    ───────
  │                                           │
  │  Step 1: Pick a persona                   │
  │  GET /api/persona/from-cluster/0          │
  │ ─────────────────────────────────────────>│
  │<──────────────────────────────────────────│
  │  { codeName: "HCP-00-042", traits: [...] }│
  │                                           │
  │  Step 2: Start session                    │
  │  POST /api/simulation/start               │
  │  { "persona_id": "HCP-00-042" }          │
  │ ─────────────────────────────────────────>│
  │<──────────────────────────────────────────│
  │  { session_id: "abc-123", offer: {...} }  │
  │                                           │
  │  Step 3: WebRTC handshake                 │
  │  POST /api/simulation/handshake           │
  │  { session_id: "abc-123", answer: {...} } │
  │ ─────────────────────────────────────────>│
  │<──────────────────────────────────────────│
  │  { status: "connected" }                  │
  │                                           │
  │  Step 4: Conversation loop (repeat)       │
  │  ┌────────────────────────────────────┐   │
  │  │ POST /api/simulation/turn          │   │
  │  │ { session_id: "abc-123",           │   │
  │  │   input_text: "Doctor, our..." }   │   │
  │  │────────────────────────────────────>│   │
  │  │<───────────────────────────────────│   │
  │  │ { reply_text: "Interesting...",    │   │
  │  │   emotion: "skeptical",            │   │
  │  │   adherence_score: 0.72,           │   │
  │  │   objections: [...] }              │   │
  │  └────────────────────────────────────┘   │
  │                                           │
  │  Step 5: (Optional) Check compliance      │
  │  POST /api/mohp/evaluate                  │
  │  { session_id, input_text, cluster_id:0 } │
  │ ─────────────────────────────────────────>│
  │<──────────────────────────────────────────│
  │  { objections: [{severity, guideline}] }  │
  │                                           │
  │  Step 6: Review session                   │
  │  GET /api/analytics/session/abc-123       │
  │ ─────────────────────────────────────────>│
  │<──────────────────────────────────────────│
  │  { durationMs, adherenceScore,            │
  │    emotionTimeline, conversation }        │

3.3 HCP Exploration Workflow

Browse dynamic Strategy HCP segments, doctor-level behavioral fields, and legacy graph research context where needed. The launch HCP Explorer first calls GET /api/strategy/clusters and then GET /api/strategy/cluster/{cluster_id}/doctors?limit=50; graph endpoints remain available for institution/topic research context.

1. GET /api/graph/institutions/summary?limit=20
   → Top 20 institutions by doctor count

2. GET /api/graph/institution/Brigham%20and%20Women's%20Hospital/doctors?limit=50
   → All doctors at that institution

3. GET /api/graph/doctor/HCP-00-042
   → Full doctor profile (h-index, publications, topics, institution)

4. GET /api/graph/overlap?code_name_a=HCP-00-001&code_name_b=HCP-00-042
   → Shared research topics between two doctors

5. GET /api/graph/topic/Biomarker%20Discovery/doctors?limit=50
   → All doctors researching a given topic

4. API Endpoints Reference

4.1 Health

`GET /healthz`

Server health check.

Response:

{ "status": "ok" }

4.2 Pipeline — Data Ingestion

`POST /api/pipeline/seed`

Loads the gold-layer doctor dataset (~480 profiles) from data/gold/doctors_unified.json into Neo4j. Call this once after first setup.

Response:

{
  "status": "seeded",
  "records_loaded": 480,
  "records_ingested": 480,
  "source_file": "C:\\code\\...\\data\\gold\\doctors_unified.json"
}

`POST /api/pipeline/ingest`

Queue an arbitrary event for outbox processing.

Request Body: Any JSON object.

Response:

{ "status": "queued", "event_id": "uuid-string" }

`POST /api/pipeline/dispatch`

Manually trigger outbox dispatch (processes one pending event).

Response:

{ "processed": true }

4.3 Strategy — Campaign Engine

`POST /api/strategy/full-evaluate` ⭐ Primary Endpoint

The all-in-one campaign analysis endpoint. This is the main entry point for the campaign workflow.

Request Body:

{
  "campaign_text": "Our Phase 3 trial demonstrated...",
  "rejection_distance_threshold": 3.0,
  "region": "egypt"
}

Field	Type	Required	Description
`campaign_text`	string	Yes	Campaign message text (min 1 char)
`rejection_distance_threshold`	float	No	Override threshold (default: 3.0, must be > 0)
`region`	string \| null	No	Filter HCPs: `"egypt"`, `"saudi"`, `"gulf"`, or `null` for all

Response:

{
  "campaign_vector_12d": [0.82, 0.65, 0.71, 0.30, 0.55, 0.40, 0.78, 0.25, 0.60, 0.45, 0.50, 0.70],
  "campaign_vector_pca": [1.23, -0.67, 0.45],
  "gmm": {
    "k": 10,
    "data_source": "bronze_dynamic_master",
    "feature_names": ["therapeutic_focus", "..."],
    "scaler_center": [0.5, "..."],
    "scaler_scale": [0.2, "..."],
    "n_pca_components": 3,
    "pca_explained_variance_ratio": [0.45, 0.25, 0.15],
    "centroids": [[...], [...], [...], [...]],
    "covariances": [[[...]], [[...]], [[...]], [[...]]]
  },
  "projection": {
    "path": "scaled_pca",
    "fallback_used": false,
    "input_dimensions": 12,
    "output_dimensions": 3
  },
  "heatmap": {
    "ranking": [
      {
        "cluster_id": 1,
        "label": "Digital Oncology Adopters",
        "distance": 1.23,
        "probability": 0.45,
        "top_doctors": []
      }
    ]
  },
  "cluster_cards": [
    {
      "id": "c0",
      "cluster_id": 0,
      "name": "Evidence-Driven KOLs",
      "score": 35.0,
      "distance": 2.1,
      "probability": 0.35
    }
  ],
  "rejection_distance_threshold": 3.0,
  "rejected": false,
  "optimized": null,
  "vectorization_model": "gemma4:31b-cloud"
}

When rejected is true, optimized will contain:

{
  "original_text": "...",
  "optimized_text": "...",
  "target_cluster": 0,
  "improvements": ["Added evidence-based language", "..."]
}

`POST /api/strategy/vectorize`

Convert campaign text into a 12-dimensional feature vector.

Request Body:

{
  "text": "Campaign message text",
  "campaign_id": "optional-id"
}

Response:

{
  "normalized_features": {
    "therapeutic_focus": 0.82,
    "messaging_tone": 0.65,
    "target_seniority": 0.71,
    "channel_preference": 0.45,
    "kol_alignment": 0.50,
    "trial_phase_relevance": 0.70
  },
  "embedding": [...],
  "embedding_model": "onnx-minilm",
  "model": "gemma4:31b-cloud"
}

`POST /api/strategy/heatmap`

Build a heatmap from a pre-computed campaign vector and GMM centroids/covariances.

Request Body:

{
  "campaign_vector": [12 floats],
  "centroids": [[...], [...]],
  "covariances": [[[...]], [[...]]],
  "cluster_top_doctors": { "0": ["HCP-00-001"], "1": ["HCP-01-042"] }
}

Field	Type	Required	Validation
`campaign_vector`	float[]	Yes	Min 1 element
`centroids`	float[][]	Yes	Must match `campaign_vector` dims
`covariances`	float[][][]	Yes	Must match `centroids` count and dims
`cluster_top_doctors`	dict	No	Cluster ID → list of doctor code_names

Response:

{
  "ranking": [
    { "cluster_id": 0, "label": "...", "distance": 1.5, "probability": 0.35, "top_doctors": [...] }
  ]
}

`POST /api/strategy/evaluate`

Advanced evaluation with custom centroids (BYO GMM results).

Request Body:

{
  "campaign_text": "...",
  "centroids": [[...], [...]],
  "covariances": [[[...]], [[...]]],
  "cluster_top_doctors": null,
  "rejection_distance_threshold": 3.0
}

Response:

{
  "campaign_vector": [12 floats],
  "heatmap": { "ranking": [...] },
  "rejection_distance_threshold": 3.0,
  "rejected": false,
  "optimized": null
}

`POST /api/strategy/optimize`

Rewrite a campaign to better target a specific cluster using LLM.

Request Body:

{
  "campaign_text": "Original campaign text",
  "target_cluster": 0,
  "target_centroid_vector": [12 floats]
}

Response:

{
  "original_text": "...",
  "optimized_text": "...",
  "target_cluster": 0,
  "improvements": ["Added evidence-based language"]
}

`POST /api/strategy/memory/store`

Store a campaign outcome for future RAG-based optimization.

Request Body:

{
  "campaign_text": "...",
  "campaign_id": "optional-id",
  "outcome": "success",
  "success_score": 0.85,
  "is_successful": true,
  "cluster_id": 0,
  "extra_metadata": {}
}

Field	Type	Required	Default
`campaign_text`	string	Yes	—
`campaign_id`	string	No	Auto-generated UUID
`outcome`	string	No	`""`
`success_score`	float	No	`0.0` (range 0.0–1.0)
`is_successful`	bool	No	Derived from score/outcome
`cluster_id`	int	No	`null`
`extra_metadata`	dict	No	`{}`

Response:

{
  "stored": true,
  "campaign_id": "campaign_abc123",
  "embedding_model": "onnx-minilm",
  "is_successful": true
}

`GET /api/strategy/clusters`

Return dynamic Strategy cluster metadata for launch-path UIs such as War Room and HCP selection. These labels are the same oncology-relevant dynamic labels used by Strategy cluster_cards; they are not the old fixed four-archetype persona labels.

Query Parameters:

Param	Type	Default	Values
`region`	string	null	`egypt`, `saudi`, `gulf`

Response:

{
  "k": 10,
  "source": "bronze_dynamic_master",
  "region": null,
  "total_in_db": 480,
  "clusters": [
    {
      "cluster_id": 1,
      "name": "Digital Oncology Adopters",
      "description": "Digital-ready oncology adoption segment",
      "total_in_cluster": 64
    }
  ]
}

`GET /api/strategy/cluster/{cluster_id}/doctors`

Get doctors from the master CSV database assigned to a GMM cluster.

Path Parameters: cluster_id (int)

Query Parameters:

Param	Type	Default	Range
`limit`	int	50	1–200
`region`	string	null	`egypt`, `saudi`, `gulf`

Response:

{
  "cluster_id": 0,
  "total_in_cluster": 120,
  "total_in_db": 480,
  "k": 10,
  "region": "egypt",
  "doctors": [
    {
      "cluster_id": 0,
      "name": "HCP-00-042",
      "region": "egypt",
      "headline": "Professor of Oncology",
      "location": "Cairo, Egypt",
      "company": "Cairo University Hospital",
      "job": "Senior Oncology Physician",
      "school": "Cairo University",
      "school_degree": "MD, PhD",
      "primary_specialty": "Oncology",
      "seniority_level": "Senior",
      "highest_academic_degree": "PhD",
      "total_years_experience": 22,
      "expected_age": 50,
      "age_group": "45-54",
      "current_role_tenure": 8,
      "kol_status": true,
      "digital_presence": true,
      "academic_affiliation": true,
      "workplace_category": "Academic Medical Center",
      "institutional_tier": true,
      "adoption_profile": "Early_Adopter",
      "channel_preference": "High_Digital",
      "patient_volume_proxy": "High",
      "skepticism_level": "Medium",
      "cognitive_processing_style": "Empirical"
    }
  ]
}

4.4 Persona — Doctor Archetypes

`GET /api/persona/from-cluster/{cluster_id}`

Generate a persona profile from a legacy cluster behavioral template. The launch-path War Room now uses /api/strategy/clusters and dynamic Strategy cluster names for HCP selection; this older persona endpoint remains available for compatibility and should not be used to infer Strategy cluster_cards count.

Path Parameters: cluster_id (int, 0–3)

Response:

{
  "codeName": "HCP-00-042",
  "clusterId": 0,
  "clusterLabel": "The Academic Skeptic",
  "traits": [
    { "axis": "Scientific Rigor", "value": 0.923 },
    { "axis": "Innovation Appetite", "value": 0.312 },
    { "axis": "Guideline Adherence", "value": 0.891 },
    { "axis": "Price Sensitivity", "value": 0.187 },
    { "axis": "Risk Tolerance", "value": 0.162 },
    { "axis": "Peer Influence", "value": 0.715 },
    { "axis": "Evidence Threshold", "value": 0.935 },
    { "axis": "Formulary Weight", "value": 0.389 },
    { "axis": "Patient Centricity", "value": 0.512 },
    { "axis": "Digital Readiness", "value": 0.215 },
    { "axis": "KOL Alignment", "value": 0.838 },
    { "axis": "Trial Participation", "value": 0.887 }
  ]
}

`GET /api/persona/{code_name}`

Get a specific doctor's persona profile with academic metrics.

Path Parameters: code_name (string, e.g., "HCP-00-042")

Response:

{
  "codeName": "HCP-00-042",
  "clusterId": 0,
  "clusterLabel": "The Academic Skeptic",
  "traits": [...],
  "h_index": 41,
  "works_count": 129,
  "cited_by_count": 6958,
  "years_active": 16
}

4.5 Simulation — Roleplay Engine

`POST /api/simulation/start`

Start a new simulation session with a doctor persona. War Room should pass the selected dynamic cluster name in campaign_snapshot.cluster_name so Analytics can display the same Strategy segment label later.

Request Body:

{
  "persona_id": "HCP-00-042",
  "campaign_id": "camp-001",
  "campaign_snapshot": {
    "cluster_id": 1,
    "cluster_name": "Digital Oncology Adopters",
    "source": "war_room"
  }
}

Field	Type	Required
`persona_id`	string	Yes (min 1 char)
`campaign_id`	string	No
`campaign_snapshot`	object	No

`POST /api/simulation/end`

Mark a simulation session as ended so Analytics no longer treats it as active.

Request Body:

{ "session_id": "abc-123" }

Response:

{ "session_id": "abc-123", "ended": true }

Response:

{
  "session_id": "uuid-string",
  "offer": { "type": "offer", "sdp": "..." },
  "persona": { ... }
}

`POST /api/simulation/handshake`

Complete WebRTC signaling.

Request Body:

{
  "session_id": "uuid-string",
  "answer": { "type": "answer", "sdp": "..." }
}

Response:

{ "status": "connected" }

`POST /api/simulation/ice-candidate`

Exchange ICE candidates (can be called multiple times).

Request Body:

{
  "session_id": "uuid-string",
  "candidate": { "candidate": "...", "sdpMid": "0", "sdpMLineIndex": 0 }
}

Response:

{ "accepted": true }

`POST /api/simulation/turn`

Send the rep's message and receive the AI persona's response. This is the core conversation loop — call repeatedly.

Request Body:

{
  "session_id": "uuid-string",
  "input_text": "Doctor, our Phase 3 trial showed a 40% improvement in PFS...",
  "input_audio_base64": ""
}

Field	Type	Required
`session_id`	string	Yes
`input_text`	string	No (empty = audio only)
`input_audio_base64`	string	No (base64-encoded audio)

Response:

{
  "reply_text": "Interesting, but I'd like to see the subgroup analysis...",
  "reply_audio_base64": "...",
  "emotion": "skeptical",
  "adherence_score": 0.72,
  "talking_points_delivered": 3,
  "talking_points_total": 5,
  "objections": [...],
  "session_summary": { ... }
}

`GET /api/simulation/cache/{cache_key}`

Check the semantic cache for a previously computed response.

Response:

{ "hit": true, "value": "cached response text" }

4.6 MOHP — Objection Detection

`POST /api/mohp/evaluate`

Analyze a rep's statement for possible oncology medical/compliance objections. MOHP is a deterministic rule baseline plus optional Ollama/Chroma clinical evidence enhancement when OLLAMA_API_KEY and the provider are available. It is not a regulatory approval engine and not patient medical advice.

Request Body:

{
  "session_id": "uuid-string",
  "input_text": "This drug is 100% effective with no side effects",
  "cluster_id": 0,
  "persona_id": ""
}

Field	Type	Required	Default
`session_id`	string	Yes	—
`input_text`	string	Yes	—
`cluster_id`	int	No	`0`
`persona_id`	string	No	`""`

Response:

{
  "session_id": "uuid-string",
  "objections": [
    {
      "id": "mohp-a1b2c3d4",
      "timestamp": 1713355200000,
      "objection": "Clarify evidence strength before making survival claims.",
      "guideline": "retrieved clinical evidence",
      "severity": "medium",
      "safer_phrasing": "Use qualified evidence language.",
      "evidence_gap": "Head-to-head survival evidence was not retrieved."
    }
  ],
  "count": 1,
  "mohp_mode": "llm_rag_enhanced",
  "llm_enhancement_used": true,
  "clinical_evidence_retrieval_called": true,
  "clinical_evidence_count": 1,
  "retrieval_empty": false,
  "llm_error": false
}

Severity levels: "low", "medium", "high"

If OLLAMA_API_KEY is missing or the provider fails, the endpoint returns mohp_mode="rule_fallback" and llm_enhancement_used=false. If clinical retrieval is empty, retrieval_empty=true and the response must not invent guideline names.

Rule baseline databases by legacy cluster profile:

Cluster 0 (Academic Skeptic): NCCN evidence, ICH E9 statistics, FDA endpoint guidance, EBM hierarchy
Cluster 1 (Commercial Adopter): Novel mechanism validation, digital health, rapid adoption
Cluster 2 (Guideline Loyalist): On-label requirements, guideline standards, risk-benefit framework
Cluster 3 (Price-Sensitive): Health economics, formulary positioning, patient access

4.7 Graph — Knowledge Graph

`POST /api/graph/ingest`

Ingest doctor records into the Neo4j knowledge graph.

Request Body:

{
  "records": [
    {
      "code_name": "HCP-00-042",
      "cluster_id": 0,
      "h_index": 41,
      "institution": "Brigham and Women's Hospital",
      "topics": ["Biomarker Discovery"]
    }
  ]
}

Response:

{ "status": "ok", "ingested": 1 }

`GET /api/graph/doctor/{code_name}`

Get a single doctor's full profile from the graph.

Response:

{
  "code_name": "HCP-00-042",
  "source": "seed_synthetic",
  "cluster_id": 0,
  "h_index": 41,
  "works_count": 129,
  "cited_by_count": 6958,
  "i10_index": 78,
  "years_active": 16,
  "institution": "Brigham and Women's Hospital",
  "institution_type": "academic_medical_center",
  "institution_country": "US",
  "topics": ["Biomarker Discovery", "Minimal Residual Disease"]
}

`GET /api/graph/cluster/{cluster_id}/doctors?limit=50`

Get doctors by GMM cluster (ranked by h-index).

Query: limit (int, 1–500, default 50)

Response:

{ "cluster_id": 0, "doctors": [...], "count": 50 }

`GET /api/graph/institution/{institution_name}/doctors?limit=50`

Get doctors affiliated with an institution.

Response:

{ "institution": "Mount Sinai Hospital", "doctors": [...], "count": 12 }

`GET /api/graph/topic/{topic_name}/doctors?limit=50`

Get doctors researching a specific topic.

Response:

{ "topic": "Biomarker Discovery", "doctors": [...], "count": 8 }

`GET /api/graph/institutions/summary?limit=20`

Get institutions ranked by affiliated doctor count.

Query: limit (int, 1–100, default 20)

Response:

{
  "institutions": [
    { "name": "Brigham and Women's Hospital", "doctor_count": 15, "avg_h_index": 38.2 }
  ]
}

`GET /api/graph/overlap?code_name_a=HCP-00-001&code_name_b=HCP-00-042`

Find shared research topics between two doctors.

Query Parameters (required): code_name_a, code_name_b

Response:

{
  "doctor_a": "HCP-00-001",
  "doctor_b": "HCP-00-042",
  "shared_topics": ["Biomarker Discovery"],
  "count": 1
}

4.8 Analytics — Session History

`GET /api/analytics/sessions?limit=50`

List real simulation sessions ordered by recency. Empty stores return an empty array; the API no longer injects fake/demo analytics sessions into the live UI.

Query: limit (int, 1–200, default 50)

Response:

{ "sessions": [{ "session_id": "...", "persona_id": "...", ... }] }

`GET /api/analytics/session/{session_id}`

Full session analytics for post-simulation review.

Response:

{
  "sessionId": "abc-123",
  "personaId": "HCP-00-042",
  "campaignId": "camp-001",
  "clusterId": 0,
  "durationMs": 245000,
  "adherenceScore": 0.72,
  "emotionTimeline": [
    { "timestamp": 1234, "emotion": "neutral" },
    { "timestamp": 5678, "emotion": "skeptical" }
  ],
  "totalPoints": 5,
  "deliveredPoints": 3,
  "objections": [
    { "id": "mohp-...", "severity": "medium", "guideline": "..." }
  ],
  "conversation": [
    { "role": "rep", "text": "Doctor, our Phase 3..." },
    { "role": "persona", "text": "Interesting, but I'd like..." }
  ]
}

4.9 Stats — System Metrics

`GET /api/stats/embedding`

Active embedding model info.

Response:

{ "model_name": "onnx-minilm", "dimension": 384, "backend": "onnx" }

`GET /api/stats/projection`

Projection bridge (384D → 12D) status.

Response:

{ "ready": true, "input_dim": 384, "output_dim": 12 }

`GET /api/stats/cache`

Redis cache key counts.

Response:

{
  "semantic_cache_keys": 42,
  "session_keys": 5,
  "simulation_session_keys": 3
}

`GET /api/stats/dlq`

Dead letter queue depth.

Response:

{ "dlq_depth": 0 }

`GET /api/stats/outbox`

Pending outbox events count.

Response:

{ "pending_outbox_events": 0 }

4.10 Admin — Embedding Management

`GET /admin/embeddings/status`

Current embedding model and available alternatives.

Response:

{
  "model_name": "onnx-minilm",
  "dimension": 384,
  "known_models": {
    "onnx-minilm": { "description": "all-MiniLM-L6-v2 via ONNX (384-dim, local)", "dimension": 384 },
    "ollama:nomic-embed-text": { "description": "Nomic Embed Text via Ollama (768-dim)", "dimension": 768 },
    "pritamdeka/S-PubMedBert-MS-MARCO": { "description": "PubMedBERT for medical search (768-dim)", "dimension": 768 },
    "NeuML/pubmedbert-base-embeddings": { "description": "PubMedBERT base biomedical (768-dim)", "dimension": 768 }
  }
}

`POST /admin/embeddings/swap`

Hot-swap the embedding model at runtime.

Request Body:

{
  "model_name": "pritamdeka/S-PubMedBert-MS-MARCO",
  "reindex": true
}

Response:

{
  "previous_model": "onnx-minilm",
  "new_model": "pritamdeka/S-PubMedBert-MS-MARCO",
  "reindex": { "collections_reindexed": 1, "documents_reindexed": 42 }
}

`POST /admin/embeddings/reindex`

Re-embed all ChromaDB collections with the current model.

Response:

{
  "collections": { "campaign_memory": { "documents_reindexed": 42 } },
  "model": { "name": "onnx-minilm", "dimension": 384 }
}

4.11 Admin — Dead Letter Queue

`GET /admin/dlq?limit=100`

List failed events in the dead letter queue.

Response:

{ "items": [{ "event_id": "...", "event_type": "...", "payload": {...}, "retries": 3 }] }

`POST /admin/dlq/replay`

Replay the oldest DLQ item back into the outbox.

Response:

{ "replayed": true, "item": { "event_id": "...", ... } }

5. Data Models

Doctor Profile (Gold Dataset)

{
  "code_name": "HCP-00-042",
  "source": "seed_synthetic",
  "cluster_id": 0,
  "h_index": 41,
  "works_count": 129,
  "cited_by_count": 6958,
  "i10_index": 78,
  "years_active": 16,
  "institution": "Brigham and Women's Hospital",
  "institution_type": "academic_medical_center",
  "institution_country": "US",
  "topics": ["Biomarker Discovery", "Minimal Residual Disease"]
}

Campaign Feature Vector (12D)

{
  "therapeutic_focus": 0.82,
  "messaging_tone": 0.65,
  "target_seniority": 0.71,
  "channel_preference": 0.45,
  "kol_alignment": 0.50,
  "trial_phase_relevance": 0.70,
  "formulary_impact": 0.25,
  "patient_population_size": 0.60,
  "competitive_positioning": 0.55,
  "regulatory_stage": 0.78,
  "budget_tier": 0.30,
  "urgency_score": 0.40
}

Heatmap Ranking Entry

{
  "cluster_id": 0,
  "label": "Evidence-Driven KOLs",
  "distance": 1.5,
  "probability": 0.35,
  "top_doctors": ["HCP-00-042", "HCP-00-015"]
}

distance: Mahalanobis distance — lower = better fit
probability: Softmax over distances — higher = better fit

Cluster Card

{
  "id": "c0",
  "cluster_id": 0,
  "name": "Evidence-Driven KOLs",
  "score": 35.0,
  "distance": 2.1,
  "probability": 0.35
}

score: probability x 100 (percentage fit)
Strategy cluster card names are dynamic and unique by cluster_id; do not hardcode four fixed Strategy labels in clients.

Outbox Event

{
  "event_id": "uuid-string",
  "event_type": "gold.ingest",
  "payload": { ... },
  "retries": 0
}

6. Error Handling

All errors use standard HTTP status codes with JSON bodies.

Code	Meaning	When
`400`	Bad Request	Invalid request body, dimension mismatch, missing required fields
`404`	Not Found	Doctor/session/resource not found
`409`	Conflict	Duplicate resource (e.g., username already exists)
`429`	Too Many Requests	Rate limit exceeded for expensive LLM/vectorization endpoints
`422`	Unprocessable Entity	Pydantic validation failure
`500`	Internal Server Error	GMM clustering failure, Neo4j connection error, etc.

422 Validation Error Shape

{
  "detail": [
    {
      "type": "string_too_short",
      "loc": ["body", "campaign_text"],
      "msg": "String should have at least 1 character",
      "input": "",
      "ctx": { "min_length": 1 }
    }
  ]
}

Complete Endpoint Index

#	Method	Path	Tag
1	`GET`	`/healthz`	health
2	`POST`	`/api/pipeline/seed`	pipeline
3	`POST`	`/api/pipeline/ingest`	pipeline
4	`POST`	`/api/pipeline/dispatch`	pipeline
5	`POST`	`/api/strategy/full-evaluate`	strategy
6	`POST`	`/api/strategy/vectorize`	strategy
7	`POST`	`/api/strategy/heatmap`	strategy
8	`POST`	`/api/strategy/evaluate`	strategy
9	`POST`	`/api/strategy/optimize`	strategy
10	`POST`	`/api/strategy/memory/store`	strategy
11	`GET`	`/api/strategy/clusters`	strategy
12	`GET`	`/api/strategy/cluster/{cluster_id}/doctors`	strategy
13	`GET`	`/api/persona/from-cluster/{cluster_id}`	persona
14	`GET`	`/api/persona/{code_name}`	persona
15	`POST`	`/api/simulation/start`	simulation
16	`POST`	`/api/simulation/end`	simulation
17	`POST`	`/api/simulation/handshake`	simulation
18	`POST`	`/api/simulation/ice-candidate`	simulation
19	`POST`	`/api/simulation/turn`	simulation
20	`GET`	`/api/simulation/cache/{cache_key}`	simulation
21	`POST`	`/api/mohp/evaluate`	mohp
22	`POST`	`/api/graph/ingest`	graph
23	`GET`	`/api/graph/doctor/{code_name}`	graph
24	`GET`	`/api/graph/cluster/{cluster_id}/doctors`	graph
25	`GET`	`/api/graph/institution/{institution_name}/doctors`	graph
26	`GET`	`/api/graph/topic/{topic_name}/doctors`	graph
27	`GET`	`/api/graph/institutions/summary`	graph
28	`GET`	`/api/graph/overlap`	graph
29	`GET`	`/api/analytics/sessions`	analytics
30	`GET`	`/api/analytics/session/{session_id}`	analytics
31	`GET`	`/api/stats/embedding`	stats
32	`GET`	`/api/stats/projection`	stats
33	`GET`	`/api/stats/cache`	stats
34	`GET`	`/api/stats/dlq`	stats
35	`GET`	`/api/stats/outbox`	stats
36	`GET`	`/admin/embeddings/status`	admin
37	`POST`	`/admin/embeddings/swap`	admin
38	`POST`	`/admin/embeddings/reindex`	admin
39	`GET`	`/admin/dlq`	admin
40	`POST`	`/admin/dlq/replay`	admin

Orsync Scenarist v7.0 — API Documentation

Table of Contents

1. Project Workflow Overview

12 Canonical Strategy Feature Keys

Strategy Clusters

Infrastructure, Persistence, and Runtime Guards

2. Getting Started — First-Run Sequence

3. Core Workflows

3.1 Campaign Analysis Workflow

Current LLM, Optimization, and Cluster Behavior

3.2 Simulation Workflow

3.3 HCP Exploration Workflow

4. API Endpoints Reference

4.1 Health

GET /healthz

4.2 Pipeline — Data Ingestion

POST /api/pipeline/seed

POST /api/pipeline/ingest

POST /api/pipeline/dispatch

4.3 Strategy — Campaign Engine

POST /api/strategy/full-evaluate ⭐ Primary Endpoint

POST /api/strategy/vectorize

POST /api/strategy/heatmap

POST /api/strategy/evaluate

POST /api/strategy/optimize

POST /api/strategy/memory/store

GET /api/strategy/clusters

GET /api/strategy/cluster/{cluster_id}/doctors

4.4 Persona — Doctor Archetypes

GET /api/persona/from-cluster/{cluster_id}

GET /api/persona/{code_name}

4.5 Simulation — Roleplay Engine

POST /api/simulation/start

POST /api/simulation/end

POST /api/simulation/handshake

POST /api/simulation/ice-candidate

POST /api/simulation/turn

GET /api/simulation/cache/{cache_key}

4.6 MOHP — Objection Detection

POST /api/mohp/evaluate

4.7 Graph — Knowledge Graph

POST /api/graph/ingest

GET /api/graph/doctor/{code_name}

GET /api/graph/cluster/{cluster_id}/doctors?limit=50

GET /api/graph/institution/{institution_name}/doctors?limit=50

GET /api/graph/topic/{topic_name}/doctors?limit=50

GET /api/graph/institutions/summary?limit=20

GET /api/graph/overlap?code_name_a=HCP-00-001&code_name_b=HCP-00-042

4.8 Analytics — Session History

GET /api/analytics/sessions?limit=50

GET /api/analytics/session/{session_id}

4.9 Stats — System Metrics

GET /api/stats/embedding

GET /api/stats/projection

GET /api/stats/cache

GET /api/stats/dlq

GET /api/stats/outbox

4.10 Admin — Embedding Management

GET /admin/embeddings/status

POST /admin/embeddings/swap

POST /admin/embeddings/reindex

4.11 Admin — Dead Letter Queue

GET /admin/dlq?limit=100

POST /admin/dlq/replay

5. Data Models

Doctor Profile (Gold Dataset)

Campaign Feature Vector (12D)

Heatmap Ranking Entry

Cluster Card

Outbox Event

6. Error Handling

422 Validation Error Shape

Complete Endpoint Index

`GET /healthz`

`POST /api/pipeline/seed`

`POST /api/pipeline/ingest`

`POST /api/pipeline/dispatch`

`POST /api/strategy/full-evaluate` ⭐ Primary Endpoint

`POST /api/strategy/vectorize`

`POST /api/strategy/heatmap`

`POST /api/strategy/evaluate`

`POST /api/strategy/optimize`

`POST /api/strategy/memory/store`

`GET /api/strategy/clusters`

`GET /api/strategy/cluster/{cluster_id}/doctors`

`GET /api/persona/from-cluster/{cluster_id}`

`GET /api/persona/{code_name}`

`POST /api/simulation/start`

`POST /api/simulation/end`

`POST /api/simulation/handshake`

`POST /api/simulation/ice-candidate`

`POST /api/simulation/turn`

`GET /api/simulation/cache/{cache_key}`

`POST /api/mohp/evaluate`

`POST /api/graph/ingest`

`GET /api/graph/doctor/{code_name}`

`GET /api/graph/cluster/{cluster_id}/doctors?limit=50`

`GET /api/graph/institution/{institution_name}/doctors?limit=50`

`GET /api/graph/topic/{topic_name}/doctors?limit=50`

`GET /api/graph/institutions/summary?limit=20`

`GET /api/graph/overlap?code_name_a=HCP-00-001&code_name_b=HCP-00-042`

`GET /api/analytics/sessions?limit=50`

`GET /api/analytics/session/{session_id}`

`GET /api/stats/embedding`

`GET /api/stats/projection`

`GET /api/stats/cache`

`GET /api/stats/dlq`

`GET /api/stats/outbox`

`GET /admin/embeddings/status`

`POST /admin/embeddings/swap`

`POST /admin/embeddings/reindex`

`GET /admin/dlq?limit=100`

`POST /admin/dlq/replay`