ExposureGuard-DCPG-Encoder

Graph attention encoder over the Dynamic Cross-modal PHI Graph (DCPG). Produces a fixed-dim patient embedding and risk score from a multi-modal PHI exposure graph.

Part of the ExposureGuard ecosystem.

Architecture

Two-layer GAT with attention pooling. No external ML framework required β€” pure Python with no dependencies.

Input graph (nodes + edges)
      β”‚
  Layer 1: GAT  [18 β†’ 32]   (node features Γ— edge weights)
      β”‚
  Layer 2: GAT  [32 β†’ 16]
      β”‚
  Attention pool (weighted by risk_entropy)
      β”‚
  patient_embedding [16]  +  risk_score [0,1]

Node features (dim 18)

Group Dim Content
modality one-hot 8 text, asr, image_proxy, waveform_proxy, audio_proxy, image_link, audio_link, unknown
phi_type one-hot 8 NAME_DATE_MRN_FACILITY, NAME_DATE_MRN, FACE_IMAGE, WAVEFORM_HEADER, VOICE, FACE_LINK, VOICE_LINK, unknown
scalars 3 risk_entropy, context_confidence, pseudonym_version_norm

Edge weights

Inherited directly from DCPGEdge:

w = 0.30Β·f_temporal + 0.30Β·f_semantic + 0.25Β·f_modality + 0.15Β·f_trust

Usage

from dcpg_encoder import DCPGEncoder, encode_patient

# graph_summary comes from DCPGAdapter.graph_summary() or CRDTGraph.summary()
result = encode_patient(graph_summary)

result["patient_embedding"]   # List[float], dim=16, L2-normalized
result["node_embeddings"]     # Dict[node_id, List[float]]
result["risk_score"]          # float in [0, 1]

From CRDT federated graph

result = encode_patient(crdt_summary, source="crdt")

Batch

from inference import predict_batch
results = predict_batch([summary_a, summary_b])

Integration with ExposureGuard ecosystem

DCPGAdapter.graph_summary()
        β”‚
DCPGEncoder.encode()         ← this model
        β”‚
    β”Œβ”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
    β”‚                      β”‚
patient_embedding     risk_score
    β”‚                      β”‚
PolicyNet           SynthRewrite-T5
(vkatg/exposureguard-policynet)

Input format

{
  "nodes": [
    {
      "node_id": "patient_1::text::NAME_DATE_MRN_FACILITY",
      "modality": "text",
      "phi_type": "NAME_DATE_MRN_FACILITY",
      "risk_entropy": 0.72,
      "context_confidence": 0.9,
      "pseudonym_version": 1
    }
  ],
  "edges": [
    {
      "source": "patient_1::text::NAME_DATE_MRN_FACILITY",
      "target": "patient_1::asr::NAME_DATE_MRN",
      "type": "co_occurrence",
      "weight": 0.71
    }
  ]
}

Output format

{
  "patient_embedding": [0.0, 0.189, 0.0, ...],
  "node_embeddings": {
    "patient_1::text::NAME_DATE_MRN_FACILITY": [0.0, 0.188, ...]
  },
  "risk_score": 0.429,
  "embed_dim": 16
}

Related models

Citation

@misc{exposureguard2025,
  title  = {ExposureGuard: Cross-Modal PHI Re-identification Risk Scoring via Dynamic Graph Attention},
  author = {[Your Name]},
  year   = {2025},
  url    = {https://huggingface.co/vkatg/exposureguard-dcpg-encoder},
  note   = {US Provisional Patent filed 2025-07-05}
}
Downloads last month
14
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Dataset used to train vkatg/exposureguard-dcpg-encoder

Spaces using vkatg/exposureguard-dcpg-encoder 2