compounding-test / reference /portraits.json
apingali
feat(hf-space): Gradio diagnostic with Anthropic SDK + prompt caching (US4)
60518c1
[
{
"id": "progressive",
"label": "Progressive Corporation",
"summary": "Personal-auto insurer that built a multi-decade telematics program (Snapshot) collecting tens of billions of vehicle-miles of driving-behavior data, attached to claim outcomes the company observes directly. The data feeds the live pricing model, giving Progressive a behavioral input set the rest of the industry lacks.",
"bottleneck": "Adverse selection in pricing β€” the moment the insurer commits a price to a risk it has only partially observed (the underwriting decision at the quote screen).",
"compounding_summary": "Snapshot satisfies all four compounding conditions at material strength. Proprietary data origin (Snapshot beacon data is first-party, not bought or scraped). Self-labeling workflow (every policy term produces a claim outcome that re-trains the pricing model). Decreasing marginal cost (17-year-amortized pipeline). Defensible asymmetry (multi-policy-generation head start on integrating behavior data into the live rate-filing pipeline β€” capped 3/4 because State Farm and Allstate are catching up on raw mileage).",
"article_url": "https://www.mile-hi.ai/journal/progressives-risk-selection-flywheel"
},
{
"id": "deere",
"label": "John Deere",
"summary": "Industrial-equipment manufacturer running See & Spray, a per-nozzle computer-vision system at the sprayer boom that discharges herbicide only on weed pixels. Paired with the John Deere Operations Center, every spray pass produces image-and-outcome data joined to the combine yield reading three months later β€” a vision corpus generated by Deere's equipment on Deere customers' fields.",
"bottleneck": "Herbicide cost per acre at the spraying step, joined to yield realization at harvest β€” the single place in row-crop agriculture where the next dollar of operating cost stops producing throughput.",
"compounding_summary": "See & Spray satisfies the four conditions through physical-world data. Proprietary data origin (vision corpus generated by Deere equipment on Deere customers' fields). Self-labeling workflow (combine yield map labels sprayer decisions three months earlier). Decreasing marginal cost β€” capped 3/4 by the diesel + equipment-replacement-cycle floor. Defensible asymmetry (embedded equipment fleet is the moat; competitors must rebuild a comparable install base on the agricultural replacement cycle, measured in years).",
"article_url": "https://www.mile-hi.ai/journal/deeres-physical-world-data-loop"
},
{
"id": "mastercard",
"label": "Mastercard",
"summary": "Card-payments network running Decision Intelligence Pro (May 2024), a generative-AI fraud-scoring transformer trained on every authorization request that has hit the Mastercard rail across decades, paired with every chargeback that subsequently labeled it. Every new transaction generates a row that, three to ninety days later, gets a clean positive or negative label as the chargeback either arrives or does not.",
"bottleneck": "Quality of the authorize-or-decline decision at the moment of swipe β€” the joint shape of the false-positive and false-negative curves, where the network earns interchange on transactions it approves correctly and loses long-run share of issuer wallet on the ones it gets wrong.",
"compounding_summary": "Mastercard satisfies the conditions three-strong, one-moderate. Proprietary data origin (the cross-cardholder, cross-merchant authorization sequence is something only the network observes). Self-labeling workflow (chargebacks are the labels). Decreasing marginal cost β€” 3/4 (gen-AI architecture is an ongoing spend layer on the older stack). Defensible asymmetry β€” candid 2/4: Visa operates a structurally parallel rail with comparable scale, and American Express runs a closed-loop network with arguably better label fidelity. The asymmetry is real against any third party that does not run a global card rail, but thinner against the one peer that matters.",
"article_url": "https://www.mile-hi.ai/journal/mastercards-network-of-labeled-outcomes"
},
{
"id": "mayo",
"label": "Mayo Clinic",
"summary": "Non-profit clinical operation running an AI-ECG program that trains a deep-learning model on decades of standard 10-second 12-lead ECGs, each paired with the diagnoses, procedures, and clinical outcomes that arrived for that same patient in the years after the tracing was recorded. The model surfaces atrial fibrillation and cardiac amyloidosis from sinus-rhythm tracings a cardiologist cannot yet read clinically.",
"bottleneck": "Diagnostic latency for hidden cardiac conditions β€” the time window between when a disease first becomes detectable in routinely collected data and when a human reader has enough clinical signal to act on it.",
"compounding_summary": "AI-ECG satisfies the four conditions at clinical scale. Proprietary data origin (the linkage β€” attaching a specific patient's 1998 ECG to their 2007 stroke and 2014 heart-failure admission, across millions of patients β€” does not exist outside Mayo's record system). Self-labeling workflow (every eventual diagnosis labels the prior ECG tracings for that patient). Decreasing marginal cost β€” capped 3/4 by clinical-validation cadence (every new indication requires its own validation study + regulatory clearance). Defensible asymmetry (longitudinal label structure + embedded clinical reading practice + institutional-trust position β€” none replicable by buying a vendor's model). The model produces a probability score; the cardiologist makes the decision.",
"article_url": "https://www.mile-hi.ai/journal/mayos-outcome-labeled-corpus"
}
]