AI & ML interests

Document AI, Healthcare AI, Clinical LLMs

Recent Activity

Organization Card

Root Cause Analytics

Synthetic medical training data for Australian healthcare AI.

We build PHI-free synthetic document libraries that look and behave like real clinical PDFs - so teams can train OCR, layout-aware extraction, and clinical NLP models without waiting 18 months for ethics approval.

What we ship

  • Synthetic Medical Documents - 5,000 PDFs across 45 document types, modelled on NSW Health practice. Pre-labelled with structured ground truth and pixel-precise bounding boxes for every field. Free 50-document sample available; full library commercially licensable.
  • Custom commissions - bespoke document mixes, additional jurisdictions, or hospital-specific branding for teams with niche training needs.
  • Generator licences - run the synthesis pipeline yourself with source code and seeds for unlimited internal generation.

Why synthetic

The bottleneck for medical document AI in Australia is training data. Real hospital PDFs are locked behind the Privacy Act. Generic synthetic medical text has no layout, no scans, no labels - useless for vision-language models like LayoutLMv3, Donut, or DocFormer. Public datasets like MIMIC are US-centric and increasingly restricted.

We sit in that gap: visually realistic, jurisdiction-specific, fully labelled, zero PHI risk.

Who runs this

Founded by Jack Webb - Sydney-based AI engineer, faculty advisor at the Australian Institute of Health Executives, background in clinical document AI and large-scale synthetic data generation.

Get in touch

  • Commercial licensing, custom commissions, or generator licences: jack.webb@rootcauseanalytics.com.au
  • Response: within 24 hours during AU business hours (Sydney AEST)
  • Academic discount: 50% off with proof of affiliation

models 0

None public yet