--- license: llama3.1 datasets: - biodatlab/ec-raft-dataset language: - en pipeline_tag: text-generation --- # EC-RAFT: Automated Generation of Clinical Trial Eligibility Criteria ## Model Description **EC-RAFT** is a fine-tuned Retrieval-Augmented Fine-Tuning (RAFT) model based on **LLaMA-3.1-8B-Instruct** architecture. It is designed to automatically generate **structured, high-quality clinical trial eligibility criteria (EC)** directly from trial titles and descriptions. EC-RAFT integrates **domain-specific retrieval** with **synthesized intermediate reasoning** steps, enabling it to produce **clinically relevant** and **contextually appropriate** EC sets. ## Fine-tuning Details - **Original Model:** LLaMA-3.1-8B-Instruct - **Datasets used for fine-tuning:** - ClinicalTrials.gov (267,347 trials, 2000–2024) [biodatlab/ec-raft-dataset](https://huggingface.co/datasets/biodatlab/ec-raft-dataset) - Retrieval corpus constructed using **SciNCL model** - Intermediate reasoning steps **R** generated using **Gemini-1.5-flash-002** - Fine-tuning method: - **Retrieval-Augmented Fine-Tuning (RAFT)** - **Low-Rank Adaptation (LoRA)** ## Model Performance Evaluated on a held-out ClinicalTrials.gov test split: | Metric | Score | |-----------------------------------|---------| | **BERTScore** (semantic similarity) | **86.23** | | **Precision** (LLM-guided evaluation) | **78.84%** | | **Recall** (LLM-guided evaluation) | **75.89%** | | **Mean LLM-as-a-Judge Score** (0–3) | **1.7150** | | **Mean Pair-BERTScore** | **67.76** | - **Outperforms zero-shot LLaMA-3.1 and Gemini-1.5-flash baselines** - **Outperforms fine-tuned LLaMA and Meditron baselines** - **Clinically validated:** LLM-as-a-Judge scores highly correlated with human physician evaluation ## Intended Use - Assist **researchers**, **trial designers**, and **sponsors** in drafting clinical trial eligibility criteria. - **Automate** EC generation to reduce manual effort and improve consistency. - Support **clinical trial design** transparency and quality. - Enable integration with **trial registry platforms**, **clinical trial matching systems**, and **EC recommendation tools**. ## Limitations - Requires **human validation** of generated EC before clinical use. - Trained on **public ClinicalTrials.gov data** — may not generalize well to: - Rare or novel diseases - Specialized or non-standard trial designs - Non-public trial data - Optimized for **English-language clinical trials**. - As with any LLM-based system, risks include hallucination, subtle errors, and domain shifts. - Evaluation metrics (BERTScore, LLM-as-a-Judge) are proxies — not full substitutes for domain expert review. ## Acknowledgments This model was developed using resources provided by: - **RAVIS Technology** for feedback and collaboration. - **Faculty of Medicine Ramathibodi Hospital** - **NSTDA Supercomputer Center (ThaiSC), Project \#pv814001** We also acknowledge the contributions of the broader open-source community whose tools and prior works on **RAFT**, **SciNCL**, **LoRA**, **LLaMA-3**, and **biomedical NLP** made this project possible.