Upload folder using huggingface_hub

Browse files

Files changed (6) hide show

README.md +120 -0
config.json +15 -0
figures/fig1.png +0 -0
figures/fig2.png +0 -0
figures/fig3.png +0 -0
pytorch_model.bin +3 -0

README.md ADDED Viewed

	@@ -0,0 +1,120 @@

+---
+license: apache-2.0
+library_name: transformers
+---
+# BioMedLM
+<!-- markdownlint-disable first-line-h1 -->
+<!-- markdownlint-disable html -->
+<!-- markdownlint-disable no-duplicate-header -->
+<div align="center">
+  <img src="figures/fig1.png" width="60%" alt="BioMedLM" />
+</div>
+<hr>
+<div align="center" style="line-height: 1;">
+  <a href="LICENSE" style="margin: 2px;">
+    <img alt="License" src="figures/fig2.png" style="display: inline-block; vertical-align: middle;"/>
+  </a>
+</div>
+## 1. Introduction
+BioMedLM represents a breakthrough in biomedical natural language processing. This specialized language model has been trained on extensive medical literature, clinical notes, and healthcare documentation. In the latest version, BioMedLM demonstrates remarkable capabilities in understanding complex medical terminology, extracting clinical entities, and generating accurate medical summaries.
+<p align="center">
+  <img width="80%" src="figures/fig3.png">
+</p>
+Compared to general-purpose language models, BioMedLM shows significant improvements in domain-specific tasks. For instance, in the MedQA benchmark, the model's accuracy has increased from 65% to 82.3% in the current version. This advancement stems from specialized pre-training on PubMed abstracts and clinical trial data.
+Beyond clinical text understanding, this version offers enhanced capabilities in drug-drug interaction detection, adverse event extraction, and ICD-10 coding assistance.
+## 2. Evaluation Results
+### Comprehensive Benchmark Results
+<div align="center">
+| | Benchmark | PubMedBERT | BioBERT | ClinicalBERT | BioMedLM |
+|---|---|---|---|---|---|
+| **Clinical NLP Tasks** | Clinical NER | 0.823 | 0.835 | 0.841 | 0.870 |
+| | Drug Interaction | 0.712 | 0.728 | 0.739 | 0.804 |
+| | Medical QA | 0.654 | 0.668 | 0.682 | 0.750 |
+| **Document Analysis** | Disease Classification | 0.789 | 0.801 | 0.812 | 0.895 |
+| | Clinical Inference | 0.701 | 0.715 | 0.724 | 0.799 |
+| | Symptom Extraction | 0.756 | 0.769 | 0.778 | 0.830 |
+| | Medical Coding | 0.688 | 0.702 | 0.715 | 0.740 |
+| **Report Generation** | Radiology Report | 0.621 | 0.639 | 0.651 | 0.750 |
+| | Patient Summary | 0.598 | 0.615 | 0.628 | 0.719 |
+| | Adverse Event | 0.734 | 0.749 | 0.761 | 0.853 |
+| | Literature Mining | 0.667 | 0.682 | 0.694 | 0.780 |
+| **Research Tasks**| Gene Relation | 0.578 | 0.594 | 0.608 | 0.788 |
+| | Clinical Trial | 0.645 | 0.661 | 0.673 | 0.782 |
+| | Pathology Analysis | 0.712 | 0.728 | 0.741 | 0.865 |
+| | Safety Compliance | 0.801 | 0.815 | 0.827 | 0.844 |
+</div>
+### Overall Performance Summary
+BioMedLM demonstrates state-of-the-art performance across all biomedical benchmark categories, with particularly notable results in clinical entity recognition and document analysis tasks.
+## 3. Clinical API & Demo Platform
+We offer a clinical demo interface and API for healthcare researchers to interact with BioMedLM. Please visit our secure portal for HIPAA-compliant access.
+## 4. How to Run Locally
+Please refer to our code repository for information about deploying BioMedLM in clinical environments.
+Compared to previous versions, the usage recommendations for BioMedLM have the following changes:
+1. Medical domain system prompts are recommended.
+2. Clinical context window has been expanded to 8192 tokens.
+The model architecture of BioMedLM-Base is optimized for clinical document processing, with specialized attention mechanisms for medical entity relationships.
+### System Prompt
+We recommend using the following system prompt for clinical applications:
+```
+You are BioMedLM, a specialized biomedical AI assistant trained on medical literature and clinical documentation.
+Current context: {clinical_setting}
+```
+For example,
+```
+You are BioMedLM, a specialized biomedical AI assistant trained on medical literature and clinical documentation.
+Current context: Outpatient clinical notes review.
+```
+### Temperature
+We recommend setting the temperature parameter $T_{model}$ to 0.3 for clinical applications to ensure factual accuracy.
+### Prompts for Clinical Document Processing
+For clinical document analysis, please follow the template:
+```
+document_template = \
+"""[document type]: {doc_type}
+[clinical content begin]
+{clinical_text}
+[clinical content end]
+{analysis_request}"""
+```
+For literature search enhanced generation, we recommend the following template:
+```
+search_clinical_template = \
+'''# The following are relevant medical literature findings:
+{pubmed_results}
+In the search results provided, each finding is formatted as [source X begin]...[source X end]. Please cite sources appropriately using [citation:X] format. Ensure medical accuracy and evidence-based responses.
+When responding:
+- Prioritize peer-reviewed sources
+- Note any conflicting findings
+- Include confidence levels where appropriate
+- Flag any potential safety concerns
+# Clinical query:
+{query}'''
+```
+## 5. License
+This code repository is licensed under the [Apache 2.0 License](LICENSE). The use of BioMedLM models requires compliance with healthcare regulations including HIPAA where applicable.
+## 6. Contact
+If you have any questions, please raise an issue on our GitHub repository or contact us at research@biomedlm.health.
+```

config.json ADDED Viewed

	@@ -0,0 +1,15 @@

+{
+    "model_type": "roberta",
+    "architectures": [
+        "RobertaForMaskedLM"
+    ],
+    "hidden_size": 1024,
+    "num_attention_heads": 16,
+    "num_hidden_layers": 24,
+    "vocab_size": 50265,
+    "domain": "biomedical",
+    "pretrained_on": [
+        "pubmed",
+        "clinical_notes"
+    ]
+}

figures/fig1.png ADDED Viewed

figures/fig2.png ADDED Viewed

figures/fig3.png ADDED Viewed

pytorch_model.bin ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:57dff1e759659abd7546e649e98e8c676e1e30465ac071021e8adec4ed79e0dd
+size 37