Sentence Similarity
sentence-transformers
Safetensors
English
bert
biomedical
medical
healthcare
information-retrieval
semantic-search
text-embeddings-inference
Instructions to use pankajrajdeo/bioforge-namedropper-owl with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- sentence-transformers
How to use pankajrajdeo/bioforge-namedropper-owl with sentence-transformers:
from sentence_transformers import SentenceTransformer model = SentenceTransformer("pankajrajdeo/bioforge-namedropper-owl") sentences = [ "That is a happy person", "That is a happy dog", "That is a very happy person", "Today is a sunny day" ] embeddings = model.encode(sentences) similarities = model.similarity(embeddings, embeddings) print(similarities.shape) # [4, 4] - Notebooks
- Google Colab
- Kaggle
BioForge 3b: Owl
Part of the BioForge Progressive Training Pipeline - NameDropper: OWL Ontology Expansion - Biomedical ontology knowledge
Model Overview
This is Stage 3b in the BioForge progressive training curriculum.
Training Details
- Training Data: OWL ontologies (protein-free)
- Epochs: 5
- Batch Size: 1024
- Architecture: bioformer-8L (BERT-based, 8 layers)
- Embedding Dimension: 384
- Max Sequence Length: 1024 tokens
Usage
from sentence_transformers import SentenceTransformer
# Load this model
model = SentenceTransformer("pankajrajdeo/bioforge-namedropper-owl")
# Encode medical text
sentences = [
"Type 2 diabetes mellitus",
"Myocardial infarction"
]
embeddings = model.encode(sentences)
print(embeddings.shape) # (2, 384)
BioForge Training Pipeline
The complete BioForge pipeline consists of:
- Stage 1a: PubMed Foundation โ
pankajrajdeo/bioforge-stage1a-pubmed - Stage 1b: Clinical Trials โ
pankajrajdeo/bioforge-stage1b-clinical-trials - Stage 1c: UMLS Ontology โ
pankajrajdeo/bioforge-stage1c-umls - Stage 3b: OWL Ontology (NameDropper) โ
pankajrajdeo/bioforge-namedropper-owl - Stage 4: Mixed Foundation โญ RECOMMENDED โ
pankajrajdeo/bioforge-stage4-mixed
Recommended Model
For most use cases, we recommend Stage 4 Mixed Model which combines all training data for the best overall performance.
Citation
@software{bioforge2025,
author = {Pankaj Rajdeo},
title = {BioForge: Progressive Biomedical Sentence Embeddings},
year = {2025},
publisher = {Hugging Face},
url = {https://huggingface.co/pankajrajdeo/bioforge-namedropper-owl},
note = {Stage 3b}
}
License
MIT License
Contact
- Author: Pankaj Rajdeo
- Institution: Cincinnati Children's Hospital Medical Center
- Hugging Face: @pankajrajdeo
- Downloads last month
- -