RaduGabriel's picture
Upload README.md with huggingface_hub
298699b verified
|
raw
history blame
852 Bytes

Gene Extraction Model

This model is fine-tuned for gene extraction using BERT-CRF architecture.

Usage

from transformers import AutoTokenizer, AutoModelForTokenClassification
from transformers import pipeline

# Load model and tokenizer
model_name = "RaduGabriel/gene-entity-recognition"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForTokenClassification.from_pretrained(model_name)

# Create NER pipeline
ner_pipeline = pipeline(
    "ner",
    model=model,
    tokenizer=tokenizer,
    aggregation_strategy="simple"
)

# Example usage
text = "The BRCA1 gene is associated with breast cancer."
results = ner_pipeline(text)

Labels

  • O
  • B-GENE
  • I-GENE
  • E-GENE
  • S-GENE

Model Details

  • Architecture: BERT-CRF
  • Base Model: answerdotai/ModernBERT-large
  • Number of Labels: 5
  • CRF Layer: Enabled