| # Gene Extraction Model | |
| This model is fine-tuned for gene extraction using BERT-CRF architecture. | |
| ## Usage | |
| ```python | |
| from transformers import AutoTokenizer, AutoModelForTokenClassification | |
| from transformers import pipeline | |
| # Load model and tokenizer | |
| model_name = "RaduGabriel/gene-entity-recognition" | |
| tokenizer = AutoTokenizer.from_pretrained(model_name) | |
| model = AutoModelForTokenClassification.from_pretrained(model_name) | |
| # Create NER pipeline | |
| ner_pipeline = pipeline( | |
| "ner", | |
| model=model, | |
| tokenizer=tokenizer, | |
| aggregation_strategy="simple" | |
| ) | |
| # Example usage | |
| text = "The BRCA1 gene is associated with breast cancer." | |
| results = ner_pipeline(text) | |
| ``` | |
| ## Labels | |
| - O | |
| - B-GENE | |
| - I-GENE | |
| - E-GENE | |
| - S-GENE | |
| ## Model Details | |
| - Architecture: BERT-CRF | |
| - Base Model: answerdotai/ModernBERT-large | |
| - Number of Labels: 5 | |
| - CRF Layer: Enabled | |