RaduGabriel
/

gene-entity-recognition

Model card Files Files and versions

gene-entity-recognition / README.md

RaduGabriel's picture

Upload README.md with huggingface_hub

0eaac24 verified 7 months ago

|

history blame contribute delete

1.29 kB

	# Gene Extraction Model

	This model is fine-tuned for gene extraction using BERT-CRF architecture.

	## Model Description
	This model uses a custom BERT-CRF architecture for token classification, specifically designed for gene entity recognition. The model combines BERT with a Conditional Random Field (CRF) layer for improved sequence labeling.

	## Usage

	```python
	from transformers import AutoTokenizer, AutoModelForTokenClassification
	from transformers import pipeline

	model_name = "RaduGabriel/gene-entity-recognition"
	hf_token = None
	tokenizer = AutoTokenizer.from_pretrained(model_name, token=hf_token)
	model = AutoModelForTokenClassification.from_pretrained(model_name, token=hf_token)

	text = "TIF1gamma, a novel member of the transcriptional intermediary factor 1 family, plays a crucial role in gene regulation."

	# Create NER pipeline
	ner_pipeline = pipeline(
	"ner",
	model=model,
	tokenizer=tokenizer,
	aggregation_strategy="simple"
	)


	results = ner_pipeline(text)
	print(results)
	```

	## Labels
	- O
	- B-GENE
	- I-GENE

	## Model Details
	- Architecture: BERT-CRF
	- Base Model: dmis-lab/biobert-v1.1
	- Number of Labels: 3
	- CRF Layer: Enabled

	## Training Details
	- Training Data: GNormPlus dataset
	- Optimizer: AdamW
	- Learning Rate: 2e-05
	- Batch Size: 32
	- Epochs: 3