linh101201
/

scibert-concept-annotation

Text Classification

concept-annotation

sequence-classification

Model card Files Files and versions

scibert-concept-annotation / README.md

linh101201's picture

Update README.md

f284577 verified 8 days ago

|

history blame contribute delete

1.63 kB

	---
	language: en
	license: apache-2.0
	library_name: transformers
	tags:
	- scibert
	- concept-annotation
	- nlp
	- sequence-classification

	metrics:
	- accuracy
	pipeline_tag: text-classification
	---

	# SciBERT Concept Annotation

	This model is a fine-tuned version of SciBERT for Concept Annotation. It classifies the relationship between a document text and a specific concept/term using sequence classification.

	## Model Description
	- Model type: SciBERT (BERT-based)
	- Language(s): English
	- License: Apache 2.0
	- Fine-tuned from model: `allenai/scibert_scivocab_uncased`

	## Usage

	You can use this model directly with a custom inference script. Note that while the model weights are hosted here, it is designed to work with the `allenai/scibert_scivocab_uncased` tokenizer.

	### Example Code

	```python
	from transformers import AutoModelForSequenceClassification, AutoTokenizer
	import torch

	# Load model and tokenizer
	model_id = "linh101201/scibert-concept-annotation"
	tokenizer_id = "allenai/scibert_scivocab_uncased"

	model = AutoModelForSequenceClassification.from_pretrained(model_id, num_labels=2).to("cuda")
	tokenizer = AutoTokenizer.from_pretrained(tokenizer_id)

	# Example inputs: Document text and the Concept to annotate
	text = "Large Language Model in Law Documents Hub"
	concept = "natural language processing"

	inputs = tokenizer(text, concept, return_tensors="pt").to("cuda")

	with torch.no_grad():
	logits = model(**inputs).logits
	# Apply softmax to get probabilities
	probs = torch.nn.functional.softmax(logits, dim=-1)
	print(f"Logits: {logits}")
	print(f"Probabilities: {probs}")