--- language: en license: apache-2.0 library_name: transformers tags: - scibert - concept-annotation - nlp - sequence-classification metrics: - accuracy pipeline_tag: text-classification --- # SciBERT Concept Annotation This model is a fine-tuned version of SciBERT for **Concept Annotation**. It classifies the relationship between a document text and a specific concept/term using sequence classification. ## Model Description - **Model type:** SciBERT (BERT-based) - **Language(s):** English - **License:** Apache 2.0 - **Fine-tuned from model:** `allenai/scibert_scivocab_uncased` ## Usage You can use this model directly with a custom inference script. Note that while the model weights are hosted here, it is designed to work with the `allenai/scibert_scivocab_uncased` tokenizer. ### Example Code ```python from transformers import AutoModelForSequenceClassification, AutoTokenizer import torch # Load model and tokenizer model_id = "linh101201/scibert-concept-annotation" tokenizer_id = "allenai/scibert_scivocab_uncased" model = AutoModelForSequenceClassification.from_pretrained(model_id, num_labels=2).to("cuda") tokenizer = AutoTokenizer.from_pretrained(tokenizer_id) # Example inputs: Document text and the Concept to annotate text = "Large Language Model in Law Documents Hub" concept = "natural language processing" inputs = tokenizer(text, concept, return_tensors="pt").to("cuda") with torch.no_grad(): logits = model(**inputs).logits # Apply softmax to get probabilities probs = torch.nn.functional.softmax(logits, dim=-1) print(f"Logits: {logits}") print(f"Probabilities: {probs}")