File size: 1,357 Bytes

56db97b

---
language: en
license: mit
library_name: transformers
tags:
- bert
- scientific-text
- embeddings
- fine-tuned
pipeline_tag: feature-extraction
---

# scibert-citation-model

This model is a fine-tuned version of SciBERT specifically optimized for generating embeddings from scientific papers.

## Model Details

- **Base Model**: SciBERT (Scientific BERT)
- **Fine-tuning Task**: Scientific paper understanding and embedding generation
- **Language**: English (Scientific/Academic)
- **Vocabulary**: Scientific vocabulary

## Usage

```python
from transformers import AutoTokenizer, AutoModel
import torch

# Load model and tokenizer
tokenizer = AutoTokenizer.from_pretrained("your-username/scibert-citation-model")
model = AutoModel.from_pretrained("your-username/scibert-citation-model")

# Generate embeddings
text = "Your scientific text here"
inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=512)
with torch.no_grad():
    outputs = model(**inputs)
    embeddings = outputs.last_hidden_state[:, 0, :]  # [CLS] token embedding

print(f"Embeddings shape: {embeddings.shape}")
```

## Performance

Fine-tuned SciBert

## Training Details

- **Training Framework**: PyTorch/Transformers
- **Fine-tuning Objective**: Scientific text understanding

## Citation

If you use this model in your research, please cite appropriately.