How to use from the
Use from the
Transformers library
# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("feature-extraction", model="demdecuong/stroke_simcse")
# Load model directly
from transformers import AutoTokenizer, AutoModel

tokenizer = AutoTokenizer.from_pretrained("demdecuong/stroke_simcse")
model = AutoModel.from_pretrained("demdecuong/stroke_simcse")
Quick Links

YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

This is finetune version of SimCSE: Simple Contrastive Learning of Sentence Embeddings , train unsupervised on 570K stroke sentences from : stroke books, quora medical, quora's stroke and human annotates.

Extract sentence representation

from transformers import AutoTokenizer, AutoModel  
tokenizer = AutoTokenizer.from_pretrained("demdecuong/stroke_simcse")
model = AutoModel.from_pretrained("demdecuong/stroke_simcse")

text = "What are disease related to red stroke's causes?"
inputs = tokenizer(text, return_tensors='pt')
outputs = model(**inputs)[1]

Build up embedding for database

database = [
    'What is the daily checklist for stroke returning home',
    'What are some tips for stroke adapt new life',
    'What  should I consider when using nursing-home care'
]

embedding = torch.zeros((len(database),768))

for i in range(len(database)):
  inputs = tokenizer(database[i], return_tensors="pt")
  outputs = model(**inputs)[1]
  embedding[i] = outputs

print(embedding.shape)

Result

On our Poc testset , which contains pairs of matching question related to stroke from human-generated.

Model Top-1 Accuracy
SimCSE (supervised) 75.83
SimCSE (ours) 76.66
Downloads last month
4
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Paper for demdecuong/stroke_simcse