Instructions to use tuhailong/cross_encoder_roberta-wwm-ext_v2 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use tuhailong/cross_encoder_roberta-wwm-ext_v2 with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-classification", model="tuhailong/cross_encoder_roberta-wwm-ext_v2")# Load model directly from transformers import AutoTokenizer, AutoModelForSequenceClassification tokenizer = AutoTokenizer.from_pretrained("tuhailong/cross_encoder_roberta-wwm-ext_v2") model = AutoModelForSequenceClassification.from_pretrained("tuhailong/cross_encoder_roberta-wwm-ext_v2") - Notebooks
- Google Colab
- Kaggle
Data
train data is similarity sentence data from E-commerce dialogue, about 50w sentence pairs.
Model
model created by sentence-tansformers,model struct is cross-encoder, pretrained model is hfl/chinese-roberta-wwm-ext. This model structure is as same as tuhailong/cross_encoder_roberta-wwm-ext_v1,the difference is changing the epoch from 5 to 1, the performance is better in my dataset.
Usage
>>> from sentence_transformers.cross_encoder import CrossEncoder
>>> model = CrossEncoder(model_save_path, device="cuda", max_length=64)
>>> sentences = ["今天天气不错", "今天心情不错"]
>>> score = model.predict([sentences])
>>> print(score[0])
Code
train code from https://github.com/TTurn/cross-encoder
- Downloads last month
- 3