Text Classification
Transformers
Safetensors
Vietnamese
uni_vsfc_transformer
vietnamese
custom-code
multilingual-e5
uni_vsfc
uit-vsfc
education
multitask
custom_code
Instructions to use NeoCyber/m-e5-small-uit-vsfc-uni with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use NeoCyber/m-e5-small-uit-vsfc-uni with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-classification", model="NeoCyber/m-e5-small-uit-vsfc-uni", trust_remote_code=True)# Load model directly from transformers import AutoModelForSequenceClassification model = AutoModelForSequenceClassification.from_pretrained("NeoCyber/m-e5-small-uit-vsfc-uni", trust_remote_code=True, dtype="auto") - Notebooks
- Google Colab
- Kaggle
m-e5-small-uit-vsfc-uni
Overview
Vietnamese multi-task text classification model for student feedback. The model jointly predicts sentiment and topic labels from a single sentence.
Model Details
- Base model:
intfloat/multilingual-e5-small - Architecture:
uni_vsfc - Checkpoint source:
uit-vsfc-uni-e5-small-best.pt - Sequence length used during training/inference pipeline:
256 - Tasks:
sentiment, topic
Label Schema
sentiment:0 = negative,1 = neutral,2 = positivetopic:0 = lecturer,1 = training_program,2 = facility,3 = others
Task Heads
sentiment:3classestopic:4classes
Dataset
- Dataset:
Vietnamese Students' Feedback Corpus (UIT-VSFC)Vietnamese Students' Feedback Corpus (UIT-VSFC) contains more than 16,000 human-annotated student feedback sentences with sentiment and topic labels.
Data Format
sentenceis the input text column.sentimentis a 3-class label andtopicis a 4-class label.
Splits
- Train:
11426samples - Validation:
1583samples - Test:
3166samples
Checkpoint Metrics
loss:0.2894accuracy:0.9005
Usage
Load the model with trust_remote_code=True because this repository contains custom modeling code.
from transformers import AutoModelForSequenceClassification, AutoTokenizer
repo_id = "NeoCyber/m-e5-small-uit-vsfc-uni"
tokenizer = AutoTokenizer.from_pretrained(repo_id)
model = AutoModelForSequenceClassification.from_pretrained(
repo_id,
trust_remote_code=True,
)
texts = ["slide giáo trình đầy đủ ."]
inputs = tokenizer(texts, return_tensors="pt", truncation=True, padding=True)
outputs = model(**inputs)
predictions = model.decode_predictions(outputs.logits_by_task)
print(predictions)
Notes
- The repository includes custom
configuration_*.pyandmodeling_*.pyfiles required bytransformersAutoClasses. outputs.logits_by_taskcontains one tensor per task, andoutputs.logitsis the concatenated tensor.
- Downloads last month
- 26
Model tree for NeoCyber/m-e5-small-uit-vsfc-uni
Base model
intfloat/multilingual-e5-small