tmt3103
/

VSFC-topic-classify-phoBERT

Text Classification

Eval Results (legacy)

Model card Files Files and versions

VSFC-topic-classify-phoBERT / README.md

tmt3103's picture

Update README.md

d641446 verified 10 months ago

|

history blame contribute delete

1.83 kB

	---
	license: apache-2.0
	tags:
	- text-classification
	- topic-analysis
	- vietnamese
	- vsfc
	- phobert
	language:
	- vi
	datasets:
	- uit-vsfc
	model-index:
	- name: VSFC Topic Classifier (PhoBERT)
	results:
	- task:
	type: text-classification
	name: Topic Classification
	dataset:
	name: UIT-VSFC
	type: uit-vsfc
	metrics:
	- type: accuracy
	value: 89.1346
	- type: f1
	value: 89.0436
	---

	# VSFC TOPIC Classifier using PhoBERT

	This model is fine-tuned from [`vinai/phobert-base`](https://huggingface.co/vinai/phobert-base) on the UIT-VSFC dataset for Vietnamese Students Feedback Corpus topic analysis.

	## 🧠 Model Details

	- Model type: Transformer (BERT-based)
	- Base model: [`vinai/phobert-base`](https://huggingface.co/vinai/phobert-base)
	- Fine-tuned task: Sentence-level topc classification
	- Target labels: Lecturer, Training program, Facility, Others
	- Tokenizer: SentencePiece BPE

	## 📚 Training Data

	- Dataset: [UIT-VSFC](https://drive.google.com/drive/folders/1xclbjHHK58zk2X6iqbvMPS2rcy9y9E0X)
	- Language: Vietnamese
	- License: Academic use
	- Students’ feedback is a vital resource for the interdisciplinary research involving the combining of two different research fields between sentiment analysis and education.

	## 🚀 How to Use

	```python
	from transformers import AutoTokenizer, AutoModelForSequenceClassification

	tokenizer = AutoTokenizer.from_pretrained("tmt3103/VSFC-topic-classify-phoBERT")
	model = AutoModelForSequenceClassification.from_pretrained("tmt3103/VSFC-topic-classify-phoBERT")

	inputs = tokenizer("Giảng viên thân thiện dễ thương", return_tensors="pt")
	outputs = model(**inputs)
	predicted_class = outputs.logits.argmax(dim=-1).item()