ScientificNLIsrb / README.md

Create README.md

cd7918b verified 11 months ago

4.2 kB

	---
	language: sr
	metrics:
	- accuracy
	- precision
	- recall
	- f1
	base_model: microsoft/deberta-v3-large
	---
	# srbNLI: Serbian Natural Language Inference Model

	## Model Overview
	srbNLI is a fine-tuned Natural Language Inference (NLI) model for Serbian, created by adapting the SciFact dataset. The model is based on state-of-the-art transformer architectures. It is trained to recognize relationships between claims and evidence in Serbian text, with applications in scientific claim verification and potential expansion to broader claim verification tasks.

	## Key Details
	- Model Type: Transformer-based
	- Language: Serbian
	- Task: Natural Language Inference (NLI), Textual Entailment, Claim Verification
	- Dataset: srbSciFact (automatically translated SciFact dataset)
	- Fine-tuning: Fine-tuned on Serbian NLI data (support, contradiction, and neutral categories).
	- Metrics: Accuracy, Precision, Recall, F1-score

	## Motivation
	This model addresses the lack of NLI datasets and models for Serbian, a low-resource language. It provides a tool for textual entailment and claim verification, especially for scientific claims, with broader potential for misinformation detection and automated fact-checking.

	## Training
	- Base Models Used: DeBERTa-v3-large
	- Training Data: Automatically translated SciFact dataset
	- Fine-tuning: Conducted on a single DGX NVIDIA A100 GPU (40 GB)
	- Hyperparameters: Optimized learning rate, batch size, weight decay, epochs, and early stopping

	## Evaluation
	The model was evaluated using standard NLI metrics (accuracy, precision, recall, F1-score). It was also compared to the GPT-4o model for generalization capabilities.

	## Use Cases
	- Claim Verification: Scientific claims and general domain claims in Serbian
	- Misinformation Detection: Identifying contradictions or support between claims and evidence
	- Cross-lingual Applications: Potential for cross-lingual claim verification with multilingual models

	## Future Work
	- Improving accuracy with human-corrected translations and Serbian-specific datasets
	- Expanding to general-domain claim verification
	- Enhancing multilingual NLI capabilities

	## Results Comparison

	The table below presents a comparison of the fine-tuned models (DeBERTa-v3-large, RoBERTa-large, BERTić, GPT-4o, and others) on the srbSciFact dataset, focusing on key metrics: Accuracy (Acc), Precision (P), Recall (R), and F1-score (F1). The models were evaluated on their ability to classify relationships between claims and evidence in Serbian text.

	\| Model \| Accuracy \| Precision (P) \| Recall (R) \| F1-score (F1) \|
	\|----------------------\|----------\|---------------\|------------\|---------------\|
	\| DeBERTa-v3-large \| 0.70 \| 0.86 \| 0.82 \| 0.84 \|
	\| RoBERTa-large \| 0.57 \| 0.63 \| 0.76 \| 0.69 \|
	\| BERTić (Serbian) \| 0.56 \| 0.56 \| 0.37 \| 0.44 \|
	\| GPT-4o (English) \| 0.66 \| 0.70 \| 0.77 \| 0.78 \|
	\| mDeBERTa-base \| 0.63 \| 0.92 \| 0.75 \| 0.83 \|
	\| XLM-RoBERTa-large \| 0.64 \| 0.89 \| 0.77 \| 0.83 \|
	\| mBERT-cased \| 0.48 \| 0.76 \| 0.50 \| 0.60 \|
	\| mBERT-uncased \| 0.57 \| 0.45 \| 0.61 \| 0.52 \|

	### Observations
	- DeBERTa-v3-large performed the best overall, with an accuracy of 0.70 and an F1-score of 0.84.
	- RoBERTa-large and BERTić showed lower performance, especially in recall, suggesting challenges in handling complex linguistic inference in Serbian.
	- GPT-4o outperforms all fine-tuned models in F1-score when the prompt is in English, but the DeBERTa-v3-large model slightly outperforms GPT-4o when the prompt is in Serbian.
	- mDeBERTa-base and XLM-RoBERTa-large exhibited strong cross-lingual performance, with F1-scores of 0.83 and 0.83, respectively.

	This demonstrates the potential of adapting advanced transformer models to Serbian while highlighting areas for future improvement, such as refining translations and expanding domain-specific data.
	---