luluw
/

Roberta-devangari-script-classification

Text Classification

Generated from Trainer

text-embeddings-inference

Model card Files Files and versions

Metrics Training metrics Community

luluw commited on Oct 29, 2024

Commit

b402f07

·

verified ·

1 Parent(s): e75506c

Update README.md

Files changed (1) hide show

README.md +19 -9

README.md CHANGED Viewed

@@ -1,7 +1,10 @@
 ---
 library_name: transformers
 language:
-- np
 base_model: RoBERTa
 tags:
 - generated_from_trainer
@@ -30,15 +33,23 @@ It achieves the following results on the evaluation set:
 ## Model description
-More information needed
 ## Intended uses & limitations
-More information needed
-## Training and evaluation data
-More information needed
 ## Training procedure
@@ -61,14 +72,13 @@ The following hyperparameters were used during training:
 | Training Loss | Epoch  | Step | Validation Loss | Accuracy | F1     | Precision | Recall |
 |:-------------:|:------:|:----:|:---------------:|:--------:|:------:|:---------:|:------:|
-| 0.2337        | 0.9997 | 1638 | 0.0603          | 0.9874   | 0.9874 | 0.9875    | 0.9874 |
 | 0.0513        | 2.0    | 3277 | 0.0387          | 0.9919   | 0.9919 | 0.9919    | 0.9919 |
-| 0.0252        | 2.9991 | 4914 | 0.0329          | 0.9935   | 0.9935 | 0.9935    | 0.9935 |
 ### Framework versions
 - Transformers 4.44.2
 - Pytorch 2.4.1+cu121
 - Datasets 3.0.2
-- Tokenizers 0.19.1

 ---
 library_name: transformers
 language:
+- nep
+- hi
+- sa
+- mr
 base_model: RoBERTa
 tags:
 - generated_from_trainer
 ## Model description
+This model is a fine-tuned version of RoBERTa, optimized for multiclass text classification on datasets written in
+Devanagari script across multiple languages, including Nepali, Marathi, Sanskrit, Bhojpuri, and Hindi. By leveraging the
+robust RoBERTa architecture, this model has been fine-tuned to recognize intricate patterns and contextual
+cues within Devanagari text, achieving high accuracy and F1 scores for multiclass classification tasks.
 ## Intended uses & limitations
+#### Intended Uses:
+- Multiclass text classification for Nepali, Marathi, Sanskrit, Bhojpuri, and Hindi, written in Devanagari script.
+- Suitable for sentiment analysis, topic categorization, and public opinion monitoring.
+#### Limitations:
+- Limited to Devanagari script; accuracy may drop on other scripts.
+- Fine-tuned for multiclass classification; may not generalize well to other tasks or binary classifications.
+- Language-specific nuances not present in the dataset may impact performance on certain dialects.
 ## Training procedure
 | Training Loss | Epoch  | Step | Validation Loss | Accuracy | F1     | Precision | Recall |
 |:-------------:|:------:|:----:|:---------------:|:--------:|:------:|:---------:|:------:|
+| 0.2337        | 1.0 | 1638 | 0.0603          | 0.9874   | 0.9874 | 0.9875    | 0.9874 |
 | 0.0513        | 2.0    | 3277 | 0.0387          | 0.9919   | 0.9919 | 0.9919    | 0.9919 |
+| 0.0252        | 3.0 | 4914 | 0.0329          | 0.9935   | 0.9935 | 0.9935    | 0.9935 |
 ### Framework versions
 - Transformers 4.44.2
 - Pytorch 2.4.1+cu121
 - Datasets 3.0.2
+- Tokenizers 0.19.1