Instructions to use bagamine/SI2M_DarijaBERTV1 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use bagamine/SI2M_DarijaBERTV1 with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("fill-mask", model="bagamine/SI2M_DarijaBERTV1")# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("bagamine/SI2M_DarijaBERTV1", dtype="auto") - Notebooks
- Google Colab
- Kaggle
Model Card for Fine-Tuned SI2M_DarijaBERT and CamelBERT
This model card outlines the fine-tuning of SI2M_DarijaBERT on a trunc of a large Moroccan Darija dataset scraped from youtube transcriptions and other websites that you can find here : https://huggingface.co/datasets/HANTIFARAH/combined_darija_dataset_cleaned . These transformer model were fine-tuned for the purpose embedding generation in Moroccan Darija, enhancing it performance on specific NLP tasks and tested it Embeddings on text Classification tasks.
Model Details
Model Description
The SI2M_DarijaBERT model have been fine-tuned on Moroccan Darija texts. the model is based on the BERT architecture and specialize in generating embeddings for text classification tasks in Moroccan Darija.
- Developed by: [BAGUENNA Mohammed-Amine]
- Model type: Transformer-based (BERT architecture)
- Language(s) (NLP): Moroccan Darija (Arabic dialect)
- Finetuned from model: SI2M_DarijaBERT
Recommendations
Users should take care to ensure their data falls within the domain of Moroccan Darija text. Further fine-tuning with more specialized data is recommended for domain-specific applications (e.g., medical language).
How to Get Started with the Model
You can use the models with the following code:
from transformers import AutoTokenizer, AutoModelForSequenceClassification
model = AutoModel.from_pretrained("bagamine/SI2M_DarijaBERTV1")
tokenizer = AutoTokenizer.from_pretrained("bagamine/SI2M_DarijaBERTV1")
Model tree for bagamine/SI2M_DarijaBERTV1
Base model
SI2M-Lab/DarijaBERT