-
Shorter but not Worse: Frugal Reasoning via Easy Samples as Length Regularizers in Math RLVR
Paper • 2511.01937 • Published • 16 -
MBZUAI-Paris/Frugal-Thinking-30B-A3B
31B • Updated • 2 -
MBZUAI-Paris/Frugal-Thinking-4B
Text Generation • 4B • Updated • 5 • • 7 -
MBZUAI-Paris/Frugal-Thinking-RL-Data
Viewer • Updated • 177k • 18 • 5
Abdelaziz Bounhar
BounharAbdelaziz
AI & ML interests
Deep Learning, Reinforcement Learning, AI Agents, Generative Modeling, NLP, Information Theory, Security of Machine Learning, ...etc
Recent Activity
liked a model 20 days ago
pyannote/segmentation-3.0 liked a model 20 days ago
pyannote/speaker-diarization-3.1 liked a dataset 22 days ago
tiiuae/alyah-emirati-benchmarkOrganizations
Moroccan Darija LLMs
Language Models that speaks Moroccan darija (ary)
Moroccan Speech Models & Datasets
Moroccan darija STT
-
BounharAbdelaziz/Morocco-Darija-STT-large
Automatic Speech Recognition • 2B • Updated • 17 -
atlasia/DODa-audio-dataset
Viewer • Updated • 12.7k • 375 • 19 -
atlasia/Moroccan-Darija-Wiki-Audio-Dataset
Viewer • Updated • 492 • 49 • 14 -
BounharAbdelaziz/Dvoice-v2-cleaned
Viewer • Updated • 120 • 13
Translation Models & Datasets
English to Moroccan darija (ary) models
-
BounharAbdelaziz/Terjman-v2-English-Morocco-Darija-Dataset-350K
Viewer • Updated • 355k • 12 • 1 -
BounharAbdelaziz/Terjman-Supreme-v2.0
Translation • 3B • Updated • 2 -
BounharAbdelaziz/Terjman-Ultra-v2.0
Translation • 1B • Updated • 48 • 2 -
BounharAbdelaziz/Terjman-Large-v2.0
Translation • 0.2B • Updated • 27 • 3
Arabic (MSA) Summarization Models & Datasets
A collection of models (and the dataset used to train them) that are trained for summarizing arabic text.
-
BounharAbdelaziz/MaYofid-Qwen2.5-3B-Instruct
Text Generation • 3B • Updated -
BounharAbdelaziz/MaYofid-Falcon3-3B-Instruct
Text Generation • 3B • Updated -
BounharAbdelaziz/MaYofid-Qwen2.5-3B-Instruct-AWQ
3B • Updated • 1 -
BounharAbdelaziz/Arabic-Synthetic-Summarization-Dataset-Filtered
Viewer • Updated • 4.41k • 23 • 1
RLHF/RLVR
Some RLHF/RLVR experiments using GRPO and DPO.
Moroccan Darija Embeddings Models & Datasets
Sentence and word embedding models for Moroccan darija (ary)
-
BounharAbdelaziz/Morocco-Darija-Sentence-Embedding-v0.2
Sentence Similarity • 0.6B • Updated • 2 -
BounharAbdelaziz/ModernBERT-Morocco-Sentence-Embeddings-v0.2-bs-32-lr-2e-05-ep-2-wp-0.05-gacc-1-gnm-1.0-v0.3
Sentence Similarity • 0.2B • Updated • 1 -
BounharAbdelaziz/Morocco-Darija-Sentence-Embedding-v0.1
Feature Extraction • 0.6B • Updated • 192 • 2 -
BounharAbdelaziz/XLM-RoBERTa-Morocco-bs-32-lr-2e-05-ep-2-wp-0.05-gacc-1-gnm-1.0-v0.3
0.6B • Updated • 8
Moroccan Darija Datasets
A collection of all available datasets for pretraining LLMs
Arabic (MSA) Language Models & Datasets
Frugal-AI
-
Shorter but not Worse: Frugal Reasoning via Easy Samples as Length Regularizers in Math RLVR
Paper • 2511.01937 • Published • 16 -
MBZUAI-Paris/Frugal-Thinking-30B-A3B
31B • Updated • 2 -
MBZUAI-Paris/Frugal-Thinking-4B
Text Generation • 4B • Updated • 5 • • 7 -
MBZUAI-Paris/Frugal-Thinking-RL-Data
Viewer • Updated • 177k • 18 • 5
RLHF/RLVR
Some RLHF/RLVR experiments using GRPO and DPO.
Moroccan Darija LLMs
Language Models that speaks Moroccan darija (ary)
Moroccan Darija Embeddings Models & Datasets
Sentence and word embedding models for Moroccan darija (ary)
-
BounharAbdelaziz/Morocco-Darija-Sentence-Embedding-v0.2
Sentence Similarity • 0.6B • Updated • 2 -
BounharAbdelaziz/ModernBERT-Morocco-Sentence-Embeddings-v0.2-bs-32-lr-2e-05-ep-2-wp-0.05-gacc-1-gnm-1.0-v0.3
Sentence Similarity • 0.2B • Updated • 1 -
BounharAbdelaziz/Morocco-Darija-Sentence-Embedding-v0.1
Feature Extraction • 0.6B • Updated • 192 • 2 -
BounharAbdelaziz/XLM-RoBERTa-Morocco-bs-32-lr-2e-05-ep-2-wp-0.05-gacc-1-gnm-1.0-v0.3
0.6B • Updated • 8
Moroccan Speech Models & Datasets
Moroccan darija STT
-
BounharAbdelaziz/Morocco-Darija-STT-large
Automatic Speech Recognition • 2B • Updated • 17 -
atlasia/DODa-audio-dataset
Viewer • Updated • 12.7k • 375 • 19 -
atlasia/Moroccan-Darija-Wiki-Audio-Dataset
Viewer • Updated • 492 • 49 • 14 -
BounharAbdelaziz/Dvoice-v2-cleaned
Viewer • Updated • 120 • 13
Moroccan Darija Datasets
A collection of all available datasets for pretraining LLMs
Translation Models & Datasets
English to Moroccan darija (ary) models
-
BounharAbdelaziz/Terjman-v2-English-Morocco-Darija-Dataset-350K
Viewer • Updated • 355k • 12 • 1 -
BounharAbdelaziz/Terjman-Supreme-v2.0
Translation • 3B • Updated • 2 -
BounharAbdelaziz/Terjman-Ultra-v2.0
Translation • 1B • Updated • 48 • 2 -
BounharAbdelaziz/Terjman-Large-v2.0
Translation • 0.2B • Updated • 27 • 3
Arabic (MSA) Language Models & Datasets
Arabic (MSA) Summarization Models & Datasets
A collection of models (and the dataset used to train them) that are trained for summarizing arabic text.
-
BounharAbdelaziz/MaYofid-Qwen2.5-3B-Instruct
Text Generation • 3B • Updated -
BounharAbdelaziz/MaYofid-Falcon3-3B-Instruct
Text Generation • 3B • Updated -
BounharAbdelaziz/MaYofid-Qwen2.5-3B-Instruct-AWQ
3B • Updated • 1 -
BounharAbdelaziz/Arabic-Synthetic-Summarization-Dataset-Filtered
Viewer • Updated • 4.41k • 23 • 1