Sheikh Bengali AI Model

Model Description

Sheikh is a Bengali (Bangla) language AI model trained for instruction following and conversational tasks. Built on top of Microsoft's DialoGPT-medium, this model has been fine-tuned with Bengali instruction-following data to understand and generate responses in Bengali language.

Model Details

  • Model Type: Language Model, Text Generation
  • Architecture: GPT-2 based (DialoGPT-medium)
  • Base Model: microsoft/DialoGPT-medium
  • Parameters: 355M
  • Language: Bengali (Bangla)
  • Training Data: Alpaca Bangla instruction dataset
  • Model Size: 1.4GB
  • License: Apache 2.0

Intended Use

This model is designed for:

  • Bengali language text generation
  • Instruction following and question answering
  • Educational content creation
  • Cultural and historical knowledge responses
  • General conversational AI in Bengali

Usage

from transformers import AutoTokenizer, AutoModelForCausalLM

# Load the model
tokenizer = AutoTokenizer.from_pretrained("megharudushi/Sheikh")
model = AutoModelForCausalLM.from_pretrained("megharudushi/Sheikh")

# Generate Bengali response
input_text = "বাংলাদেশের রাজধানী কী?"
inputs = tokenizer.encode(input_text, return_tensors="pt")
outputs = model.generate(inputs, max_length=150, temperature=0.8)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)

Example Prompts

  • Educational: "গণিতের মৌলিক নীতি বলুন"
  • Cultural: "বাংলা সাহিত্যের বিখ্যাত কবি কারা?"
  • General: "স্বাস্থ্যকর থাকার উপায় বলুন"
  • Historical: "বাংলাদেশের স্বাধীনতার ইতিহাস বর্ণনা করুন"

Model Performance

  • Supports Bengali language understanding and generation
  • Trained on Bengali instruction-following dataset
  • Optimized for educational and conversational contexts
  • Cultural knowledge preservation for Bengali language

Limitations

  • Trained primarily on Bengali instruction data
  • May have limitations in very specialized domains
  • Performance depends on input quality and clarity
  • Model size limited by computational resources

Training Details

  • Base Model: microsoft/DialoGPT-medium
  • Fine-tuning Data: Alpaca Bangla dataset
  • Training Approach: Instruction following
  • Language Focus: Bengali (Bangla) language

Citation

If you use this model, please cite:

@misc{SheikhBengaliAI,
  title={Sheikh Bengali AI Model},
  author={megharudushi},
  year={2025},
  url={https://huggingface.co/megharudushi/Sheikh},
  note={Bengali language instruction-following model based on DialoGPT-medium}
}

License

This model is released under the Apache 2.0 License.

Contributing

This model is part of the Bengali AI initiative to make Bengali language AI more accessible to the community.


Created: December 21, 2025
Repository: https://huggingface.co/megharudushi/Sheikh
Base Model: microsoft/DialoGPT-medium
Language: Bengali (Bangla)

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for megharudushi/Sheikh

Finetuned
(88)
this model

Evaluation results