Bayan-15B

A specialized Arabic Large Language Model for legal reasoning, interpretive methodologies, and classical Arabic text analysis.

Model Description

Bayan-15B is a domain-adapted language model built on Qwen2.5-14B, fine-tuned on a comprehensive corpus of classical Arabic legal and interpretive texts. The model excels at understanding complex argumentative structures, legal reasoning patterns, and hermeneutical methodologies in Arabic.

Key Capabilities

  • Legal Text Analysis: Understanding and generating classical Arabic legal discourse
  • Interpretive Reasoning: Analyzing methodological frameworks and interpretive principles
  • Classical Arabic: Deep comprehension of traditional scholarly Arabic writing styles
  • Argumentation: Following complex chains of reasoning and evidence-based arguments

Training Data

  • Corpus Size: Approximately 190 million tokens
  • Sources: Over 900 classical Arabic texts covering legal theory, interpretive methodology, and jurisprudential reasoning
  • Language: Classical and Modern Standard Arabic

Technical Specifications

Parameter Value
Base Model Qwen/Qwen2.5-14B
Parameters 14.7B
Training Method Continued Pre-Training (CPT)
Context Length 2048 tokens
Precision bfloat16

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained( "MohJaf/Bayan-15B", torch_dtype="auto", device_map="auto" ) tokenizer = AutoTokenizer.from_pretrained("MohJaf/Bayan-15B")

prompt = "Your Arabic text here" inputs = tokenizer(prompt, return_tensors="pt").to(model.device) outputs = model.generate(**inputs, max_new_tokens=256) print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Use Cases

  • Academic research in Arabic legal traditions
  • Analysis of classical interpretive methodologies
  • Arabic NLP applications requiring domain expertise
  • Educational tools for Arabic legal studies
  • Compliance and advisory systems for Islamic finance

Limitations

  • Specialized in classical Arabic legal discourse
  • Not a substitute for qualified legal or religious experts
  • Should be used as a research and analysis tool
  • May require domain expertise to evaluate outputs

License

This model is released under CC BY-NC-ND 4.0.

Academic and research use is permitted. Commercial use requires separate licensing. Modifications and redistribution are not permitted without prior authorization.

For commercial licensing inquiries, please contact the developer.

Developer

Bayan AI, LLC Building AI solutions for Arabic language understanding and specialized domains.

Citation

@misc {usuli-ai-2025, author = {Bayan AI}, title = {Bayan-15B: Arabic Legal Reasoning Language Model}, year = {2025}, publisher = {Hugging Face}, url = { https://huggingface.co/MohJaf/Bayan-15B } }

Contact

Hugging Face:

@MohJaf

Organization: Bayan AI, LLC

Downloads last month
24
Safetensors
Model size
15B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for MohJaf/Bayan-15B

Base model

Qwen/Qwen2.5-14B
Finetuned
(100)
this model
Quantizations
2 models