Update README.md

4101dc5 verified 28 days ago

2.6 kB


	# Azhar_Model_v0.2_Final

	## 📜 Project Description
	This model is a fine-tuned version of Qwen2.5-7B optimized for Islamic Jurisprudence (Fiqh) using the Shamela Library corpus.

	## 📊 Training Specifications
	- Base Model: Qwen2.5-7B
	- Training Data: 20,000 high-quality juristic records from Shamela.
	- Total Processed Tokens: ~24.5 Million Tokens.
	- Optimization Steps: 3,000 Steps.
	- Effective Batch Size: 16 (via Gradient Accumulation).
	- Hardware: Kaggle Dual T4 GPUs.
	- Total Processed Tokens: 24,576,000
	- Loss Improvement: 85.21%
	- Final Perplexity: 9.68
	-

	### 📈 Performance Gains
	\| Metric \| Initial Value \| Final Value \| Improvement \|
	\| :--- \| :--- \| :--- \| :--- \|
	\| Cross-Entropy Loss \| 15.3510 \| 2.2705 \| 85.21% \|
	\| Perplexity (PPL) \| 4,643,546.51 \| 9.68 \| 99.99% \|

	### 🧪 Qualitative Comparison Results:
	The model was evaluated across 4 paradigms. The Azhar Hybrid approach showed:
	- Accuracy: Significant reduction in juristic hallucinations compared to the Base model.
	- Linguistic Style: Successful adoption of classical 'Shamela' phrasing and scholarly terminology.
	- Reliability: High consistency in providing evidence-based rulings (Fiqh).

	## 🧪 Qualitative Evaluation (Quad-Comparison Analysis)
	To ensure reliability, the model was tested across four different paradigms:
	1. Base Model: Showed 15% accuracy with significant juristic hallucinations.
	2. RAG Only: Accurate but lacked scholarly linguistic style.
	3. FT Only (Azhar v0.2): Demonstrated "Juristic Intuition" and high fluency in classical Arabic.
	4. Hybrid (FT+RAG): The optimal configuration, achieving the highest scores in both factual accuracy and stylistic authenticity.

	## 📂 Verification Files
	The full comparison results (Base vs RAG vs FT vs Hybrid) are available in the attached `Azhar_Model_Quad_Comparison_v0.1.csv` file in this repository.

	## 📉 Visualized Convergence
	The training exhibited a stable downward trend in both Loss and Perplexity curves, indicating successful domain adaptation without catastrophic forgetting.

	## ⚖️ Ethical Considerations & Use Case
	This model is intended for academic and research purposes to assist scholars in navigating classical texts. It should be used as a supplementary tool alongside traditional scholarly verification.
	---
	Developed by: MSc. Shamil Al-Mohammedi This model serves as the primary artifact for the research paper: "Fine-Tuning Large Language Models on Classical Arabic Juristic Corpora: A Case Study on the Shamela Library."


	# Azhar_Model_v0.2_Final

	## 📜 Project Description
	This model is a fine-tuned version of Qwen2.5-7B optimized for Islamic Jurisprudence (Fiqh) using the Shamela Library corpus.

	## 📊 Training Specifications
	- Base Model: Qwen2.5-7B
	- Training Data: 20,000 high-quality juristic records from Shamela.
	- Total Processed Tokens: ~24.5 Million Tokens.
	- Optimization Steps: 3,000 Steps.
	- Effective Batch Size: 16 (via Gradient Accumulation).
	- Hardware: Kaggle Dual T4 GPUs.
	- Total Processed Tokens: 24,576,000
	- Loss Improvement: 85.21%
	- Final Perplexity: 9.68
	-

	### 📈 Performance Gains
	\| Metric \| Initial Value \| Final Value \| Improvement \|
	\| :--- \| :--- \| :--- \| :--- \|
	\| Cross-Entropy Loss \| 15.3510 \| 2.2705 \| 85.21% \|
	\| Perplexity (PPL) \| 4,643,546.51 \| 9.68 \| 99.99% \|

	### 🧪 Qualitative Comparison Results:
	The model was evaluated across 4 paradigms. The Azhar Hybrid approach showed:
	- Accuracy: Significant reduction in juristic hallucinations compared to the Base model.
	- Linguistic Style: Successful adoption of classical 'Shamela' phrasing and scholarly terminology.
	- Reliability: High consistency in providing evidence-based rulings (Fiqh).

	## 🧪 Qualitative Evaluation (Quad-Comparison Analysis)
	To ensure reliability, the model was tested across four different paradigms:
	1. Base Model: Showed 15% accuracy with significant juristic hallucinations.
	2. RAG Only: Accurate but lacked scholarly linguistic style.
	3. FT Only (Azhar v0.2): Demonstrated "Juristic Intuition" and high fluency in classical Arabic.
	4. Hybrid (FT+RAG): The optimal configuration, achieving the highest scores in both factual accuracy and stylistic authenticity.

	## 📂 Verification Files
	The full comparison results (Base vs RAG vs FT vs Hybrid) are available in the attached `Azhar_Model_Quad_Comparison_v0.1.csv` file in this repository.

	## 📉 Visualized Convergence
	The training exhibited a stable downward trend in both Loss and Perplexity curves, indicating successful domain adaptation without catastrophic forgetting.

	## ⚖️ Ethical Considerations & Use Case
	This model is intended for academic and research purposes to assist scholars in navigating classical texts. It should be used as a supplementary tool alongside traditional scholarly verification.
	---
	Developed by: MSc. Shamil Al-Mohammedi This model serves as the primary artifact for the research paper: "Fine-Tuning Large Language Models on Classical Arabic Juristic Corpora: A Case Study on the Shamela Library."