Spaces:

ar-llm-browser
/

README

Running

App Files Files Community

README / README.md

s-y-a-n's picture

Update README.md

2a485ea verified 18 days ago

|

history blame contribute delete

1.6 kB

	---
	title: README
	emoji: 🌖
	colorFrom: indigo
	colorTo: indigo
	sdk: static
	pinned: true
	thumbnail: >-
	https://cdn-uploads.huggingface.co/production/uploads/68979c2673a7d0c259085219/i6ENamHFtDN6lK96DDjAP.jpeg
	---

	# Browser-based Locally Hosted Arabic LLM Optimization

	This reporsitory contains all the models that were compressed and optimized as part of our final year research project.

	For more information about our project, please refer to our [project webpage](https://arabicog.com).

	## Abstract
	This work evaluates the effectiveness of several model compression and optimization techniques on large language models, namely quantization, pruning, and knowledge distillation, with a focus on Arabic natural language performance and browser deployment. The primary methods investigated are 8-bit and 4-bit quantization with several methods such as GPTQ, LLM.int8(), and QLoRA, SparseGPT and Wanda for pruning, and a knowledge distillation pipeline based on a Qwen2.5-32B-Instruct teacher model and a Qwen2.5-7B-Instruct student model. The proposed SelecTKD method filters low-confidence teacher tokens during token-level distillation to improve bilingual Arabic-English balance. The report concludes with a comprehensive comparative analysis between the tested compression methods, comparing their compression and accuracy tradeoff and discussing their practical effectiveness for limited- resource deployment. The study also includes staged fine-tuning to adapt the model to Arabic, GCC, and Bahraini contexts using a broad-to-specific curriculum while preserving bilingual performance.