--- title: README emoji: 👀 colorFrom: purple colorTo: indigo sdk: static pinned: false --- --- # EdgeCompress **EdgeCompress** is a graduation project developed by senior Computer Science students from **Capital University, Egypt**. Our work focuses on **compressing Large Language Models (LLMs)** to make them efficient enough to run on **edge devices** with limited computational resources. ## Project Overview Large Language Models typically require significant memory, storage, and computational power. This makes them difficult to deploy on edge hardware such as embedded systems, IoT devices, and low-power GPUs. Our project explores different **model compression techniques** to reduce the size and resource requirements of LLMs while maintaining acceptable performance. ## Research Focus We investigate multiple compression approaches, including: * **Quantization** * **Model pruning** * **Knowledge Distillation** * **Low-precision inference** * **Memory-efficient deployment strategies** ## Edge Deployment After compression, the models are evaluated on **edge computing environments** to determine: * Memory usage * Inference latency * Performance degradation after compression * Suitability for real-time edge AI applications ## What You Will Find Here This organization hosts: * Compressed LLM checkpoints * Experiments with different compression techniques * Benchmark results on edge hardware * Research artifacts from our graduation project ## Goal Our goal is to **enable efficient deployment of LLMs on edge devices**, making advanced AI models more accessible in real-world and resource-constrained environments.