Spaces:
Running
Running
| title: README | |
| emoji: ๐ | |
| colorFrom: purple | |
| colorTo: indigo | |
| sdk: static | |
| pinned: false | |
| --- | |
| # EdgeCompress | |
| **EdgeCompress** is a graduation project developed by senior Computer Science students from **Capital University, Egypt**. | |
| Our work focuses on **compressing Large Language Models (LLMs)** to make them efficient enough to run on **edge devices** with limited computational resources. | |
| ## Project Overview | |
| Large Language Models typically require significant memory, storage, and computational power. This makes them difficult to deploy on edge hardware such as embedded systems, IoT devices, and low-power GPUs. | |
| Our project explores different **model compression techniques** to reduce the size and resource requirements of LLMs while maintaining acceptable performance. | |
| ## Research Focus | |
| We investigate multiple compression approaches, including: | |
| * **Quantization** | |
| * **Model pruning** | |
| * **Knowledge Distillation** | |
| * **Low-precision inference** | |
| * **Memory-efficient deployment strategies** | |
| ## Edge Deployment | |
| After compression, the models are evaluated on **edge computing environments** to determine: | |
| * Memory usage | |
| * Inference latency | |
| * Performance degradation after compression | |
| * Suitability for real-time edge AI applications | |
| ## What You Will Find Here | |
| This organization hosts: | |
| * Compressed LLM checkpoints | |
| * Experiments with different compression techniques | |
| * Benchmark results on edge hardware | |
| * Research artifacts from our graduation project | |
| ## Goal | |
| Our goal is to **enable efficient deployment of LLMs on edge devices**, making advanced AI models more accessible in real-world and resource-constrained environments. |