AI & ML interests

Hardware-aware AI Model Optimization

Recent Activity

nota-ai 's collections 4

Efficient MoE-based LLM
Mixture-of-Experts Large Language Models with Advanced Quantization
Efficient Large Language Model
Shortened LLMs from Depth Pruning; https://github.com/Nota-NetsPresso/shortened-llm