Pruna OSS is turning 1! To mark this milestone, we're launching the First Prune initiative.
What's First Prune: If you contribute to open issues at our GitHub repo, you earn Pruna Inference API credits.
How you can participate: β’ Pick an open issue labelled "first-prune" and assign it to you β’ Submit your PR and mark it ready for review by April 30 β’ Find out more in the PR template when you open a PR
More OSS than ever with the latest pruna 0.3.2 release. It extends existing algorithm families, such as compilers, kernels, and pruners, and adds new ones, including decoders, distillers, enhancers, and recoverers. But it's not only a collection of algorithms; instead, you can easily combine them to get the biggest efficiency win.
Announcing RealPerformance, a dataset of functional issues of language models that mirrors failure patterns identified through rigorous testing in real LLM agents
How to learn about efficient AI? - Happy to announce the Awesome AI Efficiency repo that gathers a curated list of 100+ materials to understand the challenges and solutions in making AI faster, smaller, cheaper, greener.
π It is designed for a **large audience** including beginners, decision-makers, engineers, and researchers. π It contains **diverse materials** with newspaper articles, blogs, tools, tech reports, research papers, books, and lectures.
This is an ongoing project. Do not hesitate to share your feedback/suggestions and star the repo! π
π₯ Announcing FLUX-Juiced: The Fastest Image Generation Endpoint (2.6x faster)!
Optimisations are widely applied and can reduce inference time, but their impact on quality often remains unclear, so we decided to challenge the status quo and create our own optimised version of FLUX.1[dev] called FLUX-juiced.