Pruna AI

Team

company

https://www.pruna.ai/

PrunaAI

Activity Feed Request to join this org

AI & ML interests

Efficient machine learning for any model and hardware: pruning, quantization, compilation, and more.

Recent Activity

sharpenb published a model 7 days ago

PrunaAI/WeiboAI-VibeThinker-3B-HQQ-8bit-smashed

sharpenb updated a model 7 days ago

PrunaAI/WeiboAI-VibeThinker-3B-HQQ-8bit-smashed

sharpenb published a model 7 days ago

PrunaAI/WeiboAI-VibeThinker-3B-HQQ-4bit-smashed

View all activity

Articles

🔥 Announcing FLUX-Juiced: The Fastest Image Generation Endpoint (2.6 times faster)!

Apr 23, 2025

• 13

An Introduction to AI Model Optimization Techniques

Apr 18, 2025

• 30

Optimise AI Models and Make Them Faster, Smaller, Cheaper, Greener

Apr 4, 2025

• 18

View all articles

sharpenb

published a model 7 days ago

PrunaAI/WeiboAI-VibeThinker-3B-HQQ-8bit-smashed

Updated 7 days ago • 17

sharpenb

updated a model 7 days ago

PrunaAI/WeiboAI-VibeThinker-3B-HQQ-8bit-smashed

Updated 7 days ago • 17

sharpenb

published a model 7 days ago

PrunaAI/WeiboAI-VibeThinker-3B-HQQ-4bit-smashed

Updated 7 days ago • 19 • 1

sharpenb

updated a model 7 days ago

PrunaAI/WeiboAI-VibeThinker-3B-HQQ-4bit-smashed

Updated 7 days ago • 19 • 1

sharpenb

published a model 29 days ago

PrunaAI/openbmb-MiniCPM5-1B-HQQ-8bit-smashed

Updated 29 days ago • 26 • 1

sharpenb

updated a model 29 days ago

PrunaAI/openbmb-MiniCPM5-1B-HQQ-8bit-smashed

Updated 29 days ago • 26 • 1

sharpenb

published a model 29 days ago

PrunaAI/openbmb-MiniCPM5-1B-HQQ-4bit-smashed

Updated 29 days ago • 25

sharpenb

updated a model 29 days ago

PrunaAI/openbmb-MiniCPM5-1B-HQQ-4bit-smashed

Updated 29 days ago • 25

sharpenb

published a model about 1 month ago

PrunaAI/NexaAI-octo-net-HQQ-8bit-smashed

Updated May 19 • 3

sdiazlor

posted an update 2 months ago

Post

164

As First Prune, the one-year Pruna OSS anniversary, is halfway.

We’re sharing a recap blog post about our OSS journey — how we started, what we’ve built so far, and what’s next.

Read it here: https://dev.to/pruna-ai/first-prune-celebrate-one-year-of-pruna-oss-50gp

sdiazlor

posted an update 3 months ago

Post

108

Pruna OSS is turning 1! To mark this milestone, we're launching the First Prune initiative.

What's First Prune:
If you contribute to open issues at our GitHub repo, you earn Pruna Inference API credits.

How you can participate:
• Pick an open issue labelled "first-prune" and assign it to you
• Submit your PR and mark it ready for review by April 30
• Find out more in the PR template when you open a PR

Each merged PR scores 30 credits.

Let’s build something great together! Find your issue: https://github.com/PrunaAI/pruna/issues

sdiazlor

posted an update 4 months ago

Post

2611

More OSS than ever with the latest pruna 0.3.2 release. It extends existing algorithm families, such as compilers, kernels, and pruners, and adds new ones, including decoders, distillers, enhancers, and recoverers. But it's not only a collection of algorithms; instead, you can easily combine them to get the biggest efficiency win.

Read the full blog here: https://huggingface.co/blog/PrunaAI/pruna-0-3-2-open-source-optimization-algorithms

davidberenstein1957

posted an update 6 months ago

Post

2797

🚨 Phare LLM benchmark V2: Reasoning models don't guarantee better security

Read the full blog here: https://huggingface.co/blog/davidberenstein1957/phare-llm-benchmark-v2

davidberenstein1957

posted an update 11 months ago

Post

630

Announcing RealPerformance, a dataset of functional issues of language models that mirrors failure patterns identified through rigorous testing in real LLM agents

https://huggingface.co/blog/davidberenstein1957/realperformance-llm-business-compliance

davidberenstein1957

posted an update 12 months ago

Post

419

🚨 LLMs recognise bias but also reproduce harmful stereotypes: an analysis of bias in leading LLMs

I've written a new entry in our series on the Giskard, BPIFrance and Google Deepmind Phare benchmark(phare.giskard.ai).

This time it covers bias: https://huggingface.co/blog/davidberenstein1957/llms-recognise-bias-but-also-produce-stereotypes

Previous entry on hallucinations: https://huggingface.co/blog/davidberenstein1957/phare-analysis-of-hallucination-in-leading-llms

1 reply

davidberenstein1957

posted an update about 1 year ago

Post

1778

I created a collection of FLUX.1 models but 4x faster PrunaAI/flux1-but-4x-faster-66c0b7340836dd7a55e9c0ea

davidberenstein1957

posted an update about 1 year ago

Post

1105

Good answers are not necessarily factual answers: an analysis of hallucination in leading LLMs

https://huggingface.co/blog/davidberenstein1957/phare-analysis-of-hallucination-in-leading-llms

sharpenb

posted an update about 1 year ago

Post

3214

How to learn about efficient AI? - Happy to announce the Awesome AI Efficiency repo that gathers a curated list of 100+ materials to understand the challenges and solutions in making AI faster, smaller, cheaper, greener.

🚀 It is designed for a **large audience** including beginners, decision-makers, engineers, and researchers.
📚 It contains **diverse materials** with newspaper articles, blogs, tools, tech reports, research papers, books, and lectures.

This is an ongoing project. Do not hesitate to share your feedback/suggestions and star the repo! 🌟

https://github.com/PrunaAI/awesome-ai-efficiency

2 replies

davidberenstein1957

posted an update about 1 year ago

Post

2287

🔥 Announcing FLUX-Juiced: The Fastest Image Generation Endpoint (2.6x faster)!

Optimisations are widely applied and can reduce inference time, but their impact on quality often remains unclear, so we decided to challenge the status quo and create our own optimised version of FLUX.1[dev] called FLUX-juiced.

Blog: https://huggingface.co/blog/PrunaAI/flux-fastest-image-generation-endpoint

davidberenstein1957

posted an update about 1 year ago

Post

1751

🧑‍🏫 I wrote a brief blogpost to give An Introduction to AI Model Optimization Techniques!

URL: https://huggingface.co/blog/PrunaAI/introduction-to-ai-model-optimization-techniques

AI & ML interests

Recent Activity

Articles

Pruna 0.3.2: More OSS Algos, More Ways to Optimize

LLM Architectures Explained: What Powers Today’s Top Models

Slashing torch.compile Warmup & LoRA Swapping Times with Pruna

SmolLM-Smashed: Tiny Giants, Optimized for Speed

AI Model Optimization More Flexible Than Ever

Effective Prompting for Generative Vision Models

Measuring What Matters: Objective Metrics for Image Generation Assessment

Faster ComfyUI Nodes for Flux and Stable Diffusion with Pruna

🔥 Announcing FLUX-Juiced: The Fastest Image Generation Endpoint (2.6 times faster)!

An Introduction to AI Model Optimization Techniques

Optimise AI Models and Make Them Faster, Smaller, Cheaper, Greener

Team members 18

PrunaAI's activity