huggingPartyParis

community

https://partiful.com/e/oWOMGoPxB5D37qw5F8yN

Activity Feed Request to join this org

AI & ML interests

None defined yet.

Recent Activity

KhalilGuetari authored a paper 24 days ago

PEEK: Picking Essential frames via Efficient Knowledge distillation

Prabhjotschugh authored a paper about 2 months ago

When Less Is More: Simplicity Beats Complexity for Physics-Constrained InSAR Phase Unwrapping

arcanoXIII authored a paper about 2 months ago

Modulate-and-Map: Crossmodal Feature Mapping with Cross-View Modulation for 3D Anomaly Detection

View all activity

nielsr

submitted a paper to Daily Papers 6 days ago

Duration Aware Scheduling for ASR Serving Under Workload Drift

Paper • 2603.11273 • Published Mar 11 • 3

thibautloiseau

authored 2 papers 17 days ago

Alligat0R: Pre-Training Through Co-Visibility Segmentation for Relative Camera Pose Regression

Paper • 2503.07561 • Published Mar 10, 2025 • 2

RUBIK: A Structured Benchmark for Image Matching across Geometric Challenges

Paper • 2502.19955 • Published Feb 27, 2025 • 2

Lunor

authored a paper 21 days ago

MV-Fashion: Towards Enabling Virtual Try-On and Size Estimation with Multi-View Paired Data

Paper • 2603.08147 • Published Mar 9

nielsr

submitted a paper to Daily Papers 22 days ago

Ultralytics YOLO26: Unified Real-Time End-to-End Vision Models

Paper • 2606.03748 • Published 24 days ago • 15

lbourdois

posted an update 24 days ago

Post

958

New blog post!
An introduction to a little-known but highly effective model reduction method: 𝗧𝗿𝗶𝗺𝗺𝗶𝗻𝗴✂️
We show how to reduce model size (we went up to 87.24% reduction) while preserving its performance.

We applied this technique to 16 different model families across several modalities to illustrate that it works on any architecture (as long as the embedding layer is the last one of the model) and on any modality involving text.
From these 16 families, we generated over 𝟱,𝟱𝟬𝟬 𝗺𝗼𝗻𝗼𝗹𝗶𝗻𝗴𝘂𝗮𝗹 𝗺𝗼𝗱𝗲𝗹𝘀 𝗶𝗻 𝟭𝟮𝟰 𝗱𝗶𝗳𝗳𝗲𝗿𝗲𝗻𝘁 𝗹𝗮𝗻𝗴𝘂𝗮𝗴𝗲𝘀 🌍

Key takeaways from our experiments:
1️⃣ Trimming does not require a GPU. Our models were obtained on a CPU.
2️⃣ This method scales up to at least 4B parameters (we did not test beyond that).
3️⃣ Trimmed model is smaller than the original while preserving its performance. If you observe a slight performance drop, just fine-tuned to recover or even surpass the original performance.
4️⃣ For an equivalent compute budget, it is better to trim then fine-tune rather than fine-tuning the original model. Since the model is smaller, you can run more epochs/show more data and get in fine a better model than the original.
5️⃣ Trimming is a competitive alternative to distillation and quantization. E.g. we obtained our alternative to DistilBERT in 9 minutes on CPU vs. 90 hours of GPU for the latter.
6️⃣ Trimming could generate reasoning traces in the language of the trimmed model. This could be an alternative to generating traces in English and then translating them into the desired language.

And many other things (such as how much data are needed, the impact of the database used, the order in which it should be done, etc.) are available in the blogpost!

Blogpost: https://huggingface.co/blog/lbourdois/introduction-to-trimming
Models: alphaedge-ai/Trimming_models_search

4 replies

nielsr

submitted a paper to Daily Papers 29 days ago

Gemini Embedding 2: A Native Multimodal Embedding Model from Gemini

Paper • 2605.27295 • Published about 1 month ago • 23

nielsr

submitted a paper to Daily Papers about 1 month ago

Stable Audio 3

Paper • 2605.17991 • Published May 18 • 20

Aurelien-Morgan

posted an update about 2 months ago

Post

1094

@retrain-pipelines v0.2.0 is out !
I'm at Station F at My booth with GOSIM Paris 2026 today & tomorrow.
Come meet me for a live in-person demo and a chat !