Doonari OKuno

Excited to open-source the VisDrone Aerial Object Detection Model Zoo on Hugging Face.

The collection includes multiple YOLO variants trained and evaluated on the VisDrone benchmark for aerial object detection, with accompanying documentation and performance metrics.

If you're working on drones, aerial surveillance, robotics, or small-object detection, I hope these models save you some time.

Model Zoo: https://huggingface.co/collections/dronefreak/visdrone-detection-model-zoo

Feedback, issues, and contributions are welcome.

12 replies

liked a model 12 days ago

MiniMaxAI/MiniMax-M3

Image-Text-to-Text • 427B • Updated 1 day ago • 201k • • 1.27k

reacted to Jaward's post with 🔥 13 days ago

Post

9135

Our preprint is out!
We attempt to model human teaching behaviors into agents yielding a unified framework that enables adaptive personalized learning experiences:
LectūraAgents addresses the prevailing limitations in current AI learning systems with three essential capabilities:
(1) a hierarchical multi-agent architecture modeled on academic standards. we observe that agents collaborating across hierarchies yield better personalized learning outcomes.
(2) an adaptive embodied teaching mechanism, in which the instructor agent executes visible and pedagogically motivated teaching actions (e.g. handwrite, highlight, circle etc) on contents in a teaching environment while speaking.
(3) to achieve this we propose a novel teaching action-speech alignment algorithm (TASA) that dynamically aligns speech with visual teaching actions: specifically, TASA temporally chops up speech segments into word-level tokens, performs salience heuristics analysis on learning contents (texts, images etc) then identifies relevant regions to apply pedagogical teaching actions that guide attention and augment understanding.

We conducted several experiments to assess these capabilities: starting with pedagogical evaluation of the various components under frontier models, comparative analysis with existing frameworks and an efficacy study with real students.

Results show consistent gains in standard instructional metrics (curated by expert educators) spanning lecture content quality, embodied teaching quality, assessment, and personalization over baseline systems, positioning LectūraAgents as a pedagogically grounded framework for personalized learning at scale.

Paper: LectūraAgents: A Multi-Agent Framework for Adaptive Personalized AI-Assisted Learning and Embodied Teaching (2606.16428)
Data: Jaward/lectura-agents-data

1 reply

liked a model 28 days ago

openai/gpt-oss-safeguard-20b

Text Generation • 22B • Updated Jan 14 • 148k • • 236

reacted to RiverRider's post with 🔥 28 days ago

Post

2819

This is not the end of words. It is the end of pretending their meanings are determined.

Meaning Forks. SRT detects it.

Paste any text to identify contested terms

RiverRider/srt-introspect

Try any prompt (attached link) to see exactly what an LLM is thinking at every meaningful step of its answer

RiverRider/srt-introspect

Repository

https://github.com/space-bacon/SRT

Paper

https://github.com/space-bacon/SRT/blob/main/paper_nla.md

Explainer

https://github.com/space-bacon/SRT/blob/main/docs/EXPLAINERS.md

liked a dataset 29 days ago

sahil2801/CodeAlpaca-20k

Viewer • Updated Oct 3, 2023 • 20k • 19.8k • 236

reacted to lbourdois's post with 🤗 29 days ago

Post

1009

New blog post!
An introduction to a little-known but highly effective model reduction method: 𝗧𝗿𝗶𝗺𝗺𝗶𝗻𝗴✂️
We show how to reduce model size (we went up to 87.24% reduction) while preserving its performance.

We applied this technique to 16 different model families across several modalities to illustrate that it works on any architecture (as long as the embedding layer is the last one of the model) and on any modality involving text.
From these 16 families, we generated over 𝟱,𝟱𝟬𝟬 𝗺𝗼𝗻𝗼𝗹𝗶𝗻𝗴𝘂𝗮𝗹 𝗺𝗼𝗱𝗲𝗹𝘀 𝗶𝗻 𝟭𝟮𝟰 𝗱𝗶𝗳𝗳𝗲𝗿𝗲𝗻𝘁 𝗹𝗮𝗻𝗴𝘂𝗮𝗴𝗲𝘀 🌍

Key takeaways from our experiments:
1️⃣ Trimming does not require a GPU. Our models were obtained on a CPU.
2️⃣ This method scales up to at least 4B parameters (we did not test beyond that).
3️⃣ Trimmed model is smaller than the original while preserving its performance. If you observe a slight performance drop, just fine-tuned to recover or even surpass the original performance.
4️⃣ For an equivalent compute budget, it is better to trim then fine-tune rather than fine-tuning the original model. Since the model is smaller, you can run more epochs/show more data and get in fine a better model than the original.
5️⃣ Trimming is a competitive alternative to distillation and quantization. E.g. we obtained our alternative to DistilBERT in 9 minutes on CPU vs. 90 hours of GPU for the latter.
6️⃣ Trimming could generate reasoning traces in the language of the trimmed model. This could be an alternative to generating traces in English and then translating them into the desired language.

And many other things (such as how much data are needed, the impact of the database used, the order in which it should be done, etc.) are available in the blogpost!

Blogpost: https://huggingface.co/blog/lbourdois/introduction-to-trimming
Models: alphaedge-ai/Trimming_models_search

4 replies

upvoted a changelog 29 days ago

Hugging Face Changelog

Filter Models page by Base Models only

May 28

• 173

liked a dataset about 2 months ago

poloclub/diffusiondb

Updated Jan 22, 2024 • 8.4k • 628

liked a model about 2 months ago

openbmb/MiniCPM-V-4.6

Image-Text-to-Text • 1B • Updated about 22 hours ago • 880k • 1.13k

replied to PhysiQuanty's post about 2 months ago

This is absolutely out of the box model.

reacted to PhysiQuanty's post with 🚀 about 2 months ago

Post

5151

❗ Dating apps do not allow us to control the profiles suggested to us based on our mutual search criteria ❗
🧬 If you want to see if your soulmate has already existed, I have published a dataset of 59k anonymized public profiles

SpiceeChat/OkCupid-59k-Anonymized-Profiles

Are you looking for a female ML engineer who is looking for a male ML engineer and you can't find it on the apps ?
You need to look for her, but more importantly, she needs to look for you.
Personally, I'm looking for a physicist I'm encountering the same problem. I can't find it
My answer : Paradox of choice of dating apps solved by patent ⚡ WO2026082672 ⚡
https://patentscope.wipo.int/search/en/detail.jsf?docId=WO2026082672

J'ai du breveté pour te trouver et on se trouvera bientôt !

9 replies

liked a model about 2 months ago

baidu/ERNIE-Image

Text-to-Image • Updated Apr 17 • 77.8k • • 661

reacted to cihatyldz's post with 👍 about 2 months ago

Post

3601

Şifahane, a dual-inference medical classification demo, is now live on Spaces. It features side-by-side Turkish BERT and Qwen2.5 architectures for real-time evaluation of the "Classifier vs. LLM" trade-offs, all within a single space. The system utilizes a fine-tuned Turkish BERT for high-speed, cost-effective inference and the Qwen2.5-7B model for flexible multi-task reasoning, with support for department classification, condition analysis, urgency assessment, and rationale generation across 12 medical departments.

🧠 BERT model: https://lnkd.in/dCUUASqq
📊 Dataset: https://lnkd.in/dGK9y24w
🤗 Demo: https://lnkd.in/dtWjCCPF

reacted to Crownelius's post with 🔥 2 months ago

Post

5959

My Huggingface journey has been a trip!
I wanted to take the time to thank each and every one of you for using my dataset and getting it to go as far as it did. Believe it or not, some neanderthal was and maybe still is trending on huggingface.

Not only did my dataset reach number one, my fine-tuned qwen3.5 model did as well. Top 10. Honestly, ain't much left to do here.

Y'all have given me the desire, no... the craving for more. I am absolutely obsessed with AI now. I want to tweak it... I want to take it apart, just to see what makes everything tick. I want to put it together like Frankenstein and his monster.

The only thing that's stopping this guy is compute. I don't mind spending every penny I have on this. I desperately want to drive AI forward, even just a little bit.

I never knew the clanker hater from a year ago would be saying this.

Thank you all from the bottom of my heart.

Looking forward to showing you what I'm cooking up next. @CompactAI is your only hint!

3 replies

reacted to prithivMLmods's post with 🔥 4 months ago

Post

3160

Introducing QIE-Bbox-Studio! 🔥🤗

The QIE-Bbox-Studio demo is now live — more precise and packed with more options. Users can manipulate images with object removal, design addition, and even move objects from one place to another, all in just 4-step fast inference.

🤗 Demo: prithivMLmods/QIE-Bbox-Studio
🔗 GitHub: https://github.com/PRITHIVSAKTHIUR/QIE-Bbox-Studio

🚀 Models [LoRA] :

● QIE-2511-Object-Mover-Bbox: prithivMLmods/QIE-2511-Object-Mover-Bbox
● QIE-2511-Object-Remover-Bbox-v3: prithivMLmods/QIE-2511-Object-Remover-Bbox-v3
● QIE-2511-Outfit-Design-Layout: prithivMLmods/QIE-2511-Outfit-Design-Layout
● QIE-2509-Object-Remover-Bbox-v3: prithivMLmods/QIE-2509-Object-Remover-Bbox-v3
● QIE-2509-Object-Mover-Bbox: prithivMLmods/QIE-2509-Object-Mover-Bbox

🚀 Collection:

● Qwen Image Edit [Layout Bbox]: https://huggingface.co/collections/prithivMLmods/qwen-image-edit-layout-bbox

To learn more, visit the app page or the respective model pages.

reacted to hannayukhymenko's post with 🔥 4 months ago

Post

2161

Do you translate your benchmarks from English correctly? 🤔
Turns out, for many languages it is much harder than you can imagine!

Introducing Recovered in Translation 🌍 together with @aalexandrov
https://ritranslation.insait.ai

Translating benchmarks is a painful process, requiring a lot of manual inspection and adjustments. You start from setting up the whole pipeline and adapting to every format type, including task specifics. There already exist some massive benchmarks, but they still have some simple (and sometimes silly) bugs, which can hurt the evaluations :( We present a novel automated translation framework to help with that!

Eastern and Southern European languages introduce richer linguistic structures compared to English and for benchmarks which heavily rely on grammatical coherence machine translation presents a risk of harming evaluations. We discover potential answer leakage or misleading through grammatical structure of the questions. Some benchmarks are also just outdated and need to be retranslated with newer and better models.

We present a framework with novel test-time scaling methods which allow to control time and cost investments, while at the same time mitigate the need for human-in-the-loop verification. While working on Ukrainian-focused MamayLM models, we had to translate 10+ benchmarks in a short span of time. Finding human evaluators is costly and time-consuming, same goes for using professional translators. With our pipeline we were able to do it in 3 days🏎️

We hope our findings will help enable stronger multilingual evaluations and developments. We release all produced benchmarks on Hugging Face together with the source code and Arxiv paper 🤗

Paper: Recovered in Translation: Efficient Pipeline for Automated Translation of Benchmarks and Datasets (2602.22207)
Code: https://github.com/insait-institute/ritranslation
Benchmarks: https://huggingface.co/collections/INSAIT-Institute/multilingual-benchmarks

1 reply

liked 2 models 5 months ago

dslim/bert-base-NER

Token Classification • 0.1B • Updated Oct 8, 2024 • 1.88M • • 721

facebook/nllb-200-distilled-600M

Translation • Updated Feb 14, 2024 • 1.69M • 927

Doonari OKuno

AI & ML interests

Recent Activity

Organizations

dooOkuno's activity

Filter Models page by Base Models only