Deprem Yapay Zeka

community

AI & ML interests

Yapay zeka kullanarak afet müdahaleye yardım etmeye çalışıyoruz. Geliştirilen bütün uygulamalara bakmak için: https://github.com/acikkaynak

Recent Activity

fcakyon authored a paper 7 days ago

SenBen: Sensitive Scene Graphs for Explainable Content Moderation

selvatas authored a paper 19 days ago

RDP LoRA: Geometry-Driven Identification for Parameter-Efficient Adaptation in Large Language Models

adirik authored a paper about 2 months ago

ReasonX: MLLM-Guided Intrinsic Image Decomposition

View all activity

fcakyon

authored a paper 7 days ago

SenBen: Sensitive Scene Graphs for Explainable Content Moderation

Paper • 2604.08819 • Published Apr 9 • 1

fcakyon

posted an update 7 days ago

Post

Let me introduce you to our CVPR 2026 paper!

Today's content moderation systems give you a label: safe or unsafe. They don't tell you what triggered the decision, who is involved, or where in the image it happens. That opacity hurts auditing, breaks adaptation across platforms, and frustrates the human review that responsible deployment demands.

We built SenBen to fix this: the first large-scale scene graph benchmark designed specifically for sensitive content moderation:

- 13,999 annotated frames from 157 movies
- Visual Genome style scene graphs with bounding boxes, attributes, and predicates
- Affective state attributes (pain, fear, aggression, distress) so the model captures not just what is in the frame, but what it means
- 16 safety tags across 5 categories, the broadest taxonomy of any dataset of this kind

A small model that beats much bigger ones:

We distilled a frontier VLM into a compact 241M parameter student built on Florence-2.

On grounded scene graph metrics, the 241M student beats every evaluated VLM except Gemini, and every commercial safety API. It also wins on object detection and captioning across the entire model zoo. It runs at 733 ms per frame on 1.2 GB VRAM, which is 7.6 times faster than the next-best local VLM at zero per-frame cost. The whole benchmark, from dataset creation through all baseline evaluations, is reproducible for under $350.

Project: https://senben.kim/
Paper: SenBen: Sensitive Scene Graphs for Explainable Content Moderation (2604.08819)
Dataset: fcakyon/senben
Code (soon): https://github.com/fcakyon/senben

selvatas

authored a paper 19 days ago

RDP LoRA: Geometry-Driven Identification for Parameter-Efficient Adaptation in Large Language Models

Paper • 2604.19321 • Published 22 days ago • 7

adirik

authored 2 papers about 2 months ago

ReasonX: MLLM-Guided Intrinsic Image Decomposition

Paper • 2512.04222 • Published Dec 3, 2025 • 1

PRISM: A Unified Framework for Photorealistic Reconstruction and Intrinsic Scene Modeling

Paper • 2504.14219 • Published Apr 19, 2025 • 2

umuthopeyildirim

authored a paper 4 months ago

Experimentation in Content Moderation using RWKV

Paper • 2409.03939 • Published Sep 5, 2024

selvatas

authored a paper 6 months ago

TurkColBERT: A Benchmark of Dense and Late-Interaction Models for Turkish Information Retrieval

Paper • 2511.16528 • Published Nov 20, 2025 • 24

merve

posted an update 7 months ago

Post

12147

deepseek-ai/DeepSeek-OCR is out! 🔥 my take ⤵️
> pretty insane it can parse and re-render charts in HTML
> it uses CLIP and SAM features concatenated, so better grounding
> very efficient per vision tokens/performance ratio
> covers 100 languages

4 replies

selvatas

authored a paper 8 months ago

Turk-LettuceDetect: A Hallucination Detection Models for Turkish RAG Applications

Paper • 2509.17671 • Published Sep 22, 2025 • 12

merve

posted an update 8 months ago

Post

7009

large AI labs open-sourced a ton of models last week 🔥
here's few picks, find even more here merve/sep-16-releases-68d13ea4c547f02f95842f05 🤝
> IBM released a new Docling model with 258M params based on Granite (A2.0) 📝 ibm-granite/granite-docling-258M
> Xiaomi released 7B audio LM with base and instruct variants (MIT) XiaomiMiMo/mimo-audio-68cc7202692c27dae881cce0
> DecartAI released Lucy Edit, open Nano Banana 🍌 (NC) decart-ai/Lucy-Edit-Dev
> OpenGVLab released a family of agentic computer use models (3B/7B/32B) with the dataset 💻 OpenGVLab/scalecua-68c912cf56f7ff4c8e034003
> Meituan Longcat released thinking version of LongCat-Flash 💭 meituan-longcat/LongCat-Flash-Thinking

2 replies

merve

posted an update 8 months ago

Post

3535

IBM just released small swiss army knife for the document models: granite-docling-258M on Hugging Face 🔥

> not only a document converter but also can do document question answering, understand multiple languages 🤯
> best part: released with Apache 2.0 license 👏 use it with your commercial projects!
> it supports transformers, vLLM and MLX from the get-go! 🤗
> built on SigLIP2 & granite-165M

model: ibm-granite/granite-docling-258M
demo: ibm-granite/granite-docling-258m-demo 💗

merve

posted an update 8 months ago

Post

1288

a ton of image/video generation models and LLMs from big labs 🔥

> Meta released facebook/mobilellm-r1-68c4597b104fac45f28f448e, smol LLMs for on-device use 💬
> Tencent released tencent/SRPO, high res image generation model and tencent/POINTS-Reader, cutting edge OCR 📝
> ByteDance released bytedance-research/HuMo, video generation from any input ⏯️

find more models, datasets, demos here merve/sep-11-releases-68c7dbfa26bea8cd921fa0ac

selvatas

authored a paper 8 months ago

Guided Decoding and Its Critical Role in Retrieval-Augmented Generation

Paper • 2509.06631 • Published Sep 8, 2025 • 12

merve

posted an update 8 months ago

Post

1079

fan-favorite vision LM Florence-2 is now officially supported in transformers 🤗

find all the models in

florence-community org 🫡

merve

posted an update 8 months ago

Post

1877

past week was great for open LLMs 🔥 merve/sep-1-releases-68bede0e729c12597eefd050

> Google released google/embeddinggemma-300m, new embedding model with 300M params
> new update to Kimi-K2 just landed moonshotai/Kimi-K2-Instruct-0905 😍
> OpenBMB released a new version to MiniCPM with 8B params openbmb/MiniCPM4.1-8B

also soooo many Qwen-Image & Kontext LoRAs dropped!

merve

posted an update 8 months ago

Post

3767

upgrade your transformers 🔥
it comes with insanely capable models like merve/sam2-66ac9deac6fca3bc5482fe30, microsoft/kosmos-2.5, and more 🫡
I built a notebook you can run with free Colab T4 to walk through the API for new models 🙋🏻‍♀️ merve/smol-vision

fine-tuning will follow-up soon!

merve

posted an update 8 months ago

Post

6327

large AI labs have dropped so many open models last week 🔥 don't miss out on them

→ Apple released on-device vision LMs apple/fastvlm-68ac97b9cd5cacefdd04872e & apple/mobileclip2-68ac947dcb035c54bcd20c47
→ OpenGVLab released InternVL3.5, 32 new vision LMs with one based on gpt-oss! (OS) OpenGVLab/internvl35-68ac87bd52ebe953485927fb
→ MSFT released a killer small TTS model (OS) microsoft/VibeVoice-1.5B

find more herehttps://huggingface.co/collections/merve/august-29-releases-68b5a3754cfb8abf59e2b486

1 reply

merve

posted an update 9 months ago

Post

6116

first vision language model built off openai/gpt-oss-20b just dropped! 🔥

InternVL3.5 comes with 32 models 🤯 pre-trained, fine-tuned, aligned in various sizes OpenGVLab/internvl35-68ac87bd52ebe953485927fb
comes with gpt-oss or Qwen3 for LLM part ⤵️

1 reply

kadirnar

posted an update 9 months ago

Post

2987

What can you do with the VyvoTTS library?

- You can train a model in a language it has never been trained in using the PT model. There’s no need for large datasets.
- With the PT model, you can easily replicate the voice of any character you want. Just 1k samples are enough.
- You can add emotion support with a small dataset.

Github: https://github.com/Vyvo-Labs/VyvoTTS
HuggingFace:

Vyvo

merve

posted an update 9 months ago

Post

3352

GPT-4.1-mini level model right in your iPhone 🤯

openbmb/MiniCPM-V-4 is only 4B while surpassing GPT-4.1-mini in vision benchmarks 🔥

allows commercial use as well!

AI & ML interests

Recent Activity

Team members 106

deprem-ml's activity