Fatih C. Akyon

fcakyon

https://linktr.ee/fcakyon

AI & ML interests

VLM, Agentic Dataset Curation @ultralytics @metu

Recent Activity

authored a paper about 12 hours ago

SenBen: Sensitive Scene Graphs for Explainable Content Moderation

posted an update about 15 hours ago

Let me introduce you to our CVPR 2026 paper! Today's content moderation systems give you a label: safe or unsafe. They don't tell you what triggered the decision, who is involved, or where in the image it happens. That opacity hurts auditing, breaks adaptation across platforms, and frustrates the human review that responsible deployment demands. We built SenBen to fix this: the first large-scale scene graph benchmark designed specifically for sensitive content moderation: - 13,999 annotated frames from 157 movies - Visual Genome style scene graphs with bounding boxes, attributes, and predicates - Affective state attributes (pain, fear, aggression, distress) so the model captures not just what is in the frame, but what it means - 16 safety tags across 5 categories, the broadest taxonomy of any dataset of this kind A small model that beats much bigger ones: We distilled a frontier VLM into a compact 241M parameter student built on Florence-2. On grounded scene graph metrics, the 241M student beats every evaluated VLM except Gemini, and every commercial safety API. It also wins on object detection and captioning across the entire model zoo. It runs at 733 ms per frame on 1.2 GB VRAM, which is 7.6 times faster than the next-best local VLM at zero per-frame cost. The whole benchmark, from dataset creation through all baseline evaluations, is reproducible for under $350. Project: https://senben.kim/ Paper: https://huggingface.co/papers/2604.08819 Dataset: https://huggingface.co/datasets/fcakyon/senben Code (soon): https://github.com/fcakyon/senben

updated a dataset 9 days ago

fcakyon/senben

View all activity

Organizations

Posts 2

Post

Let me introduce you to our CVPR 2026 paper!

Today's content moderation systems give you a label: safe or unsafe. They don't tell you what triggered the decision, who is involved, or where in the image it happens. That opacity hurts auditing, breaks adaptation across platforms, and frustrates the human review that responsible deployment demands.

We built SenBen to fix this: the first large-scale scene graph benchmark designed specifically for sensitive content moderation:

- 13,999 annotated frames from 157 movies
- Visual Genome style scene graphs with bounding boxes, attributes, and predicates
- Affective state attributes (pain, fear, aggression, distress) so the model captures not just what is in the frame, but what it means
- 16 safety tags across 5 categories, the broadest taxonomy of any dataset of this kind

A small model that beats much bigger ones:

We distilled a frontier VLM into a compact 241M parameter student built on Florence-2.

On grounded scene graph metrics, the 241M student beats every evaluated VLM except Gemini, and every commercial safety API. It also wins on object detection and captioning across the entire model zoo. It runs at 733 ms per frame on 1.2 GB VRAM, which is 7.6 times faster than the next-best local VLM at zero per-frame cost. The whole benchmark, from dataset creation through all baseline evaluations, is reproducible for under $350.

Project: https://senben.kim/
Paper: SenBen: Sensitive Scene Graphs for Explainable Content Moderation (2604.08819)
Dataset: fcakyon/senben
Code (soon): https://github.com/fcakyon/senben

Post

3168

🎉 GitHub selected the ultralytics computer vision project, known for its YOLOv8/YOLO11 real-time SOTA computer vision models, as one of the top 5 open-source projects for first-time contributors in 2024!

Link to the project: https://github.com/ultralytics/ultralytics

Link to the full GitHub 2024 recap report: https://github.blog/news-insights/octoverse/octoverse-2024/