Yasunori Ozaki's picture

🔄 In a Training Loop

Yasunori Ozaki PRO

alfredplpl

·

https://alfredplpl.github.io/en/index.html

AI & ML interests

Computer Vision, LLM

Recent Activity

liked a model 2 days ago

llm-jp/llm-jp-4-8b-thinking-gguf

liked a model 8 days ago

zai-org/GLM-5.2

liked a model 12 days ago

google/diffusiongemma-26B-A4B-it

View all activity

Organizations

upvoted a paper 27 days ago

MONET: A Massive, Open, Non-redundant and Enriched Text-to-image dataset

Paper • 2605.21272 • Published May 20 • 4

upvoted a collection 27 days ago

MONET - Massive Open Non-redundant, Enriched, Text-to-image

A curated, deduped & recaptioned open image–text dataset of 104.9M samples released under the Apache2.0 licence. https://huggingface.co/blog/jasperai/ • 4 items • Updated 29 days ago • 11

upvoted 3 collections about 1 month ago

Bonsai Image

6 items • Updated 22 days ago • 87

Jagle

Jagle: Building a Large-Scale Japanese Multimodal Post-Training Dataset for Vision–Language Models • 5 items • Updated Apr 12 • 2

MobileCLIP2

MobileCLIP2: Mobile-friendly image-text models with SOTA zero-shot capabilities trained on DFNDR-2B • 30 items • Updated Apr 23 • 64

upvoted 3 papers about 1 month ago

L2P: Unlocking Latent Potential for Pixel Generation

Paper • 2605.12013 • Published May 12 • 36

Asymmetric Flow Models

Paper • 2605.12964 • Published May 13 • 22

Qwen-Image-VAE-2.0 Technical Report

Paper • 2605.13565 • Published May 13 • 62

upvoted 4 papers about 2 months ago

Qwen-Image-2.0 Technical Report

Paper • 2605.10730 • Published May 11 • 114

STARFlow2: Bridging Language Models and Normalizing Flows for Unified Multimodal Generation

Paper • 2605.08029 • Published May 8 • 12

Continuous-Time Distribution Matching for Few-Step Diffusion Distillation

Paper • 2605.06376 • Published May 7 • 27

Continuous Latent Diffusion Language Model

Paper • 2605.06548 • Published May 7 • 84

upvoted 3 collections about 2 months ago

SenseNova-U1

SenseNova-U1: Unifying Multimodal Understanding and Generation with NEO-Unify Architecture • 10 items • Updated 14 days ago • 74

GenLIP

Model weights of paper "Let ViT Speak: Generative Language-Image Pre-training" • 6 items • Updated May 5 • 8

imabari-dialect-models

今治弁モデル • 6 items • Updated Apr 23 • 2

upvoted a paper about 2 months ago

World-R1: Reinforcing 3D Constraints for Text-to-Video Generation

Paper • 2604.24764 • Published Apr 27 • 119

upvoted a collection about 2 months ago

MiMo-V2.5

4 items • Updated Apr 27 • 90

upvoted a paper 2 months ago

AVControl: Efficient Framework for Training Audio-Visual Controls

Paper • 2603.24793 • Published Mar 25 • 30

upvoted 2 collections 2 months ago

MiDashengLM-7B-1021

4 items • Updated Oct 27, 2025 • 2

DeepSeek-V4

4 items • Updated Apr 24 • 695