oguzhanercan 's Collections Image Generation
updated
Causal Diffusion Transformers for Generative Modeling
Paper
• 2412.12095
• Published • 23
SnapGen: Taming High-Resolution Text-to-Image Models for Mobile Devices
with Efficient Architectures and Training
Paper
• 2412.09619
• Published • 30
DiffSensei: Bridging Multi-Modal LLMs and Diffusion Models for
Customized Manga Generation
Paper
• 2412.07589
• Published • 48
Flowing from Words to Pixels: A Framework for Cross-Modality Evolution
Paper
• 2412.15213
• Published • 28
CLEAR: Conv-Like Linearization Revs Pre-Trained Diffusion Transformers
Up
Paper
• 2412.16112
• Published • 23
Parallelized Autoregressive Visual Generation
Paper
• 2412.15119
• Published • 53
Democratizing Text-to-Image Masked Generative Models with Compact
Text-Aware One-Dimensional Tokens
Paper
• 2501.07730
• Published • 18
SANA 1.5: Efficient Scaling of Training-Time and Inference-Time Compute
in Linear Diffusion Transformer
Paper
• 2501.18427
• Published • 25
Improved Training Technique for Latent Consistency Models
Paper
• 2502.01441
• Published • 8
I Think, Therefore I Diffuse: Enabling Multimodal In-Context Reasoning
in Diffusion Models
Paper
• 2502.10458
• Published • 38
EQ-VAE: Equivariance Regularized Latent Space for Improved Generative
Image Modeling
Paper
• 2502.09509
• Published • 9
LDGen: Enhancing Text-to-Image Synthesis via Large Language Model-Driven
Language Representation
Paper
• 2502.18302
• Published • 5
How far can we go with ImageNet for Text-to-Image generation?
Paper
• 2502.21318
• Published • 26
RectifiedHR: Enable Efficient High-Resolution Image Generation via
Energy Rectification
Paper
• 2503.02537
• Published • 12
Inductive Moment Matching
Paper
• 2503.07565
• Published • 6
Autoregressive Image Generation with Randomized Parallel Decoding
Paper
• 2503.10568
• Published • 9
DiT-Air: Revisiting the Efficiency of Diffusion Model Architecture
Design in Text to Image Generation
Paper
• 2503.10618
• Published • 19
Neighboring Autoregressive Modeling for Efficient Visual Generation
Paper
• 2503.10696
• Published • 8
Paper
• 2503.16425
• Published • 16
Ultra-Resolution Adaptation with Ease
Paper
• 2503.16322
• Published • 13
When Less is Enough: Adaptive Token Reduction for Efficient Image
Representation
Paper
• 2503.16660
• Published • 72
Equivariant Image Modeling
Paper
• 2503.18948
• Published • 15
Diffusion-4K: Ultra-High-Resolution Image Synthesis with Latent
Diffusion Models
Paper
• 2503.18352
• Published • 6
TextCrafter: Accurately Rendering Multiple Texts in Complex Visual
Scenes
Paper
• 2503.23461
• Published • 94
HiFlow: Training-free High-Resolution Image Generation with Flow-Aligned
Guidance
Paper
• 2504.06232
• Published • 13
VisualCloze: A Universal Image Generation Framework via Visual
In-Context Learning
Paper
• 2504.07960
• Published • 50
PixelFlow: Pixel-Space Generative Models with Flow
Paper
• 2504.07963
• Published • 18
GigaTok: Scaling Visual Tokenizers to 3 Billion Parameters for
Autoregressive Image Generation
Paper
• 2504.08736
• Published • 46
SimpleAR: Pushing the Frontier of Autoregressive Visual Generation
through Pretraining, SFT, and RL
Paper
• 2504.11455
• Published • 14
REPA-E: Unlocking VAE for End-to-End Tuning with Latent Diffusion
Transformers
Paper
• 2504.10483
• Published • 22
T2I-R1: Reinforcing Image Generation with Collaborative Semantic-level
and Token-level CoT
Paper
• 2505.00703
• Published • 44
End-to-End Vision Tokenizer Tuning
Paper
• 2505.10562
• Published • 22
STARFlow: Scaling Latent Normalizing Flows for High-resolution Image
Synthesis
Paper
• 2506.06276
• Published • 26
Improving Progressive Generation with Decomposable Flow Matching
Paper
• 2506.19839
• Published • 8
Qwen-Image Technical Report
Paper
• 2508.02324
• Published • 274
PixNerd: Pixel Neural Field Diffusion
Paper
• 2507.23268
• Published • 52
DiT360: High-Fidelity Panoramic Image Generation via Hybrid Training
Paper
• 2510.11712
• Published • 31
Generating an Image From 1,000 Words: Enhancing Text-to-Image With
Structured Captions
Paper
• 2511.06876
• Published • 28
FARMER: Flow AutoRegressive Transformer over Pixels
Paper
• 2510.23588
• Published • 59
Kandinsky 5.0: A Family of Foundation Models for Image and Video Generation
Paper
• 2511.14993
• Published • 233
DeCo: Frequency-Decoupled Pixel Diffusion for End-to-End Image Generation
Paper
• 2511.19365
• Published • 66
Terminal Velocity Matching
Paper
• 2511.19797
• Published • 12
OmniAlpha: A Sequence-to-Sequence Framework for Unified Multi-Task RGBA Generation
Paper
• 2511.20211
• Published • 12
PixelGen: Pixel Diffusion Beats Latent Diffusion with Perceptual Loss
Paper
• 2602.02493
• Published • 46
Image Generation with a Sphere Encoder
Paper
• 2602.15030
• Published • 16
RAISE: Requirement-Adaptive Evolutionary Refinement for Training-Free Text-to-Image Alignment
Paper
• 2603.00483
• Published • 3
DREAM: Where Visual Understanding Meets Text-to-Image Generation
Paper
• 2603.02667
• Published • 6
OmniLottie: Generating Vector Animations via Parameterized Lottie Tokens
Paper
• 2603.02138
• Published • 151