Read But Not Implemented - a ethananhtran Collection

ethananhtran 's Collections

Read But Not Implemented

Read But Not Implemented

updated 30 days ago

TurboDiffusion: Accelerating Video Diffusion Models by 100-200 Times

Paper • 2512.16093 • Published Dec 18, 2025 • 95
Z-Image: An Efficient Image Generation Foundation Model with Single-Stream Diffusion Transformer

Paper • 2511.22699 • Published Nov 27, 2025 • 238
DataFlow: An LLM-Driven Framework for Unified Data Preparation and Workflow Automation in the Era of Data-Centric AI

Paper • 2512.16676 • Published Dec 18, 2025 • 219
Sharp Monocular View Synthesis in Less Than a Second

Paper • 2512.10685 • Published Dec 11, 2025 • 28
Latent Implicit Visual Reasoning

Paper • 2512.21218 • Published Dec 24, 2025 • 69
SemanticGen: Video Generation in Semantic Space

Paper • 2512.20619 • Published Dec 23, 2025 • 93
Live Avatar: Streaming Real-time Audio-Driven Avatar Generation with Infinite Length

Paper • 2512.04677 • Published Dec 4, 2025 • 171
Spatia: Video Generation with Updatable Spatial Memory

Paper • 2512.15716 • Published Dec 17, 2025 • 33
The Prism Hypothesis: Harmonizing Semantic and Pixel Representations via Unified Autoencoding

Paper • 2512.19693 • Published Dec 22, 2025 • 66
Kandinsky 5.0: A Family of Foundation Models for Image and Video Generation

Paper • 2511.14993 • Published Nov 19, 2025 • 231
PersonaLive! Expressive Portrait Image Animation for Live Streaming

Paper • 2512.11253 • Published Dec 12, 2025 • 37
Diffusion Transformers with Representation Autoencoders

Paper • 2510.11690 • Published Oct 13, 2025 • 166
1.58-bit FLUX

Paper • 2412.18653 • Published Dec 24, 2024 • 86
InsertAnywhere: Bridging 4D Scene Geometry and Diffusion Models for Realistic Video Object Insertion

Paper • 2512.17504 • Published Dec 19, 2025 • 97
ProEdit: Inversion-based Editing From Prompts Done Right

Paper • 2512.22118 • Published Dec 26, 2025 • 18
Decoupled DMD: CFG Augmentation as the Spear, Distribution Matching as the Shield

Paper • 2511.22677 • Published Nov 27, 2025 • 33
FlashPortrait: 6x Faster Infinite Portrait Animation with Adaptive Latent Prediction

Paper • 2512.16900 • Published Dec 18, 2025 • 11
StoryMem: Multi-shot Long Video Storytelling with Memory

Paper • 2512.19539 • Published Dec 22, 2025 • 18
LiveTalk: Real-Time Multimodal Interactive Video Diffusion via Improved On-Policy Distillation

Paper • 2512.23576 • Published Dec 29, 2025 • 65
Youtu-LLM: Unlocking the Native Agentic Potential for Lightweight Large Language Models

Paper • 2512.24618 • Published Dec 31, 2025 • 151
Stream-DiffVSR: Low-Latency Streamable Video Super-Resolution via Auto-Regressive Diffusion

Paper • 2512.23709 • Published Dec 29, 2025 • 50
mHC: Manifold-Constrained Hyper-Connections

Paper • 2512.24880 • Published Dec 31, 2025 • 311
Improving Multi-step RAG with Hypergraph-based Memory for Long-Context Complex Relational Modeling

Paper • 2512.23959 • Published Dec 30, 2025 • 112
Avatar Forcing: Real-Time Interactive Head Avatar Generation for Natural Conversation

Paper • 2601.00664 • Published Jan 2 • 56
Can LLMs Predict Their Own Failures? Self-Awareness via Internal Circuits

Paper • 2512.20578 • Published Dec 23, 2025 • 85
InfiniDepth: Arbitrary-Resolution and Fine-Grained Depth Estimation with Neural Implicit Fields

Paper • 2601.03252 • Published Jan 6 • 102
Entropy-Adaptive Fine-Tuning: Resolving Confident Conflicts to Mitigate Forgetting

Paper • 2601.02151 • Published Jan 5 • 109
GDPO: Group reward-Decoupled Normalization Policy Optimization for Multi-reward RL Optimization

Paper • 2601.05242 • Published Jan 8 • 228
Learnable Multipliers: Freeing the Scale of Language Model Matrix Layers

Paper • 2601.04890 • Published Jan 8 • 42
LTX-2: Efficient Joint Audio-Visual Foundation Model

Paper • 2601.03233 • Published Jan 6 • 154
MMFormalizer: Multimodal Autoformalization in the Wild

Paper • 2601.03017 • Published Jan 6 • 105
Controlled Self-Evolution for Algorithmic Code Optimization

Paper • 2601.07348 • Published Jan 12 • 115
Rewarding the Rare: Uniqueness-Aware RL for Creative Problem Solving in LLMs

Paper • 2601.08763 • Published Jan 13 • 148
VIBE: Visual Instruction Based Editor

Paper • 2601.02242 • Published Jan 5 • 63
Multiplex Thinking: Reasoning via Token-wise Branch-and-Merge

Paper • 2601.08808 • Published Jan 13 • 39
Advances and Frontiers of LLM-based Issue Resolution in Software Engineering: A Comprehensive Survey

Paper • 2601.11655 • Published Jan 15 • 60
LongCat-Flash-Thinking-2601 Technical Report

Paper • 2601.16725 • Published Jan 23 • 176