Collections
Discover the best community collections!
Collections including paper arxiv:2411.05003
-
Animate-X: Universal Character Image Animation with Enhanced Motion Representation
Paper • 2410.10306 • Published • 56 -
ReCapture: Generative Video Camera Controls for User-Provided Videos using Masked Video Fine-Tuning
Paper • 2411.05003 • Published • 71 -
TIP-I2V: A Million-Scale Real Text and Image Prompt Dataset for Image-to-Video Generation
Paper • 2411.04709 • Published • 26 -
IterComp: Iterative Composition-Aware Feedback Learning from Model Gallery for Text-to-Image Generation
Paper • 2410.07171 • Published • 43
-
SG-I2V: Self-Guided Trajectory Control in Image-to-Video Generation
Paper • 2411.04989 • Published • 15 -
ReCapture: Generative Video Camera Controls for User-Provided Videos using Masked Video Fine-Tuning
Paper • 2411.05003 • Published • 71 -
CogVideoX-5B
🎥1.02kText-to-Video
-
jingheya/lotus-depth-g-v1-0
Depth Estimation • Updated • 14.4k • 26
-
Tora: Trajectory-oriented Diffusion Transformer for Video Generation
Paper • 2407.21705 • Published • 27 -
TrackGo: A Flexible and Efficient Method for Controllable Video Generation
Paper • 2408.11475 • Published • 18 -
TVG: A Training-free Transition Video Generation Method with Diffusion Models
Paper • 2408.13413 • Published • 14 -
PhysGen: Rigid-Body Physics-Grounded Image-to-Video Generation
Paper • 2409.18964 • Published • 27
-
ReCapture: Generative Video Camera Controls for User-Provided Videos using Masked Video Fine-Tuning
Paper • 2411.05003 • Published • 71 -
Qwen2.5 Coder Artifacts
🐢1.7kCreate and view code for applications using text prompts
-
Stable Virtual Camera
⚡495Generate 3D video from input images
-
Wan2.2 14B Fast
🎥2.34kgenerate a video from an image with a text prompt
-
Revisit Large-Scale Image-Caption Data in Pre-training Multimodal Foundation Models
Paper • 2410.02740 • Published • 54 -
From Code to Correctness: Closing the Last Mile of Code Generation with Hierarchical Debugging
Paper • 2410.01215 • Published • 39 -
Molmo and PixMo: Open Weights and Open Data for State-of-the-Art Multimodal Models
Paper • 2409.17146 • Published • 121 -
EuroLLM: Multilingual Language Models for Europe
Paper • 2409.16235 • Published • 29
-
LiheYoung/depth_anything_vitb14
Depth Estimation • Updated • 23.2k • 3 -
Depth Anything V2
🌖591Generate depth maps from images
-
ReCapture: Generative Video Camera Controls for User-Provided Videos using Masked Video Fine-Tuning
Paper • 2411.05003 • Published • 71 -
Image-Free Timestep Distillation via Continuous-Time Consistency with Trajectory-Sampled Pairs
Paper • 2511.20410 • Published • 2
-
ReCapture: Generative Video Camera Controls for User-Provided Videos using Masked Video Fine-Tuning
Paper • 2411.05003 • Published • 71 -
Qwen2.5 Coder Artifacts
🐢1.7kCreate and view code for applications using text prompts
-
Stable Virtual Camera
⚡495Generate 3D video from input images
-
Wan2.2 14B Fast
🎥2.34kgenerate a video from an image with a text prompt
-
Animate-X: Universal Character Image Animation with Enhanced Motion Representation
Paper • 2410.10306 • Published • 56 -
ReCapture: Generative Video Camera Controls for User-Provided Videos using Masked Video Fine-Tuning
Paper • 2411.05003 • Published • 71 -
TIP-I2V: A Million-Scale Real Text and Image Prompt Dataset for Image-to-Video Generation
Paper • 2411.04709 • Published • 26 -
IterComp: Iterative Composition-Aware Feedback Learning from Model Gallery for Text-to-Image Generation
Paper • 2410.07171 • Published • 43
-
SG-I2V: Self-Guided Trajectory Control in Image-to-Video Generation
Paper • 2411.04989 • Published • 15 -
ReCapture: Generative Video Camera Controls for User-Provided Videos using Masked Video Fine-Tuning
Paper • 2411.05003 • Published • 71 -
CogVideoX-5B
🎥1.02kText-to-Video
-
jingheya/lotus-depth-g-v1-0
Depth Estimation • Updated • 14.4k • 26
-
Revisit Large-Scale Image-Caption Data in Pre-training Multimodal Foundation Models
Paper • 2410.02740 • Published • 54 -
From Code to Correctness: Closing the Last Mile of Code Generation with Hierarchical Debugging
Paper • 2410.01215 • Published • 39 -
Molmo and PixMo: Open Weights and Open Data for State-of-the-Art Multimodal Models
Paper • 2409.17146 • Published • 121 -
EuroLLM: Multilingual Language Models for Europe
Paper • 2409.16235 • Published • 29
-
Tora: Trajectory-oriented Diffusion Transformer for Video Generation
Paper • 2407.21705 • Published • 27 -
TrackGo: A Flexible and Efficient Method for Controllable Video Generation
Paper • 2408.11475 • Published • 18 -
TVG: A Training-free Transition Video Generation Method with Diffusion Models
Paper • 2408.13413 • Published • 14 -
PhysGen: Rigid-Body Physics-Grounded Image-to-Video Generation
Paper • 2409.18964 • Published • 27
-
LiheYoung/depth_anything_vitb14
Depth Estimation • Updated • 23.2k • 3 -
Depth Anything V2
🌖591Generate depth maps from images
-
ReCapture: Generative Video Camera Controls for User-Provided Videos using Masked Video Fine-Tuning
Paper • 2411.05003 • Published • 71 -
Image-Free Timestep Distillation via Continuous-Time Consistency with Trajectory-Sampled Pairs
Paper • 2511.20410 • Published • 2