🌈 Qwen-Image-Edit-MeiTu

This model β€” Qwen-Image-Edit-MeiTu β€” is an improved variant of Qwen/Qwen-Image-Edit, built with DiT-based architecture fine-tuning to enhance visual consistency, aesthetic quality, and structural alignment in complex edits.

Developed by Valiant Cat AI Lab, this version aims to further close the gap between high-fidelity semantic editing and coherent artistic rendering, achieving a more natural and professional output across a wide range of prompts and subjects.


✨ Key Improvements

  • Enhanced Consistency:
    Utilizes DiT (Diffusion Transformer) fine-tuning to ensure structural stability between input and edited regions, maintaining global spatial coherence.

  • Aesthetic Optimization:
    Trained with aesthetic discriminators and curated aesthetic score datasets, producing more pleasing colors, contrast, and light balance.

  • Better Detail Preservation:
    Improved low-level reconstruction for fine details such as textures, faces, and typography.

  • Broader Scene Adaptability:
    Performs well on portraits, environments, product photos, and illustrations, supporting both semantic and appearance-based editing.


πŸ–ΌοΈ Showcase

Below are examples of consistency and aesthetic improvement in complex editing scenarios:

Input & Output

πŸ’¬ Recommended Prompts

Try these prompts to explore the model’s strengths:

  • β€œmake the lighting soft and cinematic with better balance”
  • β€œenhance the photo’s composition and maintain realism”
  • β€œrefine skin tone and texture consistency”
  • β€œimprove the global color tone and aesthetic harmony”
  • β€œincrease photo realism and clarity without changing content”

🧩 Integration with ComfyUI

This model works seamlessly with a modified ComfyUI Qwen-Image-Edit workflow.
Just use this model in the Unet node to workflow for edit image.


πŸ“₯ Download Model

Weights available in Safetensors format:

πŸ‘‰ Download Qwen-Image-Edit-MeiTu


🧠 Training

This model was trained and optimized by the
AI Laboratory of Chongqing Valiant Cat Technology Co., LTD.
Visit https://vvicat.com/ for business collaborations or research partnerships.


πŸ“„ Related Paper

This model is part of the Qwen-Edit+ research line and is associated with the following preprint:

Fan Tang, Siyuan Li
Qwen-Edit+: Scaling Image Editing with VLM-Guided Consistency and Aesthetic Preference Distillation.
Research Square, Version 1, 08 April 2026.
DOI: 10.21203/rs.3.rs-9352857/v1


πŸ“š Citation

If you use this model, please cite:

@article{tang2026qweneditplus,
  author  = {Fan Tang and Siyuan Li},
  title   = {Qwen-Edit+: Scaling Image Editing with VLM-Guided Consistency and Aesthetic Preference Distillation},
  journal = {Research Square},
  year    = {2026},
  doi     = {10.21203/rs.3.rs-9352857/v1},
  url     = {https://doi.org/10.21203/rs.3.rs-9352857/v1}
}

πŸ“œ License

Licensed under Apache 2.0.


πŸ’Ό Join Us

We are hiring research engineers and creative ML practitioners at
Chongqing Valiant Cat Technology Co., LTD β€” reach out via
πŸ“§ tommy@vvicat.com

Downloads last month
686
GGUF
Model size
20B params
Architecture
flux
Hardware compatibility
Log In to add your hardware

4-bit

5-bit

6-bit

8-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ 1 Ask for provider support

Model tree for valiantcat/Qwen-Image-Edit-MeiTu

Quantized
(12)
this model
Quantizations
1 model

Collection including valiantcat/Qwen-Image-Edit-MeiTu