Realistic Vision V6.0 Inpainting - CoreML

CoreML conversion of Realistic Vision V6.0 Inpainting optimized for Apple Silicon devices (iPhone, iPad, Mac).

Model Details

Property Value
Base Model Stable Diffusion 1.5 Inpainting
Fine-tune Realistic Vision V6.0
Resolution 512x512
UNet Channels 9 (latent + mask + masked image)
Prediction Type Epsilon
Attention SPLIT_EINSUM (optimized for ANE)
Safety Checker Included

Files

File Size Description
realistic-vision-inpaint-safe.zip 2.4 GB Full model with NSFW safety checker
realistic-vision-inpaint-coreml.zip 2.0 GB Model without safety checker (legacy)

Bundle Contents

Resources/
β”œβ”€β”€ TextEncoder.mlmodelc      # CLIP text encoder
β”œβ”€β”€ Unet.mlmodelc             # 9-channel inpainting UNet
β”œβ”€β”€ VAEDecoder.mlmodelc       # Latent to image decoder
β”œβ”€β”€ VAEEncoder.mlmodelc       # Image to latent encoder
β”œβ”€β”€ SafetyChecker.mlmodelc    # NSFW content filter
β”œβ”€β”€ vocab.json                # Tokenizer vocabulary
└── merges.txt                # BPE merges

Usage

This model is designed for use with iOS/macOS apps using CoreML. It requires a custom inpainting pipeline that:

  1. Encodes the input image to latent space using VAEEncoder
  2. Prepares a 9-channel input: [noised_latent(4) + mask(1) + masked_image_latent(4)]
  3. Runs denoising with the UNet
  4. Decodes the result with VAEDecoder
  5. Checks output with SafetyChecker (optional but recommended)

Input Format

  • Image: 512x512 RGB
  • Mask: 512x512 grayscale (white = regenerate, black = keep)
  • Prompt: Text description of desired content in masked area

Performance

Device Generation Time (20 steps)
iPhone 15 Pro ~15-20 seconds
M1 Mac ~10-15 seconds
M2/M3 Mac ~8-12 seconds

Safety Checker

The realistic-vision-inpaint-safe.zip includes a CLIP-based safety checker that filters NSFW content. When integrated:

  • Generated images are analyzed before being returned
  • NSFW content is blocked with an error
  • Safe content passes through normally

Recommended for App Store distribution.

License

This model is released under the CreativeML Open RAIL-M License.

You CAN:

  • Use commercially
  • Redistribute
  • Modify and create derivatives

You MUST:

  • Include license and attribution
  • Not use for illegal purposes
  • Not generate content exploiting minors
  • Not use for harassment or deception

Attribution

Conversion Details

Converted using Apple's ml-stable-diffusion toolkit:

python -m python_coreml_stable_diffusion.torch2coreml \
  --model-version stablediffusionapi/realistic-vision-v6.0-b1-inpaint \
  --convert-unet \
  --convert-text-encoder \
  --convert-vae-decoder \
  --convert-vae-encoder \
  --convert-safety-checker \
  --attention-implementation SPLIT_EINSUM \
  --bundle-resources-for-swift-cli \
  -o output

Related

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for jc-builds/realistic-vision-inpaint-coreml

Finetuned
(1)
this model