Add files using upload-large-folder tool
Browse files
README.md
CHANGED
|
@@ -1,4 +1,4 @@
|
|
| 1 |
-
<!-- README Version: v1.
|
| 2 |
|
| 3 |
---
|
| 4 |
license: apache-2.0
|
|
@@ -12,6 +12,8 @@ tags:
|
|
| 12 |
- fp16
|
| 13 |
- diffusion
|
| 14 |
- stable-diffusion
|
|
|
|
|
|
|
| 15 |
base_model: black-forest-labs/FLUX.1-dev
|
| 16 |
---
|
| 17 |
|
|
@@ -48,8 +50,12 @@ flux-dev-fp16/
|
|
| 48 |
│ └── t5xxl_fp16.safetensors (9.2 GB) # T5-XXL text encoder
|
| 49 |
├── clip/
|
| 50 |
│ └── t5xxl_fp16.safetensors (9.2 GB) # T5-XXL encoder (alternate location)
|
| 51 |
-
|
| 52 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
| 53 |
|
| 54 |
Total Repository Size: 72 GB
|
| 55 |
```
|
|
@@ -58,6 +64,8 @@ Total Repository Size: 72 GB
|
|
| 58 |
- **Main Model**: `flux1-dev-fp16.safetensors` (23 GB) - Core diffusion transformer
|
| 59 |
- **Text Encoders**: CLIP-L, CLIP-G, T5-XXL for advanced text understanding
|
| 60 |
- **Vision Encoder**: CLIP vision model for image understanding capabilities
|
|
|
|
|
|
|
| 61 |
|
| 62 |
## Hardware Requirements
|
| 63 |
|
|
@@ -180,6 +188,45 @@ image = pipe(
|
|
| 180 |
image.save("optimized_output.png")
|
| 181 |
```
|
| 182 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 183 |
## Model Specifications
|
| 184 |
|
| 185 |
| Specification | Details |
|
|
@@ -200,6 +247,7 @@ image.save("optimized_output.png")
|
|
| 200 |
- Multi-aspect ratio generation
|
| 201 |
- Img2img workflows
|
| 202 |
- Inpainting and outpainting
|
|
|
|
| 203 |
- ControlNet compatibility
|
| 204 |
- LoRA fine-tuning support
|
| 205 |
|
|
|
|
| 1 |
+
<!-- README Version: v1.1 -->
|
| 2 |
|
| 3 |
---
|
| 4 |
license: apache-2.0
|
|
|
|
| 12 |
- fp16
|
| 13 |
- diffusion
|
| 14 |
- stable-diffusion
|
| 15 |
+
- ip-adapter
|
| 16 |
+
- style-transfer
|
| 17 |
base_model: black-forest-labs/FLUX.1-dev
|
| 18 |
---
|
| 19 |
|
|
|
|
| 50 |
│ └── t5xxl_fp16.safetensors (9.2 GB) # T5-XXL text encoder
|
| 51 |
├── clip/
|
| 52 |
│ └── t5xxl_fp16.safetensors (9.2 GB) # T5-XXL encoder (alternate location)
|
| 53 |
+
├── clip_vision/
|
| 54 |
+
│ └── clip_vision_h.safetensors (1.2 GB) # CLIP vision encoder
|
| 55 |
+
├── vae/flux/
|
| 56 |
+
│ └── flux-vae-bf16.safetensors (160 MB) # VAE decoder in BF16 precision
|
| 57 |
+
└── ipadapter-flux/
|
| 58 |
+
└── ip-adapter.bin (5.0 GB) # IP-Adapter for image prompting
|
| 59 |
|
| 60 |
Total Repository Size: 72 GB
|
| 61 |
```
|
|
|
|
| 64 |
- **Main Model**: `flux1-dev-fp16.safetensors` (23 GB) - Core diffusion transformer
|
| 65 |
- **Text Encoders**: CLIP-L, CLIP-G, T5-XXL for advanced text understanding
|
| 66 |
- **Vision Encoder**: CLIP vision model for image understanding capabilities
|
| 67 |
+
- **VAE**: `flux-vae-bf16.safetensors` (160 MB) - Variational autoencoder for latent/image conversion
|
| 68 |
+
- **IP-Adapter**: `ip-adapter.bin` (5.0 GB) - Image prompt adapter for style transfer and image conditioning
|
| 69 |
|
| 70 |
## Hardware Requirements
|
| 71 |
|
|
|
|
| 188 |
image.save("optimized_output.png")
|
| 189 |
```
|
| 190 |
|
| 191 |
+
### IP-Adapter Image Prompting
|
| 192 |
+
|
| 193 |
+
```python
|
| 194 |
+
import torch
|
| 195 |
+
from diffusers import FluxPipeline
|
| 196 |
+
from ip_adapter import IPAdapter
|
| 197 |
+
|
| 198 |
+
# Load FLUX pipeline
|
| 199 |
+
pipe = FluxPipeline.from_single_file(
|
| 200 |
+
"E:/huggingface/flux-dev-fp16/checkpoints/flux/flux1-dev-fp16.safetensors",
|
| 201 |
+
torch_dtype=torch.float16
|
| 202 |
+
)
|
| 203 |
+
pipe.to("cuda")
|
| 204 |
+
|
| 205 |
+
# Load IP-Adapter for image conditioning
|
| 206 |
+
ip_adapter = IPAdapter(
|
| 207 |
+
pipe,
|
| 208 |
+
image_encoder_path="E:/huggingface/flux-dev-fp16/clip_vision",
|
| 209 |
+
ip_ckpt="E:/huggingface/flux-dev-fp16/ipadapter-flux/ip-adapter.bin",
|
| 210 |
+
device="cuda"
|
| 211 |
+
)
|
| 212 |
+
|
| 213 |
+
# Load reference image for style/composition transfer
|
| 214 |
+
reference_image = "reference_style.jpg"
|
| 215 |
+
|
| 216 |
+
# Generate image with text prompt + image reference
|
| 217 |
+
image = ip_adapter.generate(
|
| 218 |
+
pil_image=reference_image,
|
| 219 |
+
prompt="A landscape in the style of the reference image",
|
| 220 |
+
num_inference_steps=50,
|
| 221 |
+
guidance_scale=7.5,
|
| 222 |
+
scale=0.6, # IP-Adapter influence strength (0.0-1.0)
|
| 223 |
+
height=1024,
|
| 224 |
+
width=1024
|
| 225 |
+
)[0]
|
| 226 |
+
|
| 227 |
+
image.save("style_transfer_output.png")
|
| 228 |
+
```
|
| 229 |
+
|
| 230 |
## Model Specifications
|
| 231 |
|
| 232 |
| Specification | Details |
|
|
|
|
| 247 |
- Multi-aspect ratio generation
|
| 248 |
- Img2img workflows
|
| 249 |
- Inpainting and outpainting
|
| 250 |
+
- IP-Adapter image prompting and style transfer
|
| 251 |
- ControlNet compatibility
|
| 252 |
- LoRA fine-tuning support
|
| 253 |
|