wangkanai commited on
Commit
1b86108
·
verified ·
1 Parent(s): 56e8d8d

Add files using upload-large-folder tool

Browse files
Files changed (1) hide show
  1. README.md +51 -3
README.md CHANGED
@@ -1,4 +1,4 @@
1
- <!-- README Version: v1.0 -->
2
 
3
  ---
4
  license: apache-2.0
@@ -12,6 +12,8 @@ tags:
12
  - fp16
13
  - diffusion
14
  - stable-diffusion
 
 
15
  base_model: black-forest-labs/FLUX.1-dev
16
  ---
17
 
@@ -48,8 +50,12 @@ flux-dev-fp16/
48
  │ └── t5xxl_fp16.safetensors (9.2 GB) # T5-XXL text encoder
49
  ├── clip/
50
  │ └── t5xxl_fp16.safetensors (9.2 GB) # T5-XXL encoder (alternate location)
51
- └── clip_vision/
52
- └── clip_vision_h.safetensors (1.2 GB) # CLIP vision encoder
 
 
 
 
53
 
54
  Total Repository Size: 72 GB
55
  ```
@@ -58,6 +64,8 @@ Total Repository Size: 72 GB
58
  - **Main Model**: `flux1-dev-fp16.safetensors` (23 GB) - Core diffusion transformer
59
  - **Text Encoders**: CLIP-L, CLIP-G, T5-XXL for advanced text understanding
60
  - **Vision Encoder**: CLIP vision model for image understanding capabilities
 
 
61
 
62
  ## Hardware Requirements
63
 
@@ -180,6 +188,45 @@ image = pipe(
180
  image.save("optimized_output.png")
181
  ```
182
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
183
  ## Model Specifications
184
 
185
  | Specification | Details |
@@ -200,6 +247,7 @@ image.save("optimized_output.png")
200
  - Multi-aspect ratio generation
201
  - Img2img workflows
202
  - Inpainting and outpainting
 
203
  - ControlNet compatibility
204
  - LoRA fine-tuning support
205
 
 
1
+ <!-- README Version: v1.1 -->
2
 
3
  ---
4
  license: apache-2.0
 
12
  - fp16
13
  - diffusion
14
  - stable-diffusion
15
+ - ip-adapter
16
+ - style-transfer
17
  base_model: black-forest-labs/FLUX.1-dev
18
  ---
19
 
 
50
  │ └── t5xxl_fp16.safetensors (9.2 GB) # T5-XXL text encoder
51
  ├── clip/
52
  │ └── t5xxl_fp16.safetensors (9.2 GB) # T5-XXL encoder (alternate location)
53
+ ├── clip_vision/
54
+ └── clip_vision_h.safetensors (1.2 GB) # CLIP vision encoder
55
+ ├── vae/flux/
56
+ │ └── flux-vae-bf16.safetensors (160 MB) # VAE decoder in BF16 precision
57
+ └── ipadapter-flux/
58
+ └── ip-adapter.bin (5.0 GB) # IP-Adapter for image prompting
59
 
60
  Total Repository Size: 72 GB
61
  ```
 
64
  - **Main Model**: `flux1-dev-fp16.safetensors` (23 GB) - Core diffusion transformer
65
  - **Text Encoders**: CLIP-L, CLIP-G, T5-XXL for advanced text understanding
66
  - **Vision Encoder**: CLIP vision model for image understanding capabilities
67
+ - **VAE**: `flux-vae-bf16.safetensors` (160 MB) - Variational autoencoder for latent/image conversion
68
+ - **IP-Adapter**: `ip-adapter.bin` (5.0 GB) - Image prompt adapter for style transfer and image conditioning
69
 
70
  ## Hardware Requirements
71
 
 
188
  image.save("optimized_output.png")
189
  ```
190
 
191
+ ### IP-Adapter Image Prompting
192
+
193
+ ```python
194
+ import torch
195
+ from diffusers import FluxPipeline
196
+ from ip_adapter import IPAdapter
197
+
198
+ # Load FLUX pipeline
199
+ pipe = FluxPipeline.from_single_file(
200
+ "E:/huggingface/flux-dev-fp16/checkpoints/flux/flux1-dev-fp16.safetensors",
201
+ torch_dtype=torch.float16
202
+ )
203
+ pipe.to("cuda")
204
+
205
+ # Load IP-Adapter for image conditioning
206
+ ip_adapter = IPAdapter(
207
+ pipe,
208
+ image_encoder_path="E:/huggingface/flux-dev-fp16/clip_vision",
209
+ ip_ckpt="E:/huggingface/flux-dev-fp16/ipadapter-flux/ip-adapter.bin",
210
+ device="cuda"
211
+ )
212
+
213
+ # Load reference image for style/composition transfer
214
+ reference_image = "reference_style.jpg"
215
+
216
+ # Generate image with text prompt + image reference
217
+ image = ip_adapter.generate(
218
+ pil_image=reference_image,
219
+ prompt="A landscape in the style of the reference image",
220
+ num_inference_steps=50,
221
+ guidance_scale=7.5,
222
+ scale=0.6, # IP-Adapter influence strength (0.0-1.0)
223
+ height=1024,
224
+ width=1024
225
+ )[0]
226
+
227
+ image.save("style_transfer_output.png")
228
+ ```
229
+
230
  ## Model Specifications
231
 
232
  | Specification | Details |
 
247
  - Multi-aspect ratio generation
248
  - Img2img workflows
249
  - Inpainting and outpainting
250
+ - IP-Adapter image prompting and style transfer
251
  - ControlNet compatibility
252
  - LoRA fine-tuning support
253