JoelWester commited on
Commit
c97eb7b
·
verified ·
1 Parent(s): 0c4baa8

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -274
README.md CHANGED
@@ -7,277 +7,4 @@ sdk: gradio
7
  sdk_version: "4.0"
8
  app_file: app.py
9
  pinned: false
10
- ---
11
-
12
- # Face Anonymization Made Simple (WACV 2025)
13
-
14
- [arXiv](http://arxiv.org/abs/2411.00762)
15
-
16
- ![teaser](teaser.jpg)
17
-
18
- Our face anonymization technique preserves the original facial expressions, head positioning, eye direction, and background elements, effectively masking identity while retaining other crucial details. The anonymized face blends seamlessly into its original photograph, making it ideal for diverse real-world applications.
19
-
20
- ## Setup
21
-
22
- 1. Clone the repository.
23
-
24
- ```bash
25
- git clone https://github.com/hanweikung/face_anon_simple.git
26
- ```
27
-
28
- 2. Create a Python environment from the `environment.yml` file.
29
-
30
- ```bash
31
- conda env create -f environment.yml
32
- ```
33
-
34
- ## Usage
35
- 1. Import the library.
36
-
37
- ```python
38
- import torch
39
- from transformers import CLIPImageProcessor, CLIPVisionModel
40
-
41
- from diffusers import AutoencoderKL, DDPMScheduler
42
- from diffusers.utils import load_image
43
- from src.diffusers.models.referencenet.referencenet_unet_2d_condition import (
44
- ReferenceNetModel,
45
- )
46
- from src.diffusers.models.referencenet.unet_2d_condition import UNet2DConditionModel
47
- from src.diffusers.pipelines.referencenet.pipeline_referencenet import (
48
- StableDiffusionReferenceNetPipeline,
49
- )
50
- ```
51
-
52
- 2. Create & load models.
53
-
54
- ```python
55
- face_model_id = "hkung/face-anon-simple"
56
- clip_model_id = "openai/clip-vit-large-patch14"
57
- sd_model_id = "stabilityai/stable-diffusion-2-1"
58
-
59
- unet = UNet2DConditionModel.from_pretrained(
60
- face_model_id, subfolder="unet", use_safetensors=True
61
- )
62
- referencenet = ReferenceNetModel.from_pretrained(
63
- face_model_id, subfolder="referencenet", use_safetensors=True
64
- )
65
- conditioning_referencenet = ReferenceNetModel.from_pretrained(
66
- face_model_id, subfolder="conditioning_referencenet", use_safetensors=True
67
- )
68
- vae = AutoencoderKL.from_pretrained(sd_model_id, subfolder="vae", use_safetensors=True)
69
- scheduler = DDPMScheduler.from_pretrained(
70
- sd_model_id, subfolder="scheduler", use_safetensors=True
71
- )
72
- feature_extractor = CLIPImageProcessor.from_pretrained(
73
- clip_model_id, use_safetensors=True
74
- )
75
- image_encoder = CLIPVisionModel.from_pretrained(clip_model_id, use_safetensors=True)
76
-
77
- pipe = StableDiffusionReferenceNetPipeline(
78
- unet=unet,
79
- referencenet=referencenet,
80
- conditioning_referencenet=conditioning_referencenet,
81
- vae=vae,
82
- feature_extractor=feature_extractor,
83
- image_encoder=image_encoder,
84
- scheduler=scheduler,
85
- )
86
- pipe = pipe.to("cuda")
87
-
88
- generator = torch.manual_seed(1)
89
- ```
90
-
91
- ### Anonymize images with a single aligned face
92
-
93
- Create an anonymized version of an image if the image contains a single face and that face has already been aligned similarly to those in the [FFHQ](https://github.com/NVlabs/ffhq-dataset) or [CelebA-HQ](https://github.com/tkarras/progressive_growing_of_gans) datasets.
94
-
95
- ```python
96
- # get an input image for anonymization
97
- original_image = load_image("my_dataset/test/14795.png")
98
-
99
- # generate an image that anonymizes faces
100
- anon_image = pipe(
101
- source_image=original_image,
102
- conditioning_image=original_image,
103
- num_inference_steps=200,
104
- guidance_scale=4.0,
105
- generator=generator,
106
- anonymization_degree=1.25,
107
- width=512,
108
- height=512,
109
- ).images[0]
110
- anon_image.save("anon.png")
111
- ```
112
-
113
- ### Anonymize images with one or multiple unaligned faces
114
-
115
- Create an anonymized version of an image if it contains one or more unaligned faces.
116
-
117
- ```python
118
- import face_alignment
119
- from utils.anonymize_faces_in_image import anonymize_faces_in_image
120
-
121
- # get an input image for anonymization
122
- original_image = load_image("my_dataset/test/friends.jpg")
123
-
124
- # SFD (likely best results, but slower)
125
- fa = face_alignment.FaceAlignment(
126
- face_alignment.LandmarksType.TWO_D, face_detector="sfd"
127
- )
128
-
129
- # generate an image that anonymizes faces
130
- anon_image = anonymize_faces_in_image(
131
- image=original_image,
132
- face_alignment=fa,
133
- pipe=pipe,
134
- generator=generator,
135
- face_image_size=512,
136
- num_inference_steps=25,
137
- guidance_scale=4.0,
138
- anonymization_degree=1.25,
139
- )
140
- anon_image.save("anon.png")
141
- ```
142
-
143
- ### Swap faces between two images
144
-
145
- Create an image that swap faces.
146
-
147
- ```python
148
- # get source and conditioning (driving) images for face swap
149
- source_image = load_image("my_dataset/test/00482.png")
150
- conditioning_image = load_image("my_dataset/test/14795.png")
151
-
152
- # generate an image that swaps faces
153
- swap_image = pipe(
154
- source_image=source_image,
155
- conditioning_image=conditioning_image,
156
- num_inference_steps=200,
157
- guidance_scale=4.0,
158
- generator=generator,
159
- anonymization_degree=0.0,
160
- width=512,
161
- height=512,
162
- ).images[0]
163
- swap_image.save("swap.png")
164
- ```
165
-
166
- We also provide the [demo.ipynb](https://github.com/hanweikung/face_anon_simple/blob/main/demo.ipynb) notebook, which guides you through the steps mentioned above.
167
-
168
- ### Note on image resolution
169
-
170
- Our model was trained on 512x512 images. To ensure correct results, always set `width=512` and `height=512` in the `pipe` function, and `face_image_size=512` in the `anonymize_faces_in_image` function. This ensures that input images are resized correctly for the diffusion pipeline. If you're using a model trained on different sizes, like 768x768, adjust these parameters accordingly.
171
-
172
- ## Training
173
-
174
- Our model learns face swapping for anonymization. You can train it using your own face-swapped images.
175
-
176
- ### Training data structure
177
-
178
- Sample training data is available in the `my_dataset/train` directory. Real images are stored in the `real` subdirectory, while face-swapped images are stored in the `fake` subdirectory.
179
-
180
- ```bash
181
- my_dataset/
182
- ├── train
183
- │   ├── celeb
184
- │   │   ├── fake
185
- │   │   │   └── 18147_06771-01758_01758.png
186
- │   │   └── real
187
- │   │   ├── 01758_01758.png
188
- │   │   ├── 01758_09704.png
189
- │   │   └── 18147_06771.png
190
- │   └── train.jsonl
191
- └── train_dataset_loading_script.py
192
- ```
193
-
194
- ### Data loading and configuration
195
-
196
- Training data is loaded using a JSON lines file (`my_dataset/train.jsonl`) and a dataset loading script (`my_dataset/train_dataset_loading_script.py`), both provided as examples.
197
-
198
- The JSON lines file includes two sample entries specifying the source image, conditioning (driving) image, and ground truth image, with file paths based on the sample training data. Adjust these paths to match your own data:
199
-
200
- ```json
201
- {"source_image": "celeb/real/18147_06771.png", "conditioning_image": "celeb/real/01758_01758.png", "ground_truth": "celeb/fake/18147_06771-01758_01758.png"}
202
- {"source_image": "celeb/real/01758_09704.png", "conditioning_image": "celeb/fake/18147_06771-01758_01758.png", "ground_truth": "celeb/real/01758_01758.png"}
203
- ```
204
-
205
- To simulate face-swapping behavior, the source and conditioning images should have different identities. The source and ground truth should share the same identity, while the conditioning and ground truth should share the same pose and expression. When no actual ground truth is available (e.g., the first entry), the face-swapped image serves as the ground truth. When a ground truth image is available (e.g., the second entry), the swapped version of the ground truth is used as the conditioning image.
206
-
207
- Our dataset loading script follows [Hugging Face's documentation](https://huggingface.co/docs/datasets/en/dataset_script). Please update the `metadata_path` and `images_dir` file paths in the script to match your dataset:
208
-
209
- ```python
210
- _URLS = {
211
- "metadata_path": "/path/to/face_anon_simple/my_dataset/train/train.jsonl",
212
- "images_dir": "/path/to/face_anon_simple/my_dataset/train/",
213
- }
214
- ```
215
-
216
- ### Training script setup
217
-
218
- A bash script, `train_referencenet.sh`, with the training command is provided. Update the file paths and adjust parameters as needed:
219
-
220
- ```bash
221
- export MODEL_DIR="/path/to/stable-diffusion-2-1/"
222
- export CLIP_MODEL_DIR="/path/to/clip-vit-large-patch14/"
223
- export OUTPUT_DIR="./runs/celeb/"
224
- export NCCL_P2P_DISABLE=1
225
- export DATASET_LOADING_SCRIPT_PATH="./my_dataset/train_dataset_loading_script.py"
226
- export TORCH_DISTRIBUTED_DEBUG="INFO"
227
- export WANDB__SERVICE_WAIT="300"
228
-
229
- accelerate launch --main_process_port=29500 --mixed_precision="fp16" --multi_gpu -m examples.referencenet.train_referencenet \
230
- --pretrained_model_name_or_path=$MODEL_DIR \
231
- --pretrained_clip_model_name_or_path=$CLIP_MODEL_DIR \
232
- --output_dir=$OUTPUT_DIR \
233
- --dataset_loading_script_path=$DATASET_LOADING_SCRIPT_PATH \
234
- --resolution=512 \
235
- --learning_rate=1e-5 \
236
- --validation_source_image "./my_dataset/test/00482.png" \
237
- --validation_conditioning_image "./my_dataset/test/14795.png" \
238
- --train_batch_size=1 \
239
- --tracker_project_name="celeb" \
240
- --checkpointing_steps=10000 \
241
- --num_validation_images=1 \
242
- --validation_steps=1000 \
243
- --mixed_precision="fp16" \
244
- --gradient_checkpointing \
245
- --use_8bit_adam \
246
- --enable_xformers_memory_efficient_attention \
247
- --gradient_accumulation_steps=8 \
248
- --resume_from_checkpoint="latest" \
249
- --set_grads_to_none \
250
- --max_train_steps=60000 \
251
- --conditioning_dropout_prob=0.1 \
252
- --seed=0 \
253
- --report_to="wandb" \
254
- --random_flip \
255
- --dataloader_num_workers=8
256
- ```
257
-
258
- To train your model, run:
259
-
260
- ```bash
261
- bash train_referencenet.sh
262
- ```
263
-
264
- ## Test images
265
-
266
- In our paper, we selected 1,000 images each from [CelebA-HQ](https://github.com/tkarras/progressive_growing_of_gans) and [FFHQ](https://github.com/NVlabs/ffhq-dataset) for quantitative analysis. The list of test images can be found in our [Hugging Face Hub repository](https://huggingface.co/datasets/hkung/face-anon-simple-dataset).
267
-
268
- ## Citation
269
-
270
- ```bibtex
271
- @InProceedings{Kung_2025_WACV,
272
- author = {Kung, Han-Wei and Varanka, Tuomas and Saha, Sanjay and Sim, Terence and Sebe, Nicu},
273
- title = {Face Anonymization Made Simple},
274
- booktitle = {Proceedings of the Winter Conference on Applications of Computer Vision (WACV)},
275
- month = {February},
276
- year = {2025},
277
- pages = {1040-1050}
278
- }
279
- ```
280
-
281
- ## Acknowledgements
282
-
283
- This work is built upon the [Diffusers](https://github.com/huggingface/diffusers) project. The [face extractor](https://github.com/hanweikung/face_anon_simple/blob/main/utils/extractor.py) is adapted from [DeepFaceLab](https://github.com/iperov/DeepFaceLab/blob/master/mainscripts/Extractor.py).
 
7
  sdk_version: "4.0"
8
  app_file: app.py
9
  pinned: false
10
+ ---