Mohammad Azam Khan

Postdoc


Curriculum vitae




Not just Compete, but Collaborate: Local Image-to-Image Translation via Cooperative Mask Prediction


Journal article


Daejin Kim, Mohammad Azam Khan, J. Choo
Computer Vision and Pattern Recognition, 2021

Semantic Scholar DBLP DOI
Cite

Cite

APA   Click to copy
Kim, D., Khan, M. A., & Choo, J. (2021). Not just Compete, but Collaborate: Local Image-to-Image Translation via Cooperative Mask Prediction. Computer Vision and Pattern Recognition.


Chicago/Turabian   Click to copy
Kim, Daejin, Mohammad Azam Khan, and J. Choo. “Not Just Compete, but Collaborate: Local Image-to-Image Translation via Cooperative Mask Prediction.” Computer Vision and Pattern Recognition (2021).


MLA   Click to copy
Kim, Daejin, et al. “Not Just Compete, but Collaborate: Local Image-to-Image Translation via Cooperative Mask Prediction.” Computer Vision and Pattern Recognition, 2021.


BibTeX   Click to copy

@article{daejin2021a,
  title = {Not just Compete, but Collaborate: Local Image-to-Image Translation via Cooperative Mask Prediction},
  year = {2021},
  journal = {Computer Vision and Pattern Recognition},
  author = {Kim, Daejin and Khan, Mohammad Azam and Choo, J.}
}

Abstract

Facial attribute editing aims to manipulate the image with the desired attribute while preserving the other details. Recently, generative adversarial networks along with the encoder-decoder architecture have been utilized for this task owing to their ability to create realistic images. However, the existing methods for the unpaired dataset cannot still preserve the attribute-irrelevant regions properly due to the absence of the ground truth image. This work proposes a novel, intuitive loss function called the CAM-consistency loss, which improves the consistency of an input image in image translation. While the existing cycle-consistency loss ensures that the image can be translated back, our approach makes the model further preserve the attribute-irrelevant regions even in a single translation to another domain by using the Grad-CAM output computed from the discriminator. Our CAM-consistency loss directly optimizes such a Grad-CAM output from the discriminator during training, in order to properly capture which local regions the generator should change while keeping the other regions unchanged. In this manner, our approach allows the generator and the discriminator to collaborate with each other to improve the image translation quality. In our experiments, we validate the effectiveness and versatility of our proposed CAM-consistency loss by applying it to several representative models for facial image editing, such as StarGAN, AttGAN, and STGAN.


Share



Follow this website


You need to create an Owlstown account to follow this website.


Sign up

Already an Owlstown member?

Log in