1,720,964 research outputs found
Towards Latent Space Optimization of GANs Using Meta-Learning
The necessity to use very large datasets in order to train Generative Adversarial Networks (GANs) has limited their use in cases where the data at disposal are scarce or poorly labelled (e.g., in real life applications). Recently, meta-learning proved that it can help solving effectively few-shot classification problems, but its use in noise-to-image generation was only partially explored. In this paper, we took the first step into applying a meta-learning algorithm (Reptile), to the discriminator of a GAN and to a mapping network in order to optimize the random noise z to guide the generator network into producing images belonging to specific classes. By doing so, we prove that the latent space distribution is crucial for the generation of sharp samples when few training data are at disposal and also managed to generate samples of previously unseen classes just by optimizing the latent space without changing any parameter in the generator network. Finally, we show several experiments with two widely used datasets: MNIST and Omniglot
Bag of indexes: a multi-index scheme for efficient approximate nearest neighbor search
During the last years, the problem of Content-Based Image Retrieval (CBIR) was addressed in many different ways, achieving excellent results in small-scale datasets. With growth of the data to evaluate, new issues need to be considered and new techniques are necessary in order to create an efficient yet accurate system. In particular, computational time and memory occupancy need to be kept as low as possible, whilst the retrieval accuracy has to be preserved as much as possible. For this reason, a brute-force approach is no longer feasible, and an Approximate Nearest Neighbor (ANN) search method is preferable. This paper describes the state-of-the-art ANN methods, with a particular focus on indexing systems, and proposes a new ANN technique called Bag of Indexes (BoI). This new technique is compared with the state of the art on several public benchmarks, obtaining 86.09% of accuracy on Holidays+Flickr1M, 99.20% on SIFT1M and 92.4% on GIST1M. Noteworthy, these state-of-the-art accuracy results are obtained by the proposed approach with a very low retrieval time, making it excellent in the trade off between accuracy and efficiency
Would Your Clothes Look Good on Me? Towards Transferring Clothing Styles with Adaptive Instance Normalization
Several applications of deep learning, such as image classification and retrieval, recommendation systems, and especially image synthesis, are of great interest to the fashion industry. Recently, image generation of clothes gained lot of popularity as it is a very challenging task that is far from being solved. Additionally, it would open lots of possibilities for designers and stylists enhancing their creativity. For this reason, in this paper we propose to tackle the problem of style transfer between two different people wearing different clothes. We draw inspiration from the recent StarGANv2 architecture that reached impressive results in transferring a target domain to a source image and we adapted it to work with fashion images and to transfer clothes styles. In more detail, we modified the architecture to work without the need of a clear separation between multiple domains, added a perceptual loss between the target and the source clothes, and edited the style encoder to better represent the style information of target clothes. We performed both qualitative and quantitative experiments with the recent DeepFashion2 dataset and proved the efficacy and novelty of our method
Adversarial Identity Injection for Semantic Face Image Synthesis
Nowadays, deep learning models have reached incredible performance in the task of image generation. Plenty of literature works address the task of face generation and editing, with human and automatic systems that struggle to distinguish what's real from generated. Whereas most systems reached excellent visual generation quality, they still face difficulties in preserving the identity of the starting input subject. Among all the explored techniques, Semantic Image Synthesis (SIS) methods, whose goal is to generate an image conditioned on a semantic segmentation mask, are the most promising, even though preserving the perceived identity of the input subject is not their main concern. Therefore, in this paper, we investigate the problem of identity preservation in face image generation and present an SIS architecture that exploits a cross-attention mechanism to merge identity, style, and semantic features to generate faces whose identities are as similar as possible to the input ones. Experimental results reveal that the proposed method is not only suitable for preserving the identity but is also effective in the face recognition adversarial attack, i.e. hiding a second identity in the generated faces
MetalGAN: Multi-domain label-less image synthesis using cGANs and meta-learning
Image synthesis is currently one of the most addressed image processing topic in computer vision and deep learning fields of study. Researchers have tackled this problem focusing their efforts on its several challenging problems, e.g. image quality and size, domain and pose changing, architecture of the networks, and so on. Above all, producing images belonging to different domains by using a single architecture is a very relevant goal for image generation. In fact, a single multi-domain network would allow greater flexibility and robustness in the image synthesis task than other approaches. This paper proposes a novel architecture and a training algorithm, which are able to produce multi-domain outputs using a single network. A small portion of a dataset is intentionally used, and there are no hard-coded labels (or classes). This is achieved by combining a conditional Generative Adversarial Network (cGAN) for image generation and a Meta-Learning algorithm for domain switch, and we called our approach MetalGAN. The approach has proved to be appropriate for solving the multi-domain label-less problem and it is validated on facial attribute transfer, using CelebA dataset
Face Synthesis with a Focus on Facial Attributes Translation using Attention Mechanisms
Synthesis of face images by translating facial attributes is an important problem in computer vision and biometrics and has a wide range of applications in forensics, entertainment, etc. Recent advances in deep generative networks have made progress in synthesizing face images with certain target facial attributes. However, visualizing and interpreting generative adversarial networks (GANs) is a relatively unexplored area and generative models are still being employed as black-box tools. This paper takes the first step to visually interpret conditional GANs for facial attribute translation by using a gradient-based attention mechanism. Next, a key innovation is to include new learning objectives for knowledge distillation using attention in generative adversarial training, which result in improved synthesized face results, reduced visual confusions and boosted training for GANs in a positive way. Firstly, visual attentions are calculated to provide interpretations for GANs. Secondly, gradient-based visual attentions are used as knowledge to be distilled in a teacher-student paradigm for face synthesis with focus on facial attributes translation tasks in order to improve the performance of the model. Finally, it is shown how “pseudo”-attentions knowledge distillation can be employed during the training of face synthesis networks when teacher and student networks are trained to generate different facial attributes. The approach is validated on facial attribute translation and human expression synthesis with both qualitative and quantitative results being presented
Semantic Image Synthesis via Class-Adaptive Cross-Attention
In semantic image synthesis the state of the art is dominated by methods that use customized variants of the SPatially-Adaptive DE-normalization (SPADE) layers, which allow for good visual generation quality and editing versatility. By design, such layers learn pixel-wise modulation parameters to de-normalize the generator activations based on the semantic class each pixel belongs to. Thus, they tend to overlook global image statistics, ultimately leading to unconvincing local style editing and causing global inconsistencies such as color or illumination distribution shifts. Also, SPADE layers require the semantic segmentation mask for mapping styles in the generator, preventing shape manipulations without manual intervention. In response, we designed a novel architecture where cross-attention layers are used in place of SPADE for learning shape-style correlations and so conditioning the image generation process. Our model inherits the versatility of SPADE, at the same time obtaining state-of-the-art generation quality improving FID score by 5.6%, 1.4% and 3.4% on CelebMask-HQ, Ade20k and DeepFashion datasets respectively, as well as improved global and local style transfer. Code and models available at https://github.com/TFonta/CA2SIS
Going Beyond Counting First Authors in Author Co-citation Analysis
The present study examines one of the fundamental aspects of author co-citation analysis (ACA) - the way co-citation
counts are defined. Co-citation counting provides the data on which all subsequent statistical analyses and mappings
are based, and we compare ACA results based on two different types of co-citation counting - the traditional type that
only counts the first one among a cited work's authors on the one hand and a non-traditional type that takes into
account the first 5 authors of a cited work on the other hand. Results indicate that the picture produced through this non-traditional author co-citation counting contains more coherent author groups and is therefore considerably clearer. However, this picture represents fewer specialties in the research field being studied than that produced through the traditional first-author co-citation counting when the same number of top-ranked authors is selected and analyzed. Reasons for these effects are discussed
- …
