Search CORE

1,720,966 research outputs found

Discriminative feature learning for multimodal classification

Author: Calefati Alessandro
Publication venue
Publication date: 01/01/2019
Field of study

The purpose of this thesis is to tackle two related topics: multimodal classification and objective functions to improve the discriminative power of features. First, I worked on image and text classification tasks and performed many experiments to show the effectiveness of different approaches available in literature. Then, I introduced a novel methodology which can classify multimodal documents using singlemodal classifiers merging textual and visual information into images and a novel loss function to improve separability between samples of a dataset. Results show that exploiting multimodal data increases performances on classification tasks rather than using traditional single-modality methods. Moreover the introduced GIT loss function is able to enhance the discriminative power of features, lowering intra-class distance and raising inter-class distance between samples of a multiclass dataset

InsubriaSPACE - Thesis PhD Repository

Archivio istituzionale della ricerca - Università dell'Insubria

InsubriaSPACE

Multimodal Classification Fusion in Real-World Scenarios

Author: Calefati Alessandro
Shah Nawaz
Alessandro Calefati
Ignazio Gallo
Gallo Ignazio
NAWAZ SHAH
Publication venue
Publication date: 01/01/2017
Field of study

Crossref

Archivio istituzionale della ricerca - Università dell'Insubria

Semantic Text Encoding for Text Classification Using Convolutional Neural Networks

Author: Calefati Alessandro
Shah Nawaz
Alessandro Calefati
Nawaz Shah
Ignazio Gallo
Gallo Ignazio
Publication venue
Publication date: 01/01/2017
Field of study

Crossref

Archivio istituzionale della ricerca - Università dell'Insubria

Using convolutional neural networks for content extraction from online flyers

Author: Alessandro Zamberletti
CALEFATI ALESSANDRO
GALLO IGNAZIO
Lucia Noce
Alessandro Calefati
ZAMBERLETTI ALESSANDRO
Ignazio Gallo
NOCE LUCIA
Publication venue
Publication date: 01/01/2016
Field of study

The rise of online shopping has hurt physical retailers, which struggle to persuade customers to buy products in physical stores rather than online. Marketing flyers are a great mean to increase the visibility of physical retailers, but the unstructured offers appearing in those documents cannot be easily compared with similar online deals, making it hard for a customer to understand whether it is more convenient to order a product online or to buy it from the physical shop. In this work we tackle this problem, introducing a content extraction algorithm that automatically extracts structured data from flyers. Unlike competing approaches that mainly focus on textual content or simply analyze font type, color and text positioning, we propose a new approach that uses Convolutional Neural Networks to classify words extracted from flyers typically used in marketing materials to attract the attention of readers towards specific deals. We obtained good results and a high language and genre independence

Crossref

Archivio istituzionale della ricerca - Università dell'Insubria

Embedded Textual Content for Document Image Classification with Convolutional Neural Networks

Author: Alessandro Zamberletti
CALEFATI ALESSANDRO
GALLO IGNAZIO
Lucia Noce
Alessandro Calefati
ZAMBERLETTI ALESSANDRO
Ignazio Gallo
NOCE LUCIA
Publication venue
Publication date: 01/01/2016
Field of study

Crossref

Archivio istituzionale della ricerca - Università dell'Insubria

Hand written characters recognition via deep metric learning

Author: Calefati Alessandro
Shah Nawaz
Alessandro Calefati
Nawaz Shah
Ahmed Nisar
Ignazio Gallo
Gallo Ignazio
Nisar Ahmed
Publication venue
Publication date: 01/01/2018
Field of study

Deep metric learning plays an important role in measuring similarity through distance metrics among arbitrary group of data. MNIST dataset is typically used to measure similarity however this dataset has few seemingly similar classes, making it less effective for deep metric learning methods. In this paper, we created a new handwritten dataset named Urdu-Characters with set of classes suitable for deep metric learning. With this work, we compare the performance of two state-of-The-Art deep metric learning methods i.e. Siamese and Triplet network. We show that a Triplet network is more powerful than a Siamese network. In addition, we show that the performance of a Triplet or Siamese network can be improved using most powerful underlying Convolutional Neural Network architectures

Crossref

Archivio istituzionale della ricerca - Università dell'Insubria

Aiding intra-text representations with visual context for multimodal named entity recognition

Author: Gallo Ignazio
Omer Arshad
Shah Nawaz
ARSHAD OMER
Alessandro Calefati
Calefati Alessandro
Ignazio Gallo
Nawaz Shah
Publication venue
Publication date: 01/01/2019
Field of study

With the massive explosion of social media platforms such as Twitter and Instagram, people everyday share billions of multimedia posts, containing images and text. Typically, text in these posts is short, informal and noisy, leading to ambiguities which can be resolved using images. In this paper we will explore text-centric Named Entity Recognition task on these multimedia posts. We propose an end to end model which learns a joint representation of a text and an image. Our model extends multi-dimensional self-attention technique, where now image helps to enhance relationship between words. Experiments show that our model is capable of capturing both textual and visual contexts with greater accuracy, achieving state-of-the-art results on Twitter multimodal Named Entity Recognition dataset

Crossref

Archivio istituzionale della ricerca - Università dell'Insubria

A query and product suggestion method for price comparison search engines

Author: Alessandro Zamberletti
Calefati Alessandro
Lucia Noce
Alessandro Calefati
Zamberletti Alessandro
Ignazio Gallo
Gallo Ignazio
Noce Lucia
Publication venue
Publication date: 01/01/2017
Field of study

In this paper we propose a query suggestion method for price comparison search engines. Query suggestion techniques are used for generating alternative queries to facilitate web users in information seeking; in this specific domain, suggestions provided to web users need to be properly generated taking into account that the suggested products must be still available for sale. We propose a novel approach based on a slightly variant of classical query-URL graphs: the query-product click-through bipartite graph. Information extracted both from search engine logs and specific domain features are exploited to build the graph, and one of the advantages of this model is that such a graph can be used to suggest not only related queries but also related products. Concepts used in the proposed method are not restricted to our context but are used in many other major e-commerce and search engine websites, we tested the model on several challenging datasets, and also compared with a recent query suggestion approach specifically designed for price comparison engines. Our solution outperforms the competing approach, achieving higher results in terms of relevance of the provided suggestions and coverage rates on top-8 suggestions

Crossref

Archivio istituzionale della ricerca - Università dell'Insubria

Deep Latent Space Learning for Cross-Modal Mapping of Audio and Visual Signals

Author: CALEFATI ALESSANDRO
Shah Nawaz
Alessandro Calefati
Arif Mahmood
Ignazio Gallo
Muhammad Kamran Janjua
Publication venue
Publication date: 01/01/2019
Field of study

We propose a novel deep training algorithm for joint representation of audio and visual information which consists of a single stream network (SSNet) coupled with a novel loss function to learn a shared deep latent space representation of multimodal information. The proposed framework characterizes the shared latent space by leveraging the class centers which helps to eliminate the need of pairwise or triplet supervision. We quantitatively and qualitatively evaluate the proposed approach on VoxCeleb, a benchmarks audio-visual dataset on multitude of tasks including cross-modal verification, cross-modal matching and cross-modal retrieval. State-of-the-art performance is achieved on cross-modal verification and matching while comparable results are observed on the remaining applications. Our experiments demonstrate the effectiveness of the technique for cross-modal biometric applications

Crossref

Archivio istituzionale della ricerca - Università dell'Insubria

Do cross modal systems leverage semantic relationships?

Author: Gallo Ignazio
Shah Nawaz
Alessandro Calefati
Arif Mahmood
Calefati Alessandro
Ignazio Gallo
Kamran Janjua M.
Shafait Faisal
M. Kamran Janjua
Faisal Shafait
NAWAZ SHAH
Mahmood Arif
Publication venue
Publication date: 01/01/2019
Field of study

Current cross modal retrieval systems are evaluated using R@K measure which does not leverage semantic relationships rather strictly follows the manually marked image text query pairs. Therefore, current systems do not generalize well for the unseen data in the wild. To handle this, we propose a new measure SemanticMap to evaluate the performance of cross modal systems. Our proposed measure evaluates the semantic similarity between the image and text representations in the latent embedding space. We also propose a novel cross modal retrieval system using a single stream network for bidirectional retrieval. The proposed system is based on a deep neural network trained using extended center loss, minimizing the distance of image and text descriptions in the latent space from the class centers. In our system, the text descriptions are also encoded as images which enabled us to use single stream network for both text and images. To the best of our knowledge, our work is the first of its kind in terms of employing a single stream network for cross modal retrieval systems. The proposed system is evaluated on two publicly available datasets including MSCOCO and Flickr30K and has shown comparable results to the current state-of-the-art methods

Crossref

Archivio istituzionale della ricerca - Università dell'Insubria