1,720,996 research outputs found

    Incremental models based on features persistence for object recognition

    Full text link
    Object recognition has regained a high level of attention in recent years, with the application of deep convolutional neural networks to classification tasks. However, the problem of recognising objects for which a limited number of images is available is still open. In this paper, we propose a view-based object recognition method which can deal with objects represented by a handful of images. Salient points are extracted from the images, and a persistence value is defined for each point and updated as new images are added. An object model is built and refined on the basis of salient point persistence, where points with high persistence have priority over those with low persistence. The model can then be used to match a single image of an object. We demonstrate the efficacy of the proposed methodology on a dataset made of a collection of objects of cultural interest. We show that the recognition performance of the proposed method is superior to that of a competing methodology based on Bag-of-Words

    Foveated Vision for Deepface Recognition

    No full text
    In the last decade deep learning techniques have strongly influenced many aspects of computational vision. Many difficult vision tasks can now be performed by deploying a properly tailored and trained deep network. The enthusiasm for deep learning is unfortunately paired by the present lack of a clear understanding of how they work and why they provide such brilliant performance. The same applies to biometric systems. Deep learning has been successfully applied to several biometric recognition tasks, including face recognition. VGG-face is possibly the first deep convolutional network designed to perform face recognition, obtaining unsurpassed performance at the time it was firstly proposed. Over the last years, several and more complex deep convolutional networks, trained on very large, mainly private, datasets, have been proposed still elevating the performance bar also on quite challenging public databases, such as the Janus IJB-A and IJB-B. Despite of the progress in the development of such networks, and the advance in the learning algorithms, the insight on these networks is still very limited. For this reason, in this paper we analyse a biologically-inspired network based on the HMAX model, not with the aim of pushing the recognition performance further, but to better understand the representation space produced by including the retino-cortical mapping performed by the log-polar image resampling

    Foveated Vision for Biologically Inspired Continuous Face Authentication

    No full text
    In everyday life whenever people observe, interact or speak to each other, visual attention is mostly directed toward the other person’s face, particularly to the eyes and the nearby periocular regions. This is naturally reflected when the user interacts with their mobile phones in several usual activities, such as web access, payments and video calls. For this reason, the functionality of mobile devices is strongly affected by the design of the user interface. In this chapter, we propose a biologically inspired approach for continuous user authentication based on the analysis of the ocular regions. The proposed system is based on a modified version of the HMAX visual processing module. HMAX is a hierarchical model which has been conceived to mimic the basic neural architecture of the ventral stream of the visual cortex. The original HMAX model consists of four layers: S1, C1, S2 and C2. S1 and C1 represent the responses to a bank of orientation-selective Gabor filters. S2 and C2 represent the responses of simple and complex cells to other textural features. The discrimination power of HMAX in recognizing classes of objects is invariant to rotation and scale. The C1 layer, which is mainly responsible for the scale and rotation invariance, is implemented using a max-pooling operation, which may lose some spatial information. To overcome this problem while preserving the maximal visual acuity and hence the localization accuracy, we propose to augment the model by applying a retinal log-polar mapping. The log-polar mapping is an approximation of the retino-cortical mapping that is performed by the early stages of the primate visual system. Due to the high density of the cones in the fovea, the log-polar approximation of the space-variant distribution model of the photoreceptors can only be applied outside the foveal region. Therefore, the log-polar mapping is added to the HMAX model as a complementary stage to process the peripheral region of the grabbed images. In order to demonstrate the feasibility of the proposed approach to mobile scenarios, experimental results obtained from publicly available databases and image streams grabbed from mobile devices will be presented

    Empirical Grounded Simulation Models for Make-To-Order (MTO) Supply Chains: An Application in the Furniture Industry

    No full text
    The Make-to-Order (MTO) supply chain seeks to balance cost reduction with satisfactory customer service, especially concerning order lead times. Simulation plays a vital role in this balance, identifying risks, analyzing scenarios, and evaluating key performance indicators. However, existing simulation models often overlook suppliers and inventory management, focusing more on production, sales, and distribution. To address this, a simulation model tailored for the furniture industry integrates supplier selection with inventory management strategies, considering geographical complexities. Through a case study, various scenarios are assessed, revealing a trade-off between lead times and costs. Close MTO suppliers decrease lead times but increase costs due to transportation expenses, while distant sourcing minimizes costs but extends lead times, challenging customer expectations. This simulation model offers insights for MTO companies navigating supplier selection and inventory management, enhancing decision-making and customer satisfaction. Future research aims to expand the model into a supply chain Digital Twin, incorporating resilience and risk management to tackle broader MTO supply chain challenges

    Assessing bias and computational efficiency in vision transformers using early exits

    No full text
    Face recognition with deep learning is generally approached as a problem of capacity. The field has seen progressively deeper, more complex models or larger, more highly variant data sets. The data sets can be problematic, as they are often scraped indiscriminately from the internet. This results in an uncertain, and often heavily unbalanced distribution of race, gender, age and other aspects of the subjects, which is then manifested in the decisions of the models trained on them. The carbon footprint of machine learning is a concern. A real push is developing to reduce the energy consumption of machine learning as we strive for a more eco-friendly society. In addition, due to many instances of misuse by law enforcement and other agencies, unbiased models for face recognition are now fundamental to the practical application of the field. We present an approach using the state of the art Vision Transformer and Early Exits for reducing compute budget without significantly affecting performance. We develop a system for face recognition and identification with a closed-set gallery and show that with a small reduction in performance, a reasonable reduction in compute cost can be obtained using our method. Second, we investigate how these early Exits interact with the bias model through a robust evaluation of matching scores on a racially balanced data set. We show that matching scores vary heavily between cohorts, and these variations are magnified at the earlier exits

    Exploiting Face Recognizability with Early Exit Vision Transformers

    No full text
    Face recognition with Deep Learning is generally approached as a problem of capacity. The field has seen progressively deeper, more complex models or larger, more highly variant datasets. However, the carbon footprint of machine learning (ML) is a concern. A real push is developing to reduce the energy consumption of ML as we strive for a more eco-friendly society. Lower energy consumption or compute budget is always desirable, if accuracy is not reduced below a usable level. We present an approach using the state of the art Vision Transformer and Early Exits for reducing compute budget without significantly affecting performance. We develop a system for face recognition and identification with a closed-set gallery and show that with a small reduction in performance, a reasonable reduction in FLOPs can be obtained using our method

    Circular Second-Hand Apparel Business: An Empirical Study from the Retailers' Perspective

    No full text
    The second-hand apparel market has experienced significant growth in recent years, driven by increasing consumer awareness of sustainability, affordability, and value retention. This expansion aligns with the broader shift in the fashion industry towards circular economy principles, as consumers seek alternatives such as renting, repairing, exchanging, and purchasing pre-owned garments to reduce the environmental impact of the sector. Despite the rising popularity of second-hand apparel, most existing research has primarily focused on consumer behavior and business models for online platforms, while the perspective of physical retailers remains underexplored. Given the crucial role retailers are expected to play in this evolving market, this study aims to address this research gap by conducting an exploratory analysis of physical retailers in the second-hand apparel sector, investigating their procurement strategies and business expectations. To this end, a questionnaire survey was designed and then administered to a sample of second-hand retailers in Torino (Italy). The collected data were first evaluated using Cronbach’s Alpha coefficient and then analyzed with the Kruskal-Wallis test to assess procurement strategies, revenue growth, and trends in product features. The findings indicate that retailers rely on diverse sourcing channels, with vintage clothing emerging as the most profitable category, demonstrating strong revenue potential. Furthermore, retailers located in the city centers, with a large number of employees, show a more significant increase in revenues. However, despite the market’s growth, the adoption of online sales by the respondents remains limited. The study highlights key business enablers and emphasizes the potential for further development of the sector at issue. From a theoretical point of view, this research may enlarge the body of knowledge on the promising second-hand apparel market by incorporating the viewpoint of physical retailers. From a practical perspective, the findings of the study provide retailers with effective levers that might improve their business

    On the correlation between human fixations, handcrafted and CNN features

    No full text
    Traditional local image descriptors such as SIFT and SURF are based on processings similar to those that take place in the early visual cortex. Nowadays, convolutional neural networks still draw inspiration from the human vision system, integrating computational elements typical of higher visual cortical areas. Deep CNN's architectures are intrinsically hard to interpret, so much effort has been made to dissect them in order to understand which type of features they learn. However, considering the resemblance to the human vision system, no enough attention has been devoted to understand if the image features learned by deep CNNs and used for classification correlate with features that humans select when viewing images, the so-called human fixations, nor if they correlate with earlier developed handcrafted features such as SIFT and SURF. Exploring these correlations is highly meaningful since what we require from CNNs, and features in general, is to recognize and correctly classify objects or subjects relevant to humans. In this paper, we establish the correlation between three families of image interest points: human fixations, handcrafted and CNN features. We extract features from the feature maps of selected layers of several deep CNN's architectures, from the shallowest to the deepest. All features and fixations are then compared with two types of measures, global and local, which unveil the degree of similarity of the areas of interest of the three families. From the experiments carried out on ETD human fixations database, it turns out that human fixations are positively correlated with handcrafted features and even more with deep layers of CNNs and that handcrafted features highly correlate between themselves as some CNNs do
    corecore