1,720,971 research outputs found

    Beyond Relevance Feedback for Searching and Exploring Large Multimedia Collections

    No full text
    Relevance feedback was introduced over twenty years ago as a powerful tool for interactive retrieval and still is the dominant mode of interaction in multimedia retrieval systems. Over the years methods have improved and recently relevance feedback has become feasible on even the largest collections available in the multimedia community. Yet, relevance feedback typically targets the optimization of linear lists of search results and thus focuses on only one of the many tasks on the search - explore axis. Truly interactive retrieval systems have to consider the whole axis and interactive categorization is an overarching framework for many of those tasks. The multimedia analytics system MediaTable exploits this to support users in getting insight in large image collections. Categorization as a representation of the collection and user tasks does not capture the relations between items in the collection like graphs do. Hypergraphs are combining categories and relations in one model and as they are founded in set theory in fact are closely related to categorization. They, therefore, provide an elegant framework to move forward. In this talk we highlight the progress that has been made in the field of interactive retrieval and in the direction of multimedia analytics. We will further consider the promises that new results in deep learning, especially in the context of graph convolutional networks, and hypergraphs might bring to go beyond relevance feedback

    TindART: A Personal Visual Arts Recommender

    No full text
    We present TindART - a comprehensive visual arts recommender system. TindART leverages real time user input to build a user-centric preference model based on content and demographic features. Our system is coupled with visual analytics controls that allow users to gain a deeper understanding of their art taste and further refine their personal recommendation model. The content based features in TindART are extracted using a multi-task learning deep neural network which accounts for a link between multiple descriptive attributes and the content they represent. Our demographic engine is powered by social media integrations such as Google, Facebook and Twitter profiles the users can login with. Both the content and demographics power a recommender system which decision making processed is visualized through our web t-SNE implementation. TindART is live and available at: https://tindart.net/

    Towards Automated Diagnosis with Attentive Multi-modal Learning Using Electronic Health Records and Chest X-Rays

    No full text
    Jointly learning from Electronic Health Records (EHR) and medical images is a promising area of research in deep learning for medical imaging. Using the context available in EHR together with medical images can lead to more efficient data usage. Recent work has shown that jointly learning from EHR and medical images can indeed improve performance on several tasks. Current methods are however still not independent of clinician input. To obtain an automated method only prior patient information should be used together with a medical image, without the reliance on further clinician input. In this paper we propose an automated multi-modal method which creates a joint feature representation based on prior patient information from EHR and associated X-ray scan. This feature representation, which joins the two different modalities through attention leverages the contextual relationship between the modalities. This method is used to perform two tasks: diagnosis classification and free-text diagnosis generation. We show the benefit of the multi-modal approach over single-modality approaches on both tasks

    A survey of computational methods for iconic image analysis

    Full text link
    Digitization and digitalization efforts have led to an explosive growth of the number of images that are published, shared, and made available in collections. In turn, this has resulted in increased awareness of, and interest in, computational methods for automatic image analysis. Despite the tremendous progress made in the development of computational methods, there remains a gap between how a person interprets an image and what can be automatically extracted. By considering iconic images as those images for which this gap is most salient, as their meaning goes well beyond what is represented in the visual data, this article gives an overview of the potential and limitations of computational methods for iconic image analysis. I structure this overview by discussing methods that can be used to analyse the production, distribution, and reception of iconic images. Although the majority of computational methods focus on analysing production aspects, there are promising methods for image distribution aspects, whereas methods for studying image reception have received little attention. By considering the limitations of available methods I argue that computational methods can be of use for studying iconic images, but that comprehensive analysis will require methods that incorporate the plurality of meanings an image can have, and temporal nature thereof

    Detecting CNN-Generated Facial Images in Real-World Scenarios

    Full text link
    Artificial, CNN-generated images are now of such high quality that humans have trouble distinguishing them from real images. Several algorithmic detection methods have been proposed, but these appear to generalize poorly to data from unknown sources, making them infeasible for real-world scenarios. In this work, we present a framework for evaluating detection methods under real-world conditions, consisting of cross-model, cross-data, and post-processing evaluation, and we evaluate state-of-the-art detection methods using the proposed framework. Furthermore, we examine the usefulness of commonly used image pre-processing methods. Lastly, we evaluate human performance on detecting CNN-generated images, along with factors that influence this performance, by conducting an online survey. Our results suggest that CNN-based detection methods are not yet robust enough to be used in real-world scenarios

    Change and diversity

    Full text link
    This editorial announces several changes to FSI's Digital Investigation journal and presents our commitment to diversity

    Search and Explore Strategies for Interactive Analysis of Real-Life Image Collections with Unknown and Unique Categories

    No full text
    Many real-life image collections contain image categories that are unique to that specific image collection and have not been seen before by any human expert analyst nor by a machine. This prevents supervised machine learning to be effective and makes evaluation of such an image collection inefficient. Real-life collections ask for a multimedia analytics solution where the expert performs search and explores the image collection, supported by machine learning algorithms. We propose a method that covers both exploration and search strategies for such complex image collections. Several strategies are evaluated through an artificial user model. Two user studies were performed with experts and students respectively to validate the proposed method. As evaluation of such a method can only be done properly in a real-life application, the proposed method is applied on the MH17 airplane crash photo database on which we have expert knowledge. To show that the proposed method also helps with other image collections an image collection created with the Open Image Database is used. We show that by combining image features extracted with a convolutional neural network pretrained on ImageNet 1k, intelligent use of clustering, a well chosen strategy and expert knowledge, an image collection such as the MH17 airplane crash photo database can be interactively structured into relevant dynamically generated categories, allowing the user to analyse an image collection efficiently

    A new model for forensic data extraction from encrypted mobile devices

    Full text link
    In modern criminal investigations, mobile devices are seized at every type of crime scene, and the data on those devices often becomes critical evidence in the case. Various mobile forensic techniques have been established and evaluated through research in order to extract possible evidence data from devices over the decades. However, as mobile devices become essential tools for daily life, security and privacy concerns grow, and modern smartphone vendors have implemented multiple types of security protection measures - such as encryption - to guard against unauthorized access to the data on their products. This trend makes forensic acquisition harder than before, and data extraction from those devices for criminal investigation is becoming a more challenging task. Today, mobile forensic research focuses on identifying more invasive techniques, such as bypassing security features, and breaking into target smartphones by exploiting their vulnerabilities. In this paper, we explain the increased encryption and security protection measures in modern mobile devices and their impact on traditional forensic data extraction techniques for law enforcement purposes. We demonstrate that in order to overcome encryption challenges, new mobile forensic methods rely on bypassing the security features and exploiting system vulnerabilities. A new model for forensic acquisition is proposed. The model is supported by a legal framework focused on the usability of digital evidence obtained through vulnerability exploitation.</p

    Fusing Structural and Functional MRIs using Graph Convolutional Networks for Autism Classification

    Full text link
    Geometric deep learning methods such as graph convolutional networks have recently proven to deliver generalized solutions in disease prediction using medical imaging. In this paper, we focus particularly on their use in autism classification. Most of the recent methods use graphs to leverage phenotypic information about subjects (patients or healthy controls) as additional contextual information. To do so, metadata such as age, gender and acquisition sites are utilized to define intricate relations (edges) between the subjects. We alleviate the use of such non-imaging metadata and propose a fully imaging-based approach where information from structural and functional Magnetic Resonance Imaging (MRI) data are fused to construct the edges and nodes of the graph. To characterize each subject, we employ brain summaries. These are 3D images obtained from the 4D spatiotemporal resting-state fMRI data through summarization of the temporal activity of each voxel using neuroscientifically informed temporal measures such as amplitude low frequency fluctuations and entropy. Further, to extract features from these 3D brain summaries, we propose a 3D CNN model. We perform analysis on the open dataset for autism research (full ABIDE I-II) and show that by using simple brain summary measures and incorporating sMRI information, there is a noticeable increase in the generalizability and performance values of the framework as compared to state-of-the-art graph-based models

    Calibration of score based likelihood ratio estimation in automated forensic facial image comparison

    Full text link
    Forensic facial image comparison lacks a methodological standardization and empirical validation. We aim to address this problem by assessing the potential of machine learning to support the human expert in the courtroom. To yield valid evidence in court, decision making systems for facial image comparison should not only be accurate, they should also provide a calibrated confidence measure. This confidence is best conveyed using a score-based likelihood ratio. In this study we compare the performance of different calibrations for such scores. The score, either a distance or a similarity, is converted to a likelihood ratio using three types of calibration following similar techniques as applied in forensic fields such as speaker comparison and DNA matching, but which have not yet been tested in facial image comparison. The calibration types tested are: naive, quality score based on typicality, and feature-based. As transparency is essential in forensics, we focus on state-of-the-art open software and study their power compared to a state-of-the-art commercial system. With the European Network of Forensic Science Institutes (ENFSI) Proficiency tests as benchmark, calibration results on three public databases namely Labeled Faces in the Wild, SC Face and ForenFace show that both quality score and feature based calibration outperform naive calibration. Overall, the commercial system outperforms open software when evaluating these Likelihood Ratios. In general, we conclude that calibration implemented before likelihood ratio estimation is recommended. Furthermore, in terms of performance the commercial system is preferred over open software. As open software is more transparent, more research on open software is urged for
    corecore