8,535 research outputs found
Player/Avatar Body Relations in Multimodal Augmented Reality Games
Augmented reality research is finally moving towards multimodal experiences: more and more applications do not only include visuals, but also audio and even haptics. The purpose of multimodality in these applications can be to increase realism or to increase the amount or quality of communicated information. One particularly interesting and increasingly important application area is AR gaming, where the player can experience the virtual game integrated into the real environment and interact with it in a multimodal fashion. Currently, many games are set up such that the interaction is local (direct), however there are many cases in which remote (indirect) interaction will be useful or even necessary. In the latter case, the actions can be expressed through a virtual avatar, while the player's real body is also still perceivably present. The player then controls the motions and actions of the avatar, and receives multimodal feedback associated to the events occurring in the game. Can it be that the player starts to perceive the avatar as a (part of) him- or herself? Or does something even more intense take place? What are the benefits of this experience? The core of this research is to understand how multimodal perceptual configuration plays a role in the relation between a player and their in-game avatar
Digital Musicology and MIR: Papers, Projects and Challenges
In this paper we report on the ISMIR 2013 Demo and Late Breaking Session entitled Digital Musicology and MIR. Five papers were discussed as examples of interesting MIR contributions to musicology. Two important projects, Transforming Musicology and CompMusic, were briefly presented. Finally, this paper reports the first results of a questionnaire about challenges from Digital Musicology for MIR research. The most important outcomes are that lack of suitable musical data is still an important obstacle and that there is a great demand for tools and methods that make integrated access and analysis of symbolic and audio data possible
Proceedings of the 16th International Society for Music Information Retrieval Conference: October 26-30, 2015; Malaga, Spain
Content-based multimedia retrieval: indexing and diversification
The demand for efficient systems that facilitate searching in multimedia databases and collections is vastly increasing. Application domains include criminology, musicology, trademark registration, medicine and image or video retrieval on the web. This thesis discusses content-based retrieval techniques that can be applied on these databases. The most important operation to support is query-by-example: retrieve items that are to some extent similar to a given query. In chapter 2 we study vantage indexing, a generic indexing approach not limited to a specific domain, because it only requires a (preferably metric) way of calculating a distance between the objects. With vantage indexing, objects are no longer compared directly, but it is investigated how similar their resemblance is to a set of reference objects: the vantage objects. We present a new way of selecting vantage objects, and demonstrate experimentally the scalability of the approach and the improvement in retrieval performance over existing methods. In many applications, it is possible to represent the objects by graphs. In chapter 3 we propose an indexing strategy for these scenarios. The assumption is that the graph's topology can be used to describe the underlying object. This topology is stored the laplacian matrix, and the sorted set of eigenvalues (spectrum) of this matrix is used as an indexing signature. To account for partial similarity, the difference in spectra of many subgraphs is considered as well. Both theory and experimental work support the claim that similarity in laplacian spectrum predicts similarity between the original objects. The proposed representation outperforms existing methods in various application domains. A possible limitation of the approach presented in chapter 3 is that two graphs with the same topology share the same representation, while the underlying objects may be different. In chapter 4, we enrich the graph representation therefore by storing additional object properties. We develop a complex-valued analogue of the laplacian matrix, a Hermitian matrix, and use its eigenvector associated to the second smallest eigenvalue as indexing signature. This eigenvector is known to be informative about the graph, and can be reused to partition the graph into meaningful subgraphs, resulting in less subgraphs to compare for partial similarity. We provide a successful instance of this framework within the context of 3D object retrieval. In chapter 5, we claim that good retrieval results are not only relevant to the query, they should in fact reflect the diversity of the relevant objects that are present in the collection as well. Given an initial result set for a user query (image search), we propose to cluster the retrieved images based on their visual characteristics and to show the most important objects from each cluster. We first dynamically determine appropriate weights of visual features for a specific query. These weights are used in a dynamic ranking function that is deployed in a clustering technique to obtain a diverse ranking based on cluster representatives. We provide three lightweight and efficient clustering algorithms, and show that the algorithmic output coincides consistently with the results of a user study
On the Segmentation and Classification of Water in Videos
The automatic recognition of water entails a wide range of applications, yet little attention has been paid to solve this specific problem. Current literature generally treats the problem as a part of more general recognition tasks, such as material recognition and dynamic texture recognition, without distinctively analyzing and characterizing the visual properties of water. The algorithm presented here introduces a hybrid descriptor based on the joint spatial and temporal local behaviour of water surfaces in videos. The temporal behaviour is quantified based on temporal brightness signals of local patches, while the spatial behaviour is characterized by Local Binary Pattern histograms. Based on the hybrid descriptor, the probability of a small region of being water is calculated using a Decision Forest. Furthermore, binary Markov Random Fields are used to segment the image frames. Experimental results on a new and publicly available water database and a subset of the DynTex database show the effectiveness of the method for discriminating water from other dynamic and static surfaces and objects
Audio description and corpus analysis of popular music
In the field of sound and music computing, only a handful of studies are concerned with the pursuit of new musical knowledge. There is a substantial body of corpus analysis research focused on new musical insight, but almost all of it deals with symbolic data: scores, chords or manual annotations. In contrast, and despite the wide availability of audio data and tools for audio content analysis, very little work has been done on the corpus analysis of audio data. This thesis presents a number of contributions to the scientific study of music, based on audio corpus analysis. We focus on three themes: audio description, corpus analysis methodology, and the application of these description and analysis techniques to the study of music similarity and ‘hooks’. On the theme of audio description, we first present, in part i, an overview of the audio description methods that have been proposed in the music information retrieval literature, focusing on timbre, harmony and melody. We critically review current practices in terms of their relevancy to audio corpus analysis. Throughout part ii and iii, we then propose new feature sets and audio description strategies. Contributions include the introduction of audio bigram features, pitch descriptors that can be used for retrieval as well as corpus analysis, and second-order audio features, which quantify distinctiveness and recurrence of feature values given a reference corpus. On the theme of audio corpus analysis methodology, we first situate corpus analysis in the disciplinary context of music information retrieval, empirical musicology and music cognition. In part i, we then present a review of audio corpus analysis, and a case study comparing two influential corpus-based investigations into the evolution of popular music [122,175]. Based on this analysis, we formulate a set of nine recommendations for audio corpus analysis research. In part ii and iii, we present, alongside the new audio description techniques, new analysis methods for the study of song sections and within-song variation in a large corpus. Contributions on this theme include the first use of a probabilistic graphical model for the analysis of audio features. Finally, we apply new audio description and corpus analysis techniques to address two research problems of the cogitch project of which our research was a part: improving audio-based models of music similarity, and the analysis of hooks in popular music. In parts i and ii, we introduce soft audio fingerprinting, an umbrella MIR task that includes any efficient audio-based content identification. We then focus on the problem of scalable cover song detection, and evaluate several solutions based on audio bigram features. In part iii, we review the prevailing perspectives on musical catchiness, recognisability and hooks. We describe Hooked, a game we designed to collect data on the recognisability of a set of song fragments. We then present a corpus analysis of hooks, and new findings on what makes music catchy. Across the three themes above, we present several contributions to the available methods and technologies for audio description and audio corpus analysis. Along the way, we present new insights into choruses, catchiness, recognisability and hooks. By applying the proposed technologies, following the proposed methods, we show that rigorous audio corpus analysis is possible and that the technologies to engage in it are available
- …
