1,721,044 research outputs found
2019 International Conference on Very Large Database PhD Workshop, VLDB-PhD 2019
The chairs (Ilaria Bartolini, University of Bologna, Feifei Li, Alibaba) are glad to present this volume of the proceedings of the 2019 edition of the VLDB Ph.D. Workshop. The workshop was co-located with the 45th International Conference on Very Large Database (VLDB 2019) and held on August 26, 2019, in Los Angeles, California. We assembled the technical program to give Ph.D. students an opportunity to present their research ideas in a premier international research venue. The resulting program, consisting of 15 papers accepted for presentation over 23 submissions we received, presents a clear sample of emerging topics in database research that included contributions from young researchers across the world. The workshop indeed provides a forum that facilitates interactions among Ph.D. students and stimulates feedback from more experienced researchers. The program this year also included two keynote talks from prestigious industries. The first one, entitled “Database Systems 2.0”, was presented by Johannes Gehrke (Microsoft), whereas the second one, entitled “Structured Data Meets News”, was presented by Cong Yu (Google). The two keynotes were able to inspire and provide directions to Ph.D. students, discussing novel issues and challenges. We believe the event had given excellent opportunities to share and exchange research ideas between the Ph.D. students and the more experienced researchers and promoted synergistic collaborations
Multimedia Systems
This journal details innovative research ideas, emerging technologies, state-of-the-art methods and tools in all aspects of multimedia computing, communication, storage, and applications. It features theoretical, experimental, and survey articles
Real-Time Stream Processing in Social Networks with RAM^3S
The avalanche of (both user- and device-generated) multimedia data published in online social networks poses serious challenges to researchers seeking to analyze such data for many different tasks, like recommendation, event recognition, and so on. For some such tasks, the classical “batch” approach of big data analysis is not suitable, due to constraints of real-time or near-real-time processing. This led to the rise of stream processing big data platforms, like Storm and Flink, that are able to process data with a very low latency. However, this complicates the task of data analysis since any implementation has to deal with the technicalities of such platforms, like distributed processing, synchronization, node faults, etc. In this paper, we show how the RAM^3S framework could be profitably used to easily implement a variety of applications (such as clothing recommendations, job suggestions, and alert generation for dangerous events), being independent of the particular stream processing big data platforms used. Indeed, by using RAM^3S, researchers can concentrate on the development of their data analysis application, completely ignoring the details of the underlying platform
WARP: Accurate Retrieval of Shapes Using Phase of Fourier Descriptors and Time Warping Distance
Effective and efficient retrieval of similar shapes from large image databases is still a challenging problem in spite of the high relevance that shape information can have in describing image contents. In this paper, we propose a novel Fourier-based approach, called WARP, for matching and retrieving similar shapes. The unique characteristics of WARP are the exploitation of the phase of Fourier coefficients and the use of the Dynamic Time Warping (DTW) distance to compare shape descriptors. While phase information provides a more accurate description of object boundaries than using only the amplitude of Fourier coefficients, the DTW distance permits us to accurately match images even in the presence of (limited) phase shiftings. In terms of classical precision/recall measures, we experimentally demonstrate that WARP can gain, say, up to 35 percent in precision at a 20 percent recall level with respect to Fourier-based techniques that use neither phase nor DTW distance
A general framework for real-time analysis of massive multimedia streams.
Big Data platforms provide opportunities for the management and analysis of large quantities of information, but the services they provide are often too raw, since they focus on issues of fault-tolerance, increased parallelism, and so on. An additional software layer is, therefore, needed to effectively use such architectures for advanced applications in several important real-world domains, such as scientific and health care sensors, user-generated data, supply chain systems and financial companies, to name a few. In this paper, we present RAM(Formula presented.)S, a framework for the real-time analysis of massive multimedia streams, where data come from multiple data sources (such as sensors and cameras) that are widely located on the territory, with the final goal to discovery new and hidden information from the output of data sources as they occur, thus with very limited latency. We apply RAM^3S to the use case of automatic detection of suspected people from several concurrent video streams, and instantiate it on top of three different open source engines for the analysis of streaming Big Data (i.e., Apache Spark, Apache Storm, and Apache Flink). The effectiveness and scalability of RAM^3S instantiation is experimentally evaluated on real data, also comparing the performance of the three considered Big Data platforms. Such comparison is performed both on a cluster of physical machines in our datalab and on the Google Cloud Platform
Windsurf: the best way to SURF: (and SIFT/BRISK/ORB/FREAK, too).
Despite their popularity, approaches based on salient point descriptors have yet to be proven effective for content-based image retrieval. In this paper, we show how the Windsurf library can be effectively exploited to assess a fair comparison among the existing alternative approaches based on salient points, which can be contrasted on aspects of both effectiveness and efficiency. Our extensive experimental evaluation, performed on four different image benchmarks, indeed, shows that techniques based on salient point descriptors have effectiveness not better than other existing techniques and are less amenable to be indexed, and thus, their efficiency remains questionable
Comparing performances of big data stream processing platforms with RAM3S
Nowadays, Big Data platforms allow the analysis of massive data streams in an efficient way. However, the services they provide are often too raw, thus the implementation of advanced real-world applications requires a non-negligible effort for interfacing with such services. This also complicates the task of choosing which one of the many available alternatives is the most appropriate for the application at hand. In this paper, we present a comparative study of the three major open-source Big Data platforms for stream processing, as performed by using our novel RAM^3S framework. Although the results we present are specific for our use case (recognition of suspect people from massive video streams), the generality of the RAM^3S framework allows both considering such results as valid for similar applications and implementing different use cases on top of Big Data platforms with very limited effort
A stream processing abstraction framework
Real-time analysis of large multimedia streams is nowadays made efficient by the existence of several Big Data streaming platforms, like Apache Flink and Samza. However, the use of such platforms is difficult due to the fact that facilities they offer are often too raw to be effectively exploited by analysts. We describe the evolution of RAM3S, a software infrastructure for the integration of Big Data stream processing platforms, to SPAF, an abstraction framework able to provide programmers with a simple but powerful API to ease the development of stream processing applications. By using SPAF, the programmer can easily implement real-time complex analyses of massive streams on top of a distributed computing infrastructure, able to manage the volume and velocity of Big Data streams, thus effectively transforming data into value
Multimedia Queries in Digital Libraries
The intrinsic complexity and diversity of data in multimedia digital libraries (MDLs) require devising techniques and solutions that are inherently different from those usually adopted in traditional information retrieval and database (DB) systems. Moreover, the size and the dynamicity of MDLs force researchers to strive for efficiency, so as to guarantee real-time results to the users. Finally, semantics should be also brought into context, in order to facilitate users’ experience in querying, browsing, and consuming multimedia information. This chapter will present an approach toward the efficient, effective, and semantically rich data retrieval in MDLs. With respect to the commonly used holistic approach, where the multimedia datum is considered as an atomic entity, our reductionist strategy considers the multimedia information as a complex combination of component subparts and eases the fulfillment of the three above properties of efficiency, effectiveness, and semantic richness. Indeed, by decomposing multimedia information into simpler and smaller component objects, we are able to index such components without giving up the ability of querying the original information as a whole
- …
