1,721,041 research outputs found
Towards a Sustainable Internet of Sounds
The Internet of Sounds (IoS) is an emerging research area at the intersection of engineering fields and humanities including computing, communication technology, audio signal processing, acoustic monitoring, music and arts. Although this research field is expected to have beneficial impacts on society through entertainment, creativity, well-being, monitoring and security, it is paramount to be aware of the adverse impact of current technology on the environment in terms of greenhouse gases emissions, pollution and soil consumption. In this study we provide a survey of the environmental issues produced by current information and communication technology (ICT) and relate these to the use cases that the IoS envisions. On the basis of this survey, we identify some key aspects to reduce the footprint of IoS services and products and then we provide suggestions to make advancements in IoS environment-aware
Emotion rendering in auditory simulations of imagined walking styles
This paper investigated how different emotional states of a walker can be rendered and recognized by means of footstep sounds synthesis algorithms. In a first experiment, participants were asked to render, according to imagined walking scenarios, five emotions (aggressive, happy, neutral, sad, and tender) by manipulating the parameters of synthetic footstep sounds simulating various combinations of surface materials and shoes types. Results allowed to identify, for the involved emotions and sound conditions, the mean values and ranges of variation of two parameters, sound level and temporal distance between consecutive steps. Results were in accordance with those reported in previous studies on real walking, suggesting that expression of emotions in walking is independent from the real or imagined motor activity. In a second experiment participants were asked to identify the emotions portrayed by walking sounds synthesized by setting the synthesis engine parameters to the mean values found in the first experiment. Results showed that the involved algorithms were successful in conveying the emotional information at a level comparable with previous studies. Both experiments involved musicians and non-musicians. In both experiments, a similar general trend was found between the two groups
Latency of spatial audio plugins: a comparative study
The use of spatial audio plugins (SAPs) with Ambisonics processing and binaural rendering has become widespread in the last decade, thanks to their increased accessibility and usability. SAPs are particularly relevant in scenarios involving real-time music playing with headphones, such as networked music performance and individual recreational music-making using backing tracks. However, a crucial issue that has been largely overlooked thus far is the measurement of the processing latency introduced by currently available SAPs. Identifying which SAPs are the fastest is essential to enable designers, musicians, and researchers to create time-sensitive applications involving 3D audio. To bridge this gap, we compared nine systems formed by different SAPs that enable 3D audio management. We measured the latency of each system throughout the third-order Ambisonics plugins pipeline: encoding, room simulation, sound scene rotation, and binaural decoding. In particular, the measurements were performed utilizing different buffer sizes. Results showed that to achieve a minimization of the latency, it is necessary to use a combination of different SAPs from different systems. Based on our measurements, we propose two spatial audio systems that mix different SAPs. Considering a sampling rate of 48 kHz, a Dell Alienware x15 R2 laptop running the Windows 10 operating system, and an RME Fireface UFX sound card, the two systems achieved an overall latency of 0.33 ms and 0.94 ms respectively
Emotion Rendering in Plantar Vibro-Tactile Simulations of Imagined Walking Styles
This paper investigates the production and identification of emotional states of a walker using plantar vibro-tactile simulations. In a first experiment, participants were asked to render, according to imagined walking scenarios, five emotions (aggressive, happy, neutral, sad, and tender) by manipulating the parameters of synthetic footstep vibrations simulating various combinations of surface materials and shoes. Results allowed to identify, for the involved emotions and vibration conditions, the mean values and ranges of variation of two parameters, vibration amplitude and temporal distance between consecutive steps. Results were in accordance with those reported in previous studies on real walking, suggesting that the plantar vibro-tactile expression of emotions in walking is independent of the real or imagined motor activity. In a second experiment, participants were asked to identify the emotions portrayed by walking vibrations synthesized by setting the synthesis engine parameters to the mean values found in the first experiment. Results showed that the involved algorithms were successful in conveying the emotional information at a level comparable with previous studies. Results of both experiments revealed strong similarities with those of an analogous study on footstep sounds suggesting that emotionally expressive walking styles are consistently produced and recognized at auditory and plantar vibro-tactile level
Sustainable Internet of Musical Things: Strategies to Account for Environmental and Social Sustainability in Network-Based Interactive Music Systems
The use of internet-based and networking technology in computer music systems has greatly increased in the past few years. Such efforts fall in the remits of the emerging filed of the Internet of Musical Things (IoMusT), the extension of the Internet of Things paradigm to the musical domain. Given the increasing importance of connected devices in the musical domain, it is essential to reflect on the relationship between such systems and sustainability at the environmental and social levels. In this paper, we address this aspect from two perspectives: 1) how to design IoMusT systems in a sustainable way, and 2) how IoMusT systems can support sustainability. To this end, we relied on three lenses, combining literature from green IoT (lens 1), Sustainable HCI (lens 2), and the Sustainable Development Goals from the United Nations (lens 3). By combining these three lenses, we developed five strategies for a sustainable IoMusT, which are extensively presented and discussed providing critica..
5G-Enabled Internet of Musical Things Architectures for Remote Immersive Musical Practices
Networked Music Performances (NMPs) involve geographically-displaced musicians performing together in real-time. To date, scarce research has been conducted on how to integrate NMP systems with immersive audio rendering techniques able to enrich the musicians’ perception of sharing the same acoustic environment. In addition, the use of wireless technologies for NMPs has been largely overlooked. In this paper, we propose two architectures for Immersive Networked Music Performances (INMPs), which differ for the physical positions of the computing blocks constituting the 3D audio toolchain. These architectures leverage a backend specifically conceived to support remote musical practices via Software Defined Networking methods, and take advantage of the orchestration, slicing, and Multi-access Edge Computing (MEC) capabilities of 5G. Moreover, we illustrate how to integrate in the architectures machine learning algorithms for network traffic prediction and audio packet loss concealment. Traffic predictions at multiple time scales are utilized to achieve an optimized placement of Virtual Network Functions hosting audio mixing and processing functionalities within the available MEC sites, depending on the users’ geographical locations and current network load conditions. An analysis of the technical requirements for INMPs using the two architectures is provided, along with their performance assessment conducted via simulators
Interactive footsteps sounds modulate the sense of effort without affecting the kinematics and metabolic parameters during treadmill-walking
Previous research has shown that walkers provided with interactive simulations of footstep sounds on a surface material different from the one they are walking upon, experience pseudo-haptic illusions and adjust their walking kinematic according to the perceived surfaces’ compliance. Since walking on real grounds with different degrees of compliance leads to different metabolic costs, an open question is whether pseudo-haptic illusions created by interactive footstep sounds are able to affect the metabolic parameters. This study investigated whether metabolic cost and movement's kinematics are affected by such interactive auditory feedback in a constrained condition as walking on a treadmill. Participants were walking on a treadmill under three listening conditions: actual footsteps sounds, interactive simulations of footstep sounds on gravel and snow. The metabolic and kinematic data, as well as the perceived exertion, sense of effort, easiness, and feeling of sinking were recorded. Results showed that interactive footstep sounds provided during treadmill walking did not affect kinematic and metabolic parameters of walking, while they were effective in modulating participants’ perception. These results suggest that in a constrained and non self-selected pattern of locomotion the sound of action, even though correctly perceived, is not strong enough to induce a change in the metabolic and kinematics of the locomotion. © 2017 Elsevier Lt
Architecting the Musical Metaverse: Lessons from 5G and Emerging Technologies
The Musical Metaverse envisions immersive, interactive environments where geographically distributed users co-create and experience music in real-time. These scenarios impose demanding constraints on communication and computation infrastructures, requiring ultra-low latency, deterministic audio delivery, and synchronized multimodal feedback. This paper presents a critical investigation into the capabilities and limitations of 5G and emerging technologies in enabling such scenarios. Building upon empirical evaluations and architectural studies, it identifies key bottlenecks in public and non-standalone 5G deployments and explores the role of private standalone infrastructures enhanced with Mobile Edge Computing. Furthermore, it assesses the potential of complementary paradigms—such as Reconfigurable Intelligent Surfaces, mmWave communications, AI-driven orchestration, and Digital Twins - for supporting scalable and expressive musical applications rooted in shared creative expression, embodied interaction, and remote co-presence. The analysis shows that current technologies, while promising, remain insufficient to fully meet the stringent requirements of real-time musical interaction. It identifies key technological gaps and outlines future directions toward intelligent, adaptive, and musically coherent infrastructures that can support the experiential and collaborative nature of the Musical Metaverse
Distilling DDSP: Exploring Real-Time Audio Generation on Embedded Systems
This paper investigates the feasibility of running neural audio generative models on embedded systems, by comparing the performance of various models and evaluating their trade-offs in audio quality, inference speed, and memory usage. This work focuses on differentiable digital signal processing (DDSP) models, due to their hybrid architecture, which combines the efficiency and interoperability of traditional DSP with the flexibility of neural networks. In addition, the application of knowledge distillation (KD) is explored to improve the performance of smaller models. Two types of distillation strategies were implemented and evaluated: audio distillation and control distillation. These methods were applied to three foundation DDSP generative models that integrate Harmonic-plus-Noise, FM, and Wavetable synthesis. The results demonstrate the overall effectiveness of KD: the authors were able to train student models that are up to 100× smaller than their teacher counterparts while maintaining comparable performance and significantly improving inference speed and memory efficiency. However, cases where KD failed to improve or even degrade student performance have also been observed. The authors provide a critical reflection on the advantages and limitations of KD, exploring its application in diverse use cases and emphasizing the need for carefully tailored strategies to maximize its potential
- …
