1,721,549 research outputs found

    Attacker attribution of audio deepfakes

    No full text
    Deepfakes are synthetically generated media often devised with malicious intent. They have become increasingly more convincing with large training datasets advanced neural networks. These fakes are readily being misused for slander, misinformation and fraud. For this reason, intensive research for developing countermeasures is also expanding. However, recent work is almost exclusively limited to deepfake detection - predicting if audio is real or fake. This is despite the fact that attribution (who created which fake?) is an essential building block of a larger defense strategy, as practiced in the field of cybersecurity for a long time. This paper considers the problem of deepfake attacker attribution in the domain of audio. We present several methods for creating attacker signatures using low-level acoustic descriptors and machine learning embeddings. We show that speech signal features are inadequate for characterizing attacker signatures. However, we also demonstrate that embeddings from a recurrent neural network can successfully characterize attacks from both known and unknown attackers. Our attack signature embeddings result in distinct clusters, both for seen and unseen audio deepfakes. We show that these embeddings can be used in downstream-tasks to high-effect, scoring 97.10% accuracy in attacker-id classification

    Hormonal Contraceptive Review Supplementary Data

    No full text
    These documents are supplementary tables for a review entitled: "Influence of hormonal contraceptives on peripheral vascular function and structure in premenopausal females: a review.

    End-to-End Signal Factorization for Speech: Identity, Content, and Style

    No full text
    Preliminary experiments in this dissertation show that it is possible to factorize specific types of information from the speech signal in an abstract embedding space using machine learning. This information includes characteristics of the recording environment, speaking style, and speech quality. Based on these findings, a new technique is proposed to factorize multiple types of information from the speech signal simultaneously using a combination of state-of-the-art machine learning methods for speech processing. Successful speech signal factorization will lead to advances across many speech technologies, including improved speaker identification, detection of speech audio deep fakes, and controllable expression in speech synthesis

    A new Twitter verb lexicon for natural language processing.

    No full text
    We describe in-progress work on the creation of a new lexical resource that contains a list of 486 verbs annotated with quantified temporal durations for the events that they describe. This resource is being compiled from more than 14 million tweets from the Twitter microblogging site. We are creating this lexicon of verbs and typical durations to address a gap in the available information that is represented in existing research. The data that is contained in this lexicon is unlike any existing resources, which have been traditionally comprised of literature excerpts, news stories, and full-length weblogs. This kind of knowledge about how long an event lasts is crucial for natural language processing and is especially useful when the temporal duration of an event is implied. We are using data from Twitter because Twitter is a rich resource since people are publicly posting real events and real durations of those events throughout the day

    Privacy considerations for wearable audio-visual AI in hearing aids

    No full text
    Recent developments in audio visual (AV) hearing aids have shown significant potential to transform how the deaf and hard of hearing community use assistive technologies. Despite this, before the devices can be adopted at scale there are several key privacy issues to consider. These devices not only affect the wearer but also the general public. With increased awareness and concerns regarding surveillance from the general public, these devices need to be developed with privacy preserving methods at the forefront of design in order to prevent social acceptance barriers to uptake. In doing so, these devices can be widely adopted and made safe for users and society

    Safe Audio AI Services in Smart Buildings

    No full text
    Audio AI services present an opportunity to conceptualise smart buildings in a new light. Microphones can capture fine-grained audio information that can be used for determining how many people are inside of a building, where they are, and what kinds of activities are taking place. This information can feed into smartresource management systems or it could be used for assistive technologies. Generally speaking, audio is regarded as a less intrusive type of information collection than video surveillance, but significant issues of privacy and security persist with audio capture. Such issues warrant a serious discussion about how safe it is to use audio-capture in smart buildings for AI decision-making. Thisposition paper initiates a discussion of research directions for the safety of audio services related to three key areas: data degradation strategies, dynamic customisation of tools, and privacy-aware technologies. In each area, we identify key challenges and highlight solution concepts with the potential to address the issue
    corecore