1,720,952 research outputs found

    Understanding Choice Independence and Error Types in Human-AI Collaboration

    No full text
    The ability to make appropriate delegation decisions is an important prerequisite of effective human-AI collaboration. Recent work, however, has shown that people struggle to evaluate AI systems in the presence of forecasting errors, falling well short of relying on AI systems appropriately. We use a pre-registered crowdsourcing study (N = 611) to extend this literature by two underexplored crucial features of human AI decision-making: choice independence and error type. Subjects in our study repeatedly complete two prediction tasks and choose which predictions they want to delegate to an AI system. For one task, subjects receive a decision heuristic that allows them to make informed and relatively accurate predictions. The second task is substantially harder to solve, and subjects must come up with their own decision rule. We systematically vary the AI system's performance such that it either provides the best possible prediction for both tasks or only for one of the two. Our results demonstrate that people systematically violate choice independence by taking the AI's performance in an unrelated second task into account. Humans who delegate predictions to a superior AI in their own expertise domain significantly reduce appropriate reliance when the model makes systematic errors in a complementary expertise domain. In contrast, humans who delegate predictions to a superior AI in a complementary expertise domain significantly increase appropriate reliance when the model systematically errs in the human expertise domain. Furthermore, we show that humans differentiate between error types and that this effect is conditional on the considered expertise domain. This is the first empirical exploration of choice independence and error types in the context of human-AI collaboration. Our results have broad and important implications for the future design, deployment, and appropriate application of AI systems.Web Information System

    For What It's Worth: Humans Overwrite Their Economic Self-interest to Avoid Bargaining With AI Systems

    No full text
    As algorithms are increasingly augmenting and substituting human decision-making, understanding how the introduction of computational agents changes the fundamentals of human behavior becomes vital. This pertains to not only users, but also those parties who face the consequences of an algorithmic decision. In a controlled experiment with 480 participants, we exploit an extended version of two-player ultimatum bargaining where responders choose to bargain with either another human, another human with an AI decision aid or an autonomous AI-system acting on behalf of a passive human proposer. Our results show strong responder preferences against the algorithm, as most responders opt for a human opponent and demand higher compensation to reach a contract with autonomous agents. To map these preferences to economic expectations, we elicit incentivized subject beliefs about their opponent's behavior. The majority of responders maximize their expected value when this is line with approaching the human proposer. In contrast, responders predicting income maximization for the autonomous AI-system overwhelmingly override economic self-interest to avoid the algorithm.Web Information System

    TaskGenie: Crowd-Powered Task Generation for Struggling Search

    No full text
    Search tasks provide a medium for the evaluation of system performance and the underlying analytical aspects of IR systems. Researchers have recently developed new interfaces or mechanisms to support vague information needs and struggling search. However, little attention has been paid to the generation of a unified task set for evaluation and comparison of search engine improvements for struggling search. Generation of such tasks is inherently difficult, as each task is supposed to trigger struggling and exploring user behavior rather than simple search behavior. Moreover, the everchanging landscape of information needs would render old task sets less ideal if not unusable for system evaluation. In this paper, we propose a task generation method and develop a crowd-powered platform called TaskGenie to generate struggling search tasks online. Our experiments and analysis show that the generated tasks are qualified to emulate struggling search behaviors consisting of ‘repeated similar queries’ and ‘quick-back clicks’, etc. – tasks of diverse topics, high quality and difficulty can be created using this framework. For the benefit of the community, we publicly released the platform, a task set containing 80 topically diverse struggling search tasks generated and examined in this work, and the corresponding anonymized user behavior logs.Green Open Access added to TU Delft Institutional Repository ‘You share, we take care!’ – Taverne project https://www.openaccess.nl/en/you-share-we-take-care Otherwise as indicated in the copyright section: the publisher is the copyright holder of this work and the author uses the Dutch legislation to make this work public.Web Information System

    Estimating Conversational Styles in Conversational Microtask Crowdsourcing

    No full text
    Crowdsourcing marketplaces have provided a large number of opportunities for online workers to earn a living. To improve satisfaction and engagement of such workers, who are vital for the sustainability of the marketplaces, recent works have used conversational interfaces to support the execution of a variety of crowdsourcing tasks. The rationale behind using conversational interfaces stems from the potential engagement that conversation can stimulate. Prior works in psychology have also shown that ‘conversational styles’ can play an important role in communication. There are unexplored opportunities to estimate a worker’s conversational style with an end goal of improving worker satisfaction, engagement and quality. Addressing this knowledge gap, we investigate the role of conversational styles in conversational microtask crowdsourcing. To this end, we design a conversational interface which supports task execution, and we propose methods toestimate the conversational style of a worker. Our experimental setup was designed to empirically observe how conversational styles of workers relate with quality-related outcomes. Results show that even a naive supervised classifier can predict the conversation style with high accuracy (80%), and crowd workers with an Involvement conversational style provided a significantly higher output quality, exhibited a higher user engagement and perceived less cognitive task load in comparison to their counterparts. Our findings have important implications on task design with respect to improving worker performance and their engagement in microtask crowdsourcing.Green Open Access added to TU Delft Institutional Repository ‘You share, we take care!’ – Taverne project https://www.openaccess.nl/en/you-share-we-take-care Otherwise as indicated in the copyright section: the publisher is the copyright holder of this work and the author uses the Dutch legislation to make this work public.Web Information SystemsHuman-Centred Artificial Intelligenc

    Just the Right Mood for HIT!

    No full text
    Conversational agents are playing an increasingly important role in providing users with natural communication environments, improving outcomes in a variety of domains in human-computer interaction. Crowdsourcing marketplaces are simultaneously flourishing, and it has never been easier to acquire large-scale human input from online workers. Recent works have revealed the potential of conversational interfaces in improving worker engagement and satisfaction. At the same time, worker moods have been shown to have significant effects on quality related outcomes. Little is known about the role of worker moods in shaping work in conversational microtask crowdsourcing. In this paper, we conducted a crowdsourcing study addressing 600 unique online workers, to investigate the role that worker moods play in conversational microtask crowdsourcing. We also explore whether suitable conversational styles of the agent can affect the performance of workers in different moods. Our results show that workers in a pleasant mood tend to produce significantly higher quality results (over 20%), exhibit greater engagement (an increase by around 19%) and report a lower cognitive load (by over 12%), and a suitable conversational style can have a significant impact on workers in different moods. Our findings advance the current understanding of conversational microtask crowdsourcing and have important implications on designing future conversational crowdsourcing systems.Green Open Access added to TU Delft Institutional Repository ‘You share, we take care!’ – Taverne project https://www.openaccess.nl/en/you-share-we-take-care Otherwise as indicated in the copyright section: the publisher is the copyright holder of this work and the author uses the Dutch legislation to make this work public.Web Information SystemsHuman-Centred Artificial Intelligenc

    Improving Worker Engagement Through Conversational Microtask Crowdsourcing

    No full text
    The rise in popularity of conversational agents has enabled humans to interact with machines more naturally. Recent work has shown that crowd workers in microtask marketplaces can complete a variety of human intelligence tasks (HITs) using conversational interfaces with similar output quality compared to the traditional Web interfaces. In this paper, we investigate the effectiveness of using conversational interfaces to improve worker engagement in microtask crowdsourcing. We designed a text-based conversational agent that assists workers in task execution, and tested the performance of workers when interacting with agents having different conversational styles. We conducted a rigorous experimental study on Amazon Mechanical Turk with 800 unique workers, to explore whether the output quality, worker engagement and the perceived cognitive load of workers can be affected by the conversational agent and its conversational styles. Our results show that conversational interfaces can be effective in engaging workers, and a suitable conversational style has potential to improve worker engagement.Accepted author manuscriptWeb Information SystemsHuman-Centred Artificial Intelligenc

    CaptureBias: Supporting Media Scholars with Ambiguity-Aware Bias Representation for News Videos

    No full text
    In this project we explore the presence of ambiguity in textual and visual media and its influence on accurately understanding andcapturing bias in news. We study this topic in the context of supportingmedia scholars and social scientists in their media analysis. Our focuslies on racial and gender bias as well as framing and the comparisonof their manifestation across modalities, cultures and languages. In thispaper we lay out a human in the loop approach to investigate the role ofambiguity in detection and interpretation of bias.Accepted Author ManuscriptWeb Information System

    Towards Memorable Information Retrieval

    No full text
    Information overload is a problem many of us can relate to nowadays. The deluge of user generated content on the Internet, and the easy accessibility to a vast amount of data compounds the problem of remembering and retaining information that is consumed. To make information consumed more memorable, strategies such as note-taking have been found to be effective by augmenting human memory under specific conditions. This is based on the rationale that humans tend to recall information better if they have produced the information themselves. Previous works in online education have shown that conversational systems can improve learning effects. Although memorization is an important part of learning, the effect of conversation on human memorability remains unexplored. We aim to address this knowledge gap through an experimental study, by investigating human memorability in a classical information retrieval setup. We explore the impact of note-taking affordances and conversational interfaces on the memorability of information consumed by users. Our results show that traditional web search and note-taking have positive effects on knowledge gain, while the search engine with a conversational interface has the potential to augment long-term memorability. This work highlights the benefits of using note-taking and conversational interfaces to aid human memorability. Our findings have important implications on building information retrieval systems that cater to optimizing memorability of information consumed.Green Open Access added to TU Delft Institutional Repository ‘You share, we take care!’ – Taverne project https://www.openaccess.nl/en/you-share-we-take-care Otherwise as indicated in the copyright section: the publisher is the copyright holder of this work and the author uses the Dutch legislation to make this work public.Web Information SystemsHuman-Centred Artificial Intelligenc

    How do user opinions influence their interaction with web search results?

    No full text
    Understanding the influence of users' opinions on their search behavior together with their inherent biases in web search has garnered widespread interest in recent times. This is largely due to the implications of promoting critical thinking, explaining phenomena such as political polarization, or the manifestation of echo chambers. It is important to understand how personal opinions can bias users' interaction with search results. Moreover, there is a lack of understanding of the impact of user search intents, namely non-purposeful browsing versus searching with a pre-defined goal, on users' interactions with search results. We take a step towards bridging this knowledge gap through an empirical study in this paper. To do so, we select two controversial topics in abortion and gun control, and invite users to learn about them through ĝ€ Purposeless' and ĝ€ Purposeful' web searching. Our findings suggest that users with strong personal opinions exhibit biased interactions with the search results. However, the effect of users' opinions on their interactions with search results can differ depending on whether users search purposelessly or with a purpose. Our findings advance the current understanding of the effect of users' opinions in web search sessions, and show that users' search intents shape their interaction with search results. This work has broad design implications on dealing with bias in interactive information retrieval systems.Green Open Access added to TU Delft Institutional Repository ‘You share, we take care!’ – Taverne project https://www.openaccess.nl/en/you-share-we-take-care Otherwise as indicated in the copyright section: the publisher is the copyright holder of this work and the author uses the Dutch legislation to make this work public.Web Information System

    Characterising and Mitigating Aggregation-Bias in Crowdsourced Toxicity Annotations

    No full text
    Training machine learning (ML) models for natural language processing usually requires large amount of data, often acquired through crowdsourcing. The way this data is collected and aggregated can have an effect on the outputs of the trained model such as ignoring the labels which differ from the majority. In this paper we investigate how label aggregation can bias the ML results towards certain data samples and propose a methodology to highlight and mitigate this bias. Although our work is applicable to any kind of label aggregation for data subject to multiple interpretations, we focus on the effects of the bias introduced by majority voting on toxicity prediction over sentences. Our preliminary results point out that we can mitigate the majority-bias and get increased prediction accuracy for the minority opinions if we take into account the different labels from annotators when training adapted models, rather than rely on the aggregated labels.Accepted Author ManuscriptWeb Information System
    corecore