1,720,976 research outputs found
Awareness in mixed initiative planning
For tasks that need to be accomplished in unconstrained environments, as in the case of Urban Search and Rescue (USAR), human-robot collaboration is considered as an indispensable component. Collaboration is based on accurate models of robot and human perception consistent with one another, so that exchange of information critical to the accomplishment of a task is performed efficiently and in a simplified fashion to minimize the interaction overhead. In this paper, we high-light the features of a human-robot team, i.e. how robot perception may be combined with human perception based on a task-driven direction for USAR. We elaborate on the design of the components of a mixed-initiative system wherein a task assigned to the robot is planned and executed jointly with the human operator as a result of their interaction. Our description is solidified by demonstrating the application of mixed-initiative planning in a number of examples related to the morphological adaptation of the rescue robot. Copyright © 2011, Association for the Advancement of Artificial Intelligence. All rights reserved
A General Method for the Point of Regard Estimation in 3D Space
A novel approach to 3D gaze estimation for wearable multi-camera devices is proposed and its effectiveness is demonstrated both theoretically and empirically. The proposed approach, firmly grounded on the geometry of the multiple views, introduces a calibration procedure that is efficient, accurate, highly innovative but also practical and easy. Thus, it can run online with little intervention from the user. The overall gaze estimation model is general, as no particular complex model of the human eye is assumed in this work. This is made possible by a novel approach, that can be sketched as follows: each eye is imaged by a camera; two conics are fitted to the imaged pupils and a calibration sequence, consisting in the subject gazing a known 3D point, while moving his/her head, provides information to 1) estimate the optical axis in 3D world; 2) compute the geometry of the multi-camera system; 3) estimate the Point of Regard in 3D world. The resultant model is being used effectively to study visual attention by means of gaze estimation experiments, involving people performing natural tasks in wide-field, unstructured scenarios
Human-motion saliency in complex scenes
We present a new and original method for human motion analysis and evaluation, developed on the basis of the role played by attention in the perception of human motion. Attention is particularly relevant both in a multi-motion scene and in social interactions, when it comes to select and discern why and what to focus on. The first crucial role of attention concerns the saliency of human motion within a scene where other dynamics might occur. The second role, in social-close interactions, is highlighted by the selectivity shown towards gesture modalities both in peripheral and central vision. Experiments for both modeling and testing have been based on a dynamic 3D gaze tracker. © 2012 Springer-Verlag Berlin Heidelberg
Saliency prediction in the coherence theory of attention
In the coherence theory of attention, introduced by Rensink, O'Regan, and Clark (2000), a coherence field is defined by a hierarchy of structures supporting the activities taking place across the different stages of visual attention. At the interface between low level and mid-level attention processing stages are the proto-objects; these are generated in parallel and collect features of the scene at specific location and time. These structures fade away if the region is no further attended by attention. We introduce a method to computationally model these structures. Our model is based experimentally on data collected in dynamic 3D environments via the Gaze Machine, a gaze measurement framework. This framework allows to record pupil motion at the required speed and projects the point of regard in the 3D space (Pirri, Pizzoli, & Rudi, 2011; Pizzoli, Rigato, Shabani, & Pirri, 2011). To generate proto-objects the model is extended to vibrating circular membranes whose initial displacement is generated by the features that have been selected by classification. The energy of the vibrating membranes is used to predict saliency in visual search tasks. © 2013 Elsevier B.V
Linear solvability in the viewing graph
The Viewing Graph [1] represents several views linked by the corresponding fundamental matrices, estimated pairwise. Given a Viewing Graph, the tuples of consistent camera matrices form a family that we call the Solution Set. This paper provides a theoretical framework that formalizes different properties of the topology, linear solvability and number of solutions of multi-camera systems. We systematically characterize the topology of the Viewing Graph in terms of its solution set by means of the associated algebraic bilinear system. Based on this characterization, we provide conditions about the linearity and the number of solutions and define an inductively constructible set of topologies which admit a unique linear solution. Camera matrices can thus be retrieved efficiently and large viewing graphs can be handled in a recursive fashion. The results apply to problems such as the projective reconstruction from multiple views or the calibration of camera networks. © 2011 Springer-Verlag Berlin Heidelberg
Multimodal speaker recognition in a conversation scenario
As a step toward the design of a robot that can take part to it conversation we propose a robotic system that, taking advantage of multiple perceptual capabilities, actively follows a conversation among several human subjects. The essential idea of our proposal is that the robot system can dynamically change the focus of its attention according to visual or audio stimuli to track the actual speaker throughout the conversation and infer her identity
Coherence fields for 3D saliency prediction
In the coherence theory of attention [26] a coherence field is defined by a hierarchy of structures, supporting the activities across the different stages of visual attention. At the interface between low level and mid level attention processing stages are the proto-objects, generated in parallel and collecting features of the scene at specific location and time. These structures fade away if the region is not further attended by attention. We introduce a method to computationally model these structures on the basis of experiments made in dynamic 3D environments, where the only control is due to the Gaze Machine, a gaze measurement framework that can record pupil motion at the required speed and project the point of regard in the 3D space [25],[24]. We show also how, from these volatile structures, it is possible to predict saliency in 3D dynamic environments. © 2013 Springer-Verlag
Help me to help you: How to learn intentions, actions and plans
The collaboration between a human and a robot is here understood as a learning process mediated by the instructor prompt behaviours and the apprentice collecting information from them to learn a plan. The instructor wears the Gaze Machine, a wearable device gathering and conveying visual and audio input from the instructor while executing a task. The robot, on the other hand, is eager to learn both the best sequence of actions, their timing and how they interlace. The cross relation among actions is specified both in terms of time intervals for their execution, and in terms of location in space to cope with the instruction interaction with people and objects in the scene. We outline this process: how to transform the rich information delivered by the Gaze Machine into a plan. Specifically, how to obtain a map of the instructor positions and his gaze position, via visual slam and gaze fixations; further, how to obtain an action map from the running commentaries and the topological maps and, finally, how to obtain a temporal net of the relevant actions that have been extracted. The learned structure is then managed by the flexible time paradigm of flexible planning in the Situation Calculus for execution monitoring and plan generation. Copyright © 2011, Association for the Advancement of Artificial Intelligence. All rights reserved
- …
