1,720,964 research outputs found
Trash-ICRA19: A Bounding Box Labeled Dataset of Underwater Trash
This dataset is available for download as a .zip file named Trash_ICRA19.zip. Within the compressed folder are both the dataset and configurations for testing the datset with deep learning algorithms. There is a README in the top directory, and in most lower directories, explaining the files and directories in its directory.This data was sourced from the J-EDI dataset of marine debris. The videos that comprise that dataset vary greatly in quality, depth, objects in scenes, and the cameras used. They contain images of many different types of marine debris, captured from real-world environments, providing a variety of objects in different states of decay, occlusion, and overgrowth. Additionally, the clarity of the water and quality of the light vary significantly from video to video. These videos were processed to extract 5,700 images, which comprise this dataset, all labeled with bounding boxes on instances of trash, biological objects such as plants and animals, and ROVs. The eventual goal is to develop efficient and accurate trash detection methods suitable for onboard robot deployment. It is our hope that the release of this dataset will facilitate further research on this challenging problem, bringing the marine robotics community closer to a solution for the urgent problem of autonomous trash detection and removal.Fulton, Michael S; Hong, Jungseok; Sattar, Junaed. (2020). Trash-ICRA19: A Bounding Box Labeled Dataset of Underwater Trash. Retrieved from the University Digital Conservancy, https://doi.org/10.13020/x0qn-y082
TrashCan 1.0 An Instance-Segmentation Labeled Dataset of Trash Observations
The dataset is uploaded in three .zip files: dataset.zip contains the images and labels of the TrashCAN dataset, while instance_checkpoints.zip and material_checkpoints.zip contain network configurations and checkpoints for Faster-RCNN and Mask-RCNN evaluations of the TrashCan-Instance and TrashCan-Material dataset versions.The TrashCan dataset is comprised of annotated images (7,212 images currently) which contain observations of trash, ROVs, and a wide variety of undersea flora and fauna. The annotations in this dataset take the format of instance segmentation annotations: bitmaps containing a mask marking which pixels in the image contain each object. The imagery in TrashCan is sourced from the J-EDI (JAMSTEC E-Library of Deep-sea Images) dataset, curated by the Japan Agency of Marine Earth Science and Technology (JAMSTEC). This dataset contains videos from ROVs operated by JAMSTEC since 1982, largely in the sea of Japan. The dataset has two versions, TrashCan-Material and TrashCan-Instance, corresponding to different object class configurations. The eventual goal is to develop efficient and accurate trash detection methods suitable for onboard robot deployment. While datasets have previously been created containing bounding box level annotations of trash in marine environments, TrashCan is, to the best of our knowledge, the first instance-segmentation annotated dataset of underwater trash. It is our hope that the release of this dataset will facilitate further research on this challenging problem, bringing the marine robotics community closer to a solution for the urgent problem of autonomous trash detection and removal.Hong, Jungseok; Fulton, Michael S; Sattar, Junaed. (2020). TrashCan 1.0 An Instance-Segmentation Labeled Dataset of Trash Observations. Retrieved from the University Digital Conservancy, https://doi.org/10.13020/g1gx-y834
On Applications of GANs and Their Latent Representations
This report describes various applications of Generative Adversarial Networks
(GANs) for image generation, image-to-image translation, and vehicle control.
With this, we also investigate the role played by the computed latent space, and
show various ways of exploiting this space for controlled image generation and
exploration. We show one pure generative method which we call AstroGAN that is
able to generate realistic images of galaxies from a set of galaxy morphologies. Two
image-to-image translation methods are also displayed: StereoGAN, which is able
to generate a pair of stereo images given a single image; Underwater GAN, which
is able to restore distorted imagery exhibited in underwater environments. Lastly,
we show a generative model for generating actions in a simulated self-driving car
environment.Fabbri, Cameron; Sattar, Junaed. (2018). On Applications of GANs and Their Latent Representations. Retrieved from the University Digital Conservancy, https://hdl.handle.net/11299/216028
Video Diver Dataset (VDD-C) 100,000 annotated images of divers underwater
The data of VDDC comes in four zip files:
- original_data.zip: Contains the original images and .xml label files, along with some information required to process the data into the proper formats.
- script.zip: Contains the script used to generate the labels and images folders from the original_data.
- labels.zip: Contains a variety of label types, in voc, yolo, tfrecord, and tfsequence formats. These labels are also properly filtered to correct inaccurate coordinates for annotations and remove unwanted annotations.
- images.zip: Contains the images of the dataset, filtered to remove poor quality images.This dataset contains over 100,000 annotated images of divers underwater, gathered from videos of divers in pools and the Caribbean off the coast of Barbados. It is intended for the development and testing of diver detection algorithms for use in autonomous underwater vehicles (AUVs). Because the images are sourced from videos, they are largely sequential, meaning that temporally aware algorithms (video object detectors) can be trained and tested on this data. Training on this data improved our current diver detection algorithms significantly because we increased our training set size by 17 times compared to our previous best dataset. It is released for free for anyone who wants to use it.National Science Foundation #1845364 & #00074041MNRI Seed Grantde Langis, Karin; Fulton, Michael; Sattar, Junaed. (2021). Video Diver Dataset (VDD-C) 100,000 annotated images of divers underwater. Retrieved from the University Digital Conservancy, https://doi.org/10.13020/6qrp-wy09
Using LED Gaze Cues to Enhance Underwater Human-Robot Interaction
In the underwater domain, conventional methods of communication between divers and Autonomous Underwater Vehicles (AUVs) are heavily impeded. Radio signal attenuation, water turbidity (cloudiness), and low light levels make it difficult for a diver and AUV to relay information between each other. Current solutions such as underwater tablets, slates, and tags are not intuitive and introduce additional logistical challenges and points of failure. Intuitive human-robot interaction (HRI) is imperative to ensuring seamless collaboration between AUVs and divers. Eye gazes are a natural form of relaying information between humans, and are an underutilized channel of communication in AUVs, while lights help eliminate concerns of darkness, turbidity, and signal attenuation which often impair diver-robot collaboration. This research aims to implement eye gazes on LoCO (a low-cost AUV) using RGB LED rings in order to pursue intuitive forms of HRI underwater while overcoming common barriers to communication. To test the intuitiveness of the design, 9 participants with no prior knowledge of LoCO and HRI were tasked with recalling the meanings for each of 16 gaze indicators during pool trials, while being exposed to the indicators 3 to 4 days earlier. Compared to the baseline text display communication, which had a recall of 100%, the recall for most eye gaze animations were exceptionally high, with an 80% accuracy score for 11 of the 16 indicators. These results suggest certain eye indicators convey information more intuitively than others, and additional training can make gaze indicators a viable method of communication between humans and robots.This research was supported by the University of Minnesota Undergraduate Research Opportunities Program (UROP).
Special word of thanks to the Interactive Robotics and Vision Laboratory (IRV), whose guidance and expertise in HRI made this UROP possible. For more information on the IRV Lab and the LoCO AUV, visit https://irvlab.cs.umn.edu/Prabhu, Aditya; Fulton, Michael; Sattar, Junaed, Ph.D.. (2022). Using LED Gaze Cues to Enhance Underwater Human-Robot Interaction. Retrieved from the University Digital Conservancy, https://hdl.handle.net/11299/227303
A visual servoing system for an amphibious legged robot /
We present a visual servoing system for an amphibious legged robot. That is, a monocular-vision based servoing mechanism that enables the robot to track and follow a target both underwater and on the ground. We used three different tracking algorithms to track and localize the target in the image, with color being the tracked feature. Tracking is performed based on the object's color, color distribution and color distribution with a probabilistic kernel. Output from the tracker is channeled to a proportional-integral-derivative controller, which generates steering commands for the robot controller. The robot controller in turn takes the steering commands and generates motor commands for the six legs of the robot. A large class of significant applications can be addressed by allowing such a robot to follow a diver or some other moving target. The system has been evaluated in the open water and under natural lighting conditions, and has successfully performed tracking and following of a wide variety of target objects
Towards a robust framework for visual human-robot interaction
This thesis presents a vision-based interface for human-robot interaction and control for autonomous robots in arbitrary environments. Vision has the advantage of being a low-power, unobtrusive sensing modality. The advent of robust algorithms and a significant increase in computational power are the two most significant reasons for such widespread integration. The research presented in this dissertation looks at visual sensing as an intuitive and uncomplicated method for a human operator to communicate in close-range with a mobile robot. The array of communication paradigms we investigate includes, but are not limited to, visual tracking and servoing, programming of robot behaviors with visual cues, visual feature recognition, mapping and identification of individuals through gait characteristics using spatio-temporal visual patterns and quantifying the performance of these human-robot interaction approaches. The proposed framework enables a human operator to control and program a robot without the need for any complicated input interface, and also enables the robot to learn about its environment and the operator using the visual interface. We investigate the applicability of machine learning methods – supervised learning in particular – to train the vision system using stored training data. A key aspect of our work is a system for human-robot dialog for safe and efficient task execution under uncertainty. We present extensive validation through a set of human-interface trials, and also demonstrate the applicability of this research in the field on the Aqua amphibious robot platform in the under water domain. While ourframework is not specific to robots operating in the under water domain, vision under water is affected by a number of issues, such as lighting variations and color degradation, among others. Evaluating the approach in such difficult operating conditions provides a definitive validation of our approach.Cette thèse présentera une interface basée sur la vision qui permet l'intéraction entre humains et robots et aussi le control de robots autonomes parcourant des environments indéfinis. La vision à l'avantage d'être une modalité sensorielle discrète et à faible puissance. La probabilité d'algorithmes complexes et une hausse significative de puissance computationelle sont deux des raisons les plus importantes d'en faire une intégration si répandue. La recherche présentée dans cette dissertation évalue la détection visuelle comme méthode simple et intuitive pour un opérateur humain de communiquer à courte portée avec un robot mobil. L'ensemble des modèles communicationnels étudiés inclus, sans tous les nommés, la localisation et l'inspection visuelle, l'utilisation de signaux visuels pour la programmation comportemental de robots, la reconnaissance visuelle, la reconnaissance d'individus par leurs mouvements corporels caractéristiques utilisant des motifs visuels spatio-temporels tout en quantifiant la performance de cette approche à l'intéraction entre humains et robots. La structure proposée permet à l'opérateur humain de programmer et contôler un robot sans la nécessité d'une interface à entrée de données complexe. Cette structure permet aussi au robot de reconnaître des caractéristiques clés de son environment et de son opérateur humain par l'ulisation d'une interface visuelle. L'étude de l'appplication possible des méthodes d'apprentissage ulitilisées par certaines machines, toujours sous supervision, permet d'entraîner le système visuel à utiliser ses bases de données. Un aspect important de cette recherche est l'élaboration d'un système de dialogues entre humains et robots permettant l'exécution sécuritaire et efficace de tâches aux délimitations incertaines. On présente une ample validation à travers de nombreux essais utilisant notre interface avec l'aide de cobayes humains. On démontre aussi les applications possibles de cette recherche au sein des utilisations aquatiques du Aqua, robot amphibien à plateforme. Alors que notre structure de recherche ne se spécialise pas dans la robotique aquatique, la vision sous l'eau est toujours affectée par de nombreux facteurs, notamment la lumunosité variante et la dégradation de couleur. L'évaluation de l'approche nécessaire dans de telles conditions opérationnelles difficiles crée une validation définitive de notre recherche
Going Beyond Counting First Authors in Author Co-citation Analysis
The present study examines one of the fundamental aspects of author co-citation analysis (ACA) - the way co-citation
counts are defined. Co-citation counting provides the data on which all subsequent statistical analyses and mappings
are based, and we compare ACA results based on two different types of co-citation counting - the traditional type that
only counts the first one among a cited work's authors on the one hand and a non-traditional type that takes into
account the first 5 authors of a cited work on the other hand. Results indicate that the picture produced through this non-traditional author co-citation counting contains more coherent author groups and is therefore considerably clearer. However, this picture represents fewer specialties in the research field being studied than that produced through the traditional first-author co-citation counting when the same number of top-ranked authors is selected and analyzed. Reasons for these effects are discussed
Variations on the Author
“Variations on the Author” discusses two of Eduardo Coutinho’s recent films (Um Dia na Vida, from 2010, and Últimas Conversas, posthumously released in 2015) and their contribution to the general question of documentary authorship. The director’s filmography is characterized by a consistent yet self-effacing form of authorial self-inscription: Coutinho often features as an interviewer that rather than express opinions propels discourses; an interviewer that is good at listening. This mode of self-inscription characterizes him as an author who is not expressive but who is nonetheless markedly present on the screen. In Um Dia na Vida, however, Coutinho is completely absent form the image, while Últimas Conversas, on the contrary, includes a confessional prologue that moves the director from the margins to the center of his films. This article examines the ways in which these works stand out in the filmography of a director who offers new insights into the notion of cinematic authorship
- …
