1,720,974 research outputs found
A Fast MPEG's CDVS Implementation for GPU Featured in Mobile Devices
The Moving Picture Experts Group’s Compact Descriptors for Visual Search (MPEG’s CDVS) intends to standardize technologies in order to enable an interoperable, efficient, and cross-platform solution for internet-scale visual search applications and services. Among the key technologies within
CDVS, we recall the format of visual descriptors, the descriptor extraction process, and the algorithms for indexing and matching. Unfortunately, these steps require precision and computation accuracy. Moreover, they are very time-consuming, as they need running times in the order of seconds when implemented on the central processing unit (CPU) of modern mobile devices. In this paper, to reduce computation times and
maintain precision and accuracy, we re-design, for many-cores embedded graphical processor units (GPUs), all main local descriptor extraction pipeline phases of the MPEG’s CDVS standard. To reach this goal, we introduce new techniques to adapt the standard algorithm to parallel processing. Furthermore, to reduce memory accesses and efficiently distribute the kernel workload, we use new approaches to store and retrieve
CDVS information on proper GPU data structures. We present a complete experimental analysis on a large and standard test set. Our experiments show that our GPU-based approach is remarkably faster than the CPU-based reference implementation of the standard, and it maintains a comparable precision in terms of true and false positive rates
Moving object detection in heterogeneous conditions in embedded systems
This paper presents a system for moving object exposure, focusing on pedestrian detection, in external, unfriendly, and heterogeneous environments. The system manipulates and accurately merges information coming from subsequent video frames, making small computational efforts in each single frame. Its main characterizing feature is to combine several well-known movement detection and tracking techniques, and to orchestrate them in a smart way to obtain good results in diversified scenarios. It uses dynamically adjusted thresholds to characterize different regions of interest, and it also adopts techniques to efficiently track movements, and detect and correct false positives. Accuracy and reliability mainly depend on the overall receipt, i.e., on how the software system is designed and implemented, on how the different algorithmic phases communicate information and collaborate with each other, and on how concurrency is organized. The application is specifically designed to work with inexpensive hardware devices, such as off-the-shelf video cameras and small embedded computational units, eventually forming an intelligent urban grid. As a matter of fact, the major contribution of the paper is the presentation of a tool for real-time applications in embedded devices with finite computational (time and memory) resources. We run experimental results on several video sequences (both home-made and publicly available), showing the robustness and accuracy of the overall detection strategy. Comparisons with state-of-the-art strategies show that our application has similar tracking accuracy but much higher frame-per-second rates
CDVS feature selection on embedded systems
Mobile image retrieval and pairwise matching applications pose a unique set of challenges. As communicating large amount of data could take tens of seconds over a slow wireless link, MPEG defined the CDVS standard to transfer over the network only the data essential to the matching, and not the entire image. However, the extraction of salient image features is a very time consuming process, and it may still require times in the order of seconds when running on CPU of modern mobile devices. To reduce feature extraction computation times, we re-design the MPEG-CDVS feature selection algorithm for highly parallel embedded GPUs. We consider two different approaches compliant to the standard. In the first one, feature selection is performed before the orientation assignment stage. In the second one, it is performed after. We present a complete experimental analysis on a large test set. Our experiments show that our GPU-based approaches are remarkably faster than the CPU-based reference implementation of the standard, while maintaining a comparable precision in terms of true and false positive rates. To sum up, our solutions have been proved to be effective for real-time applications running on modern embedded systems
Street Viewer: An Autonomous Vision Based Traffic Tracking System
The development of intelligent transportation systems requires the availability of both accurate traffic information in real time and a cost-effective solution. In this paper, we describe Street Viewer, a system capable of analyzing the traffic behavior in different scenarios from images taken with an off-the-shelf optical camera. Street Viewer operates in real time on embedded hardware architectures with limited computational resources. The system features a pipelined architecture that, on one side, allows one to exploit multi-threading intensively and, on the other side, allows one to improve the overall accuracy and robustness of the system, since each layer is aimed at refining for the following layers the information it receives as input. Another relevant feature of our approach is that it is self-adaptive. During an initial setup, the application runs in learning mode to build a model of the flow patterns in the observed area. Once the model is stable, the system switches to the on-line mode where the flow model is used to count vehicles traveling on each lane and to produce a traffic information summary. If changes in the flow model are detected, the system switches back autonomously to the learning mode. The accuracy and the robustness of the system are analyzed in the paper through experimental results obtained on several different scenarios and running the system for long periods of tim
Efficient Complex High-Precision Computations on GPUs without Precision Loss
General-purpose computing on graphics processing units is the utilization of a graphics processing unit (GPU) to perform computation in applications traditionally handled by the central processing unit. Many attempts have been made to implement well-known algorithms on embedded and mobile GPUs. Unfortunately, these applications are computationally complex and often require high precision arithmetic, whereas embedded and mobile GPUs are designed speci ̄cally for graphics, and thus are very restrictive in terms of input/output, precision, programming style and primitives available. This paper studies how to implement effcient and accurate high-precision algorithms on embedded GPUs adopting the OpenGL ES language. We discuss the problems arising during the design phase, and we detail our implementation choices, focusing on the SIFT and ALP key-point detectors. We transform standard, i.e., single (or double) precision floating-point computations, to reduced-precision GPU arithmetic without precision loss. We develop a desktop framework to simulate Gaussian Scale Space transforms on all possible target embedded GPU platforms, and with all possible range and precision arithmetic. We illustrate how to re-engineer standard Gaussian Scale Space computations to mobile multi-core parallel GPUs using the OpenGL ES language. We present experiments on a large set of standard images, proving how efficiency and accuracy can be maintained on different target platforms. To sum up, we present a complete framework to minimize future programming effort, i.e., to easily check, on different embedded platforms, the accuracy and performance of complex algorithms requiring high-precision computations
- …
