1,720,997 research outputs found
F-measure curves: a tool to visualize classifier performance under imbalance
Learning from imbalanced data is a challenging problem in many real-world machine learning applications due in part to the bias of performance in most classification systems. This bias may exist due to three reasons: (1) Classification systems are often optimized and compared using performance measurements that are unsuitable for imbalance problems; (2) most learning algorithms are designed and tested on a fixed imbalance level of data, which may differ from operational scenarios; (3) the preference of correct classification of classes is different from one application to another. This paper investigates specialized performance evaluation metrics and tools for imbalance problem, including scalar metrics that assume a given operating condition (skew level and relative preference of classes), and global evaluation curves or metrics that consider a range of operating conditions. We focus on the case in which the scalar metric F-measure is preferred over other scalar metrics, and propose a new global evaluation space for the F-measure that is analogous to the cost curves for expected cost. In this space, a classifier is represented as a curve that shows its performance over all of its decision thresholds and a range of possible imbalance levels for the desired preference of true positive rate to precision. Curves obtained in the F-measure space are compared to those of existing spaces (ROC, precision-recall and cost) and analogously to cost curves. The proposed F-measure space allows to visualize and compare classifiers’ performance under different operating conditions more easily than in ROC and precision-recall spaces. This space allows us to set the optimal decision threshold of a soft classifier and to select the best classifier among a group. This space also allows to empirically improve the performance obtained with ensemble learning methods specialized for class imbalance, by selecting and combining the base classifiers for ensembles using a modified version of the iterative Boolean combination algorithm that is optimized using the F-measure instead of AUC. Experiments on a real-world dataset for video face recognition show the advantages of evaluating and comparing different classifiers in the F-measure space versus ROC, precision-recall, and cost spaces. In addition, it is shown that the performance evaluated using the F-measure of Bagging ensemble method can improve considerably by using the modified iterative Boolean combination algorithm
Dropout Injection at Test Time for Post Hoc Uncertainty Quantification in Neural Networks
Scene-specific crowd counting using synthetic training images
Crowd counting is a computer vision task on which considerable progress has recently been made thanks to convolutional neural networks. However, it remains a challenging task even in scene-specific settings, in real-world application scenarios where no representative images of the target scene are available, not even unlabelled, for training or fine-tuning a crowd counting model. Inspired by previous work in other computer vision tasks, we propose a simple but effective solution for the above application scenario, which consists of automatically building a scene-specific training set of synthetic images. Our solution does not require from end-users any manual annotation effort nor the collection of representative images of the target scene. Extensive experiments on several benchmark data sets show that the proposed solution can improve the effectiveness of existing crowd counting methods
On the effectiveness of synthetic data sets for training person re-identification models
Person re-identification is a prominent topic in computer vision due to its security-related applications, and to the fact that issues such as variations in illumination, background, pedestrian pose and clothing appearance make it a very challenging task in real-world scenarios. State-of-the-art supervised methods require a huge manual annotation effort for training data and exhibit limited generalisation capability to unknown target domains. Synthetic data sets have recently been proposed as one possible solution to mitigate these problems, aimed at improving generalisation capability by encompassing a larger amount of variations in the above mentioned visual factors, with no need for manual annotation. However, existing synthetic data sets differ in many aspects, including the number of images, identities and cameras, and in their degree of photorealism, and there is not yet a clear understanding of how all such factors affect person re-identification performance. This work makes a first step towards filling this gap through an in-depth empirical investigation, where we use existing synthetic data sets for model training and real benchmark ones for performance evaluation. Our results provide interesting insights towards developing effective synthetic data sets for person re-identification
Human-in-the-loop cross-domain person re-identification
Person re-identification is a challenging cross-camera matching problem, which is inherently subject to domain shift. To mitigate it, many solutions have been proposed so far, based on four kinds of approaches: supervised and unsupervised domain adaptation, direct transfer, and domain generalisation; in particular, the first two approaches require target data during system design, respectively labelled and unlabelled. In this work, we consider a very different approach, known as human-in-the-loopHITL), which consists of exploiting user’s feedback on target data processed during system operation to improve re-identification accuracy. Although it seems particularly suited to this application, given the inherent interaction with a human operator, HITL methods have been proposed for person re-identification by only a few works so far, and with a different purpose than addressing domain shift. However, we argue that HITL deserves further consideration in person re-identification, also as a potential alternative solution against domain shift. To substantiate our view, we consider simple HITL implementations which do not require model re-training or fine-tuning: they are based on well-known relevance feedback algorithms for content-based image retrieval, and of novel versions of them we devise specifically for person re-identification. We then conduct an extensive, cross-data set experimental evaluation of our HITL implementations on benchmark data sets, and compare them with a large set of existing methods against domain shift, belonging to the four categories mentioned above. Our results provide evidence that HITL can be as effective as, or even outperform, existing ad hoc solutions against domain shift for person re-identification, even under the simple implementations we consider. We believe that these results can foster further research on HITL in the person re-identification field, where, in our opinion, its potential has not been thoroughly explored so far
Multiple classifier systems for adversarial classification task (Lecture Notes in Computer Science (2009) 5519, (132141))
An Empirical Evaluation of Nuclei Segmentation from H&E Images in a Real Application Scenario
Cell nuclei segmentation is a challenging task, especially in real applications, when the target images significantly differ between them. This task is also challenging for methods based on convolutional neural networks (CNNs), which have recently boosted the performance of cell nuclei segmentation systems. However, when training data are scarce or not representative of deployment scenarios, they may suffer from overfitting to a different extent, and may hardly generalise to images that differ from the ones used for training. In this work, we focus on real-world, challenging application scenarios when no annotated images from a given dataset are available, or when few images (even unlabelled) of the same domain are available to perform domain adaptation. To simulate this scenario, we performed extensive cross-dataset experiments on several CNN-based state-of-the-art cell nuclei segmentation methods. Our results show that some of the existing CNN-based approaches are capable of generalising to target images which resemble the ones used for training. In contrast, their effectiveness considerably degrades when target and source significantly differ in colours and scale
- …
