1,721,395 research outputs found

    Making Decisions by Unlabeled Bits

    No full text
    In a statistical decision problem, one observes independent samples and is asked to decide between two mutually exclusive states of nature. With knowledge of the data distributions under the two hypotheses, it is known that the asymptotically optimal error probabilities converge to zero at a rate given by the error exponent function, which quantifies the information for detection contained in the data. As a new element, suppose that the decision problem is to be solved without data labeling, i.e., with no knowledge of which observation comes from which distribution. What is the error exponent in this case? How much information for detection is contained in the samples and how much is lost with the labels? This problem - unlabeled detection - is of theoretical relevance per se, but is also attracting growing practical interest, because loss of labels often occurs naturally in big-data settings, user profiling in social networks; and may be deliberate, for example to save bandwidth in inference involving large sensor networks. For binary unlabeled observations and focusing on the low-detectability regime, we derive simple closed-form expressions for the error exponent and related quantities (Chernoff-Stein exponent, Chernoff information), providing new insights. Practical algorithms are then discussed, showing that many decision algorithms proposed in the literature reduce for binary data to simple forms, which highlight their properties and relative merits. A detector with close-to-optimum performance for a wide class of detection problems is proposed

    Shuffled bits in the low-detectability regime

    No full text
    We consider a decision problem in which data are unordered (unlabeled). Recent studies of this problem provide a complete asymptotic characterization of the decision performance for large data size, which is the solution of a convex optimization problem. While this is fully satisfactory from a numerical viewpoint, limited insight is offered because a closed-form explicit expression for the decision performance is, in general, not available. For binary observations and the challenging regime of low-detectability, we derive an extremely simple analytical solution, investigate its properties and discuss the obtained physical insights

    Algorithms and Fundamental Limits for Unlabeled Detection Using Types

    Full text link
    We deal with the classical problem of testing two simple statistical hypotheses but, as a new element, it is assumed that the data vector is observed after an unknown permutation of its entries. What is the fundamental limit for the detection performance in this case? How much information for detection is contained in the entry values and how much in their positions? In the first part of this paper, we answer these questions. In the second part, we focus on practical algorithms. A low-complexity detector solves the detection problem without attempting to estimate the permutation. A modified version of the auction algorithm is then considered, and two greedy algorithms with affordable worst case complexity are presented. The detection operational characteristics of these detectors are investigated by computer experiments. The problem we address is referred to as unlabeled detection and is motivated by large sensor network applications, but applications are also foreseen in different fields, including image processing, social sensing, genome research, and molecular communication

    Generalized score-tests for decision fusion with sensing model uncertainty

    No full text
    This chapter investigates distributed detection of a phenomenon of interest (POI) via decision fusion in wireless sensor networks (WSNs). The decisions are collected by a fusion center (FC), which is in charge of performing a more accurate global decision. So as to account for a realistic scenario, it is assumed that the POI presents a signature with limited spatial extent, and its exact location and emitted amplitude (or energy) are not known. More specifically, when the POI is present, the sensors observe a signal with an attenuation depending on the distance between the sensor and the (unknown) target position, embedded in Gaussian noise. The unavailability of a completely specified model defeats the applicability of the well-known (optimal) likelihood-ratio (LR) test (LRT). As a consequence, in the general case, the FC is usually in charge of solving a composite hypothesis test and the generalized LRT (GLRT) is commonly employed. Unfortunately, in these scenarios, its complexity is typically high. Accordingly, the present chapter discusses the development of generalized score tests as alternatives with reduced computational complexity. After a brief recall of the GLRT for the considered problems, fusion rules corresponding to generalized versions of well-known score tests are introduced, based on Davies’framework, since the resulting problems include nuisance parameters only under the POI-present hypothesis. The focus is on two relevant signal models, i.e., the cases of random and deterministic unknown signals, leading to one-sided and two-sided testing, respectively. Finally, a convincing (semi-theoretical) rationale for threshold-optimization is presented and analyzed

    Some Approaches to Quantization for Distributed Estimation with Data Association

    No full text
    Quantization for estimation is explored for the case that it must be performed jointly with data association, that is, the case in which measurements are of uncertain origin. Data association requires some sort of gating of distributed observations, and a censoring strategy is proposed. Several quantization philosophies are explored, specifically, uniform quantization, uniform quantization with measurement exchangeability incorporated (the “type” method), and uniform quantization of sorted measurements. The second scheme uses less bandwidth than the third, but it is shown, perhaps surprisingly, that the third preserves more information that may be useful for estimation, and a simple procedure for optimal fused estimation based on this third scheme is given. Interestingly, when compared in terms of rate-distortion curve, the schemes two and three perform similarly; their censored versions offer further improvement in performances due to the uncertain-origin property of the measurements
    corecore