Search CORE

1,721,003 research outputs found

Problems in group testing estimation and design

Author: Haber Gregory
Publication venue
Publication date: 01/01/2018
Field of study

Group testing, which includes any procedure in which units are tested in pools rather than individually, has been an active area of research in the statistical literature for over 70 years. Much of this research has been focused on the problem of estimation in which random variables representing some binary trait are pooled together to estimate the underlying Bernoulli parameter. The effective use of such procedures has been shown to lead to large reductions in terms of mean square error (MSE), resulting in more accurate estimates, or allowing for fewer tests to be carried out while maintaining a fixed level of MSE. Despite this, group testing estimation problems are equivalent to estimating a nonlinear function of the underlying parameter so that previous work has relied heavily on large sample methods to establish results. In practice, however, group testing problems will usually involve very small sample sizes so that such methods may be inappropriate. In this dissertations we explore several problems related to group testing estimation and design based on small sample methods. The first problem we consider is the construction of unbiased estimators. While the standard binomial model does not yield an unbiased estimator, we give a construction based on an inverse binomial model which samples until a fixed number of negative pools are observed. This is extended to include cases where misclassification errors are present, and we show that, while an unbiased estimator can be constructed, it is improper, yielding values outside the parameter space. This is extended to the entire class of binomial sampling plans when misclassification is present, showing that no proper unbiased estimator exists in this broad class. These ideas are extended again to the case of multinomial sampling, where we show that under any sampling plan it is impossible to find a proper unbiased estimator, even without misclassification. The next problem we consider is the estimation of two diseases simultaneously using group testing methods. No closed form MLE exists in this case, and numerical methods are difficult due to a high frequency of boundary estimates. We propose an EM algorithm based estimator and provide proofs of convergence, even on the boundary of the parameter space. Several closed form alternatives are also provided, primarily with the aim of bias reduction. The final problem we consider is that of choosing the group size for experiments when only a small number of tests can be carried out. Previous methods have relied heavily on good prior knowledge of the parameter value to be estimated, with the needed accuracy decreasing with the sample size. We propose simple random walk based adaptive procedures which minimize the need for such prior information. These designs are shown numerically to outperform the large sample based methods previously found in the literature. These methods are extended to the case when misclassification errors are present, with similar results

MD-SOAR Maryland Shared Open Access Repository

Monotonicity in the sample size of the length of classical confidence intervals

Author: Malinovsky Yaakov
Kagan Abram M.
Publication venue
Publication date: 05/09/2012
Field of study

It is proved that the average length of standard confidence intervals for parameters of Gamma and normal distributions monotonically decreases with the sample size. The proofs are based on fine properties of the classical Gamma function.The authors would like to thank the referee for a careful reading of the manuscript and very helpful suggestions. The work of the second author was partially supported by a 2012 UMBC Summer Faculty Fellowship grant.https://www.sciencedirect.com/science/article/pii/S016771521200330

MD-SOAR Maryland Shared Open Access Repository

Monitoring Threshold Functions over Distributed Data Streams with Node Dependent Constraints

Author: Malinovsky Yaakov
Kogan Jacob
Publication venue
Publication date: 18/09/2012
Field of study

Monitoring data streams in a distributed system has attracted considerable interest in recent years. The task of feature selection (e.g., by monitoring the information gain of various features) requires a very high communication overhead when addressed using straightforward centralized algorithms. While most of the existing algorithms deal with monitoring simple aggregated values such as frequency of occurrence of stream items, motivated by recent contributions based on geometric ideas we present an alternative approach. The proposed approach enables monitoring values of an arbitrary threshold function over distributed data streams through stream dependent constraints applied separately on each stream. We report numerical experiments on a real-world data that detect instances where communication between nodes is required, and compare the approach and the results to those recently reported in the literature.The authors thank anonymous reviewers whose valuable comments greatly enhanced exposition of the results. The work of the first author was supported in part by 2012 UMBC Summer Faculty Fellowship grant.https://www.mdpi.com/1999-4893/5/3/37

MD-SOAR Maryland Shared Open Access Repository

Best invariant and minimax estimation of quantiles in finite populations

Author: Malinovsky Yaakov
Rinott Yosef
Publication venue
Publication date: 18/02/2011
Field of study

The theoretical literature on quantile and distribution function estimation in infinite populations is very rich, and invariance plays an important role in these studies. This is not the case for the commonly occurring problem of estimation of quantiles in finite populations. The latter is more complicated and interesting because an optimal strategy consists not only of an estimator, but also of a sampling design, and the estimator may depend on the design and on the labels of sampled individuals, whereas in iid sampling, design issues and labels do not exist. We study the estimation of finite population quantiles, with emphasis on estimators that are invariant under the group of monotone transformations of the data, and suitable invariant loss functions. Invariance under the finite group of permutation of the sample is also considered. We discuss nonrandomized and randomized estimators, best invariant and minimax estimators, and sampling strategies relative to different classes. Invariant loss functions and estimators in finite population sampling have a nonparametric flavor, and various natural combinatorial questions and tools arise as a result.We thank Gil Kalai for several enlightening discussions of this work, and three reviewers for many useful comments. This research was supported in part by Grant no. 473/04 from the Israel Science Foundation. The first author also supported by the Intramural Research Program of the National Institutes of Health, Eunice Kennedy Shriver National Institute of Child Health and Human Development.https://www.sciencedirect.com/science/article/pii/S037837581100073

MD-SOAR Maryland Shared Open Access Repository

Conjectures on optimal nested generalized group testing algorithm

Author: Malinovsky Yaakov
Publication venue
Publication date: 22/07/2020
Field of study

Consider a finite population of N items, where item i has a probability pi to be defective. The goal is to identify all items by means of group testing. This is the generalized group testing problem (hereafter GGTP). In the case of p1 = … = pN = p, Yao and Hwang (1990) proved that the pairwise testing algorithm is the optimal nested algorithm, with respect to the expected number of tests, for all N if and only if p∈ [1−1/2,(3−5)/2] (R-range hereafter) (an optimal at the boundary values). In this note, we present a result that helps to define the generalized pairwise testing algorithm (hereafter GPTA) for the GGTP. We present two conjectures: (a) when all pi,i = 1,…,N belong to the R-range, GPTA is the optimal procedure among nested procedures applied to pi of nondecreasing order; and (b) if all pi,i = 1,…,N belong to the R-range, GPTA is the optimal nested procedure, that is, minimizes the expected total number of tests with respect to all possible testing orders in the class of nested procedures. Although these conjectures are logically reasonable, we were only able to empirically verify the first one up to a particular level of N. We also provide a short survey of GGTP.https://onlinelibrary.wiley.com/doi/abs/10.1002/asmb.255

MD-SOAR Maryland Shared Open Access Repository

Sterrett Procedure for the Generalized Group Testing Problem

Author: Malinovsky Yaakov
Publication venue
Publication date: 04/11/2017
Field of study

Group testing is a useful method that has broad applications in medicine, engineering, and even in airport security control. Consider a finite population of N items, where item i has a probability pi to be defective. The goal is to identify all items by means of group testing. This is the generalized group testing problem. The optimum procedure, with respect to the expected total number of tests, is unknown even in case when all pi are equal. (Hwang 1975) proved that an ordered partition (with respect to pi) is the optimal for the Dorfman procedure (procedure D), and obtained an optimum solution (i.e., found an optimal partition) by dynamic programming. In this paper, we investigate the Sterrett procedure (procedure S). We provide close form expression for the expected total number of tests, which allows us to find the optimum arrangement of the items in the particular group. We also show that an ordered partition is not optimal for the procedure S or even for a slightly modified Dorfman procedure (procedure D′). This discovery implies that finding an optimal procedure S appears to be a hard computational problem. However, by using an optimal ordered partition for all procedures, we show that procedure D′ is uniformly better than procedure D, and based on numerical comparisons, procedure S is uniformly and significantly better than procedures D and D′.https://link.springer.com/article/10.1007/s11009-017-9601-4#Abs

MD-SOAR Maryland Shared Open Access Repository

More on Round-Robin Tournament Models with a Unique Maximum Score

Author: Malinovsky Yaakov
Publication venue
Publication date: 04/11/2024
Field of study

In this note we extend a recent result showing the uniqueness of the maximum score in a classical round-robin tournament to the round-robin tournament models with equally strong players.Research supported in part by BSF grant 2020063http://arxiv.org/abs/2411.0214

MD-SOAR Maryland Shared Open Access Repository

A Note on the Closed-Form Solution for the Longest Head Run Problem of Abraham de Moivre

Author: Malinovsky Yaakov
Publication venue
Publication date: 11/10/2021
Field of study

The problem of the longest head run was introduced and solved by Abraham de Moivre in the second edition of his book Doctrine of Chances (de Moivre, 1738). The closed-form solution as a finite sum involving binomial coefficients was provided in Uspensky (1937). Since then, the problem and its variations and extensions have found broad interest and diverse applications. Surprisingly, a very simple closed form can be obtained, which we present in this note.https://link.springer.com/article/10.1007/s00283-021-10125-

MD-SOAR Maryland Shared Open Access Repository

On optimal policy in the group testing with incomplete identification

Author: Malinovsky Yaakov
Publication venue
Publication date: 24/04/2018
Field of study

Consider a very large (infinite) population of items, where each item independent from the others is defective with probability p, or good with probability q=1?p. The goal is to identify N good items as quickly as possible. The following group testing policy (policy A) is considered: test items together in the groups, if the test outcome of group i of size ni is negative, then accept all items in this group as good, otherwise discard the group. Then, move to the next group and continue until exact N good items are found. The goal is to find an optimal testing configuration, i.e., group sizes, under policy A, such that the expected waiting time to obtain N good items is minimal. Recently, Gusev (2012) found an optimal group testing configuration under the assumptions of constant group size and N=?. In this note, an optimal solution under policy A for finite N is provided.https://www.sciencedirect.com/science/article/pii/S016771521830169

MD-SOAR Maryland Shared Open Access Repository

A note on the distribution of the extreme degrees of a random graph via the Stein-Chen method

Author: Malinovsky Yaakov
Publication venue
Publication date: 02/07/2024
Field of study

We offer an alternative proof, using the Stein-Chen method, of Bollobás' theorem concerning the distribution of the extreme degrees of a random graph. The same method also applies in a more general setting where the probability of every pair of vertices being connected by edges depends on the number of vertices.This research was supported by grant no 2020063 from the United States-Israel Binational Science Foundation (BSF).https://link.springer.com/article/10.1007/s11009-024-10091-

MD-SOAR Maryland Shared Open Access Repository