1,721,360 research outputs found
Method and system for supporting multi-method dispatching in object-oriented programming
In-Place Suffix Sorting
Given string T = T[1,... ,n], the suffix sorting problem is to lexicographically sort the suffixes T[i,... ,n] for all i. This problem is central to the construction of suffix arrays and trees with many applications in string processing, computational biology and compression. A bottleneck in these applications is the amount of workspace needed to perform suffix sorting beyond the space needed to store the input as well as the output. In particular, emphasis is even on the constant c in the O(n) = cn space algorithms known for this problem, Currently the best previous result [5] takes O (nv + n log n) time and O (n/ √v) extra space, for any v ∈6 [1, √n] for strings from a general alphabet. We improve this and present the first known in-place suffix sorting algorithm. Our algorithm takes O (n log n) time using O(1) workspace and is optimal in the worst case for the general alphabet. © Springer-Verlag Berlin Heidelberg 2007
Optimal Cache-Aware Suffix Selection
Given string S[1⋯N] and integer k, the suffix selection problem is to determine the kth lexicographically smallest amongst the suffixes S[i ⋯ N], 1 ≤ i ≤ N. We study the suffix selection problem in the cache-aware model that captures two-level memory inherent in computing systems, for a cache of limited size M and block size B. The complexity of interest is the number of block transfers. We present an optimal suffix selection algorithm in the cache-aware model, requiring θ (N/B) block transfers, for any string S over an unbounded alphabet (where characters can only be compared), under the common tall-cache assumption (i.e. M = Ω (B1+∈), where ∈ < 1). Our algorithm beats the bottleneck bound for permuting an input array to the desired output array, which holds for nearly any nontrivial problem in hierarchical memory models. © G. Franceschini, R. Grossi, and S. Muthukrishnan
Optimal Suffix Selection
Given a string S[1 ⋯ n], the suffix selection problem is to find the kth lexicographically smallest amongst the n suffixes S[i ⋯ n], for i = 1, . . . , n. In particular, the fundamental question is if selection can be performed more efficiently than sorting all the suffixes. If one considered n numbers, they can be sorted using Θ(n log n) comparisons and the classical result from 70's is that selection can be done using O(n) comparisons. Thus selection is provably more efficient than sorting, for n numbers. Suffix sorting can be done using Θ(n log n) comparisons, but does suffix selection need suffix sorting? We settle this fundamental problem by presenting an optimal, deterministic algorithm for suffix selection using O(n) comparisons. Copyright 2007 ACM
Radix Sorting With No Extra Space
It is well known that n integers in the range [1, nc] can be sorted in O(n) time in the RAM model using radix sorting. More generally, integers in any range [1, U] can be sorted in O(n√log log n) time [5]. However, these algorithms use O(n) words of extra memory. Is this necessary? We present a simple, stable, integer sorting algorithm for words of size O(log n), which works in O(n) time and uses only O(1) words of extra memory on a RAM model. This is the integer sorting case most useful in practice. We extend this result with same bounds to the case when the keys are read-only, which is of theoretical interest. Another interesting question is the case of arbitrary c. Here we present a black-box transformation from any RAM sorting algorithm to a sorting algorithm which uses only O(1) extra space and has the same running time. This settles the complexity of in-place sorting in terms of the complexity of sorting. © Springer-Verlag Berlin Heidelberg 2007
Time and Space Efficient Method-Lookup for Object-Oriented Programs (Extended Abstract)
) S. Muthukrishnan Martin Muller y DIMACS & Univ. of Warwick University of New Mexico 1 Introduction Object-oriented languages (OOLs) are becoming increasingly popular in software development (See [4, 11, 18, 20] on OOLs). The modular units in such languages are abstract data types called classes, comprising data and functions (or selectors in the OOL parlance); each selector has possibly multiple implementations (or methods in OOL parlance) each in a different class. These languages support reusability of code/functions by allowing a class to inherit methods from its superclass in a hierarchical arrangement of the various classes. Therefore, when a selector s is invoked in a class c, the relevant method for s inherited by c has to be determined. That is the fundamental problem of method-lookup in object-oriented programs. Since nearly every statement of such programs calls for a method-lookup, efficient support of OOLs crucially relies on the method-lookup mechanism. The challen..
Optimal Parallel Dictionary Matching and Compression (Extended Abstract)
) Martin Farach S. Muthukrishnan y Rutgers University DIMACS April 26, 1995 Abstract Emerging applications in multi-media and the Human Genome Project require storage and searching of large databases of strings -- a task for which parallelism seems the only hope. In this paper, we consider the parallelism in some of the fundamental problems in compressing strings and in matching large dictionaries of patterns against texts. We present the first work-optimal algorithms for these well-studied problems including the classical dictionary matching problem, optimal compression with a static dictionary and the universal data compression with dynamic dictionary of Lempel and Ziv. All our algorithms are randomized and they are of the Las Vegas type. Furthermore, they are fast, working in time logarithmic in the input size. Additionally, our algorithms seem suitable for a distributed implementation. 1 Introduction Large data bases of strings from multi-media applications and the Human G..
Static optimality theorem for external memory string access
Data warehouses are increasingly storing and managing large scale string data, and dealing with large volume of transactions that update and search string data. Motivated by this context, we initiate the study of self-adjusting data structures for string dictionary operations, that is, data structures that are designed to be efficient on an entire sequence rather than individual string operations. Furthermore, we study this problem in the external memory model where string data is too massive to be stored in internal memory and has to reside in disks; each access to a disk page fetches B items, and the cost of the operations is the number of pages accessed (I/Os). We show that given a dictionary of n strings S-1,...,S-n of total length N, a sequence of m string searches S-i1,Si-2,...,S-im takes
O(Sigma(j=1)(m)((B)/(\Sij\)) + Sigma(i=1)(n)(n(i)log(Bni)/(m))) expected I/Os, where n(i) is the number of times S-i is queried. Inserting or deleting a string S takes O((\S\)(B) + log(B) n) expected amortized I/Os. The Sigma(j=1)(m) (\Sij\)(B) term is a lower bound for reading the input; the Sigma(i=1)(n) n(i) log(B) (ni)/(m) term is the entropy of the query sequence and is a standard information-theoretic lower bound. This is the Static Optimality Theorem for external-memory string access. The static optimality theorem was first formalized and proved by Tarjan and Sleator for numerical dictionary in their classic splay trees paper in 1985 [16]; they left open the B-tree case for numerical values (page 684), and a fortiori, the case of string data in an external-memory setting, that we settle here. We obtain our result not by using traditional "splay" operations on search trees as in [16], but by designing a novel and conceptually simple self-adjusting data structure based on the well-known skip lists
- …
