Virginia Tech - Wake Forest University School of Biomedical Engineering & Sciences

Computer Science Technical Reports @Virginia Tech
Not a member yet
    997 research outputs found

    Data Leak Detection As a Service: Challenges and Solutions

    Full text link
    We describe a network-based data-leak detection (DLD) technique, the main feature of which is that the detection does not require the data owner to reveal the content of the sensitive data. Instead, only a small amount of specialized digests are needed. Our technique – referred to as the fuzzy fingerprint – can be used to detect accidental data leaks due to human errors or application flaws. The privacy-preserving feature of our algorithms minimizes the exposure of sensitive data and enables the data owner to safely delegate the detection to others.We describe how cloud providers can offer their customers data-leak detection as an add-on service with strong privacy guarantees. We perform extensive experimental evaluation on the privacy, efficiency, accuracy and noise tolerance of our techniques. Our evaluation results under various data-leak scenarios and setups show that our method can support accurate detection with very small number of false alarms, even when the presentation of the data has been transformed. It also indicates that the detection accuracy does not degrade when partial digests are used. We further provide a quantifiable method to measure the privacy guarantee offered by our fuzzy fingerprint framework

    The Poset Cover Problem

    No full text
    A partial order or poset P = (X,<) on a (finite) base set X determines the set L(P) of linear extensions of P. The problem of computing, for a poset P, the cardinality of L(P) is #P-complete. A set {P1, P2, . . . , Pk} of posets on X covers the set of linear orders that is the union of the L(Pi). Given linear orders L1,L2, . . . ,Lm on X, the Poset Cover problem is to determine the smallest number of posets that cover {L1,L2, . . . ,Lm}. Here, we show that the decision version of this problem is NP- complete. On the positive side, we explore the use of cover relations for finding posets that cover a set of linear orders and present a polynomial-time algorithm to find a partial poset cover

    Adaptive Key Protection in Complex Cryptosystems with Attributes

    Full text link
    In the attribute-based encryption (ABE) model, attributes (as opposed to identities) are used to encrypt messages, and all the receivers with qualifying attributes can decrypt the ciphertext. However, compromised attribute keys may affect the communications of many users who share the same access control policies. We present the notion of forward-secure attribute-based encryption (fs-ABE) and give a concrete construction based on bilinear map and decisional bilinear Diffie-Hellman assumption. Forward security means that a compromised private key by an adversary at time t does not break the confidentiality of the communication that took place prior to t. We describe how to achieve both forward security and encryption with attributes, and formally prove our security against the adaptive chosen-ciphertext adversaries. Our scheme is non-trivial, and the key size only grows polynomially with logN (where N is the number of time periods). We further generalize our scheme to support the individualized key-updating schedule for each attribute, which provides a finer granularity for key management. Our insights on the required properties that an ABE scheme needs to possess in order to be forward-secure compatible are useful beyond the specific fs-ABE construction given. We raise an open question at the end of the paper on the escrow problem of the master key in ABE schemes

    A Practical Method to Estimate Information Content in the Context of 4D-Var Data Assimilation. I: Methodology

    Full text link
    Data assimilation obtains improved estimates of the state of a physical system by combining imperfect model results with sparse and noisy observations of reality. all observations used in data assimilation are equally valuable. The ability to characterize the usefulness of different data points is important for analyzing the effectiveness of the assimilation system, for data pruning, and for the design of future sensor systems. This paper focuses on the four dimensional variational (4D-Var) data assimilation framework. Metrics from information theory are used to quantify the contribution of observations to decreasing the uncertainty with which the system state is known. We establish an interesting relationship between different information-theoretic metrics and the variational cost function/gradient under Gaussian linear assumptions. Based on this insight we derive an ensemble-based computational procedure to estimate the information content of various observations in the context of 4D-Var. The approach is illustrated on a nonlinear test problem. In the companion paper (Singh et al., 2012a) the methodology is applied to a global chemical data assimilation experiment

    A Practical Blended Analysis for Dynamic Features in JavaScript

    Full text link
    The JavaScript Blended Analysis Framework is designed to perform a general-purpose, practical combined static/dynamic analysis of JavaScript programs, while handling dynamic features such as run-time generated code and variadic func- tions. The idea of blended analysis is to focus static anal- ysis on a dynamic calling structure collected at runtime in a lightweight manner, and to rene the static analysis us- ing additional dynamic information. We perform blended points-to analysis of JavaScript with our framework and compare results with those computed by a pure static points- to analysis. Using JavaScript codes from actual webpages as benchmarks, we show that optimized blended analysis for JavaScript obtains good coverage (86.6% on average per website) of the pure static analysis solution and nds ad- ditional points-to pairs (7.0% on average per website) con- tributed by dynamically generated/loaded code

    A Framework to Analyze the Performance of Load Balancing Schemes for Ensembles of Stochastic Simulations

    Full text link
    Ensembles of simulations are employed to estimate the statistics of possible future states of a system, and are widely used in important applications such as climate change and biological modeling. Ensembles of runs can naturally be executed in parallel. However, when the CPU times of individual simulations vary considerably, a simple strategy of assigning an equal number of tasks per processor can lead to serious work imbalances and low parallel efficiency. This paper presents a new probabilistic framework to analyze the performance of dynamic load balancing algorithms for ensembles of simulations where many tasks are mapped onto each processor, and where the individual compute times vary considerably among tasks. Four load balancing strategies are discussed: most-dividing, all-redistribution, random-polling, and neighbor-redistribution. Simulation results with a stochastic budding yeast cell cycle model is consistent with the theoretical analysis. It is especially significant that there is a provable global decrease in load imbalance for the local rebalancing algorithms due to scalability concerns for the global rebalancing algorithms. The overall simulation time is reduced by up to 25%, and the total processor idle time by 85%

    GreenVis: Energy-Saving Color Schemes for Sequential Data Visualization on OLED Displays

    Full text link
    The organic light emitting diode (OLED) display has recently become popular in the consumer electronics market. Compared with current LCD display technology, OLED is an emerging display technology that emits light by the pixels themselves and doesn’t need an external back light as the illumination source. In this paper, we offer an approach to reduce power consumption on OLED displays for sequential data visualization. First, we create a multi-objective optimization approach to find the most energy-saving color scheme for given visual perception difference levels. Second, we apply the model in two situations: pre-designed color schemes and auto generated color schemes. Third, our experiment results show that the energy-saving sequential color scheme can reduce power consumption by 17.2% for pre-designed color schemes. For auto-generated color schemes, it can save 21.9% of energy in comparison to the reference color scheme for sequential data

    A Practical Method to Estimate Information Content in the Context of 4D-Var Data Assimilation. II: Application to Global Ozone Assimilation

    Full text link
    Data assimilation obtains improved estimates of the state of a physical system by combining imperfect model results with sparse and noisy observations of reality. Not all observations used in data assimilation are equally valuable. The ability to characterize the usefulness of different data points is important for analyzing the effectiveness of the assimilation system, for data pruning, and for the design of future sensor systems. In the companion paper (Sandu et al., 2012) we derive an ensemble-based computational procedure to estimate the information content of various observations in the context of 4D-Var. Here we apply this methodology to quantify the signal and degrees of freedom for signal information metrics of satellite observations used in a global chemical data assimilation problem with the GEOS-Chem chemical transport model. The assimilation of a subset of data points characterized by the highest information content yields an analysis comparable in quality with the one obtained using the entire data set

    A Practical Blended Analysis for Dynamic Features in JavaScript

    Full text link
    JavaScript is widely used in Web applications; however, its dynamism renders static analysis ineffective. Our JavaScript Blended Analysis Framework is designed to handle JavaScript dynamic features. It performs a flexible combined static/dynamic analysis. The blended analysis focuses static analysis on a dynamic calling structure collected at runtime in a lightweight manner, and refines the static analysis using dynamic information. The framework is instantiated for points-to analysis with stmt-level MOD analysis and tainted input analysis. Using JavaScript codes from actual webpages as benchmarks, we show that blended points-to analysis for JavaScript obtains good coverage (86.6% on average per website) of the pure static analysis solution and finds additional points-to pairs (7.0% on average per website) contributed by dynamically generated/loaded code. Blended tainted input analysis reports all 6 true positives reported by static analysis, but without false alarms, and finds three additional true positives

    Architecture-Aware Optimization on a 1600-core Graphics Processor

    Full text link
    The graphics processing unit (GPU) continues to make significant strides as an accelerator in commodity cluster computing for high-performance computing (HPC). For example, three of the top five fastest supercomputers in the world, as ranked by the TOP500, employ GPUs as accelerators. Despite this increasing interest in GPUs, however, optimizing the performance of a GPU-accelerated compute node requires deep technical knowledge of the underlying architecture. Although significant literature exists on how to optimize GPU performance on the more mature NVIDIA CUDA architecture, the converse is true for OpenCL on the AMD GPU. Consequently, we present and evaluate architecture-aware optimizations for the AMD GPU. The most prominent optimizations include (i) explicit use of registers, (ii) use of vector types, (iii) removal of branches, and (iv) use of image memory for global data. We demonstrate the efficacy of our AMD GPU optimizations by applying each optimization in isolation as well as in concert to a large-scale, molecular modeling application called GEM. Via these AMD-specific GPU optimizations, the AMD Radeon HD 5870 GPU delivers 65% better performance than with the wellknown NVIDIA-specific optimizations

    838

    full texts

    997

    metadata records
    Updated in last 30 days.
    Computer Science Technical Reports @Virginia Tech is based in United States
    Access Repository Dashboard
    Do you manage Open Research Online? Become a CORE Member to access insider analytics, issue reports and manage access to outputs from your repository in the CORE Repository Dashboard! 👇