1,720,974 research outputs found

    Short-term prediction models for server management in Internet-based contexts

    No full text
    Modern Internet applications run on top of complex system infrastructures where several runtimemanagement algorithms have to guarantee high performance, scalability and availability. This paper aims tooffer a support to runtime algorithms that must take decisions on the basis of historical and predicted loadconditions of the internal system resources. We propose a new class of moving filtering techniques and ofadaptive prediction models that are specifically designed to deal with runtime and short-term forecast oftime series which originate from monitors of system resources of Internet-based servers. A large set ofexperiments confirm that the proposed models improve the prediction accuracy with respect to existingalgorithms and they show stable results for different workload scenario

    Data clustering based on correlation analysis applied to highly variable domains

    No full text
    Clustering of traffic data based on correlation analysis is an important element of several network management objectives including traffic shaping and quality of service control. Existing correlation-based clustering algorithms are affected by poor results when applied to highly variable time series characterizing most network traffic data. This paper proposes a new similarity measure for computing clusters of highly variable data on the basis of their correlation. Experimental evaluations on several synthetic and real datasets show the accuracy and robustness of the proposed solution that improves existing clustering methods based on statistical correlations

    Selective resource characterization for evaluation of system dynamics

    No full text
    Management decisions to achieve peak performance operations, scalability and availability in distributed systems require a continuous statistical characterization of data setscoming from server and network monitors. Due to the increasing sizes of data centers and their continuous dynamicchanges, the traditional approaches that work on all datasets in a centralized way are impractical. We propose astrategy for data processing that is able to limit the analysis of the large sets of collected measures to a smaller subsetof significant information for a twofold purpose: to classifythe collected data sets in few classes characterized by similarstatistical behaviors, to evaluate the dynamics of the overallsystem and its most relevant changes. The proposed strategy works at the level of server resources and of significantaggregation of servers of the overall distributed system. Several experimental results demonstrate the feasibility of theproposed strategy that is validated in real contexts

    Architectures for scalable and flexible Web personalization services

    No full text
    The complexity of services provided through theWeb is con-tinuously increasing and issues introduced by both heteroge-neous client devices and Web content personalization are be-coming a major challenge for the Web. Tailoring Web andmultimedia resources tomeet the user and client requirementsopens twomain novel issues in the research area of content de-livery. The content adaptation operations may be computa-tionally expensive, requiring high efficiency and scalability intheWeb architectures.Moreover, personalization services in-troduce security and consistency issues for user profile infor-mation management. In this paper, we propose a novel dis-tributed architecture, with four variants, for the efficient de-livery of personalized service where the nodes are organizedin two levels.We discuss how the architectural choices are af-fected by security and consistency constraints as well as by theaccess to privileged information of the content provider.More-over we discuss performance trade-offs of the various choices

    A hierarchical architecture for on-line control of private cloud-based systems

    No full text
    Several enterprise data centers are adopting the private cloud computing paradigm as a scalable, cost-effective, robust way to provide services to their end users. The management and control of the underlying hw/sw infrastructure pose several interesting problems. In this paper we are interested to evidence that the monitoring process needs to scale to thousands of heterogeneous resources at different levels (system, network, storage, application) and at different time scales; it has to cope with missing data and detect anomalies in the performance samples; it has to transform all data into meaningful information and pass it to the decision process (possibly through different, ad-hoc algorithms for different resources). In most cases of interest for this paper, the control management system must operate under real-time constraints. We propose a hierarchical architecture that is able to support the efficient orchestration of an on-line management mechanism for a private cloud-based infrastructure. This architecture integrates a framework that collects samples from monitors, validates and aggregates them. We motivate the choice of a hierarchical scheme and show some data manipulation, orchestration and control strategies at different time scales. We then focus on a specific context referring to mid-term management objectives.We have applied the proposed hierarchical architecture successfully to data centers made of a large number of nodes that require short to mid-term control and in our experience we can conclude that it is a viable approach for the control of private cloud-based systems

    Exploratory security analytics for anomaly detection

    Full text link
    The huge number of alerts generated by network-based defense systems prevents detailed manual inspections of security events. Existing proposals for automatic alerts analysis work well in relatively stable and homogeneous environments, but in modern networks, that are characterized by extremely complex and dynamic behaviors, understanding which approaches can be effective requires exploratory data analysis and descriptive modeling. We propose a novel framework for automatically investigating temporal trends and patterns of security alerts with the goal of understanding whether and which anomaly detection approaches can be adopted for identifying relevant security events. Several examples referring to a real large network show that, despite the high intrinsic dynamism of the system, the proposed framework is able to extract relevant descriptive statistics that allow to determine the effectiveness of popular anomaly detection approaches on different alerts groups

    On the Selection of Models for Runtime Prediction of System Resources

    No full text
    Applications and services delivered through large Internet Data Centersare now feasible thanks to network and server improvement, but also to virtualization,dynamic allocation of resources and dynamic migrations. The large numberof servers and resources involved in these systems requires autonomic managementstrategies because no amount of human administrators would be capable of cloningand migrating virtual machines in time, as well as re-distributing or re-mapping theunderlying hardware. At the basis of most autonomic management decisions, thereis the need of evaluating own global behavior and change it when the evaluationindicates that they are not accomplishing what they were intended to do or some relevantanomalies are occurring. Decisions algorithms have to satisfy different timescales constraints. In this chapter we are interested to short-term contexts whereruntime prediction models work on the basis of time series coming from samples ofmonitored system resources, such as disk, CPU and network utilization. In similarenvironments, we have to address two main issues. First, original time series areaffected by limited predictability because measurements are characterized by noisesdue to system instability, variable offered load, heavy-tailed distributions, hardwareand software interactions. Moreover, there is no existing criteria that can help us tochoose a suitable prediction model and related parameters with the purpose of guaranteeingan adequate prediction quality. In this chapter, we evaluate the impact thatdifferent choices on prediction models have on different time series, and we suggesthow to treat input data and whether it is convenient to choose the parameters of aprediction model in a static or dynamic way. Our conclusions are supported by alarge set of analyses on realistic and synthetic data traces

    Separating internal and external fluctuation in distributed web-based services

    No full text
    The observable behavior of a complex system reflects the mechanisms governingthe internal interactions between the system’s components and the effect ofexternal perturbations. We investigate the behavior of a distributed system providingWeb-based services and the effects of the impact of external request arrivals on theinternal system resources; the results of our study are of primary importance for takingseveral runtime decisions on load and resource management. Here we show that bycapturing the simultaneous activities of several performance indexes of the Web-basedsystem nodes we can separate the internal dynamics from the external fluctuations. Forevery internal performance index, we are able to determine the origin of fluctuations,finding that while all the considered performance indexes of the application server haverobust internal dynamics, the CPU utilization and the network throughput of the Weband database servers are mainly driven by external demand

    A quantitative methodology to identify relevant users in social networks

    No full text
    Social networks are gaining an increasing popularity on the Internet, with tens of millions of registered users and an amount of exchanged contents accounting for a large fraction of the Internet traffic. Due to this popularity, social networks are becoming a critical media for business and marketing, as testified by viral advertisement campaigns based on such networks. To exploit the potential of social networks, it is necessary to classify the users in order to identify the most relevant ones.For example, in the context of marketing on social networks, it is necessary to identify which users should be involved in an advertisement campaign.However, the complexity of social networks, where each user is described by a large number of attributes, transforms the problem of identifying relevant users in a needle in a haystack problem. Starting from a set of user attributes that may be redundant or do not provide significant information for our analysis, we need to extract a limited number of meaningful characteristics that can be used to identify relevant users.We propose a quantitative methodology based on Principal Component Analysis (PCA) to analyze attributes and extract characteristics of social network users from the initial attribute set. The proposed methodology can be applied to identify relevant users in social network for different types of analysis. As an application, we present two case studies that show how the proposed methodology can be used to identify relevant users for marketing on the popular YouTube network. Specifically, we identify which users may play a key role in the content dissemination and how users may be affected by different dissemination strategies

    Supporting data center management through clustering of system data streams

    No full text
    Aggregating large data sets related to hardware and software resources into clusters is at the basis of several operations and strategies for management and control. High variability and noise characterizing data collected from system resources monitoring prevent the application of existing solutions that are affected by low accuracy and scarce robustness. We present a new algorithm which extends the clustering method to data center management because it is able to find groups of related objects even when correlation is hidden by high variability. Our experimental evaluation performed on both synthetic and real data shows the accuracy and robustness of the proposed solution, and its ability in clustering servers with correlated functionalit
    corecore