1,720,962 research outputs found
Modelli di performance per la valutazione e la gestione dei moderni sistemi informatici e applicazioni
The advent of the cloud computing paradigm has lead Internet service providers to set up large Internet systems distributed all over the world in order to efficiently serve needs of their customers. Large Internet systems are characterized by a huge number of hardware resources and software components having the goal to make computing services readily available to users on demand, like any other utility service available in today's society.
Efficient management of large Internet systems requires several strategies that decide on request dispatching, load balance, admission control, and request redirection without direct intervention of human administrators. At the basis of most autonomic management decisions there is the need of performance models for supporting system management by taking real-time decisions on the basis of information related to the state of internal system components and resources. Performance models supporting large infrastructures must be able to operate at different time scales and should support prompt reconfigurations motivated by continuous dynamic changes in system, client, and business policies. Performance models should be scalable for increasing numbers of hardware and software components, they should be adaptive to heterogeneous data streams characteristics, and they should guarantee reliable results over changing system conditions and requirements.
This thesis presents a set of novel performance models proposed for online management of large amounts of variable and heterogeneous data streams coming from several system monitors and networks. In particular, this thesis extends the state-of-the-art in performance modeling in manifold directions: (i) it presents a novel approach for improving scalability when managing large amounts of data; (ii) it introduces new adaptive models for the on-line detection of anomalies and relevant state changes in highly variable contexts, and for the identification of correlations and groups of related objects even when correlation is hidden by high variability; (iii) it uses predictive analytics for improving performance in the selection of cloud availability zones on the basis of user preferences.
Extensive evaluations of the proposed approaches in real systems demonstrate improvements in performance modeling with respect to the state-of-the-art solutions and the satisfaction of scalability, adaptivity, and reliability requirements that are mandatory for large systems management.The advent of the cloud computing paradigm has lead Internet service providers to set up large Internet systems distributed all over the world in order to efficiently serve needs of their customers. Large Internet systems are characterized by a huge number of hardware resources and software components having the goal to make computing services readily available to users on demand, like any other utility service available in today's society.
Efficient management of large Internet systems requires several strategies that decide on request dispatching, load balance, admission control, and request redirection without direct intervention of human administrators. At the basis of most autonomic management decisions there is the need of performance models for supporting system management by taking real-time decisions on the basis of information related to the state of internal system components and resources. Performance models supporting large infrastructures must be able to operate at different time scales and should support prompt reconfigurations motivated by continuous dynamic changes in system, client, and business policies. Performance models should be scalable for increasing numbers of hardware and software components, they should be adaptive to heterogeneous data streams characteristics, and they should guarantee reliable results over changing system conditions and requirements.
This thesis presents a set of novel performance models proposed for online management of large amounts of variable and heterogeneous data streams coming from several system monitors and networks. In particular, this thesis extends the state-of-the-art in performance modeling in manifold directions: (i) it presents a novel approach for improving scalability when managing large amounts of data; (ii) it introduces new adaptive models for the on-line detection of anomalies and relevant state changes in highly variable contexts, and for the identification of correlations and groups of related objects even when correlation is hidden by high variability; (iii) it uses predictive analytics for improving performance in the selection of cloud availability zones on the basis of user preferences.
Extensive evaluations of the proposed approaches in real systems demonstrate improvements in performance modeling with respect to the state-of-the-art solutions and the satisfaction of scalability, adaptivity, and reliability requirements that are mandatory for large systems management
Data clustering based on correlation analysis applied to highly variable domains
Clustering of traffic data based on correlation analysis is an important element of several network management objectives including traffic shaping and quality of service control. Existing correlation-based clustering algorithms are affected by poor results when applied to highly variable time series characterizing most network traffic data. This paper proposes a new similarity measure for computing clusters of highly variable data on the basis of their correlation. Experimental evaluations on several synthetic and real datasets show the accuracy and robustness of the proposed solution that improves existing clustering methods based on statistical correlations
Selective resource characterization for evaluation of system dynamics
Management decisions to achieve peak performance operations, scalability and availability in distributed systems require a continuous statistical characterization of data setscoming from server and network monitors. Due to the increasing sizes of data centers and their continuous dynamicchanges, the traditional approaches that work on all datasets in a centralized way are impractical. We propose astrategy for data processing that is able to limit the analysis of the large sets of collected measures to a smaller subsetof significant information for a twofold purpose: to classifythe collected data sets in few classes characterized by similarstatistical behaviors, to evaluate the dynamics of the overallsystem and its most relevant changes. The proposed strategy works at the level of server resources and of significantaggregation of servers of the overall distributed system. Several experimental results demonstrate the feasibility of theproposed strategy that is validated in real contexts
A hierarchical architecture for on-line control of private cloud-based systems
Several enterprise data centers are adopting the private cloud computing paradigm as a scalable, cost-effective, robust way to provide services to their end users. The management and control of the underlying hw/sw infrastructure pose several interesting problems. In this paper we are interested to evidence that the monitoring process needs to scale to thousands of heterogeneous resources at different levels (system, network, storage, application) and at different time scales; it has to cope with missing data and detect anomalies in the performance samples; it has to transform all data into meaningful information and pass it to the decision process (possibly through different, ad-hoc algorithms for different resources). In most cases of interest for this paper, the control management system must operate under real-time constraints. We propose a hierarchical architecture that is able to support the efficient orchestration of an on-line management mechanism for a private cloud-based infrastructure. This architecture integrates a framework that collects samples from monitors, validates and aggregates them. We motivate the choice of a hierarchical scheme and show some data manipulation, orchestration and control strategies at different time scales. We then focus on a specific context referring to mid-term management objectives.We have applied the proposed hierarchical architecture successfully to data centers made of a large number of nodes that require short to mid-term control and in our experience we can conclude that it is a viable approach for the control of private cloud-based systems
Monitoring large cloud-based systems
Large scale cloud-based services are built upon a multitude of hardware and software resources, disseminated
in one or multiple data centers. Controlling and managing these resources requires the integration of several
pieces of software that may yield a representative view of the data center status. Today’s both closed and
open-source monitoring solutions fail in different ways, including the lack of scalability, scarce representativity
of global state conditions, inability in guaranteeing persistence in service delivery, and the impossibility of
monitoring multi-tenant applications. In this paper, we present a novel monitoring architecture that addresses
the aforementioned issues. It integrates a hierarchical scheme to monitor the resources in a cluster with a
distributed hash table (DHT) to broadcast system state information among different monitors. This architecture
strives to obtain high scalability, effectiveness and resilience, as well as the possibility of monitoring
services spanning across different clusters or even different data centers of the cloud provider. We evaluate the
scalability of the proposed architecture through a bottleneck analysis achieved by experimental results
Adaptive, scalable and reliable monitoring of big data on clouds
Real-time monitoring of cloud resources is crucial for a variety of tasks such as performance analysis, workload management, capacity planning and fault detection. Applications producing big data make the monitoring task very difficult at high sampling frequencies because of high computational and communication overheads in collecting, storing, and managing information. We present an adaptive algorithm for monitoring big data applications that adapts the intervals of sampling and frequency of updates to data characteristics and administrator needs. Adaptivity allows us to limit computational and communication costs and to guarantee high reliability in capturing relevant load changes. Experimental evaluations performed on a large testbed show the ability of the proposed adaptive algorithm to reduce resource utilization and communication overhead of big data monitoring without penalizing the quality of data, and demonstrate our improvements to the state of the art.Real-time monitoring of cloud resources is crucial for a variety of tasks such as performance analysis, workload management, capacity planning and fault detection. Applications producing big data make the monitoring task very difficult at high sampling frequencies because of high computational and communication overheads in collecting, storing, and managing information. We present an adaptive algorithm for monitoring big data applications that adapts the intervals of sampling and frequency of updates to data characteristics and administrator needs. Adaptivity allows us to limit computational and communication costs and to guarantee high reliability in capturing relevant load changes. Experimental evaluations performed on a large testbed show the ability of the proposed adaptive algorithm to reduce resource utilization and communication overhead of big data monitoring without penalizing the quality of data, and demonstrate our improvements to the state of the art
Self-adaptive techniques for the load trend evaluation of internal system resources
Modern distributed systems that have to avoid performance degradation and system overload require several runtime management decisions for load balancing and load sharing, overload and admission control,job dispatching and request redirection. As the external workload and the internal resource behavior of themodern system is highly complex and variable, selfadaptive techniques require a stable vision of the system behavior. In this paper we propose a trend modelthat guarantees a robust interpretation for load-awaredecision algorithms. Various experimental results in aWeb cluster demonstrate that the proposed models andalgorithms guarantee better stability of the load and areduction of the response time experienced by the users
Detecting behavioral variations in system resources of large data centers
The identification of significant changes in systemresource behaviors is mandatory for an efficient managementof data centers. As the dimension of modern data centersincreases, the evaluation of state change detections throughtraditional algorithms becomes computationally intractable.We propose a novel approach that characterizes the statisticalproperties of the resource measures coming from systemmonitors, classifies them, and signals a change only whenthere is modification of the resource classification. This methoddiminishes the computational complexity and reaches the samedetection accuracy of traditional approaches as demonstratedby several results obtained in real enterprise data centers
A software architecture for the analysis of large sets of data streams in cloud infrastructures
System management algorithms in private andpublic cloud infrastructures have to work with literally thousands of data streams generated from resource, applicationand event monitors. This cloud context opens two novel issuesthat we address in this paper: how to design a softwarearchitecture that is able to gather and analyze all informationwithin real-time constraints; how it is possible to reduce theanalysis of the huge collected data set to the investigationof a reduced set of relevant information. The application ofthe proposed architecture is based on the most advancedsoftware components, and is oriented to the classification of thestatistical behavior of servers and to the analysis of significantstate changes. These results guide model-driven managementsystems to investigate only relevant servers and to applysuitable decision models considering the deter
- …
