1,721,006 research outputs found

    Leveraging hybrid deep learning and generative modeling to accurately estimate the remaining useful life of mechanical systems

    No full text
    With the advent of Industry 4.0 (I4.0), machine learning (ML) in artificial intelligence (AI), industrial Internet of Things (IIoT), and cyber-physical systems (CPS), the development of data-driven applications like predictive maintenance (PdM) has accelerated. In asset-dependent industries, PdM has reduced operational costs, increased productivity, reduced downtime, and improved safety management. Predictive maintenance solutions also help identify failure sources, eliminating unnecessary maintenance. The concept of prognostics and health management (PHM) is a predictive maintenance approach that has gained recognition as an essential paradigm in smart manufacturing. Its purpose is to provide reliable methods for monitoring the health condition of industrial equipment. To achieve this, efficient and effective methods for monitoring the health of systems are necessary. These methods involve processing and analysing large amounts of equipment data to identify anomalies and provide diagnosis and prognosis. Prognostics is a crucial procedure in the field of PHM that involves predicting future conditions. It primarily focuses on projecting the remaining lifespan of a machine, which is the duration it can continue to work as intended. This estimation is commonly referred to as the remaining useful life (RUL) of the system. The field of prognostic research is still in its early stages, which accounts for the numerous issues that need to be addressed. Despite their potential to reduce costs and increase productivity, prognostics and health management face significant challenges. These include incomplete learning from models, accuracy issues with evolving systems, and the impracticality of human tagging due to the large data volumes generated, particularly in the Industrial Internet of Things (IIoT). A critical aspect of prognostics and health management is the estimation of the remaining useful life (RUL) of equipment. This estimation is complex and requires balancing computational demands with the capabilities of predictive systems. The estimation of remaining useful life (RUL) is a fundamental challenge that must be addressed to successfully adopt predictive maintenance, especially given the early stages of research in the field of prognostics. This thesis will focus on RUL estimation for monitoring equipment using deep learning (DL) methods. This thesis presents the development of hybrid deep learning models, integrating generative modelling techniques to enhance predictive maintenance strategies, specifically in the fields of aero-engine prognostics and Remaining Useful Life (RUL) estimation. It outlines three significant contributions that leverage the capabilities of these hybrid models, supported by a comparative analysis using benchmark Commercial Modular Aero Propulsion System Simulation (C-MAPSS) turbofan engine datasets to validate the advancements in prediction accuracy. Our research is motivated to advance computational efficiency through the design of streamlined pre diction models that not only reduce the complexity of the network but also augment its processing capabilities. To provide a robust hybrid-deep learning framework, focusing on the aerospace sector, thereby contributing to the field through models that offer more precise predictions. The first contribution proposes two hybrid deep learning architectures that leverage the multi-modal and hybrid proficiencies intrinsic to deep neural networks. The objective is to encapsulate critical data and maintain comprehensive information at varying intervals, thereby enhancing the accuracy of RUL estimations. The second contribution introduces another data-driven framework that combines temporal convolution, recur rent skip components, and an attention mechanism to improve the accuracy of RUL estimation. The recurrent skip component finds long-term patterns in time series data, while temporal convolution extracts high-level features from longer sequences. Finding hidden representations and degradation development interactions between features at each window position in the input matrix is what the attention layer does to focus on the most important information for RUL estimation. The third contribution addresses the formidable challenges associated with limited data availability, elevated feature dimensionality, and complex feature interrelationships. To circumnavigate these intricacies, we advocate the application of GANs. GANs can generate synthetic sensor signals. These artificially created signals can effectively supplement the existing data, thus solving the constraints brought about by data scarcity. The data augmentation makes the feature space bigger and helps to understand the original data distribution, providing a new perspective for creating predictive systems for estimating the RUL

    On-device learning, optimization, efficient deployment and execution of machine learning algorithms on resource-constrained IoT hardware

    No full text
    Edge analytics refers to the application of data analytics and Machine Learning (ML) algorithms on IoT devices. The concept of edge analytics is gaining popularity due to its ability to perform AI-based analytics at the device level, enabling autonomous decisionmaking without depending on the cloud. However, the majority of Internet of Things (IoT) devices are embedded systems (hardware) with a low-cost microcontroller unit (MCU) or a small CPU as its brain, which often are incapable of handling complex ML algorithms. This thesis aims to improve the intelligence of such resource-constrained IoT devices by providing novel algorithms, frameworks, strategies to: create self-learning ML-based IoT devices; efficiently deploy and execute a range of Neural Networks (NNs) and also non- NN ML algorithms on IoT devices; enable practicing communication efficient distributed ML using IoT devices. The memory footprint (SRAM, Flash, and EEPROM) of MCU-based devices is often very limited, restricting onboard ML model training for large trainsets with high feature dimensions. To cope with memory issues, the current edge analytics approaches train highquality ML models on the cloud GPUs (uses large volume historical data), then deploy the deep optimized version of the resultant models on edge devices for inference. Such approaches are inefficient in concept drift situations where the data generated at the device level vary frequently, and trained models are clueless on how to behave if previously unseen data arrives. The First Contribution of this thesis aims to solve this challenge. We provide Train++ Algorithm and ML-MCU Framework, that trains ML models locally at the device level (on MCUs and small CPUs) using the full n-samples of high-dimensional data. Train++ and ML-MCU transforms even the most resource-constrained MCU-based IoT edge devices into intelligent devices that can locally build their own knowledge base on-the-fly using the live data, thus creating smart self-learning and autonomous problemsolving devices. As a part of the first contribution, to perform online machine learning (OL) in non-ideal real-world settings, we designed Imbal-OL, an OL plugin that understands the supplied data stream and balances the class size before sending it for learning using our Train++, ML-MCU, or others. The hardware resource of IoT devices are orders of magnitude less than the resources required for the standalone execution of a large, high-quality NN. Currently, to alleviate various critical issues caused by the poor hardware specifications of IoT devices, before deployment the NNs are optimized using various methods such as pruning, quantization, sparsification, model architecture tuning, etc. Even after applying state-of-the-art optimization methods, there are numerous cases where the models after deep compression/ optimization still exceed a device’s memory capacity by a margin of just a few bytes, and users cannot optimize further since the model is already compressed to its maximum. The Second Contribution of this thesis aims to solve this challenge. We propose an approach for the efficient execution of already deeply compressed, large NNs on tiny IoT devices. After optimizing NNs using state-of-the-art deep model compression methods, when the resultant models are executed by MCUs or small CPUs using the model execution sequence produced by our approach, higher levels of conserved SRAM can be achieved. As a part of the second contribution, we provide an SRAM-optimized ML classifier (non-NN) porting, stitching, and efficient deployment approach. The proposed method enables large classifiers to be comfortably executed on MCU-based IoT devices and perform ultra-fast classifications while consuming 0 bytes of SRAM. Training a problem-solving ML model using large datasets is computationally expensive and requires a scalable distributed training platform to complete training within a reasonable time frame. In this scenario, communicating model updates among workers has always been a bottleneck. The magnitude of impact on the quality of resultant models is higher when distributed training on low hardware specification devices and in uncertain real-world IoT networks where congestion, latency, bandwidth issues are common. The Third Contribution of this thesis aims to solve this challenge. We provide Globe2Train (G2T), a framework with two components named G2T-Cloud (G2T-C) and G2T-Device (G2T-D) that can efficiently connect together multiple IoT devices and collectively train to produce the target ML models at very high speeds. The G2T framework components jointly eliminate staleness and improve training scalability and speed by tolerating the real-world network uncertainties and by reducing the communication-to-computation ratio. As a part of the third contribution, we provide ElastiQuant, an elastic quantization strategy that aims to further reduce the impact caused by limitations in distributed IoT training scenarios

    Enabling smart societies using social internet of things: A semantic framework for real-time virtual object management

    No full text
    A wide variety of unique smart services and applications are evolved based on Internet of Things (IoT) for resolving numerous issues in daily life, social community, agriculture, health sector, entertainment sector, city administration, environment, weather, road traffic, and many more. As a result, sensors, devices, and humans are required to associate with each other to forma new service. To get the best out of the IoT objects a platform is required to integrate different type of objects so that objects can associate with each other and can make relation like the human society. In recent years, humans social interaction got a new edge through a profile-based online social network. The admiration of social networks and the advent of the IoT direct to a new research paradigm called Social IoT (SIoT), where real-world objects can participate in an online social network like the human social network. This effort leads to an immense possibility of unique applications for a smart cognitive society. However, it is still a challenge to explore these applications due to a lack of an adequate SIoT framework, where SIoT nodes can be managed and monitored in real-time under a cognitive framework. As these IoT nodes and services are going to co-exist with us (human), we foresee establishing of cognitive contributory skills where IoT nodes, services and even human skills can collectively forma cognitive society to share resources, information and skills. Hence, in this thesis, we propose a framework to manage and monitor the SIoT nodes intelligently and cognitively in real-time. In our proposed framework, we enable virtual representation of real-world objects known as Virtual Object (VO) and ensure their relationship semantically to compose new services by combining VOs and called Composite VO (CVO). Additionally, we identify special skills (e.g., human expertise and skill as an Abstract Object (AO). However, building such intelligent societies automatically using SIoT is a big challenge, mainly due to the complexity of the systems and availability of a large number of nodes. In such scenarios, it is not trivial to find a suitable SIoT service node correctly to avail a service in real-time. We also propose virtual object management and selection process for the SIoT platform and QoS aware object selection using Integer Programming (IP) and Multiple Criteria DecisionMaking (MCDM) to find a right service at the right time

    Expressive RDF stream reasoning via data parallelism in answer set programming

    No full text
    The Web nowadays is highly dynamic with massive amounts of data being continuously generated from a huge number of devices and services across the Internet. Various application scenarios in several domains, such as environment monitoring, health care systems, and smart transportation, can hugely benefit from the ability to efficiently integrate and query data streams from these sources to provide better services. However, in such applications, it is not only capturing data streams that is important, but also the ability to extract insights from such streams, and use them to target users\u27 needs, preferences and constraints. For this reason, different types of complex reasoning tasks need to be efficiently designed and executed on such streams to capture the sophisticated requirements of users. Stream Reasoning is an emerging research area which focuses on providing continuous complex reasoning capabilities over data streams. However, Stream Reasoning faces many challenges not only due to their heterogeneity but also due to the exponential growth in the availability of streaming data on the Web, which severely limits the complexity of reasoning that can be used to extract actionable knowledge in a scalable and reliable way. The key challenge addressed in this thesis is to enable expressive reasoning over massive, distributed, heterogeneous data streams in a scalable way. I address this problem by integrating Semantic Web for semantic integration, Answer Set Programming (ASP) for expressive reasoning, and Data Stream Management Systems for stream processing. The trade-off between scalability and expressivity in Stream Reasoning is considered, and parallel reasoning techniques are proposed to enhancing scalability while maintaining some of the key reasoning capabilities that are more expressive but also computationally more expensive. The thesis addresses two research questions related to how the expressivity and scalability of a reasoner can be improved when reasoning on Semantic Web data streams. For the first research question which targets expressivity, I propose C-ASP, a language extended from the ASP language with RDF streaming operators, which allows users to express complex requirements in terms of preferences and constraints, as a continuous reasoning request. The C-ASP reasoner is implemented to continuously evaluate such reasoning request when new data arrives. The experimental evaluation shows that the C-ASP engine outperforms the state-of-the-art RDF stream processing engine C-SPARQL. For the second research question which focuses on the scalability, I optimize the reasoning process of the C-ASP reasoner with a parallel approach based on data-level parallelism, and I demonstrate how the correctness of the results can be maintained. To do so, a clear characterization and formal definitions for analyzing the dependencies among input data streams are provided. The algorithms are developed to create a partitioning plan for guiding the parallel reasoning process to split data streams on-the-fly. Experiments show that applying this data-level parallelism improves the reasoning process significantly. The research discussed in this thesis has been deployed in two real-world scenarios in the context of Smart Cities where event-driven contextual knowledge extraction is introduced, and Smart Enterprise where an Internet of Things-enabled meeting management system is developed. The former aims at continuously identifying and filtering critical events that might affect the decision making of users while the latter investigates how to enhance users\u27 experience in online meetings on-the-go by using mobile sensors embedded in a communication platform. By addressing the requirements of such scenarios, the prototypes demonstrate the validity and feasibility of the approach proposed in this thesis

    Distributed heterogeneous web data sources integration DeXIN approach

    Full text link
    In modernen Wirtschaftsunternehmen wird häufig eine integrierte Anwendung entwickelt, um einheitlichen Zugriff auf mehrere bestehende Informationssysteme zu bieten, die innerhalb oder außerhalb des Unternehmens laufen. Datenintegration ist eine tiefgreifende Herausforderung dieser Anwendungen, da Abfragen über mehrere autonome und heterogene Datenquellen reichen.Die Integration solch unterschiedlicher Informationssysteme ist eine anspruchsvolle Aufgabe, insbesondere wenn verschiedene Anwendungen unterschiedliche Datenformate und Abfragesprachen verwenden, die nicht untereinander kompatibel sind.Mit der wachsenden Popularität von Web 2.0-Technologien und der Verfügbarkeit riesiger Mengen an Daten im Web, haben sich die Anforderungen für die Datenintegration, im Vergleich zu traditionellen Ansätzen der Datenbankintegration, geändert. Der große Umfang an Web-Datenquellen hat nicht nur zu einem hohen Maß an Verteilung, Heterogenität, sowie unterschiedlichen Datenformaten und Abfragesprachen geführt, sondern darüber hinaus sind die Daten auch mit zusätzlichen Dateneigenschaften verbunden, wie zum Beispiel Datenschutz, Lizenzierung, Kosten, Qualität der Daten, etc. Daher müssen die Datenintegration-Tools nicht nur einen optimalen Weg zur Verfügung stellen, um die Heterogenität der Datenformate und Abfragesprachen zu reduzieren, sondern darüber hinaus sollten auch die verschiedenen zusätzlichen Dateneigenschaften beibehalten werden, wenn die Daten veröffentlicht oder genutzt werden.Weiters sollte die Auswahl der Datendienste und die Selektion der Daten diese Dateneigenschaften berücksichtigen.Das Ziel dieser Dissertation ist es, bessere Mittel bereitzustellen zur einfachen und dynamischen Integration von verteilten heterogenen Web-Datenquellen (insbesondere XML und RDFDatenquellen), in einer Weise, die es dem Benutzer vereinfachen, Datenintegrationsapplikationen zu erstellen, während gleichzeitig alle Dateneigenschaften mit den damit verbundenen Daten sichergestellt werden.Das Hauptthema dieser Arbeit ist der verteilten heterogenen Datenintegration für Web- Datenquellen gewidmet. Um die Herausforderung der XML und RDF-Datenintegration zu bewältigen, schlagen wir "DeXIN (Distributed extended XQuery for heterogeneous Data Integration)", ein erweiterbares Framework für die verteilte Verarbeitung von Abfragen über heterogene, verteilte und autonome Datenquellen vor. DeXIN verwendet ein Datenformat als Grundlage (das sogenannte "aggregation model") und erweitert die entsprechende Abfragesprache, um Abfragen über heterogene Datenquellen in ihren jeweiligen Abfragesprachen durchzuführen. Wir stellen eine Erweiterung von XQuery vor, welche die volle SPARQL Sprache abdeckt und die dezentrale Ausführung von XQuery als auch SPARQL in einer einzigen Abfrage unterstützt.Für die Sicherstellung der Dateneigenschaften, die mit den veröffentlichten Daten im Web verbunden sind, führen wir ein "Data Concerns Aware Query System" ein.Dieses System vereinigt mehrere Dateneigenschaften in eine Abfragesprache, wodurch es DatenserviceIntegrationssystemen erlaubt wird, Dateneigenschaften, die mit den Datendiensten verbunden sind, zu behandeln.Unser "Data Concerns Aware Query System" erweitert die XQuery-Sprache, um Dateneigenschaften zu berücksichten. Dafür werden spezielle Schlüsselwörter eingeführt, um Dateneigenschaften innerhalb der Abfrage auszudrücken.Im letzten Teil dieser Arbeit entwerfen wir ein Mashup-Tool, welches auf DeXIN aufbaut.Wir präsentieren eine Abfrage-basierte Aggregation von mehreren heterogenen Datenquellen durch die Kombination von vielseitigen Abfrage-Features von XQuery und SPARQL mit einer intuitiven Benutzerschnittstelle eines Mashup-Tools für Datenquellen in XML und RDF. Unser Mashup-Editor ermöglicht die automatische Generierung von Mashups mit einer einfach zu bedienenden visuellen Schnittstelle. Wir nutzen das Konzept der Daten-Mashups, um dynamisch heterogene Web-Datenquellen zu integrieren, indem wir die in DeXIN vorgeschlagene Erweiterung von XQuery benutzen.In modern business enterprises, it is frequent to develop an integrated application to provide uniform access to multiple existing information systems running internally or externally of the enterprise.Data integration is a pervasive challenge faced in these applications that need to query across multiple autonomous and heterogeneous data sources. Integrating such diverse information systems becomes a challenging task particularly when different applications use different data formats and query languages which are not compatible with each other.With the growing popularity of web technologies and availability of the huge amount of data on the web, the requirements for data integration has changed from the traditional database integration approaches. The large scale of web data sources has not only led to high levels of distribution, heterogeneity, different data formats and query languages.Additionally, the data is also associated with data concerns like privacy, licensing, pricing, quality of data, etc. Hence, the data integration tools not only have to provide the optimal solution to mitigate the heterogeneity in data formats and query languages. In addition, also the various data concerns should be preserved when data is published and utilized. Moreover, data service selection and data selection should be based on these data concerns.The goal of this thesis is to provide better means to easily and dynamically integrate distributed heterogeneous web data sources (particularly XML and RDF data sources) in such a way that the user can easily build data integration applications while assuring all the data concerns associated with the data.The main topic of this work is devoted to the distributed heterogeneous data integration for web data sources. In order to deal with the challenge of XML and RDF data integration, we propose "DeXIN (Distributed extended XQuery for heterogeneous data INtegration)", an extensible framework for distributed query processing over heterogeneous, distributed and autonomous data sources. DeXIN considers one data format as the basis (the so-called "aggregation model") and extends the corresponding query language to executing queries over heterogeneous data sources in their respective query languages. We come up with an extension of XQuery which covers the full SPARQL language and supports the decentralized execution of both XQuery and SPARQL in a single query.For the assurance of the data concerns associated with the published data over the web, we introduce a "Data Concerns Aware Querying System".A data concerns aware querying system incorporates several data concerns into a query language, thus enabling data services integration systems to handle data concerns associated with the data services. Our concerns aware querying system extends the XQuery language to make it concerns aware, with the introduction of special keywords for mentioning data concerns within the query.In the last part of this thesis, we design a mashup tool on top of DeXIN. We propose a query based aggregation of multiple heterogeneous data sources by combining powerful querying features of XQuery and SPARQL with an easy interface of a mashup tool for data sources in XML and RDF. Our mashup editor allows for automatic generation of mashups with an easy to use visual interface. For the dynamic integration of heterogeneous web data sources we utilize the concept of data mashups, which uses the extension of XQuery proposed in DeXIN

    Expressive RDF stream reasoning via data parallelism in answer set programming

    No full text
    The Web nowadays is highly dynamic with massive amounts of data being continuously generated from a huge number of devices and services across the Internet. Various application scenarios in several domains, such as environment monitoring, health care systems, and smart transportation, can hugely benefit from the ability to efficiently integrate and query data streams from these sources to provide better services. However, in such applications, it is not only capturing data streams that is important, but also the ability to extract insights from such streams, and use them to target users' needs, preferences and constraints. For this reason, different types of complex reasoning tasks need to be efficiently designed and executed on such streams to capture the sophisticated requirements of users. Stream Reasoning is an emerging research area which focuses on providing continuous complex reasoning capabilities over data streams. However, Stream Reasoning faces many challenges not only due to their heterogeneity but also due to the exponential growth in the availability of streaming data on the Web, which severely limits the complexity of reasoning that can be used to extract actionable knowledge in a scalable and reliable way. The key challenge addressed in this thesis is to enable expressive reasoning over massive, distributed, heterogeneous data streams in a scalable way. I address this problem by integrating Semantic Web for semantic integration, Answer Set Programming (ASP) for expressive reasoning, and Data Stream Management Systems for stream processing. The trade-off between scalability and expressivity in Stream Reasoning is considered, and parallel reasoning techniques are proposed to enhancing scalability while maintaining some of the key reasoning capabilities that are more expressive but also computationally more expensive. The thesis addresses two research questions related to how the expressivity and scalability of a reasoner can be improved when reasoning on Semantic Web data streams. For the first research question which targets expressivity, I propose C-ASP, a language extended from the ASP language with RDF streaming operators, which allows users to express complex requirements in terms of preferences and constraints, as a continuous reasoning request. The C-ASP reasoner is implemented to continuously evaluate such reasoning request when new data arrives. The experimental evaluation shows that the C-ASP engine outperforms the state-of-the-art RDF stream processing engine C-SPARQL. For the second research question which focuses on the scalability, I optimize the reasoning process of the C-ASP reasoner with a parallel approach based on data-level parallelism, and I demonstrate how the correctness of the results can be maintained. To do so, a clear characterization and formal definitions for analyzing the dependencies among input data streams are provided. The algorithms are developed to create a partitioning plan for guiding the parallel reasoning process to split data streams on-the-fly. Experiments show that applying this data-level parallelism improves the reasoning process significantly. The research discussed in this thesis has been deployed in two real-world scenarios in the context of Smart Cities where event-driven contextual knowledge extraction is introduced, and Smart Enterprise where an Internet of Things-enabled meeting management system is developed. The former aims at continuously identifying and filtering critical events that might affect the decision making of users while the latter investigates how to enhance users' experience in online meetings on-the-go by using mobile sensors embedded in a communication platform. By addressing the requirements of such scenarios, the prototypes demonstrate the validity and feasibility of the approach proposed in this thesis

    Going Beyond Counting First Authors in Author Co-citation Analysis

    Full text link
    The present study examines one of the fundamental aspects of author co-citation analysis (ACA) - the way co-citation counts are defined. Co-citation counting provides the data on which all subsequent statistical analyses and mappings are based, and we compare ACA results based on two different types of co-citation counting - the traditional type that only counts the first one among a cited work's authors on the one hand and a non-traditional type that takes into account the first 5 authors of a cited work on the other hand. Results indicate that the picture produced through this non-traditional author co-citation counting contains more coherent author groups and is therefore considerably clearer. However, this picture represents fewer specialties in the research field being studied than that produced through the traditional first-author co-citation counting when the same number of top-ranked authors is selected and analyzed. Reasons for these effects are discussed

    Variations on the Author

    Full text link
    “Variations on the Author” discusses two of Eduardo Coutinho’s recent films (Um Dia na Vida, from 2010, and Últimas Conversas, posthumously released in 2015) and their contribution to the general question of documentary authorship. The director’s filmography is characterized by a consistent yet self-effacing form of authorial self-inscription: Coutinho often features as an interviewer that rather than express opinions propels discourses; an interviewer that is good at listening. This mode of self-inscription characterizes him as an author who is not expressive but who is nonetheless markedly present on the screen. In Um Dia na Vida, however, Coutinho is completely absent form the image, while Últimas Conversas, on the contrary, includes a confessional prologue that moves the director from the margins to the center of his films. This article examines the ways in which these works stand out in the filmography of a director who offers new insights into the notion of cinematic authorship
    corecore