1,720,970 research outputs found

    Enhancing generalization in Federated Learning with heterogeneous data: A comparative literature review

    Full text link
    Federated Learning (FL) is a collaborative training paradigm whereby a global Machine Learning (ML) model is trained using typically private and distributed data sources without disclosing the raw data. The approach paves the way for better privacy guarantees, improved overall system scalability, and sustainability. In this context, Federated Averaging (FedAvg) is a representative FL algorithm adopting a client–server protocol that operates in synchronous rounds, where selected learners contribute to the global model via local model updates, trained using their private data, while a server entity aggregates the local contributions, producing the new-generation global model as a weighted average of the local ones. However, when clients possess (highly) dissimilar data, the FedAvg technique becomes ineffective due to divergence in client models. Consequently, FedAvg-trained models struggle to generalize when presented with unseen data from the global distribution. In this research paper, we conduct a systematic review of state-of-the-art approaches proposed to counteract global model performance degradation in the presence of heterogeneous data. To this end, we compile an original taxonomy, highlighting the main algorithmic approaches and mechanisms behind each identified category. Advancing the current body of knowledge, we empirically evaluate the generalization performance on visual tasks of various methods under moderate and significant levels of data heterogeneity, as common practice within the surveyed literature. In addition, the paper benchmarks the performance of hybrid techniques, resulting as a combination of client- and server-side algorithmic tweaks, by shedding light on some associated performance tradeoffs. While recognizing other relevant issues in FL, such as device heterogeneity and energy consumption, which have a non-negligible impact on the learning process, these well-investigated topics are not the main focus of this article

    Decentralised Learning in Federated Deployment Environments

    Full text link
    Decentralised learning is attracting more and more interest because it embodies the principles of data minimisation and focused data collection, while favouring the transparency of purpose specification (i.e., the objective for which a model is built). Cloud-centric-only processing and deep learning are no longer strict necessities to train high-fidelity models; edge devices can actively participate in the decentralised learning process by exchanging meta-level information in place of raw data, thus paving the way for better privacy guarantees. In addition, these new possibilities can relieve the network backbone from unnecessary data transfer and allow it to meet strict low-latency requirements by leveraging on-device model inference. This survey provides a detailed and up-to-date overview of the most recent contributions available in the state-of-the-art decentralised learning literature. In particular, it originally provides the reader audience with a clear presentation of the peculiarities of federated settings, with a novel taxonomy of decentralised learning approaches, and with a detailed description of the most relevant and specific system-level contributions of the surveyed solutions for privacy, communication efficiency, non-IlDness, device heterogeneity, and poisoning defense

    FedUNRAN: On-device Federated Unlearning via Random Labels

    No full text
    In Federated Learning, a group of devices collaboratively learns a global machine learning model by periodically transferring locally computed model updates. This approach ensures the privacy of local data and encourages broader participation. However, as required by regulations such as the General Data Protection Regulation (GDPR), when a participating device decides to withdraw its contribution, FL systems must remove its data from the global model. In this paper, we introduce FedUNRAN, a novel and practical method to efficiently remove the contribution of an FL client from the global model without requiring retraining from scratch. Our method allows the requesting client to perform the unlearning procedure locally. FedUNRAN is lightweight, maintains the privacy-focused structure of FL, and enables effective client-side unlearning. We conduct an empirical evaluation of the method against the natural baseline, which involves simply detaching the unlearning client from training. Our results demonstrate that FedUNRAN is effective and efficient. Our code is available at: https://github.com/alessiomora/FedUNRA

    Federated Unlearning: A Survey on Methods, Design Guidelines, and Evaluation Metrics

    Full text link
    Federated learning (FL) enables collaborative training of a machine learning (ML) model across multiple parties, facilitating the preservation of users' and institutions' privacy by maintaining data stored locally. Instead of centralizing raw data, FL exchanges locally refined model parameters to build a global model incrementally. While FL is more compliant with emerging regulations such as the European General Data Protection Regulation (GDPR), ensuring the right to be forgotten in this context - allowing FL participants to remove their data contributions from the learned model - remains unclear. In addition, it is recognized that malicious clients may inject backdoors into the global model through updates, e.g., to generate mispredictions on specially crafted data examples. Consequently, there is the need for mechanisms that can guarantee individuals the possibility to remove their data and erase malicious contributions even after aggregation, without compromising the already acquired 'good' knowledge. This highlights the necessity for novel federated unlearning (FU) algorithms, which can efficiently remove specific clients' contributions without full model retraining. This article provides background concepts, empirical evidence, and practical guidelines to design/implement efficient FU schemes. This study includes a detailed analysis of the metrics for evaluating unlearning in FL and presents an in-depth literature review categorizing state-of-the-art FU contributions under a novel taxonomy. Finally, we outline the most relevant and still open technical challenges, by identifying the most promising research directions in the field

    Going Beyond Counting First Authors in Author Co-citation Analysis

    Full text link
    The present study examines one of the fundamental aspects of author co-citation analysis (ACA) - the way co-citation counts are defined. Co-citation counting provides the data on which all subsequent statistical analyses and mappings are based, and we compare ACA results based on two different types of co-citation counting - the traditional type that only counts the first one among a cited work's authors on the one hand and a non-traditional type that takes into account the first 5 authors of a cited work on the other hand. Results indicate that the picture produced through this non-traditional author co-citation counting contains more coherent author groups and is therefore considerably clearer. However, this picture represents fewer specialties in the research field being studied than that produced through the traditional first-author co-citation counting when the same number of top-ranked authors is selected and analyzed. Reasons for these effects are discussed

    Communication-Efficient Heterogeneous Federated Dropout in Cross-device Settings

    No full text
    Federated Dropout has emerged as an elegant solution to conjugate communication-efficiency and computation-reduction on Federated Learning (FL) clients. We claim that Federated Dropout can also efficiently cope with device heterogeneity by exploiting a server that broadcasts custom and differently-sized sub-models, selected from a discrete set of possible sub-models, to match the computation capability constraints of FL clients. In addition, we further reduce the up-link communication cost by applying per-layer or traditional Sparse Ternary Compression (STC) to sub-model updates. We demonstrate the effectiveness of our solution by reporting results for a well-known CNN used for classification tasks considering the Federated EMNIST dataset

    Variations on the Author

    Full text link
    “Variations on the Author” discusses two of Eduardo Coutinho’s recent films (Um Dia na Vida, from 2010, and Últimas Conversas, posthumously released in 2015) and their contribution to the general question of documentary authorship. The director’s filmography is characterized by a consistent yet self-effacing form of authorial self-inscription: Coutinho often features as an interviewer that rather than express opinions propels discourses; an interviewer that is good at listening. This mode of self-inscription characterizes him as an author who is not expressive but who is nonetheless markedly present on the screen. In Um Dia na Vida, however, Coutinho is completely absent form the image, while Últimas Conversas, on the contrary, includes a confessional prologue that moves the director from the margins to the center of his films. This article examines the ways in which these works stand out in the filmography of a director who offers new insights into the notion of cinematic authorship
    corecore