1,721,053 research outputs found

    LambdaRank Gradients are Incoherent

    Full text link
    In Information Retrieval (IR), the Learning-to-Rank (LTR) task requires building a ranking model that optimises a specific IR metric. One of the most effective approaches to do so is the well-known LambdaRank algorithm. LambdaRank uses gradient descent optimisation, and at its core, it defines approximate gradients, the so-called lambdas, for a non-differentiable IR metric. Intuitively, each lambda describes how much a document's score should be pushed up/down to reduce the ranking error. In this work, we show that lambdas may be incoherent w.r.t. the metric being optimised: e.g., a document with high relevance in the ground truth may receive a smaller gradient push than a document with lower relevance. This behaviour goes far beyond the expected degree of approximation. We analyse such behaviour of LambdaRank gradients and we introduce some strategies to reduce their incoherencies. We demonstrate through extensive experiments, conducted using publicly available datasets, that the proposed approach reduces the frequency of the incoherencies in LambdaRank and derivatives, and leads to models that achieve statistically significant improvements in the NDCG metric, without compromising the training efficiency

    Filtering out Outliers in Learning to Rank

    Full text link
    Outlier data points are known to affect negatively the learning process of regression or classification models, yet their impact in the learning-to-rank scenario has not been thoroughly investigated so far. In this work we propose SOUR, a learning-to-rank method that detects and removes outliers before building an effective ranking model. We limit our analysis to gradient boosting decision trees, where SOUR searches for outlier instances that are incorrectly ranked in several iterations of the learning process. Extensive experiments show that removing a limited number of outlier data instances before re-training a new model provides statistically significant improvements, and that SOUR outperforms state-of-the-art de-noising and outlier detection methods

    On the Effect of Low-Ranked Documents: A New Sampling Function for Selective Gradient Boosting

    Full text link
    Learning to Rank is the task of learning a ranking function from a set of query-documents pairs. Generally, documents within a query are thousands but not all documents are informative for the learning phase. Different strategies were designed to select the most informative documents from the training set. However, most of them focused on reducing the size of the training set to speed up the learning phase, sacrificing effectiveness. A first attempt in this direction was achieved by Selective Gradient Boosting a learning algorithm that makes use of customisable sampling strategy to train effective ranking models. In this work, we propose a new sampling strategy called High-Low-Sampl for selecting negative examples applicable to Selective Gradient Boosting, without compromising model effectiveness. The proposed sampling strategy allows Selective Gradient Boosting to compose a new training set by selecting from the original one three document classes: the positive examples, high-ranked negative examples and low-ranked negative examples. The resulting dataset aims at minimizing the mis-ranking risk, i.e., enhancing the discriminative power of the learned model and maintaining generalisation to unseen instances. We demonstrated through an extensive experimental analysis on publicly available datasets, that the proposed selection algorithm is able to make the most of the negative examples within the training set and leads to models capable of obtaining statistically significant improvements in terms of NDCG, compared to the state of the art

    Tavola rotonda su museo archeologico e nuovi scavi a Bari

    No full text
    resoconto del dibattito sull'aggiornamento della ricerca archeologica a Bari e sulle sorti del museo archeologico local

    SOUR: an Outliers Detection Algorithm in Learning to Rank (Abstract)

    Full text link
    Outlier data points are known to affect negatively the learning process of regression or classification models, yet their impact in the learning-to-rank scenario has not been thoroughly investigated so far. In this talk we present our effort to solve this research problem. The full version of this work will appear at ICTIR 2022 [1]. We designed SOUR, a learning-to-rank method that detects and removes outliers before building an effective ranking model. We limit our analysis to gradient boosting decision trees, but our algorithm can be easily adapted to handle different learning strategy, such as artificial Neural Network. SOUR searches for outlier instances that are consistently incorrectly ranked in several consecutive iterations of the learning process. We performed an extensive evaluation analysis on three publicly available datasets and we empirically demonstrated that i) removing a limited number of outlier data instances before re-training a new model, provides statistically significant improvements in term of effectiveness ii) SOUR outperforms state-of-the-art de-noising and outlier detection methods such as [2]. Finally, we investigated how the removal of the outliers affects the ensemble structure and we found that the ensemble leaves were purer when trained without the presence of the outliers

    Does LambdaMART Do What You Expect?

    Full text link
    We analyse the idiosyncrasies of LambdaMART gradients and we introduce some strategies to remove or reduce gradient incoherencies. Specifically, we designed three selection strategies to compute the full gradient for only those documents that should be ranked in the top-k positions of the ranking. We empirically demonstrate on publicly available datasets that the proposed approach leads to models that can achieve statistically significant improvements in terms of NDCG while maintaining the same training efficiently as optimising truncated metrics

    Endophytic survival of Pseudomonas syringae pv. actinidiae in Actinidia chinensis 'Hort16A' plants

    No full text
    To tackle the epidemiological role of the latency phase of Pseudomonas syringae pv. actinidiae (Psa) in susceptible asymptomatic host plants, the pathogen survival and colonization were studied in seven-year-old plants of Actinidia chinensis ‘Hort16A’. The plants were inoculated with a virulent Psa gfp-expressing/rifampicin-resistant strain (Psagfp-Rifres) and the endophytic presence of Psa was determined by analysis of the whole plants to re-isolate Psagfp-Rifres and by PCR assays to confirm identity. Finally, the isolates were tested to verify their ability to induce disease symptoms and HR respectively in host and non-host plants. The Psagfp-Rifres presence inside the tissues of the experimentally contaminated plants was detected during the years following the inoculation. The data obtained showed that systemic colonization of host tissues by Psagfp-Rifres took place for a long period of time. The epidemiological significance of this finding raises questions about the effectiveness of the control measures to prevent bacterial canker solely based on antimicrobial treatment on plant surfaces

    A reliable method for Pseudomonas syringae pv. actinidiae detection in kiwifruit explants during in vitro propagation

    No full text
    According to recent statistics, in Italy a significant spread of the bacterial canker of Actinidia caused by Pseudomonas syringae pv. actinidiae (Psa) is recorded with peaks of 70 and 100%, and lower percentages ranging around 50%. The control of the Psa is a priority especially considering the hazard represented by the endophytic bacterial presence in nursery material. To evaluate the risk that could be associated with micropropagated shoots when using asymptomatic mother plants, Psa survival was studied in micro-propagated shoots of A. deliciosa ‘Hayward’ inoculated with different concentrations of a virulent Psa gfp-expressing/rifampicin-resistant strain (Psagfp-Rifres). Microbiological and molecular analysis were carried out at each in vitro transfer and on rooted plantlets in greenhouse. The Psagfp-Rifres reisolation on selective media confirmed by PCR analysis and by the ability to induce symptoms in Actinidia and HR in tobacco plants, was achieved in both the micropropagated materials and in the plants analyzed in toto after their re-establishment. A reliable and efficient method for Psa detection in nursery material is proposed to verify the presence Psa infection before symptom development

    LambdaFair for Fair and Effective Ranking

    Full text link
    Traditional machine learning algorithms are known to amplify bias in data or introduce new biases during the learning process, often resulting in discriminatory outcomes that impact individuals from marginalized or underrepresented groups. In information retrieval, one application of machine learning is learning-to-rank frameworks, typically employed to reorder items based on their relevance to user interests. This focus on effectiveness can lead to rankings that unevenly distribute exposure among groups, affecting their visibility to the final user. Consequently, ensuring fair treatment of protected groups has become a pivotal challenge in information retrieval to prevent discrimination, alongside the need to maximize ranking effectiveness. This work introduces LambdaFair, a novel in-processing method designed to jointly optimize effectiveness and fairness ranking metrics. LambdaFair builds upon the LambdaMART algorithm, harnessing its ability to train highly effective models through additive ensembles of decision trees while integrating fairness awareness. We evaluate LambdaFair on three publicly available datasets, comparing its performance with state-of-the-art learning algorithms in terms of both fairness and effectiveness. Our experiments demonstrate that, on average, LambdaFair achieves 6.7% higher effectiveness and only 0.4% lower fairness compared to state-of-the-art fairness-oriented learning algorithms. This highlights LambdaFair’s ability to improve fairness without sacrificing the model’s effectiveness
    corecore