1,520 research outputs found

    Onderzoek op het HBO transformeert de samenleving

    No full text
    De wereld verandert: om ons heen zien we langzaam systemen en instituties omvallen. Wat daarvoor in de plaats komt? "Hbo-onderzoek speelt een cruciale rol bij het beantwoorden van die vraag", stellen Ineke van der Meule (directeur Centrum voor Lectoraten en Onderzoek) en Bert Mulder (lector Informatie, Technologie en Samenleving)

    Probing BERT for Ranking Abilities

    No full text
    Contextual models like BERT are highly effective in numerous text-ranking tasks. However, it is still unclear as to whether contextual models understand well-established notions of relevance that are central to IR. In this paper, we use probing, a recent approach used to analyze language models, to investigate the ranking abilities of BERT-based rankers. Most of the probing literature has focussed on linguistic and knowledge-aware capabilities of models or axiomatic analysis of ranking models. In this paper, we fill an important gap in the information retrieval literature by conducting a layer-wise probing analysis using four probes based on lexical matching, semantic similarity as well as linguistic properties like coreference resolution and named entity recognition. Our experiments show an interesting trend that BERT-rankers better encode ranking abilities at intermediate layers. Based on our observations, we train a ranking model by augmenting the ranking data with the probe data to show initial yet consistent performance improvements (The code is available at https://github.com/yolomeus/probing-search/ ).Green Open Access added to TU Delft Institutional Repository 'You share, we take care!' - Taverne project https://www.openaccess.nl/en/you-share-we-take-care Otherwise as indicated in the copyright section: the publisher is the copyright holder of this work and the author uses the Dutch legislation to make this work public.Web Information System

    BERT Rankers are Brittle: A Study using Adversarial Document Perturbations

    No full text
    Contextual ranking models based on BERT are now well established for a wide range of passage and document ranking tasks. However, the robustness of BERT-based ranking models under adversarial inputs is under-explored. In this paper, we argue that BERT-rankers are not immune to adversarial attacks targeting retrieved documents given a query. Firstly, we propose algorithms for adversarial perturbation of both highly relevant and non-relevant documents using gradient-based optimization methods. The aim of our algorithms is to add/replace a small number of tokens to a highly relevant or non-relevant document to cause a large rank demotion or promotion. Our experiments show that a small number of tokens can already result in a large change in the rank of a document. Moreover, we find that BERT-rankers heavily rely on the document start/head for relevance prediction, making the initial part of the document more susceptible to adversarial attacks. More interestingly, we find a small set of recurring adversarial words that when added to documents result in successful rank demotion/promotion of any relevant/non-relevant document respectively. Finally, our adversarial tokens also show particular topic preferences within and across datasets, exposing potential biases from BERT pre-training or downstream datasets. Green Open Access added to TU Delft Institutional Repository 'You share, we take care!' - Taverne project https://www.openaccess.nl/en/you-share-we-take-care Otherwise as indicated in the copyright section: the publisher is the copyright holder of this work and the author uses the Dutch legislation to make this work public.Web Information System

    Author Ben Ames Williams first met Searsmont farmer Bert McCorrison in 1918, a m

    No full text
    Author Ben Ames Williams first met Searsmont farmer Bert McCorrison in 1918, a meeting which the author said had a profound impact on his professional career. McCorrison died in 1931, leaving Williams his Hardscrabble Farm in Searsmount, which became the author\u27s home until his death in 1953

    Forged-GAN-BERT: Authorship Attribution for LLM-Generated Forged Novels

    Full text link
    The advancement of generative Large Language Models (LLMs), capable of producing human-like texts, introduces challenges related to the authenticity of the text documents. This requires exploring potential forgery scenarios within the context of authorship attribution, especially in the literary domain. Particularly, two aspects of doubted authorship may arise in novels, as a novel may be imposed by a renowned author or include a copied writing style of a well-known novel. To address these concerns, we introduce Forged-GAN-BERT, a modified GAN-BERT-based model to improve the classification of forged novels in two data-augmentation aspects: via the Forged Novels Generator (i.e., ChatGPT) and the generator in GAN. Compared to other transformer-based models, the proposed Forged-GAN-BERT model demonstrates an improved performance with F1 scores of 0.97 and 0.71 for identifying forged novels in single-author and multi-author classification settings. Additionally, we explore different prompt categories for generating the forged novels to analyse the quality of the generated texts using different similarity distance measures , including ROUGE-1, Jaccard Similarity, Overlap Confident, and Cosine Similarity

    Dave Hunter and Bert McDonald

    No full text
    Photograph - Dave Hunter Addresses the Haggis at Robbie Burns night at Royal Canadian Legion, Athabasca Branch No. 103, Athabasca, Alberta. Bert McDonald is on the left. February 6, 196

    Bert Pary House - 02

    No full text
    Photograph - This building was built in 1912 and was owned by Bert Pary, a telegrapher and lineman. It was purchased in 1927 by Dean Galloway, a UGG grain buyer and his widow Catherine lived in the house until 1973. Ukrainian Catholic priest Father Karychuk and his wife bought the house and passed ownership to their daughter in 1995. It was demolished in 1995 and the native Friendship Centre was built on the sit

    Bert Pary House

    No full text
    Photograph - This building was built in 1912 and was owned by Bert Pary, a telegrapher and lineman. It was purchased in 1927 by Dean Galloway, a UGG grain buyer and his widow Catherine lived in the house until 1973. Ukrainian Catholic priest Father Karychuk and his wife bought the house and passed ownership to their daughter in 1995. It was demolished in 1995 and the native Friendship Centre was built on the sit

    Demolishing the Bert Pary House

    No full text
    Photograph - This building was built in 1912 and was owned by Bert Pary, a telegrapher and lineman. It was purchased in 1927 by Dean Galloway, a UGG grain buyer and his widow Catherine lived in the house until 1973. Ukrainian Catholic priest Father Karychuk and his wife bought the house and passed ownership to their daughter in 1995. It was demolished in 1995 and the Native Friendship Centre was built on the site

    Transfer Learning for Automatic Author Profiling with BERT Transformers and GloVe Embeddings

    No full text
    Historically author profiling has been used in forensic linguistics. However, it is not until the last decades that the analysis method has worked into computer science and machine learning. In comparison, determining author profiling characteristics in machine learning is nothing new. This paper investigates the possibility to improve upon previous results with modern frameworks using data sets that have seen limited usage. The purpose of this master thesis was to use pre-trained transformers or embeddings together with transfer learning. In addition, to examine if general author profiling characteristics of anonymous users on internet forums or conversations on social media could be determined. The data sets used to investigate the questions above were PAN15 and PANDORA, which contains various properties in text data based on authors paired with ground truth labels such as gender, age, and Big Five/OCEAN. In addition, transfer learning of BERT and GloVe was used as a starting point to decrease the learning time of a new task. PAN15, a Twitter data set, did not contain enough data when training a model and was augmented using PANDORA, a Reddit-based data set. Ultimately, BERT obtained the best performance using a stacked approach, achieving 86 − 91% accuracy for each label on unseen data
    corecore