1,721,027 research outputs found
An in-depth investigation on the behavior of measures to quantify reproducibility
Science is facing a so-called reproducibility crisis, where researchers struggle to repeat experiments and to get the same or comparable results. This represents a fundamental problem in any scientific discipline because reproducibility lies at the very basis of the scientific method. A central methodological question is how to measure reproducibility and interpret different measures. In Information Retrieval (IR), current practices to measure reproducibility rely mainly on comparing averaged scores. If the reproduced score is close enough to the original one, the reproducibility experiment is deemed successful, although the identical scores can still rely on entirely different result lists. Therefore, this paper focuses on measures to quantify reproducibility in IR and their behavior. We present a critical analysis of IR reproducibility measures by synthetically generating runs in a controlled experimental setting, which allows us to control the amount of reproducibility error. These synthetic runs are generated by a deterioration algorithm based on swaps and replacements of documents in ranked lists. We investigate the behavior of different reproducibility measures with these synthetic runs in three different scenarios. Moreover, we propose a normalized version of Root Mean Square Error (RMSE) to quantify reproducibility better. Experimental results show that a single score is not enough to decide whether an experiment is successfully reproduced because such a score depends on the type of effectiveness measure and the performance of the original run. This study highlights how challenging it can be to reproduce experimental results and quantify the amount of reproducibility.</p
Going Beyond Counting First Authors in Author Co-citation Analysis
The present study examines one of the fundamental aspects of author co-citation analysis (ACA) - the way co-citation
counts are defined. Co-citation counting provides the data on which all subsequent statistical analyses and mappings
are based, and we compare ACA results based on two different types of co-citation counting - the traditional type that
only counts the first one among a cited work's authors on the one hand and a non-traditional type that takes into
account the first 5 authors of a cited work on the other hand. Results indicate that the picture produced through this non-traditional author co-citation counting contains more coherent author groups and is therefore considerably clearer. However, this picture represents fewer specialties in the research field being studied than that produced through the traditional first-author co-citation counting when the same number of top-ranked authors is selected and analyzed. Reasons for these effects are discussed
Variations on the Author
“Variations on the Author” discusses two of Eduardo Coutinho’s recent films (Um Dia na Vida, from 2010, and Últimas Conversas, posthumously released in 2015) and their contribution to the general question of documentary authorship. The director’s filmography is characterized by a consistent yet self-effacing form of authorial self-inscription: Coutinho often features as an interviewer that rather than express opinions propels discourses; an interviewer that is good at listening. This mode of self-inscription characterizes him as an author who is not expressive but who is nonetheless markedly present on the screen. In Um Dia na Vida, however, Coutinho is completely absent form the image, while Últimas Conversas, on the contrary, includes a confessional prologue that moves the director from the margins to the center of his films. This article examines the ways in which these works stand out in the filmography of a director who offers new insights into the notion of cinematic authorship
Appropriate Similarity Measures for Author Cocitation Analysis
We provide a number of new insights into the methodological discussion about author cocitation analysis. We first argue that the use of the Pearson correlation for measuring the similarity between authors’ cocitation profiles is not very satisfactory. We then discuss what kind of similarity measures may be used as an alternative to the Pearson correlation. We consider three similarity measures in particular. One is the well-known cosine. The other two similarity measures have not been used before in the bibliometric literature. Finally, we show by means of an example that our findings have a high practical relevance.information science;Pearson correlation;cosine;similarity measure;author cocitation analysis
Dispelling the Myths Behind First-author Citation Counts
We conducted a full-scale evaluative citation analysis study of scholars in the XML research field to explore just how different from each other author rankings resulting from different citation counting methods actually are, and to demonstrate the capability of emerging data and tools on the Web in supporting more realistic citation counting methods. Our results contest some common arguments for the continued
use of first-author citation counts in the evaluation of scholars, such as high correlations between author rankings by first-author citation counts and other citation
counting methods, and high costs of using more realistic citation counting methods that are not well-supported by the ISI databases. It is argued that increasingly available digital full text research papers make it possible for citation analysis studies to go beyond what the ISI databases have directly supported and to employ more
sophisticated methods
Informetric Analyses and Non-textual Document Attributes for Information Retrieval in Digital Libraries
Die Suche nach wissenschaftlicher Literatur ist eine Forschungsherausforderung für das Information Retrieval im besonderen Umfeld der digitalen Bibliotheken. Aktuelle Nutzerstudien zeigen, dass im klassischen IR-Modell zwei typische Schwächen auszumachen sind: das Ranking der gefundenen Dokumente und Probleme bei der Formulierung von Suchanfragen. Gleichzeitig ist zu sehen, dass traditionelle Retrievalsysteme, die primär textuelle Dokument- und Anfragemerkmale nutzen, bei IR-Evaluationskampagnen wie TREC und CLEF in ihrer Leistung seit Jahren stagnieren.
Zwei informetrisch-motivierte Verfahren zur Suchunterstützung werden vorgestellt und mittels einer Laborevaluation mit den beiden IR-Testkollektionen GIRT und iSearch sowie 150 und 65 Topics evaluiert. Die Verfahren sind: (1) eine auf der Kookkurrenz von Dokumentattributen basierende Anfrageerweiterung und (2) ein Rankingansatz, der informetrische Beobachtungen zur Produktivität von Informationserzeugern ausnutzt. Beide Verfahren wurden mit einer Referenzimplementation auf Basis der Suchmaschine Solr verglichen. Beide Verfahren zeigen positive Effekte beim Einsatz von zusätzlichen Dokumentattributen wie Autorennamen, ISSN-Codes und kontrollierten Schlagwörtern. Bei der Anfrageerweiterung konnte ein positiver Effekt in Form einer Verbesserung der Precision (bpref +12%) und des Recall (R +22%) erzielt werden. Die alternativen Rankingansätze konnten beim Ansatz von Autorennamen und ISSN-Codes die Baseline erreichen bzw. diese beim Einsatz der kontrollierten Schlagwörter über- treffen (MAP +14%). Einen negativen Einfluss auf das Ranking hatten allerdings die Einbeziehung von Faktoren wie Verlagsnamen oder Erscheinungsorten. Für beide Verfahren konnte eine substantiell andere Sortierung der Ergebnismenge, gemessen anhand von Kendalls, beobachtet werden. Zusätzlich zu der verbesserten Relevanz der Ergebnisliste kann der Nutzer so eine neue Sicht auf die Dokumentenmenge gewinnen.
Die Anfrageerweiterung mit Autorennamen, ISSN-Codes und Thesaurustermen zeigt das bisher ungenutzte Potential, das sich in digitalen Bibliotheken durch die Datenfülle und -qualität ergibt. Die Rankingverfahren konnten die Leistung des Baseline-Systems übertreffen, nachdem eine Überprüfung auf Vorliegen einer Power Law-Verteilung und eine anschließende Filterung durchgeführt wurde. Dies zeigt, dass die Rankingverfahren nicht universell für alle Suchanfragen anwendbar sind, sondern ein Vorhandensein bestimmter Häufigkeitsverteilungen voraussetzen. So wird die enge Verbindung der Verfahren zu informetrischen Gesetzmäßigkeiten wie Bradfords, Lotkas oder Zipfs Gesetz deutlich. Die beiden in der Arbeit evaluierten Verfahren sind als interaktive Suchunterstützungsdienste in der sozialwissenschaftlichen digitalen Bibliothek Sowiport implementiert. Die Verfahren lassen sich über entsprechende Web- Schnittstellen auch in anderen Anwendungskontexten einsetzen.The search for scientific literature in scientific information systems is a discipline at the intersection between information retrieval and digital libraries. Recent user studies show two typical weaknesses of the classical IR model: ranking of retrieved and maybe relevant documents and the language problem during the query formulation phase. At the same time traditional retrieval systems that rely primarily on textual document and query features are stagnating for years, as it could be observed in IR evaluation campaigns such as TREC or CLEF. Therefore alternative approaches to surpass these two problem fields are needed. Two different search support systems are presented in this work and evaluated with a lab evaluation using the IR test collection GIRT and iSearch with 150 and 65 topics, respectively. These two systems are (1) a query expansion that is based on the analysis of co-occurrences of document attributes and (2) a ranking mechanism that applies informetric analysis of the productivity of information producers in the information production process. Both systems were compared to a baseline system using the Solr search engine. Both methods showed positive effects when applying additional document attributes like author names, ISSN codes and controlled terms. The query expansion showed an improvement in precision (bpref +12%) and in recall (R +22%).
he alternative ranking methods were able to compete with the baseline for author names and ISSN codes and were able to beat the baseline by using controlled terms (MAP +14%). A clear negative influence was seen when using entities like publishers or locations. Both methods were able to generate a substantially different sorting of the result set, measured using Kendall. So, additional to the improved relevance in the result list, the user can get a new and different view on the document set. Query expansion using author names, ISSN codes and thesaurus terms showed great potential that lies within the rich metadata sets of digital library systems. The proposed ranking methods could outperform standard relevance ranking methods after they were filtered by the existence of a so-called power law. This showed that the proposed ranking methods cannot be used universally in any case but require specific frequency distributions in the metadata. A connection between the underlying informetric laws of Bradford, Lotka and Zipf is made clear. The evaluated methods were implemented as interactive search supporting systems that can be used in an interactive prototype and the social science digital library system Sowiport. Besides that, the methods are adaptable to other systems and environments using a free software framework and a web API
koamabayili/VECTRON-author-checklist: VECTRON author checklist
We have done our best to complete the author checklist relating to the use of animals in the hut study. Note that the objective for the hut study was to evaluate the IRS treatment applications for residual efficacy against Anopheles mosquitoes, including the local An. coluzzii mosquito population. Cows were only used to attract mosquitoes into the huts and no tests were carried out directly on the cows. The author checklist is intended for use with studies where experiments are carried out on animals, which is why we have had such difficulty in completing this for the hut study, as many of the questions do not relate to how the cows were used
Author-wise bibliometric analysis based on entropy.
Author-wise bibliometric analysis based on entropy.</p
- …
