Search CORE

1,721,071 research outputs found

SSH researchers make an impact differently. Looking at public research from the perspective of users

Author: Bonaccorsi Andrea
Chiarello Filippo
Fantoni Gualtiero
Publication venue
Publication date: 01/01/2021
Field of study

With the rise of the impact assessment revolution, governments and public opinion have started to ask researchers to give evidence of their impact outside the traditional audiences, i.e. students and researchers. There is a mismatch between the request to demonstrate the impact and the current methodologies for impact assessment. This mismatch is particularly worrisome for the research in Social Sciences and Humanities. This paper gives a contribution by examining systematically a key element of impact, i.e. the social groups that are directly or indirectly affected by the results of research. We use a Text mining approach applied to the Research Excellence Framework (REF) collection of 6,637 impact case studies in order to identify social groups mentioned by researchers. Differently from previous studies, we employ a lexicon of user groups that includes 76,857 entries, which saturates the semantic field, permits the identification of all users and opens the way to normalization. We then develop three new metrics measuring Frequency, Diversity and Specificity of user expressions. We find that Social Sciences and Humanities exhibit a distinctive structure with respect to frequency and specificity of users

Archivio della Ricerca - Università di Pisa

AI-Based Knowledge Extraction from the Bioprinting Literature for Identifying Technology Trends

Author: Bonatti Amedeo Franco
Chiarello Filippo
Vozzi Giovanni
De Maria Carmelo
Publication venue
Publication date: 01/01/2023
Field of study

B ioprinting is a rapidly evolving field, as represented by the exponential growth of articles and reviews published each year on the topic. As the number of publications increases, there is a need for an automatic tool that can help researchers do more comprehensive literature analysis, standardize the nomenclature, and so accelerate the development of novel manufacturing techniques and materials for the field. In this context, we propose an automatic keyword annotation model, based on Natural Language Processing (NLP) techniques, that can be used to find insights in the bioprinting scientific literature. The approach is based on two main data sources, the abstracts and related author keywords, which are used to train a composite model based on (i) an embeddings part (using the FastText algorithm), which generates word vectors for an input keyword, and (ii) a classifier part (using the Support Vector Machine algorithm), to label the keyword based on its word vector into a manufacturing technique, employed material, or application of the bioprinted product. The composite model was trained and optimized based on a two-stage optimization procedure to yield the best classification per- formance. The annotated author keywords were then reprojected on the abstract collection to both generate a lexicon of the bioprinting field and extract relevant information, like technology trends and the relationship between manufacturing-material-application. The proposed approach can serve as a basis for more complex NLP-related analysis toward the automated analysis of the bioprinting literature

Archivio della Ricerca - Università di Pisa

A simple and fast method for Named Entity context extraction from patents

Author: Puccetti Giovanni
Chiarello Filippo
Fantoni Gualtiero
Publication venue
Publication date: 01/01/2021
Field of study

The process of extracting relevant technical information from patents or technical literature is as valuable as it is challenging. It deals with highly relevant information extraction from a corpus of documents with particular structure, and a mix of technical and legal jargon. Patents are the wider free source of technical information where homogeneous entities can be found. From a technical perspective the approaches refer to Named Entity Recognition (NER) and make use of Machine Learning techniques for Natural Language Processing (NLP). However, due to the large amount of data, to the complexity of the lexicon, the peculiarity of the structure and the scarcity of the examples to be used to feed the machine learning system, new approaches should be studied. NER methods are increasing their performances in many contexts, but a gap still exists when dealing with technical documentation. The aim of this work is to create an automatic training sets for NER systems by exploiting the nature and structure of patents, an open and massive source of technical documentation. In particular, we focus on collecting the context where users of the invention appear within patents. We then measure to which extent we achieve our goal and discuss how much our method is generalizable to other entities and documents

Archivio istituzionale della Ricerca - Scuola Normale Superiore

Design and Implementation of a Text Mining-based Tool to Support Scoping Reviews

Author: Chiarello Filippo
Martini Antonella
Gastaldi Luca
Publication venue
Publication date: 01/01/2023
Field of study

Among literature reviews, scoping review is a relatively new approach that is increasingly gaining popularity since it helps researchers in defining emerging and multidisciplinary fields. While artificial intelligence for text processing can help researchers in this sense, we still lack clear procedures and tools to improve the reviewing process. Following a design science approach, in this article we propose a novel tool based on natural language processing (NLP) to support scoping review and to visualise its results. The tool (NLP4Scoping) is implemented using open-source software and is made available for reuse on GitHub. Each phase for its proper application is described focusing on the nascent literature stream on innovation management in digital ecosystems

Archivio istituzionale della ricerca - Politecnico di Milano

Design and Implementation of a Text mining based Tool to Support Scoping Reviews

Author: MARTINI ANTONELLA
GASTALDI LUCA
CHIARELLO FILIPPO
Publication venue
Publication date: 01/01/2023
Field of study

Archivio della Ricerca - Università di Pisa

Analyzing Social Robotics Research with Natural Language Processing Techniques

Author: Chiarello Filippo
Mazzei Daniele
Fantoni Gualtiero
Publication venue
Publication date: 01/01/2021
Field of study

The fast growth of social robotics (SR) has not been unidirectional, but rather towards a multidisciplinary scenario, creating a need for collaboration between different fields. This divergent expansion calls for a clear analysis of the field aimed at better orienting the research, thus paving the future of social robotics. This paper aims at understanding how the SR research field evolved in the last two decades by analyzing academic publications in SR and human–robot interaction using natural language processing (NLP) techniques. The analysis spotted an overlap between SR and human–robot interaction research fields that have been disambiguated using a data-driven approach that leads to the identification of a new group of papers we clustered under the concept of “soft HRI.” This research topic has been analyzed by extracting trends and insights. Finally, another topic modelling step has been applied to identify seven sub-topics that have been discussed and analyzed picturing the current state of the art of SR. The paper reports a complete overview of the SR research field identifying various topics and sub-topics helping researchers in understanding the evolution of this field, thus supporting the strategic placing and evolution of their research activities

Archivio della Ricerca - Università di Pisa

The effect of linguistic style on solvers’ success? An empirical analysis in a crowdsourcing community

Author: Chiarello Filippo
Piazza Mariangela
Mazzola Erica
Publication venue
Publication date: 04/02/2025
Field of study

Purpose: Prior crowdsourcing literature has highlighted that communication among peers within the crowdsourcing community matters since it affects the solvers’ success. Particularly, previous scholars have focused on how the volume and the content of communications among solvers improve their creativity and likelihood of winning crowdsourcing contests. This study aims to understand whether, alongside these two communication dimensions, the linguistic style of solvers’ communications (i.e. how solvers write things) permits them to promote their qualities in seekers’ eyes and emerge from the crowd. Design/methodology/approach: Empirically, we collected data from a sample of 1866 solvers within the community of the 99designs crowdsourcing platform to build an ad-hoc dataset and test our hypotheses by running an econometric analysis. Findings: Our results show that by posting comments of moderate length and prudent complexity, characterized by positive language and an other-oriented perspective, solvers can signal their capabilities and skills to increase their likelihood of succeeding in crowdsourcing contests. Originality/value: This research’s findings contribute to prior crowdsourcing literature, which has so far exclusively focused on the linguistic style used by seekers when drafting the requests for proposals of their competitions. Moreover, the paper offers practical guidance for both solvers and seekers, suggesting how to leverage peer communications in crowdsourcing contests

Archivio istituzionale della ricerca - Università di Palermo

On How Technological Evolution, Organizational Paradigms, Global Trends and Events are Shaping the Research on Project Management

Author: CHIARELLO FILIPPO
GIORDANO VITO
FANTONI GUALTIERO
MARTINI ANTONELLA
Publication venue
Publication date: 01/01/2022
Field of study

Archivio della Ricerca - Università di Pisa

Standardising job descriptions in the humanitarian supply chain: A text mining approach for recruitment process

Author: Spada Irene
Chiarello Filippo
Fabbroni Valeria
Fantoni Gualtiero
Publication venue
Publication date: 01/01/2024
Field of study

Purpose Uncertainty and complexity have increased in recent decades, posing new challenges to humanitarian organisations. This study investigates whether using standard terminology in Human Resource Management processes can support the Humanitarian supply chain in attracting and maintaining highly skilled operators. Methodology We exploit text mining to compare job vacancies on ReliefWeb, the reference platform for humanitarian job seekers, and ESCO, the European Classification of Skills, Competencies, and Occupations. We measure the level of alignment in these two resources, providing quantitative evidence about terminology standardisation in job descriptions for supporting HR operators in the Humanitarian field. Findings The most in-demand skills, besides languages, relate to resource management and economics and finance for capital management. Our results show that job vacancies for managerial and financial profiles are relatively more in line with the European database than those for technical profiles. However, the peculiarities of the humanitarian sector and the lack of standardisation are still a barrier to achieving the desired level of coherence with humanitarian policies

Archivio della Ricerca - Università di Pisa

Automatic users extraction from patents

Author: CIMINO ANDREA
Dell'Orletta Felice
Chiarello Filippo
Fantoni Gualtiero
Publication venue
Publication date: 01/01/2018
Field of study

Patents contain a large quantity of information which is usually neglected. This information is hidden beneath technical and juridical jargon and therefore so many potential readers cannot take advantage of it. State of the art natural language processing tools and in particular named entity recognition tools, could be used to detect valuable concepts in patent documents. The purpose of the present research is to design a method capable of automatically detecting and extracting one of the multiple entities hidden in patents: the users of the invention. The method is based on a new approach tailored for users extraction by integrating state-of-the-art computational linguistics tools with a large knowledge base. Furthermore the paper shows a comparison among different machine learning algorithms with the twofold aim of achieving the highest recall and evaluating the performance in terms of precision and computational effort. Finally, a case study on two patent sets has been conducted to evaluate the effectiveness and the output of the entire tool-chain

Archivio della Ricerca - Università di Pisa