1,785,579 research outputs found
Evaluating Sentence Representations for Biomedical Text: Methods and Experimental Results
Text representations ar one of the main inputs to various Natural Language Processing (NLP) methods. Given the fast developmental pace of new sentence embedding methods, we argue that there is a need for a unified methodology to assess these different techniques in the biomedical domain. This work introduces a comprehensive evaluation of novel methods across ten medical classification tasks. The tasks cover a variety of BioNLP problems such as semantic similarity, question answering, citation sentiment analysis and others with binary and multi-class datasets. Our goal is to assess the transferability of different sentence representation schemes to the medical and clinical domain. Our analysis shows that embeddings based on Language Models which account for the context-dependent nature of words, usually outperform others in terms of performance. Nonetheless, there is no single embedding model that perfectly represents biomedical and clinical texts with consistent performance across all tasks. This illustrates the need for a more suitable bio-encoder. Our MedSentEval source code, pre-trained embeddings and examples have been made available on GitHub
SimpleNLG-ZH: a Linguistic Realisation Engine for Mandarin
We introduce SimpleNLG-ZH, a realisation engine for Mandarin that follows the software design paradigm of SimpleNLG. We explain the core grammar (morphology and syntax) and the lexicon of SimpleNLG-ZH, which is very different from English and other languages for which SimpleNLG engines have been built. The system was evaluated by regenerating expressions from a body of test sentences and a corpus of human-authored expressions. Human evaluation was conducted to estimate the quality of regenerated sentences
Modelling Pro-drop with the Rational Speech Acts Model
We extend the classic Referring Expressions Generation task by considering zero pronouns in pro-drop languages such as Chinese, modelling their use by means of the Bayesian Rational Speech Acts model. By assuming that highly salient referents are most likely to be referred to by zero pronouns (i.e., pro-drop is more likely for salient referents than the less salient ones), the model offers an attractive explanation of a phenomenon not previously addressed probabilistically
Natural Language Processing in Textual Information Retrieval and Related Topics
This article is a review of the main characteristics of natural language processing techniques, focusing on its application to information retrieval and related areas
Visual Tools for Natural Language Processing
We describe GATE, the General Architecture for Text Engineering, an integrated visual development environment to support the visual assembly, execution and analysis of modular natural language processing systems. The visual model is an executable data flow program graph, automatically synthesised from data dependency declarations of language processing modules. The graph is then directly executable: modules are run interactively in the graph, and results are accessible via generic text visualisation tools linked to the modules. These tools lighten the cognitive load of viewing and compar-ing module results by relating data produced by modules back to the underlying text, by reducing the amount of search in examining results, and by displaying results in context. Overall, the GATE integrated visual development environment leads to rapid understanding of system behaviour and hence to rapid system refinement, therefore demonstrating the utility of visual programming and visualisation techniques for the development of natural language processing systems
M-RAM: a Mobile Risk Assessment Method for Enterprise Mobile Security
Mobile solutions seem to outrun the control and governance within enterprise organizations. The acceptance of smartphones and tablets in business has gone at such a high pace that organizations are no longer able to oversee the risks of their mobile usage. Traditional risk assessment methods do not consider usage of mobile devices— mobility—despite the fact that enterprise organizations struggle with managing mobile risks. We aim to fill this gap by introducing a Mobile Risk Assessment Method (M-RAM). The method is based on an evaluation of industry standard risk methods and 22 interviews with mobile security experts. Three components compose the method: (1) a risk assessment process that is customized for mobility, (2) involved entities that oppose risks, and (3) attention areas that can contain vulnerabilities as well as controls. Moreover, the study provides a practical work program to conduct the M-RAM and validates the approach by conducting a case study
Natural language processing for global and local business
The concept of natural language processing has become one of the preferred methods to better understand consumers, especially in recent years when digital technologies and research methods have developed exponentially. It has become apparent that when responding to international consumers through multiple platforms and speaking in the same language in which the consumers express themselves, companies are improving their standings within the public sphere. Natural Language Processing for Global and Local Business provides research exploring the theoretical and practical phenomenon of natural language processing through different languages and platforms in terms of today's conditions. Featuring coverage on a broad range of topics such as computational linguistics, information engineering, and translation technology, this book is ideally designed for IT specialists, academics, researchers, students, and business professionals seeking current research on improving and understanding the consumer experience
Diagnosis Classification in the Emergency Room Using Natural Language Processing
Diagnosis classification in the emergency room (ER) is a complex task. We developed several natural language processing classification models, looking both at the full classification task of 132 diagnostic categories and at several clinically applicable samples consisting of two diagnoses that are hard to distinguish
Transforming epilepsy research: A systematic review on natural language processing applications
Despite improved ancillary investigations in epilepsy care, patients' narratives remain indispensable for diagnosing and treatment monitoring. This wealth of information is typically stored in electronic health records and accumulated in medical journals in an unstructured manner, thereby restricting complete utilization in clinical decision-making. To this end, clinical researchers increasing apply natural language processing (NLP)—a branch of artificial intelligence—as it removes ambiguity, derives context, and imbues standardized meaning from free-narrative clinical texts. This systematic review presents an overview of the current NLP applications in epilepsy and discusses the opportunities and drawbacks of NLP alongside its future implications. We searched the PubMed and Embase databases with a “natural language processing” and “epilepsy” query (March 4, 2022) and included original research articles describing the application of NLP techniques for textual analysis in epilepsy. Twenty-six studies were included. Fifty-eight percent of these studies used NLP to classify clinical records into predefined categories, improving patient identification and treatment decisions. Other applications of NLP had structured clinical information retrieval from electronic health records, scientific papers, and online posts of patients. Challenges and opportunities of NLP applications for enhancing epilepsy care and research are discussed. The field could further benefit from NLP by replicating successes in other health care domains, such as NLP-aided quality evaluation for clinical decision-making, outcome prediction, and clinical record summarization
Representation Learning for Natural Language Processing
This open access book provides an overview of the recent advances in representation learning theory, algorithms and applications for natural language processing (NLP). It is divided into three parts. Part I presents the representation learning techniques for multiple language entries, including words, phrases, sentences and documents. Part II then introduces the representation techniques for those objects that are closely related to NLP, including entity-based world knowledge, sememe-based linguistic knowledge, networks, and cross-modal entries. Lastly, Part III provides open resource tools for representation learning techniques, and discusses the remaining challenges and future research directions. The theories and algorithms of representation learning presented can also benefit other related domains such as machine learning, social network analysis, semantic Web, information retrieval, data mining and computational biology. This book is intended for advanced undergraduate and graduate students, post-doctoral fellows, researchers, lecturers, and industrial engineers, as well as anyone interested in representation learning and natural language processing
- …
