1,720,983 research outputs found

    DrugProt corpus: Biocreative VII Track 1 - Text mining drug and chemical-protein interactions

    No full text
    Gold Standard annotations of the DrugProt corpus (training and development sets) Introduction The aim of the DrugProt track (similar to the previous CHEMPROT task of BioCreative VI) is to promote the development and evaluation of systems that are able to automatically detect in relations between chemical compounds/drug and genes/proteins. We have therefore generated a manually annotated corpus, the DrugProt corpus, where domain experts have exhaustively labeled:(a) all chemical and gene mentions, and (b) all binary relationships between them corresponding to a specific set of biologically relevant relation types (DrugProt relation classes). There is also an increasing interested in the integration of chemical and biomedical data understood as curation of relationships between biological and chemical entities from text and storing such information in form of structured annotation databases. Such databases are of key relevance not only for biological but also for pharmacological and clinical research. A range of different types chemical-protein/gene interactions are of key relevance for biology, including metabolic relations (e.g. substrates, products) inhibition, binding or induction associations. The DrugProt track aims to address these needs and to promote the development of systems able to extract chemical-protein interactions that might be of relevance for precision medicine as well as for drug discovery and basic biomedical research. The DrugProt track in BioCreative VII (BC VII) will explore recognition of chemical-protein entity relations from abstracts. Teams participating in this track are provided with: PubMed abstracts Manually annotated chemical compound mentions Manually annotated gene/protein mentions Manually annotated chemical compound-protein relations Zip structure: Training set folder with drugprot_training_abstracts.tsv: PubMed records drugprot_training_entities.tsv: manually labeled mention annotations of chemical compounds and genes/proteins drugprot_training_relations.tsv: chemical-­protein relation annotations Development set folder with drugprot_development_abstracts.tsv drugprot_development_entities.tsv drugprot_development_relations.tsv Data format description The input text files for the DrugProt track will be plain-text, UTF8-encoded PubMed records in a tab-separated format with the following three columns: Article identifier (PMID, PubMed identifier) Title of the article Abstract of the article DrugProt entity mention annotation files contain manually labeled mention annotations of chemical compounds and genes/proteins. Such files consist of tab-separated fields containing the following six columns: Article identifier (PMID) Term number (for this record) Type of entity mention (CHEMICAL, GENE-Y, GENE-N) Start character offset of the entity mention End character offset of the entity mention Text string of the entity mention Each line contains one entity, and each entity is uniquely identified by its PMID and the Term Number. Besides, each annotation contains an annotation type, the start-offset -the index of the first character of the annotated span in the text-, the end-offset -the index of the first character after the annotated span- and the text spanned by the annotation. Example DrugProt training entity mention annotations: 11808879 T1 GENE-Y 1860 1866 KIR6.2 11808879 T2 GENE-N 1993 2016 glutamate dehydrogenase 11808879 T3 GENE-Y 2242 2253 glucokinase 23017395 T1 CHEMICAL 216 223 HMG-CoA 23017395 T2 CHEMICAL 258 261 EPA Example DrugProt development entity mention annotations (no distinction between GENE-Y and GENE-N): 11808879 T1 GENE 1860 1866 KIR6.2 11808879 T2 GENE 1993 2016 glutamate dehydrogenase 11808879 T3 GENE 2242 2253 glucokinase 23017395 T1 CHEMICAL 216 223 HMG-CoA 23017395 T2 CHEMICAL 258 261 EPA DrugProt relation annotations will be distributed as a file that contains the detailed chemical-protein relation annotations prepared for the DrugProt track. It consists of tab-separated columns containing: Article identifier (PMID) DrugProt relation Interactor argument 1 (of type CHEMICAL) Interactor argument 2 (of type GENE) Each line contains one relation, and each relation is identified by the PMID, the relation type and the two related entities. In the below example, to find the entities involved in the first relation, you must find the entities with Term Identifier T1 and T52 within the PMID 12488248. Example DrugProt relation annotations: 12488248 INHIBITOR Arg1:T1 Arg2:T52 12488248 INHIBITOR Arg1:T2 Arg2:T52 23220562 ACTIVATOR Arg1:T12 Arg2:T42 23220562 ACTIVATOR Arg1:T12 Arg2:T43 23220562 INDIRECT-DOWNREGULATOR Arg1:T1 Arg2:T14 Please, cite: @inproceedings{krallinger2017overview, title={Overview of the BioCreative VI chemical-protein interaction Track}, author={Krallinger, Martin and Rabal, Obdulia and Akhondi, Saber A and P{\'e}rez, Mart{\i}n P{\'e}rez and Santamar{\'\i}a, Jes{\'u}s and Rodr{\'\i}guez, Gael P{\'e}rez and others}, booktitle={Proceedings of the sixth BioCreative challenge evaluation workshop}, volume={1}, pages={141--146}, year={2017}} Summary statistics: Training set Development set Documents 3500 750 Tokens 1001168 199620 Annotated Entities 89529 18858 Annotated Relations 17288 3765 Annotated Entities: Training Entities Development Entities CHEMICAL 46274 9853 GENE-Y [Normalizable] 28421 - GENE-N [Non-Normalizable] 14834 - Gene Total (N+Y) 43255 9005 Total 89529 18858 Annotated Relations: Training Relations Development Relations INDIRECT-DOWNREGULATOR 1330 332 INDIRECT-UPREGULATOR 1379 302 DIRECT-REGULATOR 2250 458 ACTIVATOR 1429 246 INHIBITOR 5392 1152 AGONIST 659 131 AGONIST-ACTIVATOR 29 10 AGONIST-INHIBITOR 13 2 ANTAGONIST 972 218 PRODUCT-OF 921 158 SUBSTRATE 2003 495 SUBSTRATE_PRODUCT-OF 25 3 PART-OF 886 258 Total 17288 3765 For further information, please visit https://biocreative.bioinformatics.udel.edu/tasks/biocreative-vii/track-1/ or email us at [email protected] and [email protected] Related resources: Web Evaluation library Relation annotation guidelines Gene and protein annotation guidelines Chemicals and drugs annotation guidelines FAQDrugProt corpus is promoted by the Plan de Impulso de las Tecnologías del Lenguaje de la Agenda Digital (Plan TL)

    Going Beyond Counting First Authors in Author Co-citation Analysis

    Full text link
    The present study examines one of the fundamental aspects of author co-citation analysis (ACA) - the way co-citation counts are defined. Co-citation counting provides the data on which all subsequent statistical analyses and mappings are based, and we compare ACA results based on two different types of co-citation counting - the traditional type that only counts the first one among a cited work's authors on the one hand and a non-traditional type that takes into account the first 5 authors of a cited work on the other hand. Results indicate that the picture produced through this non-traditional author co-citation counting contains more coherent author groups and is therefore considerably clearer. However, this picture represents fewer specialties in the research field being studied than that produced through the traditional first-author co-citation counting when the same number of top-ranked authors is selected and analyzed. Reasons for these effects are discussed

    Variations on the Author

    Full text link
    “Variations on the Author” discusses two of Eduardo Coutinho’s recent films (Um Dia na Vida, from 2010, and Últimas Conversas, posthumously released in 2015) and their contribution to the general question of documentary authorship. The director’s filmography is characterized by a consistent yet self-effacing form of authorial self-inscription: Coutinho often features as an interviewer that rather than express opinions propels discourses; an interviewer that is good at listening. This mode of self-inscription characterizes him as an author who is not expressive but who is nonetheless markedly present on the screen. In Um Dia na Vida, however, Coutinho is completely absent form the image, while Últimas Conversas, on the contrary, includes a confessional prologue that moves the director from the margins to the center of his films. This article examines the ways in which these works stand out in the filmography of a director who offers new insights into the notion of cinematic authorship

    Appropriate Similarity Measures for Author Cocitation Analysis

    Full text link
    We provide a number of new insights into the methodological discussion about author cocitation analysis. We first argue that the use of the Pearson correlation for measuring the similarity between authors’ cocitation profiles is not very satisfactory. We then discuss what kind of similarity measures may be used as an alternative to the Pearson correlation. We consider three similarity measures in particular. One is the well-known cosine. The other two similarity measures have not been used before in the bibliometric literature. Finally, we show by means of an example that our findings have a high practical relevance.information science;Pearson correlation;cosine;similarity measure;author cocitation analysis

    Dispelling the Myths Behind First-author Citation Counts

    Full text link
    We conducted a full-scale evaluative citation analysis study of scholars in the XML research field to explore just how different from each other author rankings resulting from different citation counting methods actually are, and to demonstrate the capability of emerging data and tools on the Web in supporting more realistic citation counting methods. Our results contest some common arguments for the continued use of first-author citation counts in the evaluation of scholars, such as high correlations between author rankings by first-author citation counts and other citation counting methods, and high costs of using more realistic citation counting methods that are not well-supported by the ISI databases. It is argued that increasingly available digital full text research papers make it possible for citation analysis studies to go beyond what the ISI databases have directly supported and to employ more sophisticated methods

    Author Index

    No full text
    Nao informado

    koamabayili/VECTRON-author-checklist: VECTRON author checklist

    No full text
    We have done our best to complete the author checklist relating to the use of animals in the hut study. Note that the objective for the hut study was to evaluate the IRS treatment applications for residual efficacy against Anopheles mosquitoes, including the local An. coluzzii mosquito population. Cows were only used to attract mosquitoes into the huts and no tests were carried out directly on the cows. The author checklist is intended for use with studies where experiments are carried out on animals, which is why we have had such difficulty in completing this for the hut study, as many of the questions do not relate to how the cows were used
    corecore