Search CORE

1,721,019 research outputs found

Murray-Rust, Peter

Author: Murray-Rust Peter
Publication venue
Publication date: 17/03/2016
Field of study

Universitas Maritim Raja Ali Haji Pusat Jurnal Ilmiah

Enhancement of the chemical semantic web through the use of InChI identifiers

Author: Rzepa Henry S.
Day Nick E.
Coles Simon J.
Murray-Rust Peter
Zhang Yong
Publication venue
Publication date: 01/01/2005
Field of study

Molecules, as defined by connectivity specified via the International Chemical Identifier (InChI), are precisely indexed by major web search engines so that Internet tools can be transparently used for unique structure searches

Southampton (e-Prints Soton)

Recommended from our members

Automatic analysis and validation of open polymer data

Author
Publication venue
Publication date: 2011
Field of study

A system to automatically extract, analyse, validate and model polymer data has been produced. This system is called the Polymer Informatics Knowledge System (PIKS). Methods of storing polymer data electronically are examined. The majority of data-formats are only capable of representing an idealised structure of a macromolecule rather than the actual distribution of structures present in the polymer. Polymer markup language (PML) is the only data-format capable of storing this information. A novel extension to the PML language, allowing copolymers produced with a depletion of reactants is introduced. Without the extension only Markov-chains can be produced. An informatics analysis of Unilever data of cleaning efficacy of polymers is performed. A representative macromolecule was produced for each polymer sample. Descriptors were calculated over these and used for machine learning to predict the cleaning efficacy. From these models a monomer was identified which was very strongly correlated with good cleaning performance. The monomer in question cannot be revealed as it is a trade secret. Polymer data from the PoLyInfo database are extracted and converted into XML. A summary of the data available in the PoLyInfo Database is presented. The PIKS tools were used to automatically validate this data for internal consistency, as well as against another data source. The monomers and polymers were analysed for consistency, as well as CML reactions being produced for the polymerisation reactions in the database which were also checked for constancy. The error in the structures was found to be 5.8% for the monomers, 7.3% for the polymers and 2.9% for the reactions. Some of the causes of the discrepancies are presented. The property data from the PoLyInfo database was then used for machine learning. Support Vector Regression (SVR) models of the glass transition temperature were produced both with and without the inclusion of sample characterisation data. Both methods performed similarly, with the model without producing an RMS error of 19.1K (r^2=0.96), while the model with produced an RMS error of 20.1K (r^2=0.96). This means that more sample characterisation data is required than the M_w and M_w/M_n

Apollo (Cambridge)

Recommended from our members

Extraction of chemical structures and reactions from the literature

Author
Publication venue
Publication date: 2013
Field of study

The ever increasing quantity of chemical literature necessitates the creation of automated techniques for extracting relevant information. This work focuses on two aspects: the conversion of chemical names to computer readable structure representations and the extraction of chemical reactions from text. Chemical names are a common way of communicating chemical structure information. OPSIN (Open Parser for Systematic IUPAC Nomenclature), an open source, freely available algorithm for converting chemical names to structures was developed. OPSIN employs a regular grammar to direct tokenisation and parsing leading to the generation of an XML parse tree. Nomenclature operations are applied successively to the tree with many requiring the manipulation of an in-memory connection table representation of the structure under construction. Areas of nomenclature supported are described with attention being drawn to difficulties that may be encountered in name to structure conversion. Results on sets of generated names and names extracted from patents are presented. On generated names, recall of between 96.2% and 99.0% was achieved with a lower bound of 97.9% on precision with all results either being comparable or superior to the tested commercial solutions. On the patent names OPSIN s recall was 2-10% higher than the tested solutions when the patent names were processed as found in the patents. The uses of OPSIN as a web service and as a tool for identifying chemical names in text are shown to demonstrate the direct utility of this algorithm. A software system for extracting chemical reactions from the text of chemical patents was developed. The system relies on the output of ChemicalTagger, a tool for tagging words and identifying phrases of importance in experimental chemistry text. Improvements to this tool required to facilitate this task are documented. The structure of chemical entities are where possible determined using OPSIN in conjunction with a dictionary of name to structure relationships. Extracted reactions are atom mapped to confirm that they are chemically consistent. 424,621 atom mapped reactions were extracted from 65,034 organic chemistry USPTO patents. On a sample of 100 of these extracted reactions chemical entities were identified with 96.4% recall and 88.9% precision. Quantities could be associated with reagents in 98.8% of cases and 64.9% of cases for products whilst the correct role was assigned to chemical entities in 91.8% of cases. Qualitatively the system captured the essence of the reaction in 95% of cases. This system is expected to be useful in the creation of searchable databases of reactions from chemical patents and in facilitating analysis of the properties of large populations of reactions

Apollo (Cambridge)

Author Instructions

Author: Instructions Author
Publication venue
Publication date: 04/11/2013
Field of study

Crossref

Cartographic Perspectives (E-Journal - North American Cartographic Information Society, NACIS)

Going Beyond Counting First Authors in Author Co-citation Analysis

Author: Zhao Dangzhi
Publication venue
Publication date: 01/01/2005
Field of study

The present study examines one of the fundamental aspects of author co-citation analysis (ACA) - the way co-citation counts are defined. Co-citation counting provides the data on which all subsequent statistical analyses and mappings are based, and we compare ACA results based on two different types of co-citation counting - the traditional type that only counts the first one among a cited work's authors on the one hand and a non-traditional type that takes into account the first 5 authors of a cited work on the other hand. Results indicate that the picture produced through this non-traditional author co-citation counting contains more coherent author groups and is therefore considerably clearer. However, this picture represents fewer specialties in the research field being studied than that produced through the traditional first-author co-citation counting when the same number of top-ranked authors is selected and analyzed. Reasons for these effects are discussed

E-LIS

Variations on the Author

Author: Sayad Cecilia
Publication venue
Publication date: 01/01/2016
Field of study

“Variations on the Author” discusses two of Eduardo Coutinho’s recent films (Um Dia na Vida, from 2010, and Últimas Conversas, posthumously released in 2015) and their contribution to the general question of documentary authorship. The director’s filmography is characterized by a consistent yet self-effacing form of authorial self-inscription: Coutinho often features as an interviewer that rather than express opinions propels discourses; an interviewer that is good at listening. This mode of self-inscription characterizes him as an author who is not expressive but who is nonetheless markedly present on the screen. In Um Dia na Vida, however, Coutinho is completely absent form the image, while Últimas Conversas, on the contrary, includes a confessional prologue that moves the director from the margins to the center of his films. This article examines the ways in which these works stand out in the filmography of a director who offers new insights into the notion of cinematic authorship

Crossref

Kent Academic Repository

Appropriate Similarity Measures for Author Cocitation Analysis

Author: Waltman L.R.
Eck N.J.P. van
Publication venue
Publication date
Field of study

We provide a number of new insights into the methodological discussion about author cocitation analysis. We first argue that the use of the Pearson correlation for measuring the similarity between authorsâ€™ cocitation profiles is not very satisfactory. We then discuss what kind of similarity measures may be used as an alternative to the Pearson correlation. We consider three similarity measures in particular. One is the well-known cosine. The other two similarity measures have not been used before in the bibliometric literature. Finally, we show by means of an example that our findings have a high practical relevance.information science;Pearson correlation;cosine;similarity measure;author cocitation analysis

Research Papers in Economics

Journal publishing and author self-archiving : peaceful co-existence and fruitful collaboration

Author: Wilks Yorick
Oppenheim Charles
Law Derek
Murray-Rust Peter
Berners-Lee Tim
Harnad Stevan
Shadbolt Nigel
De Roure Dave
Publication venue
Publication date: 22/08/2005
Field of study

The UK Research Funding Councils (RCUK) have proposed that all RCUK fundees should self-archive on the web, free for all, their own final drafts of journal articles reporting their RCUK-funded research, in order to maximise their usage and impact. ALPSP (a learned publishers' association) now seeks to delay and block the RCUK proposal, auguring that it will ruin journals. All objective evidence from the past decade and a half of self-archiving, however, shows that self-archiving can and does co-exist peacefully with journals while greatly enhancing both author/article and journal impact, to the benefit of both. Journal publishers should not be trying to delay and block self-archiving policy; they should be collaborating with the research community on ways to share its vast benefits

University of Strathclyde Institutional Repository

Dispelling the Myths Behind First-author Citation Counts

Author: Zhao Dangzhi
Publication venue
Publication date: 01/01/2006
Field of study

We conducted a full-scale evaluative citation analysis study of scholars in the XML research field to explore just how different from each other author rankings resulting from different citation counting methods actually are, and to demonstrate the capability of emerging data and tools on the Web in supporting more realistic citation counting methods. Our results contest some common arguments for the continued use of first-author citation counts in the evaluation of scholars, such as high correlations between author rankings by first-author citation counts and other citation counting methods, and high costs of using more realistic citation counting methods that are not well-supported by the ISI databases. It is argued that increasingly available digital full text research papers make it possible for citation analysis studies to go beyond what the ISI databases have directly supported and to employ more sophisticated methods

E-LIS