1,721,031 research outputs found
Topic detection in multichannel Italian newspapers
Nowadays, any person, company or public institution uses and exploits different channels to share private or public information with other people (friends, customers, relatives, etc.) or institutions. This context has changed the journalism, thus, the major newspapers report news not just on its own web site, but also on several social media such as Twitter or YouTube. The use of multiple communication media stimulates the need for integration and analysis of the content published globally and not just at the level of a single medium. An analysis to achieve a comprehensive overview of the information that reaches the end users and how they consume the information is needed. This analysis should identify the main topics in the news flow and reveal the mechanisms of publication of news on different media (e.g. news timeline). Currently, most of the work on this area is still focused on a single medium. So, an analysis across different media (channels) should improve the result of topic detection. This paper shows the application of a graph analytical approach, called Keygraph, to a set of very heterogeneous documents such as the news published on various media. A preliminary evaluation on the news published in a 5 days period was able to identify the main topics within the publications of a single newspaper, and also within the publications of 20 newspapers on several on-line channels
Semantic Access to Data from the Web
There is a great amount of information available on the web. So, users typically use different keyword-based web search engines to find the information they need. However, many words are polysemous and therefore the output of the search engine will include links to web pages referring to different meanings of the keywords. Besides, results with different meanings are mixed up, which makes the task of finding the relevant information difficult for the user, specially if the meanings behind the input keywords are not among the most popular in the web. In this paper, we propose a semantics-based approach to group the results returned to the user in clusters defined by the different meanings of the input keywords. Differently from other proposals, our method considers the knowledge provided by a pool of ontologies available on the Web in order to dynamically define the different categories (or clusters). Thus, it is independent of the sources providing the results that must be grouped
Using semantic techniques to access web data
Nowadays, people frequently use different keyword-based web search engines to find the information they need on the web. However, many words are polysemous and, when these words are used to query a search engine, its output usually includes links to web pages referring to their different meanings. Besides, results with different meanings are mixed up, which makes the task of finding the relevant information difficult for the users, especially if the user-intended meanings behind the input keywords are not among the most popular on the web. In this paper, we propose a set of semantics techniques to group the results provided by a traditional search engine into categories defined by the different meanings of the input keywords. Differently from other proposals, our method considers the knowledge provided by ontologies available on the web in order to dynamically define the possible categories. Thus, it is independent of the sources providing the results that must be grouped. Our experimental results show the interest of the proposal
Keyword-based Search in Data Integration Systems
In this paper we describe Keymantic, a framework for translating keywordqueries into SQL queries by assuming that the only available information isthe source metadata, i.e., schema and some external auxiliary information. Sucha framework finds application when only intensional knowledge about the datasource is available like in Data Integration Systems
Going Beyond Counting First Authors in Author Co-citation Analysis
The present study examines one of the fundamental aspects of author co-citation analysis (ACA) - the way co-citation
counts are defined. Co-citation counting provides the data on which all subsequent statistical analyses and mappings
are based, and we compare ACA results based on two different types of co-citation counting - the traditional type that
only counts the first one among a cited work's authors on the one hand and a non-traditional type that takes into
account the first 5 authors of a cited work on the other hand. Results indicate that the picture produced through this non-traditional author co-citation counting contains more coherent author groups and is therefore considerably clearer. However, this picture represents fewer specialties in the research field being studied than that produced through the traditional first-author co-citation counting when the same number of top-ranked authors is selected and analyzed. Reasons for these effects are discussed
Variations on the Author
“Variations on the Author” discusses two of Eduardo Coutinho’s recent films (Um Dia na Vida, from 2010, and Últimas Conversas, posthumously released in 2015) and their contribution to the general question of documentary authorship. The director’s filmography is characterized by a consistent yet self-effacing form of authorial self-inscription: Coutinho often features as an interviewer that rather than express opinions propels discourses; an interviewer that is good at listening. This mode of self-inscription characterizes him as an author who is not expressive but who is nonetheless markedly present on the screen. In Um Dia na Vida, however, Coutinho is completely absent form the image, while Últimas Conversas, on the contrary, includes a confessional prologue that moves the director from the margins to the center of his films. This article examines the ways in which these works stand out in the filmography of a director who offers new insights into the notion of cinematic authorship
Appropriate Similarity Measures for Author Cocitation Analysis
We provide a number of new insights into the methodological discussion about author cocitation analysis. We first argue that the use of the Pearson correlation for measuring the similarity between authors’ cocitation profiles is not very satisfactory. We then discuss what kind of similarity measures may be used as an alternative to the Pearson correlation. We consider three similarity measures in particular. One is the well-known cosine. The other two similarity measures have not been used before in the bibliometric literature. Finally, we show by means of an example that our findings have a high practical relevance.information science;Pearson correlation;cosine;similarity measure;author cocitation analysis
Keyword search over relational databases: a metadata approach
Keyword queries offer a convenient alternative to traditionalSQL in querying relational databases with large, often unknown,schemas and instances. The challenge in answering such queriesis to discover their intended semantics, construct the SQL queriesthat describe them and used them to retrieve the respective tuples.Existing approaches typically rely on indices built a-priori on thedatabase content. This seriously limits their applicability if a-prioriaccess to the database content is not possible. Examples include theon-line databases accessed through web interface, or the sources ininformation integration systems that operate behind wrappers withspecific query capabilities. Furthermore, existing literature has notstudied to its full extend the inter-dependencies across the ways thedifferent keywords are mapped into the database values and schemaelements. In this work, we describe a novel technique for translatingkeyword queries into SQL based on the Munkres (a.k.a. Hungarian)algorithm. Our approach not only tackles the above twolimitations, but it offers significant improvements in the identificationof the semantically meaningful SQL queries that describe theintended keyword query semantics. We provide details of the techniqueimplementation and an extensive experimental evaluation
Keymantic: Semantic Keyword-based Searching in Data Integration Systems
We propose the demonstration of Keymantic, a system for keyword-based searching in relational databases that does not require a-priori knowledge of instances held in a database. It nds numerous applications in situations where traditional keyword-based searching techniques are inapplicable due to the unavailability of the database contents for the construction of the required indexes
- …
