1,720,961 research outputs found
Post-processing of association rules
A demanda por métodos de análise e descoberta de conhecimento em grandes bases de dados tem fortalecido a pesquisa em Mineração de Dados. Dentre as tarefas associadas a essa área, tem-se Regras de Associação. Vários algoritmos foram propostos para tratamento de Regras de Associação, que geralmente tem como resultado um elevado número de regras, tornando o Pós-processamento do conhecimento uma etapa bastante complexa e desafiadora. Existem medidas para auxiliar essa etapa de avaliação de regras, porém existem lacunas referentes a inexistência de um método intuitivo para priorizar e selecionar regras. Além disso, não é possível encontrar metodologias específicas para seleção de regras considerando mais de uma medida simultaneamente. Esta tese tem como objetivo a proposição, desenvolvimento e implementação de uma metodologia para o Pós-processamento de Regras de Associação. Na metodologia proposta, pequenos grupos de regras identificados como potencialmente interessantes são apresentados ao usuário especialista para avaliação. Para tanto, foram analisados métodos e técnicas utilizadas em Pós-processamento de conhecimento, medidas objetivas para avaliação de Regras de Associação e algoritmos que geram regras. Dessa perspectiva foram realizados experimentos para identificar o potencial das medidas a serem empregadas como filtros de Regras de Associação. Uma avaliação gráfica apoiou o estudo das medidas e a especificação da metodologia proposta. Aspecto inovador da metodologia proposta é a utilização do método de Pareto e a combinação de medidas para selecionar as Regras de Associação. Por fim foi implementado um ambiente para avaliação de Regras de Associação, denominado ARInE, viabilizando o uso da metodologia proposta.The large demand of methods for knowledge discovery and analysis in large databases has continously increased the research in data mining area. Among the tasks associated to this area, one can find Association Rules. Several algorithms have been proposed for treating Association Rules. However, these algorithms give as results a huge amount of rules, making the knowledge post-processing phase very complex and challeging. There are several measures that can be used in this evaluation phase, but there are also some limitations regarding to the ausence of an intuitive method to rank and select rules. Moreover, it is not possible to find especific methodologies for selecting rules, considering more than one measure simultaneously. This thesis has as objective the proposal, development and implementation of a postprocessing methodology for Association Rules. In the proposed methodology, small groups of rules, which have been identified as potentialy interesting, are presented to the expert for evaluation. In this sense, methods and techniques for knowledge post-processing, objective measures for rules evaluation, and Association Rules algorithms have been analized. From this point of view, several experiments have been realized for identifying the potential of such measures to be used to filter Association Rules. The study of measures and the specification of the proposed methodology have been supported by a graphical evaluation. The novel aspect of the proposed methodology consists on using the Paretos method and combining measures for selecting Association Rules. Finally, an enviroment for evaluating Association Rules, named as ARInE, has been implemented according to the proposed methodology
Going Beyond Counting First Authors in Author Co-citation Analysis
The present study examines one of the fundamental aspects of author co-citation analysis (ACA) - the way co-citation
counts are defined. Co-citation counting provides the data on which all subsequent statistical analyses and mappings
are based, and we compare ACA results based on two different types of co-citation counting - the traditional type that
only counts the first one among a cited work's authors on the one hand and a non-traditional type that takes into
account the first 5 authors of a cited work on the other hand. Results indicate that the picture produced through this non-traditional author co-citation counting contains more coherent author groups and is therefore considerably clearer. However, this picture represents fewer specialties in the research field being studied than that produced through the traditional first-author co-citation counting when the same number of top-ranked authors is selected and analyzed. Reasons for these effects are discussed
Variations on the Author
“Variations on the Author” discusses two of Eduardo Coutinho’s recent films (Um Dia na Vida, from 2010, and Últimas Conversas, posthumously released in 2015) and their contribution to the general question of documentary authorship. The director’s filmography is characterized by a consistent yet self-effacing form of authorial self-inscription: Coutinho often features as an interviewer that rather than express opinions propels discourses; an interviewer that is good at listening. This mode of self-inscription characterizes him as an author who is not expressive but who is nonetheless markedly present on the screen. In Um Dia na Vida, however, Coutinho is completely absent form the image, while Últimas Conversas, on the contrary, includes a confessional prologue that moves the director from the margins to the center of his films. This article examines the ways in which these works stand out in the filmography of a director who offers new insights into the notion of cinematic authorship
Appropriate Similarity Measures for Author Cocitation Analysis
We provide a number of new insights into the methodological discussion about author cocitation analysis. We first argue that the use of the Pearson correlation for measuring the similarity between authors’ cocitation profiles is not very satisfactory. We then discuss what kind of similarity measures may be used as an alternative to the Pearson correlation. We consider three similarity measures in particular. One is the well-known cosine. The other two similarity measures have not been used before in the bibliometric literature. Finally, we show by means of an example that our findings have a high practical relevance.information science;Pearson correlation;cosine;similarity measure;author cocitation analysis
Dispelling the Myths Behind First-author Citation Counts
We conducted a full-scale evaluative citation analysis study of scholars in the XML research field to explore just how different from each other author rankings resulting from different citation counting methods actually are, and to demonstrate the capability of emerging data and tools on the Web in supporting more realistic citation counting methods. Our results contest some common arguments for the continued
use of first-author citation counts in the evaluation of scholars, such as high correlations between author rankings by first-author citation counts and other citation
counting methods, and high costs of using more realistic citation counting methods that are not well-supported by the ISI databases. It is argued that increasingly available digital full text research papers make it possible for citation analysis studies to go beyond what the ISI databases have directly supported and to employ more
sophisticated methods
koamabayili/VECTRON-author-checklist: VECTRON author checklist
We have done our best to complete the author checklist relating to the use of animals in the hut study. Note that the objective for the hut study was to evaluate the IRS treatment applications for residual efficacy against Anopheles mosquitoes, including the local An. coluzzii mosquito population. Cows were only used to attract mosquitoes into the huts and no tests were carried out directly on the cows. The author checklist is intended for use with studies where experiments are carried out on animals, which is why we have had such difficulty in completing this for the hut study, as many of the questions do not relate to how the cows were used
Possibilities in artificial intelligence in the detection of patterns and prediction of accidents on highways
Nesta tese é apresentada uma investigação de possibilidades em Inteligência Artificial na detecção de padrões e previsão de acidentes em rodovias. Para tanto, é realizada uma avaliação de diferentes técnicas de Aprendizado de Máquina baseadas em abordagens de agrupamento, classificação e predição de links, com base em dados de acidentes georreferenciados e modelados em estruturas de clustering, árvores (CART) e redes (redes neurais artificiais, redes bayesianas e redes complexas). Os resultados revelaram que as abordagens baseadas em redes complexas possibilitaram a detecção de estruturas de agrupamentos mais robustas, quando comparadas as técnicas tradicionais de clustering, uma vez que consideram na estrutura topológica dos dados conceitos de vizinhança. O agrupamento dos dados proporciona a redução da heterogeneidade das bases de dados, bem como a obtenção de regras de decisão (CART) com maior probabilidade de ocorrência e taxa geral de acerto. A classificação supervisionada da severidade dos acidentes em redes, utilizando as modelagens de redes neurais artificiais e redes bayesianas, permitiu identificar simultaneamente os fatores contribuintes a ocorrência dos acidentes, sejam estes associados ao motorista, as variáveis de infraestrutura viária ou as condições do ambiente. No entanto, por considerar o peso das variáveis utilizadas no processo de modelagem, a classificação por redes bayesianas tende a ser mais realista, sendo menos sensível ao overfitting. Quanto à predição dos acidentes, foi possível pela predição de links utilizando a abordagem de redes complexas bipartidas, identificar alta correlação entre os acidentes preditos e os acidentes observados para uma determinada época. A abordagem proposta é flexível ao número de variáveis necessárias ao processo de modelagem, o que permite a realização de um diversificado número de estudos. No entanto, quando se considera a modelagem simplificada, formada pelas variáveis recomendadas pelo Highway Safety Manual (HSM), da Association of State Highway na Transportation (AASHTO), verifica-se que a predição é mais precisa e acurada, uma vez que esta modelagem considera fundamentalmente variáveis de infraestrutura viária, enquanto que na modelagem geral são consideradas também variáveis ambientais, que são mais variantes no tempo e no espaço. Até o presente momento, o método proposto é o mais adequado para explicar o comportamento e aspectos dinâmicos em ambientes rodoviários. No entanto, a abordagem proposta foi limitada pela quantidade de dados explorados, bem como por anomalias decorrentes a processos de execução de obras no trecho da rodovia em análise. Ademais, pode ser aplicada em problemas de diferentes escalas e para diversos estudos de caso. Portanto, por meio da modelagem de redes bipartidas georreferenciadas é possível não apenas realizar a predição de acidentes em rodovias, como também verificar a variação da acidentalidade viária e os níveis de segurança e desempenho de uma rodovia.This thesis presents an investigation of Possibilities in Artificial Intelligence in the detection of patterns and prediction of accidents on highways. For this, it is performed an evaluation of different machine learning techniques based on grouping, classification and prediction of links, based on data from georeferenced accidents and modeled on clustering structures, trees (CART) and networks (artificial neural network, Bayesian network and complex network). The results revealed that the approaches based on complex networks enabled the detection of structures of groupings more robust, when comparing traditional clustering techniques, when they consider in the structure topological data neighborhood concepts. The grouping of data provides the reduction of heterogeneity of the database, as well as obtaining decision rules (CART) with the higher probability of occurrence and general rate of hit. The classification supervised of the severity of accidents in networks, using the modelling of artificial neural networks and Bayesian networks, it allowed to simultaneously identify the factors contributing to the occurrence of accidents, whether these are associated with the driver, the road infrastructure variables or the environmental conditions. However, considering the weight of the variables used in the modeling process, the classification by Bayesian networks tends to be more realistic, being less sensitive too overftting. Regarding the prediction of accidents, it was possible to predict links using the approach of complex bipartite networks, to identify high correlation between predicted accidents and accidents observed for a certain time. The proposed approach is flexible to the number of variables necessary for the modeling process, which allows the realization of a diversified number of studies. However, when you consider the \"simplified modelling\", formed by the variables recommended by the Highway Safety Manual (HSM), of the Association of State Highway na Transportation (AASHTO), it is verified that the prediction is more accurate and precise, since this modelling considers fundamentally variables of road infrastructure, while that the \"general modelling\" are considered the also environmental variables, which are more variants in time and space. Up to the present moment, the proposed method is best suited to explain the behavior and dynamic aspects in the environment road. However, the proposed approach was limited by the amount of data explored, as well as by anomalies resulting from processes of execution of works in the stretch of the highway under analysis. Moreover, it can be applied to problems of different scales and for several case studies. Therefore, through the modeling of geo-referenced bipartite networks, it is possible not only to predict accidents on highways, but also to check he variation of road accidents and the safety and performance levels of a highway
- …
