1,720,969 research outputs found
Web log compression
Tyt. z nagłówka.Bibliogr. s. 424.Dostępny również w formie drukowanej.STRESZCZENIE: Pliki z logami webowymi przechowują zapis aktywności klientów na danym serwerze, zwykle w formie jednolinijkowych wpisów zawierających informacje typu: numer IP maszyny klienta, data/czas dostępu do danego zasobu, rozmiar ściągniętego pliku w bajtach etc. Dane te, na popularnych serwerach www, mogą przyrastać w tempie setek megabajtów na dzień lub nawet wyższym. Archiwizacja "starych" logów jest jednak zalecana, głównie w celu ich analizy, np. mającej na celu wykrywanie ataków sieciowych i nietypowych (niepożądanych) wzorców zachowań. W niniejszej pracy przedstawiamy specjalizowany, zorientowany na kompresję, preprocesor dla logów serwera Apache i testujemy jego efektywność w połączeniu z kilkoma popularnymi kompresorami ogólnego przeznaczenia. Wyniki eksperymentów pokazują, że zaproponowana transformata poprawia efektywność kompresji o 65% (tj. 3-krotnie) w przypadku gzipa oraz o 52% w przypadku kompresora bzip2. SŁOWA KLUCZOWE: kompresja bezstratna, kompresja baz danych, logi webowe, Apache. ABSTRACT: Web log data store client activity on a particular server, usually in form of one-line "hits" with information like the client's IP, date/ time, requested file or query, download size in bytes etc. Web logs of popular sites may grow at the pace of hundreds of megabytes a day, or even more. It makes sense to archive old logs, to analyze them further, e.g. for detecting attacks or other server abuse patterns. In this work we present a specialized lossless Apache web log preprocessor and test it with combination of several popular general-purpose compressors. The test results show the proposed transform improves the compression efficiency of general-purpose compressors on average by 65% in case of gzip and 52% in case of bzip2. KEYWORDS: lossless compression, database compression, web logs, Apache
Going Beyond Counting First Authors in Author Co-citation Analysis
The present study examines one of the fundamental aspects of author co-citation analysis (ACA) - the way co-citation
counts are defined. Co-citation counting provides the data on which all subsequent statistical analyses and mappings
are based, and we compare ACA results based on two different types of co-citation counting - the traditional type that
only counts the first one among a cited work's authors on the one hand and a non-traditional type that takes into
account the first 5 authors of a cited work on the other hand. Results indicate that the picture produced through this non-traditional author co-citation counting contains more coherent author groups and is therefore considerably clearer. However, this picture represents fewer specialties in the research field being studied than that produced through the traditional first-author co-citation counting when the same number of top-ranked authors is selected and analyzed. Reasons for these effects are discussed
Variations on the Author
“Variations on the Author” discusses two of Eduardo Coutinho’s recent films (Um Dia na Vida, from 2010, and Últimas Conversas, posthumously released in 2015) and their contribution to the general question of documentary authorship. The director’s filmography is characterized by a consistent yet self-effacing form of authorial self-inscription: Coutinho often features as an interviewer that rather than express opinions propels discourses; an interviewer that is good at listening. This mode of self-inscription characterizes him as an author who is not expressive but who is nonetheless markedly present on the screen. In Um Dia na Vida, however, Coutinho is completely absent form the image, while Últimas Conversas, on the contrary, includes a confessional prologue that moves the director from the margins to the center of his films. This article examines the ways in which these works stand out in the filmography of a director who offers new insights into the notion of cinematic authorship
Appropriate Similarity Measures for Author Cocitation Analysis
We provide a number of new insights into the methodological discussion about author cocitation analysis. We first argue that the use of the Pearson correlation for measuring the similarity between authors’ cocitation profiles is not very satisfactory. We then discuss what kind of similarity measures may be used as an alternative to the Pearson correlation. We consider three similarity measures in particular. One is the well-known cosine. The other two similarity measures have not been used before in the bibliometric literature. Finally, we show by means of an example that our findings have a high practical relevance.information science;Pearson correlation;cosine;similarity measure;author cocitation analysis
Dispelling the Myths Behind First-author Citation Counts
We conducted a full-scale evaluative citation analysis study of scholars in the XML research field to explore just how different from each other author rankings resulting from different citation counting methods actually are, and to demonstrate the capability of emerging data and tools on the Web in supporting more realistic citation counting methods. Our results contest some common arguments for the continued
use of first-author citation counts in the evaluation of scholars, such as high correlations between author rankings by first-author citation counts and other citation
counting methods, and high costs of using more realistic citation counting methods that are not well-supported by the ISI databases. It is argued that increasingly available digital full text research papers make it possible for citation analysis studies to go beyond what the ISI databases have directly supported and to employ more
sophisticated methods
Sekwencyjne i równolegle algorytmy znajdowania podciągów
The first part of this work is on the longest increasing subsequence problem (LIS) and its variants (a subsequence can be obtained from a sequence by removing zero or more symbols). The problem has applications in bioinformatics, e.g., in sequence alignment, searching new genes. The first variant of the LIS problem, which is considered in this work, is a problem of longest increasing subsequences that are extremal from some point of view. Next variant is a slope-constrained longest increasing subsequence problem. The last two discussed variants of the LIS problem are a longest increasing cyclic subsequence problem (LICS) and a longest increasing subsequence in a sliding window problem (LISW). The algorithms for the recent two problems use cover representation of a sequence. Original algorithms for cover merging are crucial to the proposed algorithms for the LICS and LISW problems.The second part of this work is on the longest common subsequence problem (LCS) and its variants. The applications of these problems are numerous and concentrate mainly on the sequence comparison. For the transposition-invariant LCS problem (LCTS), a few sequential algorithms were proposed. Experiments show that they are much faster than the existing algorithms. For the constrained LCS problem (CLCS), a few sequential algorithms were also proposed. They are faster than the known algorithms. Moreover, for the CLCS problem, the first bit-parallel algorithm was invented. For the merged LCS problem (MerLCS), a bit parallel algorithm, tens times faster than the existing algorithms was proposed. For the LCS, LCTS, CLCS problems also algorithms for graphical processors were invented. All the proposed algorithms were analysed and their time and space complexities in the worst case were determined. For some algorithms the average case was also analysed. Obtained time complexities allow to show that the proposed algorithms are usually faster than the existing algorithms also in an asymptotic sense.Pierwsza cześć niniejszej pracy poświęcona jest problemowi najdłuższego podciągu rosnącego (LIS) oraz jego wariantom (podciąg otrzymuje sie z ciągu przez usuniecie zera bądź większej liczby symboli). Problem ten znajduje zastosowania m.in. w bioinformatyce do uliniawiania genomów, wyszukiwania nowych genów. Pierwszym z wariantów problemu LIS rozważanym w niniejszej pracy jest problem podciągów rosnących, które są pod pewnymi względami ekstremalne. Kolejnym wariantem jest problem podciągu rosnącego o zadanym pochyleniu. Dalsze dwa warianty to problemy cyklicznych podciągów rosnących oraz podciągów rosnących w oknie ustalonego rozmiaru ciągu wejściowego. Dla tych ostatnich wariantów zaproponowano w pracy wykorzystanie reprezentacji ciągu za pomocą pokrycia zachłannego oraz opracowano wydajne algorytmy łączenia takich pokryć. Algorytmy te są kluczowe do efektywnego rozwiązywania wspomnianych problemów.Druga cześć pracy dotyczy problemu najdłuższego wspólnego podciągu i jego wariantów.Zastosowania tych problemów są bardzo liczne i dotyczą przede wszystkim porównywania ciągów w celu oceny ich podobieństwa. Dla problemu LCS niezmienniczego względem transpozycji LCTS) zaproponowano kilka algorytmów sekwencyjnych, które, jak wynika z eksperymentów praktycznych, okazały sie znacznie szybsze od algorytmów istniejących. Dla problemu ukierunkowanego LCS (CLCS) zaproponowano algorytmy sekwencyjne, również szybsze od dotychczas istniejących. Ponadto, zaproponowano dla tego problemu pierwszy algorytm równoległości bitowej. Dla problemu scalonego LCS (MerLCS) zaproponowano pierwszy algorytm równoległości bitowej, który w eksperymentach praktycznych okazał sie kilkudziesięciokrotnie szybszy od znanych algorytmów. Dla problemów LCS, LCTS, CLCS zaproponowano także algorytmy równoległe przeznaczone do wykonywania w procesorach graficznych. Dla wszystkich algorytmów proponowanych w niniejszej pracy przeprowadzono analizęzłożoności czasowej i pamięciowej w przypadku pesymistycznym(dla niektórych także w przypadku średnim). Dzięki temu często można było wykazać, ze proponowane algorytmy są także najszybsze w sensie asymptotycznym
A Cover-Merging-Based Algorithm for the Longest Increasing Subsequence in a Sliding Window Problem
A longest increasing subsequence problem (LIS) is a well-known combinatorial problem with applications mainly in bioinformatics, where it is used in various projects on DNA sequences. Recently, a number of generalisations of this problem were proposed. One of them is to find an LIS among all fixed-size windows of the input sequence (LISW). We propose an algorithm for the LISW problem based on cover representation of the sequence that outperforms the existing methods for some class of the input sequences
koamabayili/VECTRON-author-checklist: VECTRON author checklist
We have done our best to complete the author checklist relating to the use of animals in the hut study. Note that the objective for the hut study was to evaluate the IRS treatment applications for residual efficacy against Anopheles mosquitoes, including the local An. coluzzii mosquito population. Cows were only used to attract mosquitoes into the huts and no tests were carried out directly on the cows. The author checklist is intended for use with studies where experiments are carried out on animals, which is why we have had such difficulty in completing this for the hut study, as many of the questions do not relate to how the cows were used
- …
