1,721,046 research outputs found
Modeling and Predicting Temporal Patterns of Web Content Changes
The technologies aimed at Web content discovery, retrieval and management face the compelling need of coping with its highly dynamic nature coupled with complex user interactions. This paper analyzes the temporal patterns of the content changes of three major news websites with the objective of
modeling and predicting their dynamics. It has been observed that changes are characterized by a time
dependent behavior with large
fluctuations and significant differences across hours and days. To explain this behavior, we represent the change patterns as time series. The trend and seasonal components of
the observed time series capture the weekly and daily periodicity, whereas the irregular components
take into account the remaining
fluctuations. Models based on trigonometric polynomials and ARMA
components accurately reproduce the dynamics of the empirical change patterns and provide extra-
polations into the future to be used for forecasting
Characterization of the evolution a news Web site
The Web has become a ubiquitous tool for distributing knowledge and information and for conducting businesses. To exploit the huge potential of the Web as a global information repository, it is necessary to understand its dynamics. These issues are particularly important for news Web sites as they are expected to provide fresh information on current world events to a potentially large user population. This paper presents an experimental study aimed at characterizing and modeling the evolution of a news Web site. We focused on the MSNBC Web site as it is a good representative of its category in terms of structure, news coverage and popularity. Specifically, we analyzed how often and to what extent the content of this site changed and we identified models describing its dynamics. The study has shown that the rate of page creations and updates was characterized by some well defined patterns that varied as a function of time of day and day of week. On the contrary, the content of individual pages changed to a different extent. Most updates involved a very small fraction of their content, whereas very few were more extensive and spread over the whole page. By taking into accounts all these aspects, we derived analytical models able to accurately capture and reproduce the evolution of the news Web site
- …
