1,720,958 research outputs found
Precipitation nowcasting with generative diffusion models
In recent years traditional numerical methods for accurate weather prediction have been increasingly challenged by deep learning methods. Numerous historical datasets used for short and medium-range weather forecasts are typically organized into a regular spatial grid structure. This arrangement closely resembles images: each weather variable can be visualized as a map or, when considering the temporal axis, as a video. Several classes of generative models, comprising Generative Adversarial Networks, Variational Autoencoders, or the recent Denoising Diffusion Models have largely proved their applicability to the next-frame prediction problem, and is thus natural to test their performance on the weather prediction benchmarks. Diffusion models are particularly appealing in this context, due to the intrinsically probabilistic nature of weather forecasting: what we are really interested to model is the probability distribution of weather indicators, whose expected value is the most likely prediction. In our study, we focus on a specific subset of the ERA-5 dataset, which includes hourly data pertaining to Central Europe from the years 2016 to 2021. Within this context, we examine the efficacy of diffusion models in handling the task of precipitation nowcasting, with a lead time of 1 to 3 hours. Our work is conducted in comparison to the performance of well-established U-Net models, as documented in the existing literature. An additional comparative analysis has been done with the forecasting capabilities of the CERRA system, part of the Copernicus Climate Change Service. The novelty of our approach, Generative Ensemble Diffusion (GED), lies in its innovative use of a diffusion model to generate a diverse set of possible weather scenarios. These scenarios are then amalgamated into a single prediction in a post-processing phase. This approach mimics the usual weather forecasting technique consisting in running an ensemble of numerical simulations under slightly different initial conditions by exploiting instead the intrinsic stochasticity of the generative model. In comparison to recent deep learning models addressing the same problem, our approach results in approximately a 25% reduction in the mean squared error. Reverse diffusion is a core concept in our GED approach, is particularly relevant to weather forecasting. In the context of diffusion models, reverse diffusion refers to the process of iteratively refining a noisy initial prediction into a coherent and realistic forecast. By leveraging reverse diffusion, our model effectively simulates the complex temporal dynamics of weather systems, mirroring the inherent uncertainty and variability in weather patterns
MAMKit: A Comprehensive Multimodal Argument Mining Toolkit
Multimodal Argument Mining (MAM) is a recent area of research aiming to extend argument analysis and improve discourse understanding by incorporating multiple modalities. Initial results confirm the importance of paralinguistic cues in this field. However, the research community still lacks a comprehensive platform where results can be easily reproduced, and methods and models can be stored, compared, and tested against a variety of benchmarks. To address these challenges, we propose MAMKit, an open, publicly available, PyTorch toolkit that consolidates datasets and models, providing a standardized platform for experimentation. MAMKit also includes some new baselines, designed to stimulate research on text and audio encoding and fusion for MAM tasks. Our initial results with MAMKit indicate that advancements in MAM require novel annotation processes to encompass auditory cues effectively
Going Beyond Counting First Authors in Author Co-citation Analysis
The present study examines one of the fundamental aspects of author co-citation analysis (ACA) - the way co-citation
counts are defined. Co-citation counting provides the data on which all subsequent statistical analyses and mappings
are based, and we compare ACA results based on two different types of co-citation counting - the traditional type that
only counts the first one among a cited work's authors on the one hand and a non-traditional type that takes into
account the first 5 authors of a cited work on the other hand. Results indicate that the picture produced through this non-traditional author co-citation counting contains more coherent author groups and is therefore considerably clearer. However, this picture represents fewer specialties in the research field being studied than that produced through the traditional first-author co-citation counting when the same number of top-ranked authors is selected and analyzed. Reasons for these effects are discussed
Deep Learning Models for Downscaling of Metereological Variables
Reanalysis datasets play an important role in meteorological and climate research, offering a consistent and long-term record of atmospheric conditions by assimilating past observations with modern forecast models. These datasets are of great utility in various applications, including weather forecasting, climate change research, renewable energy prediction, resource management, air quality risk assessment, and the forecasting of rare climatic events. Among the most prominent reanalysis datasets is the Copernicus Regional Reanalysis for Europe (CERRA), which stands out due to its high-resolution coverage of the European domain. CERRA has demonstrated significant utility across multiple climate-related tasks, providing detailed insights that are essential for precise and localized studies. Despite its advantages, the availability of CERRA lags two years behind the current date, primarily due to the intensive computational demands and the complexities involved in acquiring the necessary external data. To address this temporal gap, this thesis proposes a novel method employing several deep neural models to approximate CERRA downscaling in a data-driven manner without the need for additional external information other than ERA5. By leveraging the lower resolution ERA5 dataset, this research frames the problem as a super-resolution task. The study focuses on downscaling wind speed data over Italy, utilising a model trained on existing and freely available data. The results are encouraging, as the model produces outputs closely resembling the original CERRA data, with validation against in-situ observations confirming its accuracy in approximating ground measurements.
This innovative approach not only demonstrates the potential of deep neural models in overcoming the computational and data acquisition constraints associated with high-resolution reanalysis datasets but also offers a viable solution to improve the timeliness and accessibility of such data
Generazione automatica di una knowledge base con applicazione ai Sustainable Development Goals
Questa tesi di laurea compie uno studio sull’ utilizzo di tecniche di web crawling, web scraping e Natural Language Processing per costruire automaticamente un dataset di documenti e una knowledge base di coppie verbo-oggetto utilizzabile per la classificazione di testi.
Dopo una breve introduzione sulle tecniche utilizzate verrà presentato il metodo di generazione, prima in forma teorica e generalizzabile a qualunque classificazione basata su un insieme di argomenti, e poi in modo specifico attraverso un caso di studio: il software SDG Detector. In particolare quest ultimo riguarda l’applicazione pratica del metodo esposto per costruire una
raccolta di informazioni utili alla classificazione di documenti in base alla presenza di uno o più Sustainable Development Goals.
La parte relativa alla classificazione è curata dal co-autore di questa applicazione, la presente invece si concentra su un’analisi di correttezza e performance basata sull’espansione del dataset e della derivante base di conoscenza
Variations on the Author
“Variations on the Author” discusses two of Eduardo Coutinho’s recent films (Um Dia na Vida, from 2010, and Últimas Conversas, posthumously released in 2015) and their contribution to the general question of documentary authorship. The director’s filmography is characterized by a consistent yet self-effacing form of authorial self-inscription: Coutinho often features as an interviewer that rather than express opinions propels discourses; an interviewer that is good at listening. This mode of self-inscription characterizes him as an author who is not expressive but who is nonetheless markedly present on the screen. In Um Dia na Vida, however, Coutinho is completely absent form the image, while Últimas Conversas, on the contrary, includes a confessional prologue that moves the director from the margins to the center of his films. This article examines the ways in which these works stand out in the filmography of a director who offers new insights into the notion of cinematic authorship
Appropriate Similarity Measures for Author Cocitation Analysis
We provide a number of new insights into the methodological discussion about author cocitation analysis. We first argue that the use of the Pearson correlation for measuring the similarity between authors’ cocitation profiles is not very satisfactory. We then discuss what kind of similarity measures may be used as an alternative to the Pearson correlation. We consider three similarity measures in particular. One is the well-known cosine. The other two similarity measures have not been used before in the bibliometric literature. Finally, we show by means of an example that our findings have a high practical relevance.information science;Pearson correlation;cosine;similarity measure;author cocitation analysis
Dispelling the Myths Behind First-author Citation Counts
We conducted a full-scale evaluative citation analysis study of scholars in the XML research field to explore just how different from each other author rankings resulting from different citation counting methods actually are, and to demonstrate the capability of emerging data and tools on the Web in supporting more realistic citation counting methods. Our results contest some common arguments for the continued
use of first-author citation counts in the evaluation of scholars, such as high correlations between author rankings by first-author citation counts and other citation
counting methods, and high costs of using more realistic citation counting methods that are not well-supported by the ISI databases. It is argued that increasingly available digital full text research papers make it possible for citation analysis studies to go beyond what the ISI databases have directly supported and to employ more
sophisticated methods
- …
