1,721,133 research outputs found
Study of the applicability of an itemset-based portfolio planner in a multi-market context
Planning stock portfolios for long-term investments is a well-known financial problem. Many data mining and machine learning strategies have been proposed to automatically predict the set of uncorrelated stocks maximizing long-term portfolio returns. Among others, the use of scalable itemset-based strategies has recently been studied. Potentially, they can analyze large sets of historical prices corresponding to thousands of stocks in the worldwide market indexes. However, the current studies are still limited to single markets. This paper investigates the applicability of itemset-based strategies for planning stock portfolios in a multi-market context. Scaling the analyses towards multi-market scenarios poses a number of research questions, among which the choice of the diversification strategy, the influence of inter-market correlations among stock prices, and the profitability of multi-market strategies compared to single-market ones. This paper aims at answering to the aforesaid questions by considering a state-of-the-art itemset-based approach. The experimental results show that itemset-based strategies focus the generated portfolios on the outperforming markets. Furthermore, the performance of multi-market strategies with sector-based diversification is on average superior or comparable to single-market ones
Infrequent Weighted Itemset Mining Using Frequent Pattern Growth
Frequent weighted itemsets represent correlations frequently holding in data in which items may weight differently. However, in some contexts, e.g., when the need is to minimize a certain cost function, discovering rare data correlations is more interesting than mining frequent ones. This paper tackles the issue of discovering rare and weighted itemsets, i.e., the infrequent weighted itemset (IWI) mining problem. Two novel quality measures are proposed to drive the IWI mining process. Furthermore, two
algorithms that perform IWI and Minimal IWI mining efficiently, driven by the proposed measures, are presented. Experimental results show efficiency and effectiveness of the proposed approach
ViGEO: an Assessment of Vision GNNs in Earth Observation
Satellite missions and Earth Observation (EO) systems represent fundamental assets for environmental monitoring and the timely identification of catastrophic events, long-term monitoring of both natural resources and human-made assets, such as vegetation, water bodies, forests as well as buildings. Different EO missions enables the collection of information on several spectral bandwidths, such as MODIS, Sentinel-1 and Sentinel-2. Thus, given the recent advances of machine learning, computer vision and the availability of labeled data, researchers demonstrated the feasibility and the precision of land-use monitoring systems and remote sensing image classification through the use of deep neural networks. Such systems may help domain experts and governments in constant environmental monitoring, enabling timely intervention in case of catastrophic events (e.g., forest wildfire in a remote area). Despite the recent advances in the field of computer vision, many works limit their analysis on Convolutional Neural Networks (CNNs) and, more recently, to vision transformers (ViTs). Given the recent successes of Graph Neural Networks (GNNs) on non-graph data, such as time-series and images, we investigate the performances of a recent Vision GNN architecture (ViG) applied to the task of land cover classification. The experimental results show that ViG achieves state-of-the-art performances in multiclass and multilabel classification contexts, surpassing both ViT and ResNet on large-scale benchmarks
QuakeSet: A Dataset and Low-Resource Models to Monitor Earthquakes through Sentinel-1
Earthquake monitoring is necessary to promptly identify the affected areas, the severity of the events, and, finally, to
estimate damages and plan the actions needed for the restoration process. The use of seismic stations to monitor
the strength and origin of earthquakes is limited when dealing with remote areas (we cannot have global capillary
coverage). Identification and analysis of all affected areas is mandatory to support areas not monitored by traditional
stations. Using social media images in crisis management has proven effective in various situations. However,
they are still limited by the possibility of using communication infrastructures in case of an earthquake and by
the presence of people in the area. Moreover, social media images and messages cannot be used to estimate the
actual severity of earthquakes and their characteristics effectively. The employment of satellites to monitor changes
around the globe grants the possibility of exploiting instrumentation that is not limited by the visible spectrum, the
presence of land infrastructures, and people in the affected areas. In this work, we propose a new dataset composed
of images taken from Sentinel-1 and a new series of tasks to help monitor earthquakes from a new detailed view.
Coupled with the data, we provide a series of traditional machine learning and deep learning models as baselines to
assess the effectiveness of ML-based models in earthquake analysis
Identifying collaborations among researchers: a pattern-based approach
In recent years a huge amount of publications and scientific reports has become available through digital libraries and online databases. Digital libraries commonly provide advanced search interfaces, through which researchers can find and explore the most related scientific studies. Even though the publications of a single author can be easily retrieved and explored, understanding how authors have collaborated with each other on specific research topics and to what extent their collaboration have been fruitful is, in general, a challenging task. This paper proposes a new pattern-based approach to analyzing the correlations among the authors of most influential research studies. To this purpose, it analyzes publication data retrieved from digital libraries and online databases by means of an itemset-based data mining algorithm. It automatically extracts patterns representing the most relevant collaborations among authors on specific research topics. Patterns are evaluated and ranked according to the number of citations received by the corresponding publications. The proposed approach was validated in a real case study, i.e., the analysis of scientific literature on genomics. Specifically, we first analyzed scientific studies on genomics acquired from the OMIM database to discover correlations between authors and genes or genetic disorders. Then, the reliability of the discovered patterns was assessed using the PubMed search engine. The results show that, for the majority of the mined patterns, the most influential (top ranked) studies retrieved by performing author-driven PubMed queries range over the same gene/genetic disorder indicated by the top ranked pattern
- …
