1,850,144 research outputs found
Revisiting Volgenant-Jonker for Approximating Graph Edit Distance
Although it is agreed that the Volgenant-Jonker (VJ) algorithm provides a fast way to approximate graph edit distance (GED), until now nobody has reported how the VJ algorithm can be tuned for this task. To this end, we revisit VJ and propose a series of refinements that improve both the speed and memory footprint without sacrificing accuracy in the GED approximation. We quantify the effectiveness of these optimisations by measuring distortion between control-flow graphs: a problem that arises in malware matching. We also document an unexpected behavioural property of VJ
in which the time required to find shortest paths to unassigned nodes decreases as graph size increases, and explain how this phenomenon relates to the birthday paradox. Proceedings of 10th IAPR-TC-15 International Workshop, GbRPR 2015, Beijing, China, May 13-15, 2015
Calibrated imputation of numerical data under linear edit restrictions
A common problem faced by statistical offices is that data may be missing from collected data sets. The typical way to overcome this problem is to impute the missing data. The problem of imputing missing data is complicated by the fact that statistical data often have to satisfy certain edit rules and that values of variables sometimes have to sum up to known totals. Standard imputation methods for numerical data as described in the literature generally do not take such edit rules and totals into account. In the paper we describe algorithms for imputation of missing numerical data that do take edit restrictions into account and that ensure that sums are calibrated to known totals. The methods sequentially impute the missing data, i.e. the variables with missing values are imputed one by one. To assess the performance of the imputation methods a simulation study is carried out as well as an evaluation study based on a real dataset
FURY: Fuzzy unification and resolution based on edit distance
We present a theoretically founded framework for fuzzy
unification and resolution based on edit distance over trees.
Our framework extends classical unification and resolution
conservatively. We prove important properties of the framework
and develop the FURY system, which implements the
framework efficiently using dynamic programming. We
evaluate the framework and system on a large problem in
the bioinformatics domain, that of detecting typographical
errors in an enzyme name databas
Edit Distance-Based Classification of Symbol Sequences
There are many types of sequences on which classification algorithms are applied. Sequences of symbols with information on the relation between every two direct successors (referred to as link information) is one of these. A common approach for classification of such sequences is to only consider the symbols and disregard the link information. However, this can be at the expense of the quality of the classifications. In this thesis, we show how the edit distance can be used to classify sequences based on its symbols as well as its link information. The edit distance is an alignment-based pairwise dissimilarity metric. Its definition depends on the structural representation of the instances that are compared. The set of edit operations determines in which ways instances can be modified. Applying an edit operation comes with a certain edit cost that is given by its cost function. The edit distance between two instances is the least expensive sequence of edit operations that transforms the one into the other. The symbols, link information and order of a sequence can be represented by the attributed graph data structure. For every edit operation, a model of its impact has been presented. The cost function of an edit operation is based on the variable(s) on which the model of its impact relies. The cost functions are defined in the edit cost model. Using the parameters of the edit cost model, the definition of the cost functions can be controlled. The edit distance is optimized for classification by finding the values of the edit cost model's parameters that minimize the average intra-class dissimilarity and maximize the average inter-class dissimilarity. Results of experiments conducted on real and artificial data show that classifications based on the edit distance can outperform both models that only incorporate symbols and models that incorporate the symbols, link information and order of sequences. In addition, the results show that the classification performance is positively correlated with the length of the sequences. A limitation of the presented method is that optimization of the edit distance can be computationally expensive.Electrical Engineering, Mathematics and Computer ScienceIntelligent System
Mantua - Palazzo del Te - Story of Psyche Fresco on the Vault (School of Giulio Romano)
On back of postcard: 91774 -- Edit Patriarca Perbellini, Via Carlo Poma, Mantov
sotorrent/so-edit-viz: Release for SOTorrent Journal Extension
Visualization of edit and comment events in Stack Overflow threads
Seabed foraging by Antarctic krill: Implications for stock assessment, bentho-pelagic coupling, and the vertical transfer of iron
A compilation of more than 30 studies shows that adult Antarctic krill (Euphausia superba) may frequent benthic habitats year-round, in shelf as well as oceanic waters and throughout their circumpolar range. Net and acoustic data from the Scotia Sea show that in summer 2-20% of the population reside at depths between 200 and 2000 m, and that large aggregations can form above the seabed. Local differences in the vertical distribution of krill indicate that reduced feeding success in surface waters, either due to predator encounter or food shortage, might initiate such deep migrations and results in benthic feeding. Fatty acid and microscopic analyses of stomach content confirm two different foraging habitats for Antarctic krill: the upper ocean, where fresh phytoplankton is the main food source, and deeper water or the seabed, where detritus and copepods are consumed. Krill caught in upper waters retain signals of benthic feeding, suggesting frequent and dynamic exchange between surface and seabed. Krill contained up to 260 nmol iron per stomach when returning from seabed feeding. About 5% of this iron is labile, i.e., potentially available to phytoplankton. Due to their large biomass, frequent benthic feeding, and acidic digestion of particulate iron, krill might facilitate an input of new iron to Southern Ocean surface waters. Deep migrations and foraging at the seabed are significant parts of krill ecology, and the vertical fluxes involved in this behavior are important for the coupling of benthic and pelagic food webs and their elemental repositories
AlBi-HHU/homo-edit-distance: Initial Release
<p>Implementation of the homo-edit-distance algorithm.</p>
- …
