1,720,966 research outputs found
Going Beyond Counting First Authors in Author Co-citation Analysis
The present study examines one of the fundamental aspects of author co-citation analysis (ACA) - the way co-citation
counts are defined. Co-citation counting provides the data on which all subsequent statistical analyses and mappings
are based, and we compare ACA results based on two different types of co-citation counting - the traditional type that
only counts the first one among a cited work's authors on the one hand and a non-traditional type that takes into
account the first 5 authors of a cited work on the other hand. Results indicate that the picture produced through this non-traditional author co-citation counting contains more coherent author groups and is therefore considerably clearer. However, this picture represents fewer specialties in the research field being studied than that produced through the traditional first-author co-citation counting when the same number of top-ranked authors is selected and analyzed. Reasons for these effects are discussed
SGDE: Secure Generative Data Exchange for Cross-Silo Federated Learning
Privacy regulation laws, such as GDPR, impose transparency and security as design pillars for data processing algorithms. In this context, federated learning is one of the most influential frameworks for privacy-preserving distributed machine learning, achieving astounding results in many natural language processing and computer vision tasks. Several federated learning frameworks employ differential privacy to prevent private data leakage to unauthorized parties and malicious attackers. Many studies, however, highlight the vulnerabilities of standard federated learning to poisoning and inference, thus raising concerns about potential risks for sensitive data. To address this issue, we present SGDE, a generative data exchange protocol that improves user security and machine learning performance in a cross-silo federation. The core of SGDE is to share data generators with strong differential privacy guarantees trained on private data instead of communicating explicit gradient information. These generators synthesize an arbitrarily large amount of data that retain the distinctive features of private samples but differ substantially. In this work, SGDE is tested in a cross-silo federated network on images and tabular datasets, exploiting beta-variational autoencoders as data generators. From the results, the inclusion of SGDE turns out to improve task accuracy and fairness, as well as resilience to the most influential attacks on federated learning
Variations on the Author
“Variations on the Author” discusses two of Eduardo Coutinho’s recent films (Um Dia na Vida, from 2010, and Últimas Conversas, posthumously released in 2015) and their contribution to the general question of documentary authorship. The director’s filmography is characterized by a consistent yet self-effacing form of authorial self-inscription: Coutinho often features as an interviewer that rather than express opinions propels discourses; an interviewer that is good at listening. This mode of self-inscription characterizes him as an author who is not expressive but who is nonetheless markedly present on the screen. In Um Dia na Vida, however, Coutinho is completely absent form the image, while Últimas Conversas, on the contrary, includes a confessional prologue that moves the director from the margins to the center of his films. This article examines the ways in which these works stand out in the filmography of a director who offers new insights into the notion of cinematic authorship
Appropriate Similarity Measures for Author Cocitation Analysis
We provide a number of new insights into the methodological discussion about author cocitation analysis. We first argue that the use of the Pearson correlation for measuring the similarity between authors’ cocitation profiles is not very satisfactory. We then discuss what kind of similarity measures may be used as an alternative to the Pearson correlation. We consider three similarity measures in particular. One is the well-known cosine. The other two similarity measures have not been used before in the bibliometric literature. Finally, we show by means of an example that our findings have a high practical relevance.information science;Pearson correlation;cosine;similarity measure;author cocitation analysis
Dispelling the Myths Behind First-author Citation Counts
We conducted a full-scale evaluative citation analysis study of scholars in the XML research field to explore just how different from each other author rankings resulting from different citation counting methods actually are, and to demonstrate the capability of emerging data and tools on the Web in supporting more realistic citation counting methods. Our results contest some common arguments for the continued
use of first-author citation counts in the evaluation of scholars, such as high correlations between author rankings by first-author citation counts and other citation
counting methods, and high costs of using more realistic citation counting methods that are not well-supported by the ISI databases. It is argued that increasingly available digital full text research papers make it possible for citation analysis studies to go beyond what the ISI databases have directly supported and to employ more
sophisticated methods
An exploration of guided image-to-image generative adversarial networks in tyres manufacturing industry
LAUREA MAGISTRALEGrazie ai numerosi sviluppi nel campo dei modelli generativi, negli ultimi anni è cresciuto considerevolmente l'interesse verso i tentativi di generare sinteticamente immagini realistiche. A partire dai lavori sulle Reti Generative Avversarie, numerosi modelli si sono dimostrati capaci di creare figure molto convincenti, spesso riuscendo anche ad ingannare l'occhio umano.
Un particolare campo di ricerca che ha portato allo sviluppo di molti lavori interessanti è quello dell'Image-to-Image translation, che cerca, come suggerisce il nome, di produrre artificialmente delle immagini plausibili a partire da altre prese in input. Modelli del genere hanno avuto grande successo in diversi campi, come style-transfer, colorizzazione di immagini, super-resolution e domain adaptation, ma senza investigare a fondo le loro possibili applicazioni nel mondo reale, poiché spesso vengono addestrati su dataset accademici.
Attraverso questo lavoro abbiamo l'opportunità di utilizzare queste reti nel contesto della produzione di pneumatici grazie alla disponibilità dataci da Pirelli, tra i leader mondiali del settore, che punta ad automatizzare il suo processo di controllo qualità tramite l'adozione di tecniche di anomaly detection. Per favorire ciò, è necessario aumentare considerevolmente il numero di acquisizioni a disposizione dell'azienda attraverso la generazione di immagini artefatte.
Il nostro lavoro cerca di imparare la "traduzione" da un CAD ad un'acquisizione del corrispondente pneumatico, cercando di guidare la generazione in modo che l'immagine abbia delle desiderate caratteristiche che non sono incluse all'interno del dominio di partenza. Il nostro contributo, oltre ad essere un'interessante analisi sull'applicazione di una tale architettura in un contesto industriale, esplora direzioni quasi mai approfondite nell'ambito dell'I2I.
Innanzitutto, lo sviluppo del nostro modello generativo condizionato usando un dataset limitato e sbilanciato richiede l'applicazione di tecniche di transfer learning, il quale risulta essere un campo sostanzialmente insondato in letteratura.
Oltre a ciò, indagheremo sulla possibilità di imparare una corrispondenza tra caratteristiche date in input al modello e il suo spazio latente, cercando di manovrare in modo esplicito la generazione di immagini che contengano le caratteristiche desiderate.Thanks to recent years' proceedings of deep generative models, lots of interest has gathered around the synthetic generation of images. Derived models from Generative Adversarial Networks have proved to be capable of generating real looking samples, often managing to also fool the human eye.
A peculiar path of research that led to many impressive works is the so called Image-to-Image translation, which seeks, as the name suggest, to produce fake samples starting from a conditioning input image. Such models successfully performed in tasks like style-transfer, colourisation, super-resolution and domain adaptation. Still, few are the real world useful applications that have been investigated, testing these architectures in such adverse conditions as the ones present in industrial datasets contrary to the academic ones.
We have the opportunity to employ such models in a manufacturing context like the one of Pirelli, a world leading customer tyres manufacturer, which aims at automatizing their quality control process by mean of a deep learning anomaly detection pipeline. The addressed goal is to abet the learning of such models by augmenting their industrial dataset through the generation of mock samples.
The work attempts to learn the translation from a specific CAD drawing to a synthesized RGB acquisition of the corresponding tyre, employing additional categorical features as to guide the generation of the desired visual elements, which are not encoded in the source domain. Our contribution, other than an interesting analysis on the application of such architecture in an industrial setting, addresses almost little to unexplored directions in the I2I subject area.
Firstly, the development of our conditional generative model using a scarce and unbalanced dataset demands the need of applying transfer learning techniques to an I2I model, an uncharted field in the literature.
Furthermore, we investigate the possibility of learning a mapping between additional input feature vectors and the latent space of an I2I model to explicitly maneuver the generation towards samples containing the desired characteristics
Normalizing flows for anomaly detection : an application to industrial manufacturing quality control
LAUREA MAGISTRALEIl rilevamento di anomalie di immagini provenienti dall’industria manifatturiera è un
campo dove il Machine Learning è ancora applicato con difficoltà. A differenza del rilevamento
di oggetti, il rilevamento di anomalie è un processo che richiede una profonda
conoscenza dell’oggetto che non si limita alle features visibili, ma include anche il suo
metodo di costruzione. Gli standard industriali richiedono un basso rateo di falsi positivi,
ma la presenza di rumore nelle immagini è inevitabile, e quindi i risultati sono spesso
inaccettabili. Inoltre, i dataset disponibili al pubblico non sono rappresentativi delle
caratteristiche presenti in un autentico ambiente industriale, come la presenza di rumore,
di anomalie ambigue e di differenze tra oggetti della stessa classe.
Il nostro obiettivo è di superare questo problema. Abbiamo costruito un nuovo dataset
da un vero ambiente industriale con l’aiuto di Pirelli, uno dei più importanti produttori
di pneumatici. Il Pirelli Logo AD dataset consiste in oltre 3000 campionamenti di loghi
di pneumatici, ognuno consistente in 3 immagini catturate con diverse illuminazioni.
Abbiamo considerato di usare i Normalizing Flows perché sono dei modelli generativi e
possiedono l’abilità di modellare una vera distribuzione a posteriori. CS-Flow, uno dei
più avanzati NF, è stato selezionato per condurre gli esperimenti.
Abbiamo scoperto che CS-Flow ottiene una bassa performance di 0.6927 di AUROC. Il
lavoro successivo è consistito in modificare questa architettura. Abbiamo condotto studi
di ablazione sull’estrattore di feature, sul numero di coupling flows e sull’input. Sono
stati creati due nuove architetture di multi-image NF per studiare gli effetti dell’usare immagini
di diversa illumiazione insieme nel training: “Concatenation of features” e “Braid
Normalizing Flow”.
Il miglior modello è una composizione di 16 modelli di CS-Flow indipendenti di 2 coupling
blocks. Ogni modello lavora su una specifica patch dell’immagine di input. Il feature extractor
è stato ridotto a un Efficientnet-b0 di 8 layer. Il nuovo modello raggiunge 0.8533
di AUROC.Anomaly Detection (AD) of images from the manufacturing industry is a field where
Machine Learning is still struggling to be applied. Unlike Object Localization, AD is a
process that requires a deep level of knowledge of the analyzed item that is not limited
to the visible features but also its construction method. Industrial standards require a
low rate of false positives, but the presence of noise is unavoidable, and therefore the
results are often unacceptable. Moreover, the datasets available to the public are not
representative of the characteristics found in an authentic industrial environment, such as
the presence of noise, ambiguous anomalies, and differences between items from the same
class.
Our goal is to overcome this challenge. We constructed a new dataset from a real industrial
environment with the help of Pirelli, a leading tire manufacturer. The Pirelli Logo
AD dataset comprises over 3000 samples of logos of tires, each consisting of 3 images
taken in different light set up.
We considered using Normalizing Flows since they are generative models and possess the
ability to model a true posterior distribution. CS-Flow, one of the most advanced NF,
was selected to conduct the experiments.
We found out that CS-Flow obtains a low performance of 0.6927 AUROC. The successive
work consisted in modifying this architecture. We conducted ablation studies on the
feature extractor, the number of coupling flows, and the input. Two new multi-image NF
architectures were created to study the effects of using the images in different light modes
in training together: “Concatenation of features” and “Braid Normalizing Flow”.
The best model comprises 16 different independent 2 coupling blocks CS-Flow models,
each working on a specific patch of the input image. The feature extractor is reduced to
an 8-layer Efficientnet-b0. This new model reaches 0.8533 AUROC
- …
