1,720,986 research outputs found
Data dimensionality reduction and data fusion for fast characterization of green coffee samples using hyperspectral sensors
Hyperspectral sensors represent a powerful tool for chemical mapping of solid-state samples, since they provide spectral information localized in the image domain in very short times and without the need of sample pretreatment. However, due to the large data size of each hyperspectral image, data dimensionality reduction (DR) is necessary in order to develop hyperspectral sensors for real-time monitoring of large sets of samples with different characteristics. In particular, in this work, we focused on DR methods to convert the three-dimensional data array corresponding to each hyperspectral image into a one-dimensional signal (1D-DR), which retains spectral and/or spatial information. In this way, large datasets of hyperspectral images can be converted into matrices of signals, which in turn can be easily processed using suitable multivariate statistical methods. Obviously, different 1D-DR methods highlight different aspects of the hyperspectral image dataset. Therefore, in order to investigate their advantages and disadvantages, in this work, we compared three different 1D-DR methods: average spectrum (AS), single space hyperspectrogram (SSH) and common space hyperspectrogram (CSH). In particular, we have considered 370 NIR-hyperspectral images of a set of green coffee samples, and the three 1D-DR methods were tested for their effectiveness in sensor fault detection, data structure exploration and sample classification according to coffee variety and to coffee processing method. Principal component analysis and partial least squares-discriminant analysis were used to compare the three separate DR methods. Furthermore, low-level and mid-level data fusion was also employed to test the advantages of using AS, SSH and CSH altogether. [Figure not available: see fulltext.
Chemometrics, imaging and spectroscopy laboratory – Department of Life Sciences, University of Modena and Reggio Emilia
Toward the Development of Combined Artificial Sensing Systems for Food Quality Evaluation: A Review on the Application of Data Fusion of Electronic Noses, Electronic Tongues and Electronic Eyes
Devices known as electronic noses (ENs), electronic tongues (ETs), and electronic eyes (EEs) have been developed in recent years in the in situ study of real matrices with little or no manipulation of the sample at all. The final goal could be the evaluation of overall quality parameters such as sensory features, indicated by the “smell”, “taste”, and “color” of the sample under investigation or in the quantitative detection of analytes. The output of these sensing systems can be analyzed using multivariate data analysis strategies to relate specific patterns in the signals with the required information. In addition, using suitable data-fusion techniques, the combination of data collected from ETs, ENs, and EEs can provide more accurate information about the sample than any of the individual sensing devices. This review’s purpose is to collect recent advances in the development of combined ET, EN, and EE systems for assessing food quality, paying particular attention to the different data-fusion strategies applied
Data reduction di immagini iperspettrali: applicazione a problemi di classificazione
L'imaging iperspettrale (HSI) consente di acquisire in pochi secondi ipercubi di grandi
dimensioni, composti da milioni di spettri, che corrispondono a file spesso più grandi di 50 MB.
Questa ricchezza di dati rappresenta il principale vantaggio dell’HSI, sebbene causi seri problemi
per la gestione dei dati, tali da complicare lo sviluppo di applicazioni industriali efficienti per il
controllo in linea. Il nostro gruppo di ricerca ha recentemente proposto un’alternativa1 per trattare
dataset composti da decine o centinaia di immagini iperspettrali, che consiste nel convertire ogni
immagine iperspettrale in un segnale, chiamato iperspettrogramma, costruito in modo da
considerare sia l’informazione di natura spaziale che spettrale. Risulta così possibile trasformare
dataset composti da un elevato numero di immagini iperspettrali in matrici bidimensionali di
iperspettrogrammi, che a loro volta possono essere analizzate utilizzando i più comuni metodi
chemiometrici quali PCA, PLS o PLS-DA.
In questo contesto, presentiamo due applicazioni degli iperspettrogrammi per la soluzione di
problemi di classificazione. Una prima applicazione riguarda l'individuazione precoce di difetti
superficiali in diverse varietà di mele, con particolare attenzione ai campioni in cui il difetto non
risulta apprezzabile ad occhio nudo. Le 800 immagini iperspettrali acquisite sono state convertite in
iperspettrogrammi permettendo così la riduzione delle dimensioni del dataset da 18.6 GB a 4.7 MB.
Inoltre la selezione di variabili mediante iPLS-DA ha permesso di ridurre ulteriormente le
dimensioni del dataset e identificare le regioni più rilevanti nel segnale. Il migliore modello iPLSDA,
calcolato utilizzando solo 30 variabili delle 1200 iniziali, ha portato ad un valore di efficienza
in predizione sul test set esterno pari a 89.6%. Una seconda applicazione riguarda la classificazione
di caffè verde appartenente a diverse tipologie: Arabica e Robusta. Prove preliminari hanno
mostrato come la classificazione mediante PLS-DA effettuata sugli iperspettrogrammi ha portato ad
un valore di efficienza in predizione del 98.3%
Fast exploration and classification of large hyperspectral image datasets for early bruise detection on apples
Hyperspectral imaging allows to easily acquire tens of thousands of spectra for a single sample in few seconds; though valuable, this data-richness poses many problems due to the difficulty of handling a representative amount of samples altogether. For this reason, we recently proposed an approach based on the idea of reducing each image into a one-dimensional signal, named hyperspectrogram, which accounts both for spatial and for spectral information. In this manner, a dataset of hyperspectral images can be easily and quickly converted into a set of signals (2D data matrix), which in turn can be analyzed using classical chemometric techniques. In this work, the hyperspectrograms obtained from a dataset of 800 NIR-hyperspectral images of two different apple varieties were used to discriminate bruised from sound apples using iPLS-DA as variable selection algorithm, which allowed to efficiently detect the presence of bruises. Moreover, the reconstruction as images of the selected variables confirmed that the automated procedure led to the exact identification of the spatial features related to the onset and to the subsequent evolution with time of the bruise defect
Classification of Arabica and Robusta coffee samples subjected to different technological treatments using various image analysis methods
Coffee varietal differentiation based on NIR spectroscopy has been widely investigated in the last
20 years [1-3]. In this work, we have applied hyperspectral imaging in the NIR range (900-1700
nm) for the classification of Arabica and Robusta coffee varieties, considering coffee beans
subjected to different processing methods, i.e., the so-called dry method (to produce natural coffee),
wet method (to produce washed coffee) and a somewhat intermediate processing method, referred to
as polishing method (to produce polished coffee).
PCA has been used as an exploratory technique both on the image mean spectra and on the
hyperspectrograms obtained from the images. The hyperspectrograms are built by compressing the
useful information contained in each hyperspectral image into a signal composed by the frequency
distribution curves of quantities calculated by PCA [4]. This procedure allows to compress the
information conveyed by the hyperspectral images, maintaining at the same time both spatial- and
spectral-related features.
The PCA models obtained showed a clear clustering of Arabica and Robusta samples, whereas,
considering the technological treatment, the polished coffee samples are clearly distinguishable
from the others, while natural and washed coffee samples are quite superimposed.
Image mean spectra and hyperspectrograms were then subjected to PLS-DA classification after
preprocessing using SNV followed by meancentering or meancentering only. Concerning the
discrimination of coffee samples between Arabica and Robusta categories, the same value of
classification efficiency in prediction (EFFPRED = 86.3%) has been obtained considering both the
mean spectra and the hyperspectrograms. After forward iPLS-DA variable selection, EFFPRED
increased up to 98.6% for models calculated using the mean spectra and up to 100% for the models
calculated using the hyperspectrograms.
As for the discrimination of the coffee samples into the three natural, polished and washed
processing categories, the PLS-DA models calculated using mean spectra led to EFFPRED values
equal to 81.1%, 95.7% and 49.8%, respectively, while the PLS-DA models calculated using
hyperspectrograms led to EFFPRED values equal to 94.7%, 100% and 92.4%, respectively. In this
case, iPLS-DA variable selection led to an increase of the performances of the model calculated on
mean spectra (EFFPRED equal to 82.9%, 98.6% and 86.5%, respectively) and to a decrease of the
performances of the model calculated using hyperspectrograms (EFFPRED equal to 82.9%, 89.3%
and 86.5%, respectively)
Exploration of datasets of hyperspectral images
Hyperspectral images of size usually greater than 50 MB can be easily acquired in very short times,
generally without the need of sample pretreatment. While Multivariate Image Analysis (MIA) tools
can be efficiently used for the exploration of single hyperspectral images or of groups composed by
a limited number (say up to 10) of merged images, the exploration of datasets composed by a large
number (>10) of images is less straightforward. However, a representative sampling of a large
number of specimens is frequently required to correctly estimate both intra- and inter-sample
variability. This implies the acquisition of datasets composed by a large number of hyperspectral
images and of several GB in size, especially in those cases where only one or a few samples can be
included in a single image scene. In this context, the exploration of the dataset by applying MIA to
single images or to subgroups of merged images does not allow to gain a global overview of the
entire dataset variability and to properly highlight the possible presence of outliers, clusters and/or
trends. A fast procedure which can be adopted to deal with this issue consists in computing the
average spectrum of each image, to build a matrix of average spectra of the analyzed hyperspectral
images. Although this approach leads sometimes to satisfactory results (especially when dealing
with homogeneous materials), the information related to spatial variability is lost, and the
hyperspectral image data are turned into “common” (i.e., not spatially resolved) spectral data. By
averaging spectra, for example, the useful information related to the presence of a defect localized
in a relatively narrow image area could be diluted within the massive amount of other “well
behaving” pixels, becoming no longer detectable.
Aiming to develop a fast and easy-to-use tool able to facilitate the exploration of large datasets of
hyperspectral images while maintaining both spectral- and spatial-related information of the
original images, we have proposed an approach which consists in automatically converting each
hyperspectral image into a signal named hyperspectrogram [1]. Essentially, the hyperspectrogram
can be viewed as a fingerprint containing the relevant information brought by the original
hyperspectral image, and is composed by a first part accounting for the spatial information and by a
second part accounting for the spectral information. By representing each image with a vector of
few hundreds of points, this procedure enables to compare simultaneously up to hundreds of images
by means of common multivariate analysis methods, such as PCA.
In order to facilitate the exploration of datasets of hyperspectral images through hyperspectrograms,
we have recently developed a Matlab Graphical User Interface (GUI), which easily allows
calculation and visualization of hyperspectrograms, exploration of the dataset and visualization of
the features of interest contained within each single sample directly in the original image domain
Apparato e metodo per determinare parametri fisici e chimici di un campione disomogeneo tramite acquisizione ed elaborazione di immagini a colori del campione
L’invenzione consiste in un dispositivo portatile compatto, economico e di semplice utilizzo per il monitoraggio in campo del grado di maturazione fenolica dell’uva mediante l'analisi di immagini acquisite utilizzando uno smartphone
Practical comparison of sparse methods for classification of Arabica and Robusta coffee species using near infrared hyperspectral imaging
In the present work sparse-based methods are applied to the analysis of hyperspectral images with the aim at studying their capability of being adequate methods for variable selection in a classification framework. The key aspect of sparse methods is the possibility of performing variable selection by forcing the model coefficients related to irrelevant variables to zero. In particular, two different sparse classification approaches, i.e. sPCA+kNN and sPLS-DA, were compared with the corresponding classical methods (PCA + kNN and PLS-DA) to classify Arabica and Robusta coffee species. Green coffee samples were analyzed using near infrared hyperspectral imaging and the average spectra from each hyperspectral image were used to build training and test sets; furthermore a test image was used to evaluate the performances of the considered methods at pixel-level. In our case, sparse methods led to similar results as classical methods, with the advantage of obtaining more interpretable and parsimonious models. An important result to highlight is that variable selection performed with two different sparse classification approaches converged to the selection of same spectral regions, which implies the chemical relevance of those regions in the discrimination of Arabica and Robusta coffee species
Transferring results from NIR-hyperspectral to NIR-multispectral imaging systems: A filter-based simulation applied to the classification of Arabica and Robusta green coffee
Due to the differences in terms of both price and quality, the availability of effective instrumentation to discriminate between Arabica and Robusta coffee is extremely important. To this aim, the use of multispectral imaging systems could provide reliable and accurate real-time monitoring at relatively low costs. However, in practice the implementation of multispectral imaging systems is not straightforward: the present work investigates this issue, starting from the outcome of variable selection performed using a hyperspectral system. Multispectral data were simulated considering four commercially available filters matching the selected spectral regions, and used to calculate multivariate classification models with Partial Least Squares-Discriminant Analysis (PLS-DA) and sparse PLS-DA. Proper strategies for the definition of the training set and the selection of the most effective combinations of spectral channels led to satisfactory classification performances (100% classification efficiency in prediction of the test set)
- …
