1,721,059 research outputs found
Multivariate predictive modeling and validation
Many problems in chemistry involve the prediction of one or more qualitative or quantitative properties based on the experimental data. Examples of such problems involve, for instance, the possibility of predicting protein or lipid content in food matrices based on NIR spectra or of diagnosing the onset of a disease through the MS or NMR analysis of serum samples. In the former case, the property to be predicted is of a quantitative nature, while in the latter, it is discrete (qualitative). This chapter presents the chemometric strategies most commonly used to formulate predictive models, i.e., models that relate one or more dependent variables Y (qualitative or quantitative) to a set of independent variables X
Prediction of viscosity index and pour point in ester lubricants using quantitative structure-property relationship (QSPR)
Since ancient times, lubricants have been applied in different fields of technology and, from the very beginning, there has been a wide interest on improving some of their physical-chemical properties. The turning point in the development of lubricants came in the twentieth century, when modern synthetic ester base fluids were realized. In fact, with respect to “natural” lubricants (fats and mineral oils), they can be modified in order to optimize some specific technological properties; in particular, it is desirable that they present high viscosity index and a low pour point. Nevertheless, it is not always straightforward to accustom these parameters, and different theoretical studies have been pursued on this regard. Above all, valid tools to investigate these type of problems are the Quantitative Structure-Properties/Activity relationship studies (QSPA/QSPR). Starting from this considerations, the aim of the present paper is to investigate, by means of QSPR models, whether it can be possible to individuate or design ester base lubricants with some peculiar technological specificities. In particular, a QSPR analysis has been conducted in order to predict viscosity index and pour point on 41 ester lubricants by means of partial least squares combined with Leardi's genetic algorithms. The present study has provided satisfying results from the prediction point of view, and it has led to interesting conclusions from the interpretation viewpoint. In fact, it has highlighted that, the viscosity index and, to a lesser extent, the pour point, are highly correlated to the geometry, the molecular connectivity and the spatial autocorrelation of the investigated substances. © 2018 Elsevier B.V
Rapid determination of alprazolam in gin tonic cocktails based on the coupling of IR spectroscopy and chemometrics: A feasibility study
"Drug -facilitated sexual assault" (DFSA) is a sexual assault perpetrated against a person rendered unconscious by a substance that changes her/his physical and/or mental condition, such as ethanol or drugs. Several active pharmaceutical ingredients, whether used alone or with alcoholic beverages, can produce anterograde amnesia and loss of inhibition. The most common pharmaceuticals found in DFSAs are GHB (-hydroxybutyric acid), benzodiazepines (Valium, Xanax, or Roipnol), antidepressants (Venlafaxine), muscle relaxants (cyclobenzaprine), antihistamines, sleeping pills (diphenhydramines), hallucinogens, and opioids. Biological samples are typically examined in cases of suspected DFSA; however, occasionally, samples are sent to labs a long period after being collected, jeopardizing the accuracy of the analysis. As a result, in recent years, the focus has shifted to directly detecting the presence of drugs in alcoholic beverages. In light of this, the purpose of the current study is to build a FT-IR-based approach for the determination of alprazolam in a common long drink (gin and tonic). To achieve this goal, pure (Class Pure) and spiked gin tonics (Class Spiked) were analyzed by Fourier -transform infrared spectroscopy (FT-IR). Afterward, two classifiers were used: Sequential preprocessing through ORThogonalization Linear Discriminant Analysis (SPORT-LDA) and Soft Independent Modeling of Class Analogies (SIMCA). Both approaches provided good results: SPORT-LDA achieved a 95% and a 98% accuracy rate (on the external test set of samples) for spiked and pure cocktails, respectively. This corresponds to the misclassification of 5 spiked and 1 pure drinks. The SIMCA model of class pure achieved 98.2% and 91.7% of specificity and sensitivity, respectively, coinciding with 55 pure samples (over 60) correctly accepted and 2 (over 110) erroneously rejected by the model. In conclusion, the SIMCA model of class pure seems preferable, because it minimizes the type II error. Eventually, the study was circumscribed to the spiked cocktails and a novel SPORT model was used to quantify alprazolam in spiked cocktails. This provided noteworthy results, in fact, it led to a Root Mean Square Error in Prediction (RMSEP) of 0.95, and a R2pred of 0.98
Authentication of “Avola almonds” by near infrared (NIR) spectroscopy and chemometrics
Avola almond is part of the “Traditional Italian Agri-food Product” (PAT) list, as established by The Italian Ministry of agricultural food, forestry and tourism policies; this endorsement testifies its status as a high added-value product, and, consequently, it highlights the need of analytical methodologies suitable for its authentication. For these reasons, in the present study, the possibility of developing a non-destructive approach, aimed at distinguishing almonds cultivated in the Avola area from others presenting a different geographical origin, has been investigated. To fulfil this purpose, 227 almonds, cultivated in the Avola area or in other Italian territories, have been analysed by near infrared (NIR) spectroscopy coupled with Partial Least Squares-Discriminant Analysis (PLS-DA) and Soft Independent Modelling of Class Analogies (SIMCA). The two tested approaches achieved satisfactory results (in external validation) indicating both of them would represent a suitable tool for the purpose of the study. © 2019 Elsevier Inc
E-Eye Solution for the Discrimination of Common and Niche Celery Ecotypes
Celery (Apium graveolens L.) is a well- known plant and at the basis of the culinary tradition of different populations. In Italy, several celery ecotypes, presenting unique peculiarities, are grown by small local producers, and they need to be characterized, in order to be protected and safeguarded. The present work aims at developing a fast and non-destructive method for the discrimination of a common celery (the "Elne" celery) from a typical celery of Abruzzo (Central Italy). The proposed strategy is based on the use of an e-eye tool which allows the collection of images used to infer colorgrams. Initially, a principal component analysis model was used to investigate the trends and outliers in the data. Then, the classification between the common celery (Elne class) and celery from Torricella Peligna (Torricella class) was achieved by a discriminant analysis, conducted by sequential preprocessing through orthogonalization (SPORT) and sequential and orthogonalized covariance selection (SO-CovSel) and by a class-modelling method called soft independent modelling of class analogies (SIMCAs). Among these, the highest accuracy was provided by the strategies, based on the discriminant classifiers, both of which provided a total accuracy of 82% in the external validation
Advanced Analytical Tools for the Estimation of Gut Permeability of Compounds of Pharmaceutical Interest
The present study aims at developing a quantitative structure–activity relationship (QSAR) model for the determination of gut permeability of 228 pharmacological drugs at different pH conditions (3, 5, 7.4, 9, intrinsic). As a consequence, five different datasets (according to the diverse permeability shown by the compounds at the different pH values) were handled, with the aim of discriminating compounds as low-permeable or high-permeable. In order to achieve this goal, molecular descriptors for all the investigated compounds were computed and then classification models calculated by means of partial least squares discriminant analysis (PLS-DA). A high predictive capability was achieved for all models, providing correct classification rates in external validation between 80% and 96%. In order to test whether a reduction in the molecular descriptors would improve predictions and provide information about the most relevant variables, a feature selection approach, covariance selection, was used to select the most relevant subsets of predictors. This led to a slight improvement in the predictive accuracies, and it has indicated that the most relevant descriptors for the discrimination of the investigated compounds into low- and high-permeable were associated with the 2D and 3D structures
Identification and quantification of turmeric adulteration in egg-pasta by near infrared spectroscopy and chemometrics
"Egg pasta" is a kind of pasta prepared by adding eggs in the dough; the color of this product is often associated to its quality, as it is proportional to the quantity of egg present in the dough. A possible adulteration on this product is represented by the addition of turmeric (not reported in the label) in the dough. The inclusion of this ingredient (which is minimal, given the strong coloring power of this spice) fraudulently accentuates the yellow color of the product, making it more attractive to the consumer. Given this scenario, the aim of the present work is to develop an analytical approach suitable at detecting the presence of turmeric as an adulterant in egg pasta. One hundred samples of traditional and adulterated egg pasta were analyzed by NIR spectroscopy and PLS-DA (Partial Least Squares Discriminant Analysis) in order to discriminate adulterated and compliant pasta. The classification model provided a total correct classification rate of 97.5% in external validation (40 samples). Eventually, the adulterant was quantified by PLS. This strategy provided satisfying results, achieving a RMSEP (Root Mean Square Error in Prediction) of 0.112 (%-w/w) in external validation
Green Multi-Platform Solution for the Quantification of Levodopa Enantiomeric Excess in Solid-State Mixtures for Pharmacological Formulations
The aim of the present work was to develop a green multi-platform methodology for the quantification of l-DOPA in solid-state mixtures by means of MIR and NIR spectroscopy. In order to achieve this goal, 33 mixtures of racemic and pure l-DOPA were prepared and analyzed. Once spectra were collected, partial least squares (PLS) was exploited to individually model the two different data blocks. Additionally, three different multi-block approaches (mid-level data fusion, sequential and orthogonalized partial least squares, and sequential and orthogonalized covariance selection) were used in order to simultaneously handle data from the different platforms. The outcome of the chemometric analysis highlighted the quantification of the enantiomeric excess of l-DOPA in enantiomeric mixtures in the solid state, which was possible by coupling NIR and PLS, and, to a lesser extent, by using MIR. The multi-platform approach provided a higher accuracy than the individual block analysis, indicating that the association of MIR and NIR spectral data, especially by means of SO-PLS, represents a valid solution for the quantification of the l-DOPA excess in enantiomeric mixtures
Chemometric Methods for Spectroscopy-Based Pharmaceutical Analysis
Spectroscopy is widely used to characterize pharmaceutical products or processes, especially due to its desirable characteristics of being rapid, cheap, non-invasive/non-destructive and applicable both off-line and in-/at-/on-line. Spectroscopic techniques produce profiles containing a high amount of information, which can profitably be exploited through the use of multivariate mathematic and statistic (chemometric) techniques. The present paper aims at providing a brief overview of the different chemometric approaches applicable in the context of spectroscopy-based pharmaceutical analysis, discussing both the unsupervised exploration of the collected data and the possibility of building predictive models for both quantitative (calibration) and qualitative (classification) responses
Extension of SO-PLS to multi-way arrays: SO-N-PLS
Multi-way data arrays are becoming more common in several fields of science. For instance, analytical instruments can sometimes collect signals at different modes simultaneously, as e.g. fluorescence and LC/GC-MS. Higher order data can also arise from sensory science, were product scores can be reported as function of sample, judge and attribute. Another example is process monitoring, where several process variables can be measured over time for several batches. In addition, so-called multi-block data sets where several blocks of data explain the same set of samples are becoming more common. Several methods exist for analyzing either multi-way or multi block data, but there has been little attention on methods that combine these two data properties. A common procedure is to "unfold" multi-way arrays in order to obtain two-way data tables on which classical multi-block methods can be applied. However, it is a known fact that unfolding can lead to overfitted models due to increased flexibility in parameter estimation. In this paper we present a novel multi-block regression method that can handle multi-way data blocks. This method is a combination of a multi-block method called Sequential and Orthogonalized-PLS (SO-PLS) and the multi-way version of PLS, N-PLS. The new method is therefore called SO-N-PLS. We have compared the method to Multi-block-PLS (MB-PLS) and SO-PLS on unfolded data. We investigate the hypotheses that SO-N-PLS has better performances on small data sets and noisy data, and that SO-N-PLS models are easier to interpret. The hypotheses are investigated by a simulation study and two real data examples; one dealing with regression and one with classification. The simulation study show that SO-N-PLS predicts better than the unfolded methods when the sample size is small and the data is noisy. This is due to the fact that it filters out the noise better than MB-PLS and SO-PLS. For the real data examples, the differences in prediction are small but the multi-way method allows easier interpretation
- …
