1,721,055 research outputs found
QSAR model reproducibility and applicability: a case study of rate constants of hydroxy radical reaction models applied to Polybrominated Diphenyl Ethers and (Benzo-)Triazoles
The crucial importance of the three central OECD principles for quantitative structure-activity relationship
(QSAR) model validation is highlighted in a case study of tropospheric degradation of volatile organic compounds
(VOCs) by OH, applied to two CADASTER chemical classes (PBDEs and (benzo-)triazoles). The application of any
QSAR model to chemicals without experimental data largely depends on model reproducibility by the user. The reproducibility
of an unambiguous algorithm (OECD Principle 2) is guaranteed by redeveloping MLR models based on both
updated version of DRAGON software for molecular descriptors calculation and some freely available online descriptors.
The Genetic Algorithm has confirmed its ability to always select the most informative descriptors independently
on the input pool of variables. The ability of the GA-selected descriptors to model chemicals not used in model development
is verified by three different splittings (random by response, K-ANN and K-means clustering), thus ensuring the
external predictivity of the new models, independently of the training/prediction set composition (OECD Principle 4).
The relevance of checking the structural applicability domain (OECD Principle 3) becomes very evident on comparing
the predictions for CADASTER chemicals, using the new models proposed herein, with those obtained by EPI Suite
QSAR Prediction of Aquatic Toxicity of Triazoles and Benzo-Triazoles
Triazoles and benzo-triazoles (TAZ/BTAZ) are potentially hazardous chemicals that adversely affect humans and other non-target species, and are on the list of substances of very high concern (SVHC) in the European regulation of chemicals REACH.
TAZ/BTAZ are synthetic molecules used in various industrial processes (to obtain pharmaceuticals and agricultural products), and have a wide application as anti-corrosives, cleansing agents for textiles, flame retardants, photographic emulsions, etc. Furthermore they are abundantly used as components of liquid deicing agents for aircraft and airport runways. Because of their wide use they have been found distributed throughout the environment, mainly in water compartments. The amount of experimental data available for these molecules is insufficient for a comprehensive characterization of their environmental and toxicological profile and they have been included among the four classes of chemicals studied in the European FP7 Project CADASTER (CAse studies on the Development and Application of in Silico Techniques for Environmental hazard and Risk assessment).
In this study we investigated and modeled by QSAR different endpoints of interest to define the potential aquatic toxicological profile of hundreds of TAZ/BTAZ, and the possible correlations among different end-points toxicity. The studied end-points were: LC50 in fish (Onchorhynchus Mykiss), EC50 in Daphnia Magna, and EC50 in algae (Pseudokirchneriella subcapitata). Different theoretical molecular descriptors were calculated by different proprietary and freely available online software (DRAGON, Hyperchem, and the CADASTER online platform for the calculation of molecular descriptors – www.cadaster.eu). The endpoints of interest were modeled by multiple linear regression (MLR) and the Genetic Algorithm was used to select the relevant molecular descriptors by the MLR-Ordinary Least Squares (OLS) method. The best models were validated for their robustness using leave-one-out, bootstrap and the scrambling of the responses. External validation was also performed demonstrating the high predictive ability of the models. The reliability of the predictions was always checked in order to verify the chemical applicability domain of the models to new chemicals
QSAR Models for Aquatic Toxicity of Triazoles and Benzotriazoles: WP3 Results within the CADASTER Framework
Modelli QSAR per la tossicità acquatica di triazoli e benzotriazoli: risultati nell'ambito del progetto CADASTER
Floods and the Rural Economy of South-West Bengal, c. 1784-1793
Bengal, especially the lower portion of south-west Bengal, is one of the most floodprone
regions in the world and its backwardness has allegedly been a result of the annual
ravages caused by river spills. Standing crops and habitations are submerged under water
for days, and communication is disrupted and inhabitants are often forced into distress
migration. Economic life becomes most uncertain. Consequently, the settlement of bandhs
and pools was an important aspect of the administrative system in pre-colonial and colonial
times. During the colonial period, ensuring revenue collection became the primary aim of the
East India Company. Hence, by introducing the Permanent Settlement, it took an essential
step and gave an institutionalized form to flood control and embankment construction.
Accordingly, the Company presumed that the zamindar would sit at the apex of a new
agrarian order and affirm private property, generate economic surpluses, and ensure
political stability
Femtosecond Coherent Vibrational Dynamics of Anabaena Sensory Rhodopsin
The photo-induced isomerization of retinal protonated Schiff base (RPSB) inside the protein
pocket is one of the fastest (<ps) and most stereo-selective photochemical reactions in nature. The
ground state structure of the RPSB and its surrounding protein constructions are thought to be the
two most crucial factors to drive this reaction. The investigation of each factor individually was
the main goal of this thesis. Anabaena Sensory Rhodopsin (ASR), a recently discovered microbial
retinal protein, serves as an ideal system for this study as it binds two structural isomers (all-trans:
AT and 13-cis: 13C) of the RPSB within the same protein constructions in its photocycle. In the
present work, the photo-induced dynamics of the RPSB in ASR has been explored with the help
of time resolved coherent vibrational spectroscopic methods, which monitor the photo-induced
sub-ps structural changes of the RPSB. These studies have helped to shed light on the intricate
relationship between electronic and vibrational dynamics of the RPSB.
In the first half of this thesis, a comparative study showed both electronic and vibrational dynamics
are widely distinct for the AT and 13C isomers of the RPSB in ASR. In particular, the 13C isomer
exhibited more than five folds faster dynamics than the AT isomer. One possible molecular origin
behind this dynamical difference was found by comparing the ground state Raman spectra of the
two isomers. It depicted an increase in the amplitude of hydrogen-out-of-plane (HOOP) modes for
the 13C isomer, which is usually considered to be an evidence of distortion in the ground state
structure for the retinal system. The ground state pre-distortion has been reported as a potential
element for the acceleration of the isomerization reaction for the 13C isomer, in analogy with the
cis isomers of visual rhodopsin and bacteriorhodopsin.
The second half of this work explored the role of the part of protein helix inside the retinal pocket
as well as that far away from the pocket. In particular, the replacement of the amino acid residues
in vicinity of the RPSB by point mutation caused an acceleration of the reaction rate for the AT
isomer, but it had only a minor effect for the 13C isomer of the RPSB. Furthermore, the truncation
of the part of the protein, embedded into the cytoplasmic region, affected the formation of the
primary photoproduct. All these experimental results lead to two major conclusions of this thesis:
(i) the protein constructions govern the retinal isomerization dynamics and (ii) the same protein
cage exerts differential interactions on two structural isomers of the RPSB
A Machine and Deep Learning Framework to Retain Customers Based on Their Lifetime Value
Customer Lifetime Value (CLV) measures the average revenue generated by a customer over the course of their association with the firm. The Recency Frequency Monetary (RFM) Model is used to calculate the CLV. Recency is the latest item purchased. The number of times an item is purchased is the Frequency. Monetary is the price spent on the product by customers. CLV is measured using previous customer transactions of RFM factors. This research proposes a Deep Learning Customer Retention Framework to predict the Customer Lifetime Value in order to retain customers through an effective Customer Relationship Management strategy. The proposed framework combines clustering and regression models to analyze the significant variables for predicting the lifetime value of customers. Customers are categorized into levels such as high medium and low profitable customers based on their lifetime value. This research compares Deep Neural Network models, Machine Learning models and Probabilistic models. The Deep Neural Network is ANN. The machine learning models are Linear Regression, Random Forest, Gradient Boosting. The probabilistic models are Gamma-Gamma and Betageometric/negative binomial. The models are compared in order to predict the level of profitable customers. Results demonstrate that Deep Neural Network (DNN) model outperforms the other models with 71% accuracy. Improved prediction model for CLV and segmentation assists the firms to plan and decide relevant CRM strategies such as customer profitability analysis, cross-selling and one to one marketing for the future
Evaluation of CADASTER QSAR models for aquatic toxicity of (benzo)-triazoles and prioritization by consensus
Enhancing Document Information Analysis with Multi-Task Pre-training: A Robust Approach for Information Extraction in Visually-Rich Documents
This paper introduces a deep learning model tailored for document information
analysis, emphasizing document classification, entity relation extraction, and
document visual question answering. The proposed model leverages
transformer-based models to encode all the information present in a document
image, including textual, visual, and layout information. The model is
pre-trained and subsequently fine-tuned for various document image analysis
tasks. The proposed model incorporates three additional tasks during the
pre-training phase, including reading order identification of different layout
segments in a document image, layout segments categorization as per PubLayNet,
and generation of the text sequence within a given layout segment (text block).
The model also incorporates a collective pre-training scheme where losses of
all the tasks under consideration, including pre-training and fine-tuning tasks
with all datasets, are considered. Additional encoder and decoder blocks are
added to the RoBERTa network to generate results for all tasks. The proposed
model achieved impressive results across all tasks, with an accuracy of 95.87%
on the RVL-CDIP dataset for document classification, F1 scores of 0.9306,
0.9804, 0.9794, and 0.8742 on the FUNSD, CORD, SROIE, and Kleister-NDA datasets
respectively for entity relation extraction, and an ANLS score of 0.8468 on the
DocVQA dataset for visual question answering. The results highlight the
effectiveness of the proposed model in understanding and interpreting complex
document layouts and content, making it a promising tool for document analysis
tasks
- …
