1,721,055 research outputs found

    QSAR model reproducibility and applicability: a case study of rate constants of hydroxy radical reaction models applied to Polybrominated Diphenyl Ethers and (Benzo-)Triazoles

    No full text
    The crucial importance of the three central OECD principles for quantitative structure-activity relationship (QSAR) model validation is highlighted in a case study of tropospheric degradation of volatile organic compounds (VOCs) by OH, applied to two CADASTER chemical classes (PBDEs and (benzo-)triazoles). The application of any QSAR model to chemicals without experimental data largely depends on model reproducibility by the user. The reproducibility of an unambiguous algorithm (OECD Principle 2) is guaranteed by redeveloping MLR models based on both updated version of DRAGON software for molecular descriptors calculation and some freely available online descriptors. The Genetic Algorithm has confirmed its ability to always select the most informative descriptors independently on the input pool of variables. The ability of the GA-selected descriptors to model chemicals not used in model development is verified by three different splittings (random by response, K-ANN and K-means clustering), thus ensuring the external predictivity of the new models, independently of the training/prediction set composition (OECD Principle 4). The relevance of checking the structural applicability domain (OECD Principle 3) becomes very evident on comparing the predictions for CADASTER chemicals, using the new models proposed herein, with those obtained by EPI Suite

    QSAR Prediction of Aquatic Toxicity of Triazoles and Benzo-Triazoles

    No full text
    Triazoles and benzo-triazoles (TAZ/BTAZ) are potentially hazardous chemicals that adversely affect humans and other non-target species, and are on the list of substances of very high concern (SVHC) in the European regulation of chemicals REACH. TAZ/BTAZ are synthetic molecules used in various industrial processes (to obtain pharmaceuticals and agricultural products), and have a wide application as anti-corrosives, cleansing agents for textiles, flame retardants, photographic emulsions, etc. Furthermore they are abundantly used as components of liquid deicing agents for aircraft and airport runways. Because of their wide use they have been found distributed throughout the environment, mainly in water compartments. The amount of experimental data available for these molecules is insufficient for a comprehensive characterization of their environmental and toxicological profile and they have been included among the four classes of chemicals studied in the European FP7 Project CADASTER (CAse studies on the Development and Application of in Silico Techniques for Environmental hazard and Risk assessment). In this study we investigated and modeled by QSAR different endpoints of interest to define the potential aquatic toxicological profile of hundreds of TAZ/BTAZ, and the possible correlations among different end-points toxicity. The studied end-points were: LC50 in fish (Onchorhynchus Mykiss), EC50 in Daphnia Magna, and EC50 in algae (Pseudokirchneriella subcapitata). Different theoretical molecular descriptors were calculated by different proprietary and freely available online software (DRAGON, Hyperchem, and the CADASTER online platform for the calculation of molecular descriptors – www.cadaster.eu). The endpoints of interest were modeled by multiple linear regression (MLR) and the Genetic Algorithm was used to select the relevant molecular descriptors by the MLR-Ordinary Least Squares (OLS) method. The best models were validated for their robustness using leave-one-out, bootstrap and the scrambling of the responses. External validation was also performed demonstrating the high predictive ability of the models. The reliability of the predictions was always checked in order to verify the chemical applicability domain of the models to new chemicals

    Floods and the Rural Economy of South-West Bengal, c. 1784-1793

    Full text link
    Bengal, especially the lower portion of south-west Bengal, is one of the most floodprone regions in the world and its backwardness has allegedly been a result of the annual ravages caused by river spills. Standing crops and habitations are submerged under water for days, and communication is disrupted and inhabitants are often forced into distress migration. Economic life becomes most uncertain. Consequently, the settlement of bandhs and pools was an important aspect of the administrative system in pre-colonial and colonial times. During the colonial period, ensuring revenue collection became the primary aim of the East India Company. Hence, by introducing the Permanent Settlement, it took an essential step and gave an institutionalized form to flood control and embankment construction. Accordingly, the Company presumed that the zamindar would sit at the apex of a new agrarian order and affirm private property, generate economic surpluses, and ensure political stability

    Femtosecond Coherent Vibrational Dynamics of Anabaena Sensory Rhodopsin

    Full text link
    The photo-induced isomerization of retinal protonated Schiff base (RPSB) inside the protein pocket is one of the fastest (<ps) and most stereo-selective photochemical reactions in nature. The ground state structure of the RPSB and its surrounding protein constructions are thought to be the two most crucial factors to drive this reaction. The investigation of each factor individually was the main goal of this thesis. Anabaena Sensory Rhodopsin (ASR), a recently discovered microbial retinal protein, serves as an ideal system for this study as it binds two structural isomers (all-trans: AT and 13-cis: 13C) of the RPSB within the same protein constructions in its photocycle. In the present work, the photo-induced dynamics of the RPSB in ASR has been explored with the help of time resolved coherent vibrational spectroscopic methods, which monitor the photo-induced sub-ps structural changes of the RPSB. These studies have helped to shed light on the intricate relationship between electronic and vibrational dynamics of the RPSB. In the first half of this thesis, a comparative study showed both electronic and vibrational dynamics are widely distinct for the AT and 13C isomers of the RPSB in ASR. In particular, the 13C isomer exhibited more than five folds faster dynamics than the AT isomer. One possible molecular origin behind this dynamical difference was found by comparing the ground state Raman spectra of the two isomers. It depicted an increase in the amplitude of hydrogen-out-of-plane (HOOP) modes for the 13C isomer, which is usually considered to be an evidence of distortion in the ground state structure for the retinal system. The ground state pre-distortion has been reported as a potential element for the acceleration of the isomerization reaction for the 13C isomer, in analogy with the cis isomers of visual rhodopsin and bacteriorhodopsin. The second half of this work explored the role of the part of protein helix inside the retinal pocket as well as that far away from the pocket. In particular, the replacement of the amino acid residues in vicinity of the RPSB by point mutation caused an acceleration of the reaction rate for the AT isomer, but it had only a minor effect for the 13C isomer of the RPSB. Furthermore, the truncation of the part of the protein, embedded into the cytoplasmic region, affected the formation of the primary photoproduct. All these experimental results lead to two major conclusions of this thesis: (i) the protein constructions govern the retinal isomerization dynamics and (ii) the same protein cage exerts differential interactions on two structural isomers of the RPSB

    A Machine and Deep Learning Framework to Retain Customers Based on Their Lifetime Value

    No full text
    Customer Lifetime Value (CLV) measures the average revenue generated by a customer over the course of their association with the firm. The Recency Frequency Monetary (RFM) Model is used to calculate the CLV. Recency is the latest item purchased. The number of times an item is purchased is the Frequency. Monetary is the price spent on the product by customers. CLV is measured using previous customer transactions of RFM factors. This research proposes a Deep Learning Customer Retention Framework to predict the Customer Lifetime Value in order to retain customers through an effective Customer Relationship Management strategy. The proposed framework combines clustering and regression models to analyze the significant variables for predicting the lifetime value of customers. Customers are categorized into levels such as high medium and low profitable customers based on their lifetime value. This research compares Deep Neural Network models, Machine Learning models and Probabilistic models. The Deep Neural Network is ANN. The machine learning models are Linear Regression, Random Forest, Gradient Boosting. The probabilistic models are Gamma-Gamma and Betageometric/negative binomial. The models are compared in order to predict the level of profitable customers. Results demonstrate that Deep Neural Network (DNN) model outperforms the other models with 71% accuracy. Improved prediction model for CLV and segmentation assists the firms to plan and decide relevant CRM strategies such as customer profitability analysis, cross-selling and one to one marketing for the future

    Enhancing Document Information Analysis with Multi-Task Pre-training: A Robust Approach for Information Extraction in Visually-Rich Documents

    Full text link
    This paper introduces a deep learning model tailored for document information analysis, emphasizing document classification, entity relation extraction, and document visual question answering. The proposed model leverages transformer-based models to encode all the information present in a document image, including textual, visual, and layout information. The model is pre-trained and subsequently fine-tuned for various document image analysis tasks. The proposed model incorporates three additional tasks during the pre-training phase, including reading order identification of different layout segments in a document image, layout segments categorization as per PubLayNet, and generation of the text sequence within a given layout segment (text block). The model also incorporates a collective pre-training scheme where losses of all the tasks under consideration, including pre-training and fine-tuning tasks with all datasets, are considered. Additional encoder and decoder blocks are added to the RoBERTa network to generate results for all tasks. The proposed model achieved impressive results across all tasks, with an accuracy of 95.87% on the RVL-CDIP dataset for document classification, F1 scores of 0.9306, 0.9804, 0.9794, and 0.8742 on the FUNSD, CORD, SROIE, and Kleister-NDA datasets respectively for entity relation extraction, and an ANLS score of 0.8468 on the DocVQA dataset for visual question answering. The results highlight the effectiveness of the proposed model in understanding and interpreting complex document layouts and content, making it a promising tool for document analysis tasks
    corecore