1,721,017 research outputs found

    Project files provided as supporting information to the manuscript "Making sense of complex systems through resolution, relevance, and mapping entropy"

    No full text
    <p>README file to the project files provided as supporting information to the manuscript “Making sense of complex systems through resolution, relevance, and mapping entropy”</p> <p>Feb. 25, 2022</p> <p>Authors: Roi Holtzman, Marco Giulini and Raffaello Potestio</p> <p>==================================</p> <p>The dataset contains the following files:</p> <p>- A README file with the description of the pymap program for describing how different selections of *N* out of *n* degrees of freedom (mappings) affect the amount of information retained about a full data set.<br> - The pymap.py program<br> - The pymap.yml support file<br> - The data.tar tarball with the setup data<br> - The results.tar tarball with the output data<br> ===</p&gt

    Project files provided as supporting information to the manuscript "A deep learning approach to the structural analysis of proteins"

    No full text
    <p><strong>README file to the project files provided as supporting information to the manuscript “A deep learning approach to the structural analysis of proteins”</strong></p> <p>Dec. 30, 2018</p> <p>Authors: Marco Giulini and Raffaello Potestio</p> <p>==================================</p> <p>The dataset contains the following files:</p> <p> </p> <p>- datasets.zip: archive containing five .csv files, namely:</p> <p>            - decoys_cm.csv : all the data for 10728 protein decoys, training set</p> <p>            - evaluation_cm.csv : all data for 146 proteins in the evaluation set</p> <p>            - random_CG.csv : 1200 Coulomb matrices. 100 CG models for each protein with 120 amino acids</p> <p>            - 1e5g_centered_sphere.csv : 100 CG models in which the central atoms in 1e5g are not removed</p> <p>            - 1e5g_random_sphere.csv : 10 CG models for 10 different (random) locations for the sphere that includes atoms that have to be retained. 100 CG models in total</p> <p> </p> <p>- decoys_labels.lab containing the labels associated to the 10728 decoys present in the training set</p> <p>- evaluation_labels.lab containing the labels associated to the 146 pdb files in the evaluation set</p> <p>- random_CG_labels.lab containing the labels associated to the 6 proteins with 120 amino acids</p> <p>- network_development_training: a python script that performs cross validation and full training of the model</p> <p>- saved_networks.zip FOLDER containing 10 networks: the architecture is included in .json files while weight parameters are inside .hs files</p> <p> </p> <p>- pdb_files.zip FOLDER containing the PDB files that have been employed in the project, namely:</p> <p>            - pdb_files_len100 : pdb files with 100 amino acids</p> <p>            - pdb_files_len101-110 : pdb files with a number of amino acids between 101 and 110</p> <p>            - decoys : decoys of length 100 extracted from the above folder: name syntax == PDBNAME_decoy_STARTRES_ENDRES.pdb</p> <p>                        EXAMPLE 6gsp.pdb will give rise to 6gsp_decoy_0_100.pdb , 6gsp_decoy_1_101.pdb , 6gsp_decoy_2_102.pdb , 6gsp_decoy_3_103.pdb  , 6gsp_decoy_4_104.pdb</p> <p>            - pdb_files_len100 : 6 pdb files with 120 amino acids</p> <p> </p&gt

    Project files provided as supporting information to the manuscript "How Communication Pathways Bridge Local and Global Conformations in an IgG4 Antibody: a Molecular Dynamics Study"

    No full text
    June 23, 2021 Thomas Tarenzi, Marta Rigoli and Raffaello Potestio ================================== The dataset contains the following folders: - contact_area_binding_site: files with the computed surface area, used for the calculation of the contact area between PD-1 and the antibody Fab (Fig. S41). - hbonds_ab-pd1: number of hydrogen bonds between the antigen and the antibody, for each holo cluster (Fig. S41). - mutual_information: matrices with the computed mutual information, for each pair of residues (Fig. S34, S35, S46). The folder contains also the generalized correlation coefficients (Fig. S43) and the correlation scores (Fig. 4, S44, S45), computed from the mutual informations. - networks: communities - for each cluster, each residue is assigned to a community within the interaction network (Fig. S32, S33). betweenness - the values of edge betweenness for each cluster (S30, S31). - output_clustering: each frame of the apo and holo simulations is assigned a cluster index, on the basis of the structural similarity (Fig. S4). - PAD: per-residue values of PAD parameter, for apo and holo systems (Fig. 3, S39). - PCA: principal component analysis for each conformational cluster (Section S2.1). - representative_structures: representative structures for each conformational cluster (Fig. 2). - r_gyr_antibody: radii of gyration of the antibody, for each cluster (Fig. 2). - r_gyr_hinge: radii of gyration of the sole hinge segment, for each cluster (Fig. S36). - RMSD_antibody: distributions of the root-mean-square deviation of the antibody, for each cluster (Fig. S5). - RMSD_antigen: root-mean-square deviation of the antigen PD-1, for each cluster (Fig. S42). - RMSD_binding_site: distributions of the root-mean-square deviation of the residues belonging to the paratope, for each cluster (Fig. S40, S47). - RMSD_matrix: root-mean-square deviation between structures belonging to different pairs of clusters (Fig. S9). - rmsf_antigen: root-mean-square fluctuation of the antigen PD-1, for each cluster (Fig. S42). - rmsf_hinge: difference between the total root-mean-square fluctuations of the two hinge segments, for each cluster (Fig. S38). - salt_bridge: distribution of distances between residues R979 and D1377 (Fig. 4). - sasa_domains: contact area between Fab and Fc antibody domains, for each cluster (Fig. S8). - sasa_hinge: solvent accessible surface area of each hinge segment, for each cluster (Fig. S37)

    Project files provided as supporting information to the manuscript "Kinetics of radiation-induced DNA double-strand breaks through coarse-grained simulations"

    No full text
    README file to the project files provided as supporting information to the manuscript “Kinetics of radiation-induced DNA double-strand breaks through coarse-grained simulations" Authors: Manuel Micheloni, Lorenzo Petrolli, Gianluca Lattanzi and Raffaello Potestio ================================== The .zip file contains the following folders: DNA_sequence_LAMMPS: DNAsequence.txt: this is the LAMMPS data file. It contains the DNA sequence employed for the study. DSB_input_datafile Containing the binary LAMMPS data file that are employed as starting point for the DSB MD simulations. Each file contains the DNA molecule at a certain end-to-end distance (that gives also le name to the binary file). DSB_MD_simulation 0_MD_LAMMPS: Containing the LAMMPS simulation scripts. The subfolders are organized according to the logic of the study: for each DNA extension (folders named 1000, 1100, …, 1300), we investigated different DSB motifs (folders named 0,1, …,4). 1_DSB_raw_data Containing the relevant information about the MD trajectories. The files are named as “out_Ree_bd_n.mat”, where Ree is the DNA end-to-end distance, bd is the DSB distance and the index n=1,…,Nt where Nt is the total number of independent MD runs. Each “out”-file contains: i) “free-energy”, which is the internal energy contribution of those nucleotides between the breaks, and ii) POS_left/right_branch, matrices that provide the ids and the positions (xyz) of the nucleotides in the break, along trajectory. To better understand the distinction between “left/right”, see section “Assessment of the residual contact interface of the DSBs at the rupture time” in the Supplementary data. Finally, “POS_time” contains the length of the simulation run (step*dt). 2_Analysis_scripts Containing the employed analysis scripts. In folder 1_SigmoidalFitting, we provide the scripts that perform a sigmoidal fitting procedure [1] on the internal energy profile of the nucleotides between the strand breaks. Each script saves i) the internal energy barriers (activation_free_energy) and ii) breaking times (breaking_time) of all MD simulations characterized by a certain (Ree,bd). Finally, i) and ii) are averaged thus, in iii) dE and iv) tau_b we report the respective mean value and standard deviation. It is possible to find all data in 3_DSB_proc_data. [1] R P (2022). sigm_fit (https://www.mathworks.com/matlabcentral/fileexchange/42641-sigm_fit), MATLAB Central File Exchange. Retrieved July 2, 2022. 2_DsbDistanceAnalysis contains the scripts that generate the interaction matrices (in subfolder 1_interaction_matrix) that are employed in the subfolder 2_correlation_BreakTime_NucleotidesDistances. For additional details about the analysis, reference section “Analysis of the residual interactions at the characteristic time of a DSB rupture” of the article. The scripts “fitting_bd#_DSB_dist_interaction.m” produce “IntMatrix_Ree_bd.mat” data files in which are contained i) “interaction_matrix_avg”, representing the average values of the interactions in the bound state of the DNA molecule, and ii) “interaction_matrix_atBreak”, containing the relative distances of all nucleotides at the breaking time for all independent MD simulations characterized by a certain (Ree,bd). 3_DSB_proc_data The data contained in 1_DSB_raw_data are processed by the scripts in 3_Analysis_scripts and saved in 3_DSB_proc_data. For further details, see the the description of 2_Analysis_scripts. TimeScaling 0_MD_LAMMPS Here we provide the LAMMPS script employed to compute the diffusion coefficient of the 3855-bp DNA molecule. Specifically, we acquire the mean-squared-dispalcement (MSD) from which it is possible to extract the diffusion coefficient. 1_Diffusion_data Contains the MSDs for each independent simulation. NB: most data are saved according to the format .mat, used by MATLAB, a numerical computing environment and proprietary programming language developed by MathWorks.RP and MM acknowledge support from the Italian Ministry of Education, University and Research (MIUR) through the FARE grant for the project HAMMOCK (Grant R18ZHWY3NC)

    Representation and information in molecular modelling

    No full text

    Going Beyond Counting First Authors in Author Co-citation Analysis

    Full text link
    The present study examines one of the fundamental aspects of author co-citation analysis (ACA) - the way co-citation counts are defined. Co-citation counting provides the data on which all subsequent statistical analyses and mappings are based, and we compare ACA results based on two different types of co-citation counting - the traditional type that only counts the first one among a cited work's authors on the one hand and a non-traditional type that takes into account the first 5 authors of a cited work on the other hand. Results indicate that the picture produced through this non-traditional author co-citation counting contains more coherent author groups and is therefore considerably clearer. However, this picture represents fewer specialties in the research field being studied than that produced through the traditional first-author co-citation counting when the same number of top-ranked authors is selected and analyzed. Reasons for these effects are discussed

    Variations on the Author

    Full text link
    “Variations on the Author” discusses two of Eduardo Coutinho’s recent films (Um Dia na Vida, from 2010, and Últimas Conversas, posthumously released in 2015) and their contribution to the general question of documentary authorship. The director’s filmography is characterized by a consistent yet self-effacing form of authorial self-inscription: Coutinho often features as an interviewer that rather than express opinions propels discourses; an interviewer that is good at listening. This mode of self-inscription characterizes him as an author who is not expressive but who is nonetheless markedly present on the screen. In Um Dia na Vida, however, Coutinho is completely absent form the image, while Últimas Conversas, on the contrary, includes a confessional prologue that moves the director from the margins to the center of his films. This article examines the ways in which these works stand out in the filmography of a director who offers new insights into the notion of cinematic authorship

    Appropriate Similarity Measures for Author Cocitation Analysis

    Full text link
    We provide a number of new insights into the methodological discussion about author cocitation analysis. We first argue that the use of the Pearson correlation for measuring the similarity between authors’ cocitation profiles is not very satisfactory. We then discuss what kind of similarity measures may be used as an alternative to the Pearson correlation. We consider three similarity measures in particular. One is the well-known cosine. The other two similarity measures have not been used before in the bibliometric literature. Finally, we show by means of an example that our findings have a high practical relevance.information science;Pearson correlation;cosine;similarity measure;author cocitation analysis

    Dispelling the Myths Behind First-author Citation Counts

    Full text link
    We conducted a full-scale evaluative citation analysis study of scholars in the XML research field to explore just how different from each other author rankings resulting from different citation counting methods actually are, and to demonstrate the capability of emerging data and tools on the Web in supporting more realistic citation counting methods. Our results contest some common arguments for the continued use of first-author citation counts in the evaluation of scholars, such as high correlations between author rankings by first-author citation counts and other citation counting methods, and high costs of using more realistic citation counting methods that are not well-supported by the ISI databases. It is argued that increasingly available digital full text research papers make it possible for citation analysis studies to go beyond what the ISI databases have directly supported and to employ more sophisticated methods
    corecore