bonndata (Rheinische Friedrich-Wilhelms-Universität Bonn)
Not a member yet
194 research outputs found
Sort by
Genome sequence and annotation of Victoria cruziana
The genome of a Victoria cruziana plant was sequenced with nanopore long reads. The genome sequence was assembled with Verkko2, scaffolding was conducted with CPhasing and the gene models were predicted by BRAKER3 and GeMoMa. The functional annotation was predicted based on sequence similarity to well characterized Arabidopsis thaliana sequences
Raw data and analysis code for the study “Project 2025 as a technocratic blueprint: A corpus-based linguistic analysis of conservative governance discourse”
This repository contains all data and scripts used for the study “Project 2025 as a Technocratic Blueprint: A Corpus-Based Linguistic Analysis of Conservative Governance Discourse” (Schilling & Fuchs, 2025).
The study investigates the language of the Heritage Foundation’s Project 2025, a 900-page conservative policy blueprint, using methods from Corpus-Assisted Discourse Studies (CADS), Political Discourse Analysis (PDA), and psycholinguistic text analysis. The corpus includes Project 2025 and Democratic and Republican Party platforms (2016–2024).
The dataset includes:
Raw data (/data/raw_data/): full texts of Project 2025 and party platforms in CSV format.
Processed data (/data/raw_data/Project2025_lemmaPOS.csv): tokenized, lemmatized, and POS-tagged text.
Keyness results (/data/keyness/): unigram and bigram keyness calculations (log-likelihood, log-ratio).
Collocation results (/data/collocations/): top 10 adjective, noun, and verb collocates per node.
LIWC results (/data/liwc/): LIWC-22 category scores for each corpus.
Analysis scripts (/code/): R Markdown file (analysis_project2025.Rmd) and two Python scripts for collocation and keyness analysis.
All files are in UTF-8 plain-text format. The dataset contains no personal, sensitive, or proprietary data and derives entirely from publicly accessible political documents
Numerical simulation for a one-dimensional radiative transfer equation
This numerical simulation has been developed for the diffusion approximation of a one-dimensional stationary radiative transfer equation. Specifically, as the mean free path of the photons tends to zero this simulation depicts the formation of boundary layers and the fact that at the interior of the domain the radiation intensity becomes isotropic solving a Laplace equation. This numerical simulation supports the results obtained analyzing the radiative transfer equation via matched asymptotic expansions
Database on the costs and benefits of agroforestry in Africa
This dataset captures the economic value of ecosystem services generated by agroforestry systems in Africa. It is based on a systematic literature review (SLR) as well as focus group discussions conducted in selected countries across Africa. The dataset is composed of 164 variables. Descriptive variables indicate details such as source, country and specific location, tree/crop species, sapling survival rate and spacing between trees. The economic factors like inflation deflation and exchange rate are listed. The actual economic values are then divided into costs and benefits. The total costs are separated into maintenance costs and establishment costs. Total, establishment and maintenance costs as well as total benefits are given in USD 2020, the individual value categories are expressed in the currencies used in the original sources for traceability. The cost variables mainly refer to inputs and labour, while the benefits include variables such as income, yield, tree fruits, and sale of tree products
Data for Modeling the potential distribution of Wesselsbron, Sindbis, and Middelburg viruses and their vectors in Africa under future climatic and land-use changes
This dataset comprises ecological variables and species presence points used to model the current and future potential distribution of Wesselsbron, Sindbis, and Middelburg viruses and five mosquito vectors (Aedes circumluteolus, Aedes mcintoshi, Culex univittatus, Culex pipiens and Mansonia Africana). Ecological data include 19 Bioclimatic variables, Normalized Difference Vegetation Index, Built-Up areas, Settlement model grid, Human population, Forested areas, Livestock density, and Croplands. Two time periods were selected, current (reference year 2015) and future (2021 – 2040); current bioclimatic comprised data collected over 1970 – 2000. For the future, we utilized the Intergovernmental Panel on Climate Change’s Shared Socioeconomic Pathways (SSPs), which are projections of future greenhouse gas emissions and climate. We chose SSP2-4.5 for moderate and SSP5-8.5 for severe conditions. Based on the Coupled Model Intercomparison Project Phase 6, we selected two Global Climate Models (GCM), IPSL - CM6A - LR and HadGEM - GC31 – LL. Therefore, per GCM, we extracted ecological data for the two SSPs for the period 2021 – 2040. Presence points comprise coordinates of locations of samples from which the species were previously identified
pandas DataFrames of the DYToMuMu_M-20_CT10_TuneZ2star_v2_8TeV process
This dataset contains pandas DataFrames that represent filtered versions of CMS Open Data (in the form of ROOT files) available on the CERN OpenData Portal.
This dataset specifically contains data from a DYToMuMu process (Drell-Yan process resulting in two Muons in the final state), which is a simulated process created during the 2012 LHC run.
A total of 121 (99 for real collision data) relevant variables are contained in the filtered pandas DataFrames that can be found here. A list of variables can be found below, for a full explanation of them, please refer to the following paper (PLACEHOLDER, REFERENCE PAPER HERE):
nEvent, runNum, lumisection, evtNum;
nMuon, vecMuon_PT, vecMuon_Eta, vecMuon_Phi, vecMuon_PTErr, vecMuon_Q, vecMuon_StaPt, vecMuon_StaEta, vecMuon_StaPhi, vecMuon_TrkIso03, vecMuon_EcalIso03, vecMuon_HcalIso03;
nVertex, vecVertex_nTracksfit, vecVertex_ndof, vecVertex_Chi2, vecVertex_X, vecVertex_Y, vecVertex_Z;
nEle, vecEle_PT, vecEle_Eta, vecEle_Phi, vecEle_Q, vecEle_TrkIso03, vecEle_EcalIso03, vecEle_HcalIso03, vecEle_D0, vecEle_Dz;
nTau, vecTau_PT, vecTau_Eta, vecTau_Phi, vecTau_Q, vecTau_RawIso3Hits, vecTau_RawIsoMVA3oldDMwoLT, vecTau_RawIsoMVA3oldDMwLT, vecTau_RawIsoMVA3newDMwoLT, vecTau_RawIsoMVA3newDMwLT;
nPhoton, vecPhoton_PT, vecPhoton_Eta, vecPhoton_Phi, vecPhoton_Hovere, vecPhoton_Sthovere, vecPhoton_HasPixelSeed, vecPhoton_IsConv, vecPhoton_PassElectronVeto;
nMctruth, vecMctruth_PT, vecMctruth_Eta, vecMctruth_Phi, vecMctruth_Id_1, vecMctruth_Id_2, vecMctruth_X_1, vecMctruth_X_2, vecMctruth_PdgId, vecMctruth_Status, vecMctruth_Y, vecMctruth_Mass, vecMctruth_Mothers.first, vecMctruth_Mothers.second;
nJets, vecJet_PT, vecJet_Eta, vecJet_Phi, vecJet_D0, vecJet_Dz, vecJet_nCharged, vecJet_nNeutrals, vecJet_nParticles, vecJet_Beta, vecJet_BetaStar, vecJet_dR2Mean, vecJet_Q, vecJet_Mass, vecJet_Area, vecJet_Energy, vecJet_chEmEnergy, vecJet_neuEmEnergy, vecJet_chHadEnergy, vecJet_neuHadEnergy, vecJet_ID, vecJet_Num, vecJet_mcFlavor, vecJet_GenPT, vecJet_GenEta, vecJet_GenPhi, vecJet_GenMass, vecJet_flavorMatchPT, vecJet_JEC, vecJet_MatchIdx;
nPF, vecPF_PT, vecPF_Eta, vecPF_Phi, vecPF_Mass, vecPF_E, vecPF_Q, vecPF_PfType, vecPF_EcalE, vecPF_HcalE, vecPF_ndof, vecPF_Chi2, vecPF_pvId, vecPF_X, vecPF_Y, vecPF_Z, vecPF_JetNum;
fMET_PT, fMET_Eta, fMET_Phi;
HLT_Mu17_Mu8, HLT_Mu24, HLT_MET120_v, HLT_Ele27, HLT_HT350.
For the datasets containing data from real collisions at the LHC, the following variables are NOT contained:
nMctruth, vecMctruth_PT, vecMctruth_Eta, vecMctruth_Phi, vecMctruth_Id_1, vecMctruth_Id_2, vecMctruth_X_1, vecMctruth_X_2, vecMctruth_PdgId, vecMctruth_Status, vecMctruth_Y, vecMctruth_Mass, vecMctruth_Mothers.first, vecMctruth_Mothers.second;
vecJet_mcFlavor, vecJet_GenPT, vecJet_GenEta, vecJet_GenPhi, vecJet_GenMass, vecJet_flavorMatchPT, vecJet_JEC, vecJet_MatchIdx <br
Atlas on slavery in French and Spanish territories of Santo Domingo from the 16th century to the end of the 18th century
The project aims to create an atlas on slavery in Santo Domingo by compiling information on historical events in its French and Spanish territories from the 16th to the late 18th century. It responds to the growing need for deeper research into the history of slavery, particularly within the framework of UNESCO’s ‘Slave Route’ project. While Caribbean countries have increasingly evaluated their slavery-related heritage, results from the Dominican Republic and Haiti remain fragmented due to political and cultural challenges. The atlas consolidates these findings and expands existing perspectives by documenting manieles—refuge settlements of maroons—thus highlighting traces left by enslaved people themselves, in contrast to the dominant focus on colonial remains. By mapping manieles alongside official colonial installations and showing their spatial and temporal development, the atlas helps reveal known and previously unknown communication networks across the island. The results will be made available on a dedicated website, providing a new foundation for future research on slavery in the Caribbean.
Website: https://storymaps.arcgis.com/stories/7fcb77bc1c4d46aba4bb78
5124cccfb
Carbon stocks in aboveground and belowground pools in diverse land uses in the Atlantic Forest in Brazil
This dataset contains the carbon stock in different pools, including soil up to 300 cm deep, in different land uses, in the Atlantic Forest in Brazil. The data was collected in a rural settlement belonging to the MST (Landless Workers' Movement, or Movimento dos Trabalhores Sem Terra in Portuguese), in the Ipanema Settlement, in Iperó, state of São Paulo, Brazil. The farmers implement different land restoration approaches in order to restore the land. This study evaluated the carbon sequestration after land restoration, or the climate change mitigation potential of different restoration approaches, when including deep soil (300 cm) in the assessment. The unit of the carbon stocks is Mg ha-1. Soil and roots were collected in 6 layers, as follows: 0-20, 20-40, 40-100, 100-150, 150-200 and 200-300 cm. In addition to soil, carbon stocks were also determined in the aboveground biomass, necromass, fine roots and coarse roots. Six different land uses were assessed: Agriculture: 1) areas under long term (> 25 years) agriculture; 2) Agroforestry system in intermediate stage: successional agroforestry system, which is a mix of native and fruity exotic species planted in rows, and cropland between the rows, at the age of approximately 5 years old; 3) agroforestry system in advanced stages, which is the same as land use 2, but in later stages (approximately 19 years old), where trees grew and canopy closed, forming a structure similar to a secondary forest; 4) reforestation with mixed native species: planting of native species in rows following regular spacing (usually 3 meters wide), at a age with around 16 years old; 5) natural regeneration: areas set aside where regeneration of trees took place, after approximately 17 years after fencing off; 6) secondary forest: forest patches remnants, without management and targeted human interference. Land uses 2, 3, 4 and 5 are different restoration approaches and were under agriculture (land use 1) before land use change
A framework for the emergence and analysis of language in social learning agents
Neural systems have evolved not only to solve environmental challenges through internal representations but also, under social constraints, to communicate these to conspecifics. In this work, we aim to understand the structure of these internal representations and how they may be optimized to transmit pertinent information from one individual to another. Thus, we build on previous teacher-student communication protocols to analyze the formation of individual and shared abstractions and their impact on task performance. We use reinforcement learning in grid-world mazes where a teacher network passes a message to a student to improve task performance. This framework allows us to relate environmental variables with individual and shared representations. We compress high-dimensional task information within a low-dimensional representational space to mimic natural language features. In coherence with previous results, we find that providing teacher information to the student leads to a higher task completion rate and an ability to generalize tasks it has not seen before. Further, optimizing message content to maximize student reward improves information encoding, suggesting that an accurate representation in the space of messages requires bi-directional input. These results highlight the role of language as a common representation among agents and its implications on generalization capabilities
Code of "Expected Effects of a Global Transformation of Agricultural Pest Management"
The here presented dataset provides code to replicate the analysis of Möhring et al. (2025) on expected effects of a global transformation of agricultural pest management based on an online survey conducted in 2022 with 517 senior scientific experts from key agricultural regions and disciplines. The assessment framework covers 24 indicators in the economic, human health, food security, social, and environmental domains. It is anonymized. It further contains information on respondent characteristics and co-variates for the socio-economic and environmental state of the assessed regions from literature. The data was collected in 2022 with an online survey in Limesurvey. The data was collected to assess expected effects of a global transformation to pest management with zero or minimal pesticide use. This is a pressing challenge in global agriculture and relates to national and global policy targets on pesticide reduction.
The code in R can be used to replicate results of the analysis