1056 research outputs found
Sort by
Dataset for "Impact of microplastic fibres on direct membrane filtration of low-strength primary wastewater"
The paper associated with this dataset describes the effects of microfibres, a common type of microplastics, in direct membrane filtration of wastewater. The dataset contains: a) experimental data from membrane performance testing and calculations (transmembrane pressure, fouling resistance, fouling constants, cake foulant mass, specific cake resistance), and b) material characterisation data for microfibres, membranes and foulants (Fourier Transform Infrared Spectroscopy (FT-IR) spectra, Scanning electron microscope (SEM) images, Total Organic Carbon (TOC) concentrations, fluorescence Excitation-Emission Matrices).The experimental details including procedures and conditions are fully described in the associated paper.This research was supported by the United Kingdom – Iceland Arctic Science Partnership Scheme 2024/2025
Dataset for "Distortion/Interaction Analysis via Machine Learning"
Machine learning (ML) has previously been applied to predict reaction barriers for a variety of different chemical reactions. This is seen as the end point for this type of study however, post-reaction barrier analysis/energy decomposition approaches can provide insight into chemical reactivity. One such approach that has previously been used to provide information on chemical reactivity, for cycloaddition reactions in particular, is distortion/interaction-activation strain analysis (DIAS). We demonstrate that ML can be coupled with cheap and rapid semi-empirical quantum mechanical methods (SQM) to predict distortion and interaction energies at a fraction of the computational cost associated with running density functional theory (DFT) calculations. This dataset includes all the structural data in the form of Gaussian16 (Revision A.03 and C.01) output files for the four datasets used in this work and, the literature dataset reactions.Ground state reactant and transition state geometries for dimethyl malonate Michael addition reactions were built using Schrödinger’s R-Group Enumeration. R-groups were placed on various different positions of the Michael acceptor. Once generated, structures were conformationally searched using Schrödinger’s MacroModel (version 12.7) with OPLS3e. The lowest energy conformation for every structure was subsequently optimised using Gaussian16 (Revisions A.03 and C.01) using AM1 (IEFPCM=Water)//AM1 and wB97X-D/def2-TZVP (IEFPCM=Water)//wB97X-D/def2-TZVP.
For distortion/interaction-activation strain calculations, python code (available on the associated GitHub page: https://github.com/the-grayson-group/distortion-interaction_ML) was used to separate the distorted reactant structures before single point energies were calculated using Gaussian16 (Revision C.01) using AM1 and the DFT level of theory used in the original transition structure calculation and in solvent.Data has been re-uploaded to correct an error with ts_100_dft.log in the malonate data set
Dataset for "The Catalytic Enantioselective [1,2]-Wittig Rearrangement Cascade of Allylic Ethers"
This data set includes output files from the quantum chemical calculations run with Gaussian16 (Revision C.01) that support our computational mechanistic study of the enantioselective [1,2]-Wittig rearrangement of allylic ethers. It also contains three sets of in situ reaction monitoring data (collected by University of St Andrews contributors) and a Python script that fits the rate constants of a first-order kinetics model to the experimental data.Structures were computed in Gaussian 16 (revision C.01) with ONIOM. The full DFT single point energies were also run in Gaussian 16 (revision C.01). 1H NMR spectroscopy was used to collect the in situ reaction monitoring data
Dataset for "Public perceptions of biospecimen sampling and uncertainty in the context of personalised nutrition"
This dataset contains survey responses collected via the Prolific platform for a study investigating public attitudes towards personalised nutrition (PN). The study employed a 2 (scientific certainty: certain vs. uncertain) x 3 (biological sample type: urine vs. blood vs. stool) between-subjects quasi-experimental design. Participants were presented with a short vignette where these two conditions were manipulated (more information provided in the associated research publication).
Following participant demographic questions, the datasheet captures responses from a free association task, where individuals generated words in response to the vignette and then rated the affect associated with each word. The dataset also includes a series of outcome measures assessing participant attitudes on key aspects of personalised nutrition. These were measured using existing validated scales and include attitudes towards data security and privacy, the veracity of PN claims, equality of access to PN, perception of benefits, intention to adopt PN, and the perceived efficacy of trust and regulation in the industry.Data collection methodology is available in the associated open-access publication
Dataset for "RetroSketch: A Retrospective Method for Measuring Emotions and Presence in Virtual Reality"
The aggregated data file containing 140 participants' data collected and analysed in the CHI 2025 paper "RetroSketch: A Retrospective Method for Measuring Emotions and Presence in Virtual Reality".
Each participant completed two half-hour sessions and continuous measures for both sessions were aggregated in 60-second intervals, resulting in 31 rows per session and 62 rows per participant. The measures used in this study fall into three groups:
- Pre-measures include demographic information, gaming experience and preferences, personality and gamer traits, baseline emotions and physiology.
- Exposure measures include Retrospective method emotion and keypoint measures, experience sampling method (ESM) measures, and various physiological measures.
- Post-measures include measures of flow state, intrinsic motivation, multimodal presence, and simulator sickness, and participants' qualitative evaluations of RetroSketch and ESM measures.For the methodology and apparatus of the data collection, please refer to the associated paper.The study required participants to play one of five VR games (Assetto Corsa Competizione, Garden of the Sea, Half-Life Alyx, I Expect You To Die, and Red Matter) over two 30 minute sessions using a Vive Pro Eye VR headset and controllers.
Physiological measures were collected using the eye tracker in the VR headset (pupillometry), a Shimmer3 GSR+ tethered to a participant's middle and ring finger (EDA), a Polar H10 HR monitor chest strap (HR and HRV), and a Vive face tracker (facial tracking). All physiological measures were sent to a PC (Intel 13900K, Nvidia GTX 4090 and 64GB of DDR5 RAM) running the Unity data collection application over serial and Bluetooth (BLE protocol), which recorded all measures at a sample rate of approximately 40-50 Hz using the EmoSense SDK.
R (v4.4.1) was used to analyse the data.The main aggregate data file consists of the following measures:
Meta Data: Participant ID, session number, condition (Retro or Retro_ESM), and game played.
Pre-measures: Demographics, baseline emotions, big 5 personality traits (B5), immersive tendencies (ITQ), game genre preference, Brain Hex gamer types, tondello gamer traits, baseline simulator sickness (SSQ), and baseline physiology (calibration).
Exposure measures: Retro emotion measures (aggregated for every 60-second interval for both the 'Prior' 60s from the interval and over a 'Window' of 30s before and after the interval), Retro keypoint measures, experience sampling method (ESM) measures, and physiological measures aggregated for the same 60-second intervals (both 'Prior' and 'Window')
Post measures: PPL flow state (PPL-FSQ), flow state short (FSS), intrinsic motivation (IMI), multimodal presence (MPS), simulator sickness (SSQ), and qualitative questions comparing RetroSketch and ESM.
For further details about the measures, please refer to the associated paper
Dataset for "Understanding the role of aligned porosity on the intrinsic and extrinsic contributions to the dielectric permittivity of freeze-cast ferroelectrics"
This dataset is a part of the research article 'Understanding the role of aligned porosity on the intrinsic and extrinsic contributions to the dielectric permittivity of freeze-cast ferroelectric'. It contains comprehensive characterisation data for ferroelectric lead zirconate titanate PZT NCE51 ceramic, fabricated using both freeze-casting and conventional solid state route. This dataset contains hysteresis polarisation-electric field loops, impedance spectroscopy data and X-ray diffraction (XRD) patterns, which provide insights into how the microstructure of freeze-cast samples affect the functional properties of the porous freeze-cast ceramics.
In addition, the dataset also contains the results of finite element model, demonstrating how the local field distribution differ between the structures produced via freeze-casting and the burnt-out polymer spheres (BURPS) technique, despite having the same relative density. This difference demonstrate how the permittivity measured differ between these microstructures.
The dataset supports further analysis of the processing-microstructure-property relationships in porous ferroelectric ceramics. This may be of interest to researchers working on design and characterisation of advanced ferroelectric composites.
Version 2 restores certain characterisation data omitted from version 1.Full details of the methodology can be found in Section 2 of the associate research article.Version 2 introduces the file Freeze-cast_NCE51_prel.xlsx; all other files are unchanged from version 1
Dataset for "Basal Metabolic Requirements, Biomarkers of Cardiometabolic Health, and Anthropometric Measures of Obesity in Women and Men With Restricted Growth Conditions"
Data included are: participant age, height, body mass, waist and hip circumference, sagittal abdominal diameter, resting metabolic rate, and blood plasma glucose, insulin, triacylglycerol, total cholesterol, high-density lipoprotein-cholesterol, c-reactive protein, leptin and thyroxine concentrations.Please see the associated paper and README file
Dataset for “Why AGG is associated with high transgene output: passenger effects and their implications for transgene design”
In bacteria, high A and low G content of the 5′ end of the coding sequence (CDS) promotes low RNA stability, facilitating ribosomal initiation and subsequently a high protein to transcript ratio. Additionally, 5′ NGG codons are suppressive owing to peptidyl-tRNA drop off. It was, therefore, surprising that the first large-scale transgene experiment to interrogate the 5′ effect by codon randomization found the NGG, G-rich codon AGG to be the most associated with high transgene output.
In this study we show that this is not replicated in other large transgene datasets, where AGG and NGG are associated with low efficiency. More generally, there is limited agreement between the first experiment and others. This we find to be a consequence of non-random construct design. The results of this research have implications for both transgene and experimental design.Please see the associated paper
Dataset for, "RoundaboutHD: High-Resolution Real-World Urban Environment Benchmark for Multi-Camera Vehicle Tracking"
The multi-camera vehicle tracking (MCVT) framework holds significant potential for smart city applications, including anomaly detection, traffic density estimation, and suspect vehicle tracking. However, current publicly available datasets exhibit limitations, such as overly simplistic scenarios, low-resolution footage, and insufficiently diverse conditions, creating a considerable gap between academic research and real-world scenario. To fill this gap, we introduce RoundaboutHD, a comprehensive, high-resolution multi-camera vehicle tracking benchmark dataset specifically designed to represent real-world roundabout scenarios. RoundaboutHD provides a total of 40 minutes of labelled video footage captured by four non-overlapping, high-resolution (4K resolution, 15 fps) cameras. In total, 512 unique vehicle identities are annotated across different camera views, offering rich cross-camera association data. RoundaboutHD offers temporal consistency video footage and enhanced challenges, including increased occlusions and nonlinear movement inside the roundabout. In addition to the full MCVT dataset, several subsets are also available for object detection, single camera tracking, and image-based vehicle re-identification (ReID) tasks. Vehicle model information and camera modelling/ geometry information are also included to support further analysis. We provide baseline results for vehicle detection, single-camera tracking, image-based vehicle re-identification, and multi-camera tracking. The dataset is publicly available.The dataset was collected using fixed-position, real-world traffic cameras located in Indiana, USA, provided by an industrial partner under a collaborative agreement. The video footage was captured under natural driving conditions, without experimental interference, to reflect realistic urban traffic patterns. All annotations were manually curated using a custom-built semi-automated labeling toolkit developed specifically for this project. This tool significantly enhanced annotation efficiency while ensuring high labeling accuracy. The labeling process included object detection, tracking, and identity association across multiple cameras.No third-party datasets are used.The data was collected using fixed-position traffic surveillance cameras provided by an industrial partner. Each camera recorded 4K-resolution video at 15 frames per second under real-world traffic conditions. The annotation process was conducted using a custom-built semi-automated labeling toolkit developed in Python, running on Ubuntu 20.04. Key libraries and frameworks used include OpenCV, NumPy, and Matplotlib for visualization and annotation support.
To view and utilize the dataset, users will require basic tools for handling image and text data (e.g., Python with OpenCV) and a machine with sufficient storage and memory to process high-resolution video frames and annotation files. The dataset follows YOLO-style text annotations for detection tasks and CSV-format files for tracking metadata.
For reproducibility, we provide the labeling toolkit and evaluation scripts in the associated GitHub repository, along with documentation detailing the annotation format and dataset structure
Data collected for "Providing an Eyewitness Testimony as an Individual who Stammers: Examining Accuracy/Completeness and Subjective Experiences"
Stammering may impede an individual's eyewitness testimony and reduce jurors' perceptions of their credibility through a complex interplay of bio-psycho-social factors. However, no research to date has explored this. Three co-produced, mixed-methods studies are reported, investigating the evidential quality, lived experiences and perceived credibility of people who stammer (PWS) as witnesses. In pre-registered Study 1, PWS recalled as much correct information as non-stammering witnesses overall. However, during the free – but not cued – recall interview phase, PWS provided fewer correct details. A reflexive thematic analysis of participants' post-testimony reflections captured how PWS experienced a cyclical relationship between communicative pressure, anxiety over listener misperceptions and stammer severity, which they navigated either by employing avoidance strategies at the expense of testimony or by speaking through their stammer. In pre-registered Study 2, mock jurors rated PWS as less confident yet more likeable and trustworthy than non-stammering witnesses. In Study 3, providing jurors with information about stammering further improved their likeability and trustworthiness but had no impact on perceived confidence. Findings provide new insight into communication disorders in legal contexts – and the unique challenges faced by PWS in particular – demonstrating the need for systemic accommodations and targeted training for legal professionals.
This dataset contains
* A SAV file with participant demographics, cognitive ability scores, and the data on completeness, errors, and accuracy of participants' testimony accounts in overall, free, and cued recall phases.
* Another SAV file containing the number of correct details and errors, and accuracy (%) of 12 testimony accounts (6 accounts from each group) coded by two independent raters for inter-rater reliability.
* Transcripts (docx) of the semi-structured interviews conducted in Study 1b.
* The post-testimony survey responses (pdf) provided by participants who stammer.The primary aim of the first part of this study is to examine the accuracy and completeness of accounts in eyewitness testimony setting between people who stammer and those who do not. Therefore, there will be one between-subject factor (people who stammer vs. Do not stammer).
Participants will engage in the same research procedure for the first part of the study: first they will watch a video of a mock crime before being interviewed for their memory of it using standard eyewitness interviewing procedure: first prompting their free recall of the event, before being asked questions based on what they freely recalled. The main planned analysis will examine group effects on recall of details across the interview as a whole (i.e., free and cued recall combined). Further exploratory analyses will examine whether there are group differences in the recall of correct details, errors, and accuracy in eyewitness testimony accounts within free and cued recall (respectively).
The second part of this study will use a mixed methods approach using an online survey, consisting of both Likert scale questions and open-ended text-based questions. This approach will primarily involve an exploratory qualitative approach to the analysis of participants’ written responses. Quantitative (Likert scale) data will also be reported