TU Wien Research Data
Not a member yet
512 research outputs found
Sort by
Dataset Evaluating Human–Machine Collaboration through a Comparative Analysis of Experts, Machine Learning, and Hybrid Approaches in Real Estate Valuation
<h2>Dataset description</h2>
<p><span>The dataset was collected to support controlled experiments evaluating the predictive performance and efficiency of different residential property valuation approaches. Specifically, it enables a direct comparison between an AI-based price prediction model, human real estate experts, and a hybrid human–machine approach. </span></p>
<p><span>The underlying machine-learning model was trained on 21,736 apartment transactions from Vienna covering the period 2018–2022. This transaction data, originally compiled and processed for the study <em><span>“Location, Location, Location: The Power of Neighborhoods for Apartment Price Predictions Based on Transaction Data”</span></em> published in the ISPRS International Journal of Geo-Information, served as the empirical basis for model development.</span></p>
<p><span>Building on this foundation, the present dataset focuses on the <strong><span>experimental evaluation phase</span></strong> rather than transfer learning. It contains expert assessments of newly built apartments sold in Vienna in 2023, collected under three experimental conditions: (i) limited information, (ii) state-of-the-art expert valuation methods, and (iii) collaboration between experts and the ML model. The dataset further includes the corresponding model predictions and ground-truth transaction prices, enabling a systematic comparison of predictive accuracy and task efficiency across valuation strategies.</span></p>
<p><span>This dataset was used to analyze the relative strengths of standalone ML models, human expertise, and hybrid human–AI collaboration in residential price prediction, with particular emphasis on accuracy, robustness, and time efficiency.</span></p>
<h3>Context and methodology</h3>
<ul>
<li>The data set was created to predict of apartment prices 1 to 7 years into the future</li>
<li>The data set was used to test of transfer learning capabilities</li>
<li>Data collected from apartment ownership transactions, enriched by contextual information from OpenStreetMap. The features added were selected based on experience with valuation and discussions on potentially relevant factors</li>
<li>All personal data were removed from the expert survey and the transaction data</li>
</ul>
<h3>Technical details</h3>
<ul>
<li>csv-File with raw data; further explanation in ReadMe.txt</li>
<li>Python-script to analyse the data: PSFL</li>
</ul>
<h3>Licenses</h3>
<ul>
<li>Data: CC by 4.0 International</li>
<li>Code: PSFL 2.0</li>
</ul>
Conversational Recommender Systems Using Generative Models (Gen-CRS): Literature Review
<h2>Description</h2>
<p>This dataset contains a curated list of 49 research papers focused on Conversational Recommender Systems using Generative Models (Gen-CRS). The collection covers publications from 2018 to 2025 and reflects the rapid evolution of generative approaches in conversational recommendation scenarios.</p>
<p><br>The dataset was compiled in the context of the literature review “<a href="https://www.researchgate.net/publication/398319946_Conversational_Recommender_Systems_Using_Generative_Models_Gen-CRS_A_Literature_Review" target="_blank" rel="noopener">Conversational Recommender Systems Using Generative Models (Gen-CRS): A Literature Review</a>” and the tutorial “<a href="https://dl.acm.org/doi/10.1145/3705328.3748010" target="_blank" rel="noopener">A Tutorial on Recent Advances in Generative Conversational Recommender Systems</a>”, presented at the <a href="https://recsys.acm.org/recsys25/tutorials/" target="_blank" rel="noopener">ACM RecSys conference 2025</a>. It serves as the bibliographic foundation for both contributions and is intended to support transparency, reproducibility, and further research in this area.</p>
<p>Each entry in the dataset corresponds to a single paper relevant to Gen-CRS, and the selection process, collection methodology, and inclusion criteria are provided in the accompanying literature review paper.</p>
<h2>Dataset Structure</h2>
<p>The dataset is organized in a tabular format, where each row corresponds to a single publication included in the literature collection. Rows contain the essential bibliographic metadata required to identify and retrieve the original paper.</p>
<h3>The dataset includes the following columns:</h3>
<ul>
<li><strong>Paper title:</strong> Full title of the publication</li>
<li><strong>Author(s):</strong> Names of all authors as reported in the original paper</li>
<li><strong>Year:</strong> Year in which the paper was published</li>
<li><strong>Published at:</strong> Conference, workshop, journal, or other venue where the work appeared</li>
<li><strong>Reference Link/DOI:</strong> Persistent link to access the published document (e.g., DOI, publisher URL, or preprint reference)</li>
</ul>
Information Delivery Specification (IDS) dataset for code compliance checking in context of the City of Vienna
<h2>Description</h2>
<p>The published data is research data from the publication "Information Delivery Specification (IDS) for code compliance checking in context of the City of Vienna", which was submitted to the Digital Building Permit conference 2025. This paper discusses the potential of using the open standard Information Delivery Specification (IDS) for automated code compliance checking (ACC).</p>
<p>IDS is an open specification based on XML for defining and verifying information requirements for digital building models in the IFC format (an open format for BIM models). Although IDS is initially intended for checking information requirements, it has the potential to be used for ACC as well. We have explored the possibility of recreating the checking rules implemented in Solibri Office using IDS. Additionally, we considered an existing extension to the IDS schema to improve its functionality.</p>
<p>This dataset contains the results of the mentioned research paper. This includes the created IDS files for the standard and for the extended IDS schema. The extended IDS schema was published in October 2024: <a href="https://doi.org/10.48436/012bs-f6776">https://doi.org/10.48436/012bs-f6776</a></p>
<h3>Technical details</h3>
<ul>
<li>The dataset contains 2 zip files:
<ul>
<li>One containing 8 standard IDS files</li>
<li>One containing 5 extended IDS files</li>
</ul>
</li>
<li>The IDS files cover the OIB guidelines 2, 2.1, 2.2, 2.3 and 4.</li>
<li>The IDS files for OIB guidline 2.3 are divided according to the chapters 2 to 5.</li>
</ul>
<h3>Further details</h3>
<ul>
<li>The 5 IDS files named "extended IDS" correspond to the adapted IDS schema. Therefore, commercial IDS software cannot process these files.</li>
</ul>
LidarScout: Direct Out-of-Core Rendering of Massive Point Clouds
<h2>LidarScout: Direct Out-of-Core Rendering of Massive Point Clouds</h2>
<p>This dataset contains data related to our LidarScout paper at High-Performance Graphics 2025. This paper proposes a method for instantly exploring huge, compressed point clouds (LAZ) without any pre-processing and no extra storage. Contained data is:</p>
<ul>
<li>LidarScout executable for Windows (requires CUDA)</li>
<li>A screen-capture video as MP4</li>
<li>Source code for the viewer (Github Snapshot from 2025-09-25, <a href="https://github.com/cg-tuwien/lidarscout">https://github.com/cg-tuwien/lidarscout</a>, mostly C++ and CUDA)</li>
<li>Source code for the training (Github Snapshot from 2025-09-25, <a href="https://github.com/cg-tuwien/lidarscout_training">https://github.com/cg-tuwien/lidarscout_training</a>, mostly Python)</li>
<li>Trained models for all variants as Pytorch checkpoints (.ckpt) and Torchscript (.pt)</li>
<li>Training and testing datasets based on AHN5, Bund_BoraPk, CA13, ID15_Bunds, NZ23_Gisborne, BR17_SaoPaulo, and SwissSurface3D as CSV, PLY, and BIN files.</li>
<li>Testing results for all model variants and all test datasets as PLY and NPY files</li>
</ul>
<p>More detailed information on the folder and file structure can be found in the README of the training repository.</p>
<p>More general information: <a href="https://www.cg.tuwien.ac.at/research/publications/2025/erler-2025-lidarscout/">https://www.cg.tuwien.ac.at/research/publications/2025/erler-2025-lidarscout/</a></p>
<p>There are some scripts provided in the code archive for very interested users; they may require a little bit of tinkering to get them to run.</p>
<h2>Licenses</h2>
<p>The data is licensed under CC-BY 4.0, the code is licensed under MIT.</p>
<h2>Acknowledgements</h2>
<p>We thank the following data set providers: Bunds at el. and Open Topography for the Bund_Bora [ BDG+19] and ID15_Bunds [BDG+20] data sets; PG&E and Open Topography for CA13 [ Pac13 ]; The Ministry of Business, Innovation and Employment and Toit¯u Te Whenua Land Information New Zealand and Open Topography for Gisborne [ MoBE24 ]; The São Paulo City Hall (PMSP) and Open Topography for São Paulo [ (PM17 ]; The Bundesamt für Landestopografie swisstopo for swissSURFACE3D [Swi20].</p>
<p>We thank Paul Guerrero, Pedro Hermosilla, and Adam Celarek for their valuable inputs. Further, we thank Stefan Ohrhallinger for running reconstructions with BallMerge [POEM24].<br>This research has been funded by WWTF project ICT22-055 - Instant Visualization and Interaction for Large Point Clouds.</p>
Schniert Schúech affen Tantz (A-Wn_Mus.Hs._18688_n59) Audio recording
<h1>Audio recording of a lute piece from the E-LAUTE project</h1><h2>Overview</h2><p>This dataset contains an audio recording of the piece "Schniert Schúech affen Tantz", a 16th century lute music piece originally notated in lute tablature, created as part of the E-LAUTE project (<a href="https://e-laute.info/">https://e-laute.info/</a>). The recording preserves and makes historical lute music from the German-speaking regions during 1450-1550 accessible.</p><p>The recording is based on the work with the title "Schniert Schúech affen Tantz" and the id "A-Wn_Mus.Hs._18688_n59" in the e-lautedb. It is found on the page(s) or folio(s) 33v in the source "[Lautentabulatur des Stephan Craus]" with the source-id "A-Wn_Mus.Hs._18688".</p><p>The original source and multiple transcriptions of the work can be found on the E-LAUTE platform: <a href="https://edition.onb.ac.at/fedora/objects/o:lau.A-Wn_Mus.Hs._18688/methods/sdef:TEI/get?mode=n59" target="_blank">https://edition.onb.ac.at/fedora/objects/o:lau.A-Wn_Mus.Hs._18688/methods/sdef:TEI/get?mode=n59</a>.</p><p>Links to the source: <a href="http://data.onb.ac.at/rec/AC14316391" target="_blank">http://data.onb.ac.at/rec/AC14316391</a>, <a href="https://rism.online/sources/600141880" target="_blank">https://rism.online/sources/600141880</a>, .</p><h2>Dataset Contents</h2><p>This dataset includes:</p><ul><li><strong>Audio file</strong>: An audio recording of the lute piece in .wav format</li> <li><strong>Metadata file</strong>: A metadata file with detailed information about the recording in .json format</li></ul><h2>About the E-LAUTE Project</h2><p><strong>E-LAUTE: Electronic Linked Annotated Unified Tablature Edition - The Lute in the German-Speaking Area 1450-1550</strong></p><p>The E-LAUTE project creates innovative digital editions of lute tablatures from the German-speaking area between 1450 and 1550. This interdisciplinary "open knowledge platform" combines musicology, music practice, music informatics, and literary studies to transform traditional editions into collaborative research spaces.</p><p>For more information, visit the project website: <a href="https://e-laute.info/">https://e-laute.info/</a></p>
[Saltarello] (A-Wn_Mus.Hs._18688_n28) Audio recording
<h1>Audio recording of a lute piece from the E-LAUTE project</h1><h2>Overview</h2><p>This dataset contains an audio recording of the piece "[Saltarello]", a 16th century lute music piece originally notated in lute tablature, created as part of the E-LAUTE project (<a href="https://e-laute.info/">https://e-laute.info/</a>). The recording preserves and makes historical lute music from the German-speaking regions during 1450-1550 accessible.</p><p>The recording is based on the work with the title "[Saltarello]" and the id "A-Wn_Mus.Hs._18688_n28" in the e-lautedb. It is found on the page(s) or folio(s) 19v-20r in the source "[Lautentabulatur des Stephan Craus]" with the source-id "A-Wn_Mus.Hs._18688".</p><p>The original source and multiple transcriptions of the work can be found on the E-LAUTE platform: <a href="https://edition.onb.ac.at/fedora/objects/o:lau.A-Wn_Mus.Hs._18688/methods/sdef:TEI/get?mode=n28" target="_blank">https://edition.onb.ac.at/fedora/objects/o:lau.A-Wn_Mus.Hs._18688/methods/sdef:TEI/get?mode=n28</a>.</p><p>Links to the source: <a href="http://data.onb.ac.at/rec/AC14316391" target="_blank">http://data.onb.ac.at/rec/AC14316391</a>, <a href="https://rism.online/sources/600141880" target="_blank">https://rism.online/sources/600141880</a>, .</p><h2>Dataset Contents</h2><p>This dataset includes:</p><ul><li><strong>Audio file</strong>: An audio recording of the lute piece in .wav format</li> <li><strong>Metadata file</strong>: A metadata file with detailed information about the recording in .json format</li></ul><h2>About the E-LAUTE Project</h2><p><strong>E-LAUTE: Electronic Linked Annotated Unified Tablature Edition - The Lute in the German-Speaking Area 1450-1550</strong></p><p>The E-LAUTE project creates innovative digital editions of lute tablatures from the German-speaking area between 1450 and 1550. This interdisciplinary "open knowledge platform" combines musicology, music practice, music informatics, and literary studies to transform traditional editions into collaborative research spaces.</p><p>For more information, visit the project website: <a href="https://e-laute.info/">https://e-laute.info/</a></p>
Dataset for "Slow-Slope Reset Scheme for Highly-Sensitive CMOS Integrate-and-Dump Receiver OEIC"
<h1>Overview</h1>
<p>This repository provides measurement data and evaluated data related to our manuscript "<em>Slow-Slope Reset Scheme for Highly-Sensitive CMOS Integrate-and-Dump Receiver OEIC</em>" by Simon Michael Laube, Christoph Gasser, Kerstin Schneider-Hornstein, and Horst Zimmerman, published in IEEE Access, 2025, DOI: <a title="Slow-Slope Reset Scheme for Highly-Sensitive CMOS Integrate-and-Dump Receiver OEIC" href="https://doi.org/10.1109/ACCESS.2025.3602093">10.1109/ACCESS.2025.3602093</a>.</p>
<h1>Context</h1>
<p>In our study, we present the design and experimental verification two optoelectronic integrated circuits (OEICs). The main difference between the OEICs is the reset method. The two methods are:</p>
<ol>
<li>Slow-Slope reset, denoted by "Improved"</li>
<li>Rectangular reset, denoted by "Rectangular"</li>
</ol>
<p>We measured the transient output voltage of the OEICs across optical input power with an oscilloscope, and stored the waveforms in HDF5 files. The bit error probability (BER) was evaluated from the transient measurements using post-processing in Python, as explained in our manuscript. Data rates of 100 Mb/s with 80% return-to-zero (RZ) on-off keying (OOK) modulation, and 250 Mb/s with 50% RZ OOK modulation were used for the BER measurements.</p>
<h1>File structure</h1>
<p>All measurement data is provided separately for each OEIC sample and data rate. Two samples were measured for each reset method. Please note that the OEIC samples have individual sample identifiers that are part of the file names. The sample identifiers are:</p>
<ol>
<li>Improved 1: E1_1</li>
<li>Improved 2: D1_1</li>
<li>Rectangular 1: G2_2</li>
<li>Rectangular 2: H2_2</li>
</ol>
<p>The main folders <code>/BER</code>, <code>/powermeter</code>, and <code>/waveforms</code> are provided.</p>
<h2>Waveforms</h2>
<p><code>/waveforms</code> contains raw waveform (transient measurement) data. Waveforms are stored as HDF5 files (.h5 file ending) that contain an internal file system with metadata and data, generated by the oscilloscope. HDF5 files can be read using the free HDFView program, h5py Python library, or other software.</p>
<p>The internal file system within our HDF5 files has the following structure:<br><code>/FileType/KeysightH5FileType</code><br><code>/Frame/TheFrame</code><br><code>/Waveforms</code><br><code> /Channel 1/Channel 1 Data</code><br><code> /Channel 2/Channel 2 Data</code><br><code> /Channel 3/Channel 3 Data</code><br><code> /Function 1/Function 1 Data</code></p>
<p><code>KeysightH5FileType</code> and <code>TheFrame</code> are oscilloscope metadata. The <code>Channel 1</code> sub-folder contains metadata and <code>Channel 1 Data</code>. <code>Channel 1 Data</code> is the raw waveform data of the pseudo-random bit sequence (PRBS) that was used as the input signal of our OEICs. The <code>Channel 2</code> sub-folder contains metadata and <code>Channel 2 Data</code>. The <code>Channel 3</code> sub-folder contains metadata and <code>Channel 3 Data</code>. <code>Channel 2 Data</code> and <code>Channel 3 Data</code> is the raw waveform data of the OEIC output voltage. The output voltage is calculated from the difference between <code>Channel 3 Data</code> and<code> Channel 2 Data</code>, that is<br> output voltage=<code>Channel 3 Data</code> - <code>Channel 2 Data</code><br>The <code>Function 1</code> sub-folder contains metadata and <code>Function 1 Data</code>. <code>Function 1 Data</code> is the OEIC output voltage computed on the oscilloscope that was used for debugging only. Not all HDF5 files may contain the <code>Function 1</code> sub-folder.</p>
<p>The file name structure of the HDF5 files is<br> <code><sample identifier>_<measurement identifier>x1x1<optical power identifier>x1.h5</code></p>
<p>Here, the sample identifier is the same as explained above; the measurement identifier is an arbitrary text/number; and the optical power identifier connects the power measurement (see below) with the corresponding waveform. For example, the file "<code>G2_2_17Lx1x1x5x1.h5</code>" is the raw waveform of the Rectangular 1 sample, measurement "17L", recorded for the 5th optical power setting.</p>
<h2>Optical Power</h2>
<p><code>/powermeter</code> contains the raw optical power measurement results, as well as the calibration factor that was used to calculate the power incident on the chip. In other words,<br> chip power=raw power * calibration factor.</p>
<p>The optical power measurements are provided in CSV files (.csv file ending), separately for each OEIC and measurement. Within the CSV files, the first column is the optical power identifier of the measurement (see above), and the second column is the respective raw optical power. The file name structure of the CSV files is<br> <code>power_<sample identifier>_<measurement identifier>.csv</code></p>
<p>The calibration factor is provided in a TXT file (.txt file ending), separately for each OEIC and measurement. The TXT file contains only a single floating point number that is the calibration factor. The file name structure of the TXT file is<br> <code>calibration_<sample identifier>_<measurement identifier>.txt</code></p>
<p>Here, the sample identifier and measurement identifier are the same as for the waveform files, as explained above. For example, the files "<code>calibration_G2_2_17L.txt</code>" and "<code>power_G2_2_17L.csv</code>" correspond to all waveforms with the prefix "G2_2_17L", such as the abovementioned "<code>G2_2_17Lx1x1x5x1.h5</code>".</p>
<h2>Bit error probability</h2>
<p><code>/BER</code> contains the evaluated bit error probability of the OEICs. All files within this folder are generated from the raw data provided in <code>/waveforms</code> and <code>/powermeter</code>, using our Python script. Three file types are provided for each sample and data rate:</p>
<ol>
<li>Log files (.log file ending) that document the result of the evaluation. These log files were used to plot Fig. 12 in our manuscript.</li>
<li>Image files (.png file ending) that illustrate the result of the evaluation, similar to Fig. 10 in our manuscript.</li>
<li>A CSV file that contains a results summary (.csv file ending).</li>
</ol>
<p>The log file contains metadata about the evaluation process, the evaluation result (BER), as well as the the input file (waveform) and output files of the evaluation. Note that the .tab output files are not provided because they were only used for debugging of our Python script. While most of the log file contents should be self-explanatory, some require special attention:</p>
<ul>
<li>In the "User settings" section we provide settings for the evaluation of the reference PRBS (<code>Channel 1 Data</code> in the HDF5 files). The boolean flag "PRBS inverted" shows whether the PRBS waveform was processed as is, or was logically inverted. The "PRBS detection threshold" is the threshold voltage that was used to digitize the (analog) PRBS waveform. Because the SNR of the PRBS is very high, the threshold itself is uncritical and was auto-detected by our Python script. The "PRBS detection offset" marks the start of the PRBS with respect to the recorded waveform. This is necessary because the recording may start at an arbitrary time, so the first recorded bit is incomplete. The start of the PRBS was auto-detected by our Python script by rising edge detection. The "PRBS detection delay" shows at which time instant each PRBS bit is sampled, with respect to the start of a bit. Typically, the bit should be sampled at the center. For 100 Mb/s with 80% RZ modulation, the center is 4 ns (=PRBS detection delay) after the start of the bit; for 250 Mb/s with 50% RZ modulation, the center is 1 ns after the start of the bit.</li>
<li>In the "Results" section, the result and the optimized settings for the evaluation of the chip output (<code>Channel 2 Data</code> and <code>Channel 3 Data</code> in the HDF5 files) are provided. "Decision threshold" is the threshold (voltage) for bit decision. "CDS delta time" is the time between the two sample instants of correlated double sampling (CDS). "Best BER" is the BER result. The "Static delay" is the coarse delay between PRBS and chip output waveform, given in multiples of the bit period (10 ns at 100 Mb/s, 4 ns at 250 Mb/s). The "Inter-bit delay" is the fine delay between PRBS and chip output waveform, that is always less than the bit period. The sum of static delay and inter-bit delay are the total delay between PRBS and chip output waveform.</li>
</ul>
<p>The file name structure of the log files is<br> <code><sample identifier>_<measurement identifier>x1x1x<optical power identifier>x1.log</code><br>The file name structure of the image files is<br> <code><sample identifier>_<measurement identifier>x1x1x<optical power identifier>x1.png</code></p>
<p>The results summary CSV contains all optical power settings, the BER results, and the underlying dataset file names.</p>
<p>The file name structure of the CSV is<br> <code>BER_<sample>_<data rate>_<modulation>.csv</code><br>where sample is Improved 1, Improved 2, Rectangular 1, or Rectangular 2, as defined above. Data rate is 100Mbps or 250Mbps, corresponding to 100 Mb/s and 250 Mb/s, respectively. Modulation is 80RZ or 50RZ, corresponding to 80% RZ OOK and 50% RZ OOK, respectively.</p>
<h2>Other data</h2>
<p>Some raw data is given in the manuscript in tabular form. These data are not included in this dataset.</p>
<h2>Sample dataset</h2>
<p>The file "<code>sample.zip</code>" contains a small sample dataset with a single waveform file, that allows you to inspect the basic filestructure without downloading the full dataset.</p>
<h1>Licensing</h1>
<p>The dataset consists of raw measurement data and processed data.<br>Raw data is licensed under the <strong>Creative Commons Zero 1.0 Universal (CC0)</strong> license.<br>Processed data is copyrighted and licensed under the <strong>Creative Commons Attribution 4.0 International (CC-BY)</strong> license.<br>All metadata is licensed under the <strong>Creative Commons Attribution 4.0 International (CC-BY)</strong> license.</p>
<p>The following list shows the license attached to the individual files:</p>
<ul>
<li>All files and sub-folders within <code>/waveforms</code>: <strong>CC0</strong> license</li>
<li>All files and sub-folders within <code>/powermeter</code>: <strong>CC0</strong> license</li>
<li>All files and sub-folders within <code>/BER</code>: <strong>CC-BY</strong> license</li>
<li><code>/README.txt</code>: <strong>CC-BY</strong> license</li>
</ul>
ESA CCI SM RZSM Long-term Climate Data Record of Root-Zone Soil Moisture from merged multi-satellite observations
<div>
<p>This dataset was produced with funding from the European Space Agency (ESA) Climate Change Initiative (CCI) Plus Soil Moisture Project (CCN 3 to ESRIN Contract No: 4000126684/19/I-NB "ESA CCI+ Phase 1 New R&D on CCI ECVS Soil Moisture"). Project website: <a title="ESA CCI SM website" href="https://climate.esa.int/en/projects/soil-moisture/" target="_blank" rel="noopener">https://climate.esa.int/en/projects/soil-moisture/</a></p>
<p>This dataset contains information on the Root Zone Soil Moisture (RZSM) content derived from satellite observations in the microwave domain.</p>
<p>The operational (ACTIVE, PASSIVE, COMBINED) ESA CCI SM products are available at <a href="https://catalogue.ceda.ac.uk/uuid/c256fcfeef24460ca6eb14bf0fe09572/" target="_blank" rel="noopener">https://catalogue.ceda.ac.uk/uuid/c256fcfeef24460ca6eb14bf0fe09572/</a> <em>(Dorigo et al., 2017; Gruber et al., 2019; Preimesberger et al., 2021)</em>.</p>
</div>
<h2>Abstract</h2>
<div>Soil moisture is a key variable in monitoring climate and an important component of the hydrological, carbon, and energy cycles. Satellite products ameliorate the sparsity of field measurements but are inherently limited to observing the near-surface layer, while water available in the unobserved root-zone controls critical processes like plant water uptake and evapotranspiration. A variety of approaches exist for modelling root-zone soil moisture (RZSM), including approximating it from surface layer observations through an infiltration model (<em>Pasik et al., 2023; Wagner et al., 1999, Albergel et al., 2008</em>).</div>
<div>Here, we apply the method described by <em>Pasik et al. (2023)</em> to the COMBINED product of ESA CCI SM v9.2 to derive RZSM and uncertainty estimates in four depth layers of the soil (0-10, 10-40, 40-100, and 0-100 cm) over the period from January 1980 to December 2024 at ~25 km spatial sampling. In situ soil moisture measurements from the International Soil Moisture Network (<em>Dorigo et al., 2021</em>) were used for (global) T-parameter calibration and to quantify the (structural) model error component required to propagate surface measurement uncertainties to the root-zone layers. The 0-1 m layer is a (weighted) average of the other three layers. The dataset has been validated against ERA5 reanalysis RZSM fields, with global median correlations of ~0.6 [-] and ubRMSD <0.04 m³/m³.</div>
<h3>Summary</h3>
<ul>
<li>Global estimates of root-zone soil moisture from 01-1980 to 12-2024 at ~25 km spatial sampling based on the COMBINED product of ESA CCI SM v9.2.</li>
<li>Method: Exponential filter model, calibrated with in situ measurements for 3 depth layers: 0-10, 10-40, 40-100 cm with uncertainty estimates. Additionally, one layer representing the average condition from 0-1 m depth is provided. See <em>Pasik et al. (2023) </em>for more details.</li>
<li>Good agreement with independent reanalysis data (R ~0.6 [-] and ubRMSD <0.04 m³/m³), decreasing performance for deeper layers due to weaker coupling with surface SM.</li>
</ul>
<h2>Programmatic (bulk) download</h2>
<p>You can use command-line tools such as <a href="https://www.gnu.org/software/wget/" target="_blank" rel="noopener">wget</a> or <a href="https://curl.se/" target="_blank" rel="noopener">curl</a> to download (and extract) data for multiple years. The following command will download and extract the complete data set to the local directory <em>~/Download</em> on Linux or macOS systems.</p>
<blockquote>
<div>
<pre>#!/bin/bash<br><br># Set download directory<br>DOWNLOAD_DIR=~/Downloads<br><br>base_url="https://researchdata.tuwien.at/records/tqrwj-t7r58/files"<br><br># Loop through years 1980 to 2024 and download & extract data<br>for year in {1980..2024}; do<br> echo "Downloading year.zip..."<br> wget -q -P "DOWNLOAD_DIR" "year.zip"<br> unzip -o "year.zip" -d DOWNLOAD_DIR<br> rm "DOWNLOAD_DIR/$year.zip"<br>done</pre>
</div>
</blockquote>
<h2>Data details</h2>
<div>
<h3>Filename template</h3>
<p>The dataset provides global daily estimates for the 1980-2024 period at 0.25° (~25 km) horizontal grid resolution. Daily images are grouped by year (YYYY), each subdirectory containing one netCDF image file for a specific day (DD) and month (MM) of that year in a 2-dimensional (longitude, latitude) grid system (CRS: WGS84). The file name follows the convention:</p>
<blockquote>
<p>ESACCI-SOILMOISTURE-L3S-RZSMV-COMBINED-YYYYMMDD000000-fv09.2.nc</p>
</blockquote>
<h3>Data Variables</h3>
<p>Each netCDF file contains 3 coordinate variables</p>
<ul>
<li><strong>lon</strong>: longitude (WGS84), [-180,180] degree W/E</li>
<li><strong>lat</strong>: latitude (WGS84), [-90,90] degree N/S</li>
<li><strong>time: </strong>float, datetime encoded as "number of days since 1970-01-01 00:00:00 UTC"</li>
</ul>
<p> and the following data variables</p>
<ul>
<li><strong>rzsm_1</strong>: (float) Volumetric Root Zone Soil Moisture at 0-10 cm depth</li>
<li><strong>rzsm_2</strong>: (float) Volumetric Root Zone Soil Moisture at 10-40 cm depth</li>
<li><strong>rzsm_3</strong>: (float) Volumetric Root Zone Soil Moisture at 40-100 cm depth</li>
<li><strong>rzsm_1m</strong>: (float) Root Zone Soil Moisture at 0-1 m</li>
<li><strong>uncertainty_1</strong>: (float) Volumetric Root Zone Soil Moisture uncertainty at 0-10 cm depth</li>
<li><strong>uncertainty_2</strong>: (float) Volumetric Root Zone Soil Moisture uncertainty at 0-10 cm depth</li>
<li><strong>uncertainty_3</strong>: (float) Volumetric Root Zone Soil Moisture uncertainty at 0-10 cm depth</li>
</ul>
<p>Additional information for each variable are given in the netCDF attributes.</p>
</div>
<h3>Version Changelog</h3>
<p>Changes in v9.2:</p>
<ul>
<li>The COMBINED product of v9.2 is used as input.</li>
<li>The period was extended to 12-2024.</li>
</ul>
<h3>Software to open netCDF files</h3>
<p>These data can be read by any software that supports Climate and Forecast (CF) conform metadata standards for netCDF files, such as:</p>
<ul>
<li><a title="xarray" href="https://github.com/pydata/xarray" target="_blank" rel="noopener">Xarray</a> (Python)</li>
<li><a title="netCDF4" href="https://unidata.github.io/netcdf4-python/" target="_blank" rel="noopener">netCDF4</a> (Python)</li>
<li><a title="esa_cci_sm" href="https://github.com/TUW-GEO/esa_cci_sm" target="_blank" rel="noopener">esa_cci_sm</a> (Python)</li>
<li>Similar tools exist for other programming languages (Matlab, R, etc.)</li>
<li>Software packages and GIS tools can open netCDF files, e.g. <a href="https://code.mpimet.mpg.de/projects/cdo" target="_blank" rel="noopener">CDO</a>, <a href="http://nco.sourceforge.net/" target="_blank" rel="noopener">NCO</a>, <a href="https://www.qgis.org/" target="_blank" rel="noopener">QGIS</a>, ArcGIS</li>
<li>You can also use the GUI software <a href="https://www.giss.nasa.gov/tools/panoply/" target="_blank" rel="noopener">Panoply</a> to view the contents of each file</li>
</ul>
<h2>Related Records</h2>
<p>This record and all related records are part of the <a href="https://researchdata.tuwien.ac.at/communities/soilmoisture-climaterecords/records" target="_blank" rel="noopener">ESA CCI Soil Moisture science data records community</a>.</p>
DAMAP demo Cluster Forschungsdaten EXPO
<p><a href="https://damap.org/">DAMAP</a> is an open-source software tool that can be implemented at research institutions to facilitate the creation of data management plans (DMPs). The tool was created within the BMBWF-funded project <a href="https://forschungsdaten.at/en/fair-data-austria/">FAIR Data Austria</a> and is currently being further developed in the frame of the follw-up project <a href="https://forschungsdaten.at/sharedrdm/">Shared RDM Services & Infrastructure</a>.</p>
<p>The DAMAP team prepared this tutorial to present the open source software tool at the Cluster Forschungsdaten EXPO 2025 on 16 January 2025 in Vienna.</p>
<p>The presentation provides information on</p>
<ul>
<li><span lang="EN-GB">what a DMP tool is and what it should do</span></li>
<li><span lang="EN-GB">what DAMAP is and why you would use it</span></li>
<li><span lang="EN-GB">how to use DAMAP to begin creating a DMP</span></li>
</ul>
PHAIDRA - Infrastruktur und Langzeitarchivierung von Forschungsoutput
<p>Präsentationsfolien im Rahmen der Cluster Forschungsdaten Expo 2025.</p>