Skip to content

Input Data

Peter Charlton edited this page Mar 8, 2016 · 22 revisions

Several datasets are publicly available for use with RRest, as detailed here. You may also wish to use your own dataset. This page provides an overview of how to provide Input Data to RRest.

Using Publicly Available Datasets

Publicly available datasets are either provided as part of the RRest Project, or are available from other public sources. Those provided as part of the RRest Project, such as the Vortal Dataset, are already in the required format for use with the RRest toolbox. Scipts are provided to download and re-format those datasets which are available from other public sources, such as the MimicII Dataset. 'data_importer' scripts can be found here.

Once you have downloaded a dataset, you are ready to use RRest. You should specify its location to RRest in the up.paths.root_data_folder variable, in the setup_universal_params.m script.

Using Your Own Datasets

You may alternatively wish to analyse your own dataset with RRest. There are three steps to doing so: formatting the your dataset correctly, saving it in the appropriate location, and calling RRest appropriately. These are now explained:

Formatting your dataset

The easiest way to understand how to format your dataset for use with RRest is to look at an example, such as the synthetic dataset presented here.

A dataset must be stored in a Matlab structure array called data to be used with RRest. The structure array should be of dimension [1, n], where n is the number of recordings. The structure array should contain the following fields:

  • id : e.g. data(1).id = '001' , data(2).id = '002' , etc.
  • pleth : e.g. data(1).pleth.fs = 125 , specifying the sampling rate in Hz, and data(1).pleth.signal_e_vlf.y.v = [1,2,3,2,1] , where [1,2,3,2,1] is the row vector of PPG values for this recording.
  • ekg : e.g. data(1).ekg.fs = 125 , specifying the sampling rate in Hz, and data(1).ekg.signal_e_vlf.y.v = [1,2,3,2,1] , where [1,2,3,2,1] is the row vector of ECG values for this recording.

It must also contain reference respiratory data. This can either be specified as the timings of individual breaths, or as a reference respiratory signal. To specify the timings of individual breaths:

  • reference : e.g. data(1).reference.breaths.t = [1.2,3.7,5.1,6.2], where [1.2,3.7,5.1,6.2] is the row vector of breath timings in seconds.

To specify a reference respiratory signal (in this case an impedance, imp , signal:

  • imp : e.g. data(1).imp.signal_e_vlf.y = [1,2,3,2,1], where[1,2,3,2,1]` is the row vector of impedance values for this recording.

In addition, individual recordings can be assigned to groups. For instance, in the synthetic dataset the signals are assigned to baseline wander (bw), amplitude modulation (am) and frequency modulation (fm) groups. Groups can be called by any string name you wish. If you do specify groups then a sub-group analysis will be performed as well as the analysis of the entire dataset. To specify a group use the following field:

  • _group* : e.g. data(1).group = 'bw'

Saving your dataset in the appropriate location

Take the following steps to ensure that your data is saved in an appropriate location for RRest to be able to find it:

  1. Create a directory called MYDATA

  2. Specify up.paths.root_folder as the directory which contains the newly created MYDATA directory. This variable is set in the setup_universal_params.m script.

  3. Save your dataset as mydata.mat, in the MYDATA directory.

Calling RRest to analyse your dataset

Use the following command to call RRest When using your own dataset:

RRest('mydata')