Explanation Input-Feature Unit-Output Correlation Maps (Envelope Activation Correlation) #2

robintibor · 2021-06-24T09:06:01Z

Filter to frequency bands:

braindevel/braindecode/analysis/envelopes.py

Lines 153 to 162 in 21f58aa

    
           for low_cut_hz, high_cut_hz in filterbands: 
        
               log.info("Compute filterband from {:.1f} to {:.1f}...".format( 
        
                   low_cut_hz, high_cut_hz)) 
        
               if low_cut_hz > 0 and high_cut_hz < 125: 
        
                   filtered = bandpass_topo(train_topo, low_cut_hz,  
        
                                            high_cut_hz, sampling_rate=250.0, axis=0, filt_order=4) 
        
               elif low_cut_hz == 0: 
        
                   filtered = lowpass_topo(train_topo, high_cut_hz,  
        
                                       sampling_rate=250.0, axis=0, filt_order=4) 
        
               else:

Compute envelope

(absolute of hilbert transform):

braindevel/braindecode/analysis/envelopes.py

Line 171 in 21f58aa

env = np.abs(scipy.signal.hilbert(batches_topo, axis=2))

Square Envelope

(square_before_mean was True in our setting)
[Envelope was saved to a file and reloaded]

braindevel/braindecode/analysis/envelopes.py

Lines 30 to 31 in 21f58aa

    
           if square_before_mean: 
        
               env = np.square(env)

⚠️

Note there is possibly one mistake/discrepancy in the paper: We square before averaging (next step), not after

⚠️

Compute Moving Average of the envelope within the receptive field of the corresponding layer

Basic steps:

Determine receptive field size of the layer

braindevel/braindecode/analysis/envelopes.py

Lines 37 to 41 in 21f58aa

    
           def transform_to_meaned_trial_env(env, model, i_layer, train_set,  
        
                   n_inputs_per_trial): 
        
               all_layers = lasagne.layers.get_all_layers(model) 
        
               layer = all_layers[i_layer] 
        
               field_size = get_receptive_field_size(layer)

Average the envelopes within the receptive field using pooling

braindevel/braindecode/analysis/envelopes.py

Line 76 in 21f58aa

meaned_env = get_meaned_batch_env(env, field_size)

braindevel/braindecode/analysis/envelopes.py

Line 95 in 21f58aa

pooled = downsample.max_pool_2d(inputs, ds=(field_size, 1), st=(1, 1), ignore_border=True, mode='average_exc_pad')

Aggregate per-trial envelopes from the per-batch envelopes

braindevel/braindecode/analysis/envelopes.py

Lines 80 to 85 in 21f58aa

    
           fb_envs_per_trial = fb_env.reshape(n_trials,n_inputs_per_trial, 
        
               fb_env.shape[1], fb_env.shape[2], fb_env.shape[3]) 
        
           trial_env = transform_to_trial_acts(fb_envs_per_trial, 
        
               [n_inputs_per_trial] * n_trials, 
        
                                               n_sample_preds=n_sample_preds, 
        
                                               n_trial_len=n_trial_len)

Compute Correlation with Activations

For trained model

braindevel/braindecode/analysis/create_env_corrs.py

Lines 44 to 45 in 21f58aa

    
           topo_corrs = compute_trial_topo_corrs(model, i_layer, train_set,  
        
               exp.iterator, trial_env, split_per_class=True)

and random model

braindevel/braindecode/analysis/create_env_corrs.py

Lines 47 to 48 in 21f58aa

    
           rand_topo_corrs = compute_trial_topo_corrs(rand_model, i_layer, train_set,  
        
               exp.iterator, trial_env, split_per_class=True)

Compute Activations

braindevel/braindecode/analysis/create_env_corrs.py

Line 60 in 21f58aa

trial_acts = compute_trial_acts(model, i_layer, iterator, train_set)

So Compute per-batch activations and then aggregate to per-trial activations in

braindevel/braindecode/veganlasagne/layer_util.py

Lines 30 to 54 in 21f58aa

    
           # compute number of inputs per trial 
        
           i_trial_starts, i_trial_ends =  compute_trial_start_end_samples( 
        
                       train_set.y, check_trial_lengths_equal=True, 
        
                       input_time_length=iterator.input_time_length) 
        
           # +1 since trial ends is inclusive 
        
           n_trial_len = i_trial_ends[0] - i_trial_starts[0] + 1 
        
           n_inputs_per_trial = int(np.ceil(n_trial_len / float(iterator.n_sample_preds))) 
        
           log.info("Create theano function...") 
        
           all_layers = lasagne.layers.get_all_layers(model) 
        
           all_out_fn = create_pred_fn(all_layers[i_layer]) 
        
           assert(iterator.input_time_length == get_input_time_length(model)) 
        
           assert(iterator.n_sample_preds == get_n_sample_preds(model)) 
        
           log.info("Compute activations...") 
        
           all_outs_per_batch = [all_out_fn(batch[0])  
        
                 for batch in iterator.get_batches(train_set, False)] 
        
           batch_sizes = [len(batch[0]) for batch in iterator.get_batches(train_set, False)] 
        
           all_outs_per_batch = np.array(all_outs_per_batch) 
        
           n_trials = len(i_trial_starts) 
        
           log.info("Transform to trial activations...") 
        
           trial_acts = get_trial_acts(all_outs_per_batch, 
        
                                         batch_sizes, n_trials=n_trials,  
        
                                       n_inputs_per_trial=n_inputs_per_trial, 
        
                                         n_trial_len=n_trial_len,  
        
                                       n_sample_preds=iterator.n_sample_preds) 
        
           log.info("Done.")

Compute Correlation Envelope and Activations

braindevel/braindecode/analysis/create_env_corrs.py

Line 76 in 21f58aa

topo_corrs = compute_topo_corrs(trial_env, trial_acts)

braindevel/braindecode/analysis/envelopes.py

Lines 59 to 71 in 21f58aa

    
           flat_trial_env = trial_env.transpose(2,0,1,3).reshape( 
        
               trial_env.shape[0] * trial_env.shape[2], 
        
               trial_env.shape[1] * trial_env.shape[3]) 
        
           flat_trial_acts = trial_acts.transpose(1,0,2).reshape( 
        
               trial_acts.shape[1],-1) 
        
           #flat_corrs = np.corrcoef(flat_trial_env, flat_trial_acts) 
        
           #relevant_corrs = flat_corrs[:flat_trial_env.shape[0], 
        
           #          flat_trial_env.shape[0]:] 
        
           relevant_corrs = corr(flat_trial_env, flat_trial_acts) 
        
           topo_corrs = relevant_corrs.reshape(trial_env.shape[2], trial_env.shape[0], 
        
               trial_acts.shape[1]) 
        
           return topo_corrs

In the end these correlations for trained and untrained model will be saved:

braindevel/braindecode/analysis/create_env_corrs.py

Lines 52 to 53 in 21f58aa

    
           np.save('{:s}.labelsplitted.env_corrs.{:s}'.format(base_name, file_name_end), topo_corrs) 
        
           np.save('{:s}.labelsplitted.env_rand_corrs.{:s}'.format(base_name, file_name_end), rand_topo_corrs)

Now when you have these correlations for trained and untrained model you can average across units in a layer and then compute the difference of them (difference between trained and untrained model correlations). This is Figure 15 in https://onlinelibrary.wiley.com/doi/full/10.1002/hbm.23730

As a comparison we also compute the correlations of the envelope with the class labels (no network involved!). This is in the rightmost plots in Figure 15, or class-resolved/per class in Figure 14.

The text was updated successfully, but these errors were encountered:

JulianWisser · 2021-06-30T13:36:25Z

How to interpret and calculate the activations? Different than layer output?

I couldn't follow your calculation of the activations to 100%. As I understood from the paper the activations of one layer are the unit outputs. So the dimension of the activations/unit outputs is the same as the output shape of that layer!?

To calculate the correlation between the mean squared envelopes and the activations, their dimensions have to be the same. And you said that they are exactly the same since the envelopes are calculated with the receptive field size of the corresponding layer.

Your deep model contains the following layers with output shapes:

    Layer (type)               Output Shape         Param #

    Expression-1          [-1, 1, 1000, 22]               0
        Conv2d-2          [-1, 25, 991, 22]             275
        Conv2d-3           [-1, 25, 991, 1]          13,750
   BatchNorm2d-4           [-1, 25, 991, 1]              50
    Expression-5           [-1, 25, 991, 1]               0
     MaxPool2d-6           [-1, 25, 330, 1]               0
    Expression-7           [-1, 25, 330, 1]               0
       Dropout-8           [-1, 25, 330, 1]               0
        Conv2d-9           [-1, 50, 321, 1]          12,500
  BatchNorm2d-10           [-1, 50, 321, 1]             100
   Expression-11           [-1, 50, 321, 1]               0
    MaxPool2d-12           [-1, 50, 107, 1]               0
   Expression-13           [-1, 50, 107, 1]               0
      Dropout-14           [-1, 50, 107, 1]               0
       Conv2d-15           [-1, 100, 98, 1]          50,000
  BatchNorm2d-16           [-1, 100, 98, 1]             200
   Expression-17           [-1, 100, 98, 1]               0
    MaxPool2d-18           [-1, 100, 32, 1]               0
   Expression-19           [-1, 100, 32, 1]               0
      Dropout-20           [-1, 100, 32, 1]               0
       Conv2d-21           [-1, 200, 23, 1]         200,000
  BatchNorm2d-22           [-1, 200, 23, 1]             400
   Expression-23           [-1, 200, 23, 1]               0
    MaxPool2d-24            [-1, 200, 7, 1]               0
   Expression-25            [-1, 200, 7, 1]               0
       Conv2d-26              [-1, 4, 1, 1]           5,604
   LogSoftmax-27              [-1, 4, 1, 1]               0
   Expression-28                    [-1, 4]               0

I implemented a keras version of the model with exactly the same dimensions. Only using channels last and average pooling.

       Layer (type)                 Output Shape              Param #

       conv2d (Conv2D)              (None, 22, 991, 25)       275
       conv2d_1 (Conv2D)            (None, 1, 991, 25)        13775
       average_pooling2d (AveragePo (None, 1, 330, 25)        0
       dropout (Dropout)            (None, 1, 330, 25)        0
       conv2d_2 (Conv2D)            (None, 1, 321, 50)        12550
       batch_normalization (BatchNo (None, 1, 321, 50)        200
       average_pooling2d_1 (Average (None, 1, 107, 50)        0
       dropout_1 (Dropout)          (None, 1, 107, 50)        0
       conv2d_3 (Conv2D)            (None, 1, 98, 100)        50100
       batch_normalization_1 (Batch (None, 1, 98, 100)        400
       average_pooling2d_2 (Average (None, 1, 32, 100)        0
       dropout_2 (Dropout)          (None, 1, 32, 100)        0
       conv2d_4 (Conv2D)            (None, 1, 23, 200)        200200
       batch_normalization_2 (Batch (None, 1, 23, 200)        800
       average_pooling2d_3 (Average (None, 1, 7, 200)         0
       dropout_3 (Dropout)          (None, 1, 7, 200)         0
       conv2d_5 (Conv2D)            (None, 1, 1, 4)           5604
       flatten (Flatten)            (None, 4)                 0
       dense (Dense)                (None, 4)                 20

In keras I calculate the unit outputs for the pooling layers and the last convolution layer with the following code (these are the ends of the different blocks you mentioned in the paper):

        out_tensor_1 = denseConvNet.get_layer(index = 2).output         
        out_tensor_2 = denseConvNet.get_layer(index = 6).output
        out_tensor_3 =denseConvNet.get_layer(index = 10).output
        out_tensor_4 = denseConvNet.get_layer(index = 14).output
        out_tensor_pred = denseConvNet.get_layer(index = -3).output        


        trained = []
        earlyPredictor_1 = tf.keras.Model(denseConvNet.input , out_tensor_1)
        trained.append(earlyPredictor_1.predict(data_test))
        earlyPredictor_2 = tf.keras.Model( denseConvNet.input  , out_tensor_2)
        trained.append(earlyPredictor_2.predict(data_test))
        earlyPredictor_3 = tf.keras.Model( denseConvNet.input  , out_tensor_3)
        trained.append(earlyPredictor_3.predict(data_test))
        earlyPredictor_4 = tf.keras.Model( denseConvNet.input  , out_tensor_4)
        trained.append(earlyPredictor_4.predict(data_test))
        earlyPredictor_pred = tf.keras.Model( denseConvNet.input , out_tensor_pred)
        trained.append(earlyPredictor_pred.predict(data_test))

Due to that my unit outputs have the same dimensions as the output shape as the layer (channels last):

        Block 1                   (24, 1, 330, 25)
        Block 2                   (24, 1, 107, 50)
        Block 3                   (24, 1, 32, 100)
        Block 4                   (24, 1, 7, 200)
        Prediction Layer          (24, 1, 1, 4)

The receptive field size and the resulting envelope shapes of the corresponding layers are as follow:

        Receptive field size of layer 2/Block 1: 10 
        Mean squared envelope shape of layer 2/Block 1: (24, 22, 991, 1)

        Receptive field size of layer 6/Block 2: 39 
        Mean squared envelope shape of layer 6/Block 2:(24, 22, 962, 1)

        Receptive field size of layer 10/Block 3:126 
        Mean squared envelope shape of layer 10/Block 3:(24, 22, 875, 1)

        Receptive field size of layer 14/Block 4: 387 
        Mean squared envelope shape of layer 14/Block 4: (24, 22, 614, 1)

        Receptive field size of pred layer: 927 
        Mean squared envelope shape of pred layer: (24, 22, 74, 1)

Can you tell me how to understand the activations if they are not (simply) the output of one layer and how to compute them?

robintibor · 2021-06-30T17:33:28Z

Hi,
the main point that one needs to take into account is cropped decoding. See Figure 4 in https://onlinelibrary.wiley.com/doi/full/10.1002/hbm.23730 or the explanation at https://braindecode.org/auto_examples/plot_bcic_iv_2a_moabb_cropped.html

For cropped decoding without padding, the model will get as many outputs as it has inputs - receptive field size + 1.
Therefore one can use average pooling on the envelope with pooling size same as receptive field size and then get exactly matching outputs from the pooling and the output unit in the temporal dimension. And the average pooling will pool over exactly the same inputs that the output unit used to compute its output.

Concretely to have cropped decoding, one must appropriately replace the max pooling stride by dilations in the following layers.

Code from current braindecode does that here:

https://github.com/braindecode/braindecode/blob/6fe57295e5a1372dfd0abe77c494d1352bd647eb/braindecode/models/util.py#L9-L52

(it was manually done in this old repo, potentially more difficult to understand).

That may clear things up?

JulianWisser · 2021-07-01T07:37:15Z

Hi,
thanks for the clarification. That helped a lot to understand the problem since by now I only worked with trial wise training.
You said one only must "replace the max pooling stride by dilations in the following layers" like it is done in the mentioned function.
So is there no need to transform the data that is used as an input?

robintibor · 2021-07-01T12:19:11Z

No there is no need to transform the input. In our paper we actually used [-500,+4000] ms for both trialwise and cropped decoding, which is 1125 timesteps @ 250 Hz. [0,4000] should give similar, if slightly worse results.

robintibor changed the title ~~Explanation Amplitude Correlation~~ Explanation Input-Feature Unit-Output Correlation Maps (Envelope Activation Correlation) Jun 24, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Explanation Input-Feature Unit-Output Correlation Maps (Envelope Activation Correlation) #2

Explanation Input-Feature Unit-Output Correlation Maps (Envelope Activation Correlation) #2

robintibor commented Jun 24, 2021 •

edited

Loading

JulianWisser commented Jun 30, 2021

robintibor commented Jun 30, 2021

JulianWisser commented Jul 1, 2021

robintibor commented Jul 1, 2021

Explanation Input-Feature Unit-Output Correlation Maps (Envelope Activation Correlation) #2

Explanation Input-Feature Unit-Output Correlation Maps (Envelope Activation Correlation) #2

Comments

robintibor commented Jun 24, 2021 • edited Loading

Filter to frequency bands:

Compute envelope

Square Envelope

Compute Moving Average of the envelope within the receptive field of the corresponding layer

Compute Correlation with Activations

Compute Activations

Compute Correlation Envelope and Activations

JulianWisser commented Jun 30, 2021

How to interpret and calculate the activations? Different than layer output?

robintibor commented Jun 30, 2021

JulianWisser commented Jul 1, 2021

robintibor commented Jul 1, 2021

robintibor commented Jun 24, 2021 •

edited

Loading