[FieldTrip] low-pass filtering

Wed Jun 27 15:29:53 CEST 2018

Dear Burkhard,

Thank you very much for your input. I totally missed the bandwidth point. I
rerun the cluster based comparing the 500hz lowpass filtered at 40hz and
the downsampled data to 125hz (see code below). As you suggested, the
cluster analysis output is almost identical.

Best,

Diego

%-------------------------------

fsample = 500;
sigma = 0.5;
ferf = 6.0;
Aerf1 = -0.7;
Aerf2 = -0.2;
tau1 = 0.1;
tau2 = 0.05;
t = -0.8:1/fsample:0.8;
signoise = 0.2;
t0 = 0.05;
t0idx = nearest(t,t0);
ntrials = 100;

data1.label{1}='Oz';
data2.label{1}='Oz';

for k=1:ntrials;
  freq1 = normrnd(ferf,sigma);
  freq2 = normrnd(ferf,sigma);
  for tk = 1:size(t,2);
    phi1 = random('uniform',0,pi,1,1);
    phi2 = random('uniform',0,pi,1,1);
    if tk > t0idx;
      Serf1(1,tk) = Aerf2 * exp(1 - ((t(1,tk)-t0)/tau1)) *
sin(2*pi*freq1*(t(1,tk)-t0)+phi1);
      Serf2(1,tk) = Aerf2 * exp(1 - ((t(1,tk)-t0)/tau2)) *
sin(2*pi*freq2*(t(1,tk)-t0)+phi2);
    end
    Snoise1 = signoise.*randn(size(t));
    Snoise2 = signoise.*randn(size(t));
  end
  data1.trial{k}(1,:) = Serf1 + Snoise1;
  data1.time{k} = t;
  data2.trial{k}(1,:) = Serf2 + Snoise2;
  data2.time{k} = t;
end

cfg = [];
cfg.keeptrials = 'yes';
timelock1 = ft_timelockanalysis(cfg, data1);
timelock2 = ft_timelockanalysis(cfg, data2);

diff = timelock1;
diff.avg = timelock1.avg-timelock2.avg;

figure;plot(t,[timelock1.avg' timelock2.avg' diff.avg']);
legend('condition 1', 'condition 2', 'difference');

cfg = [];
cfg.design = [ 1*ones(1,ntrials) 2*ones(1,ntrials) ];
cfg.ivar = 1;
cfg.method = 'montecarlo';
cfg.statistic = 'indepsamplesT';
cfg.correctm = 'cluster';
cfg.numrandomization = 5000;
stat = ft_timelockstatistics(cfg, timelock1, timelock2);

figure;
subplot(4,1,1); plot(stat.time, stat.stat); ylabel('t-value');
title('raw data: unfiltered')
subplot(4,1,2); plot(diff.time, diff.avg); ylabel('avg1-avg2 (uV)');
subplot(4,1,3); semilogy(stat.time, stat.prob); ylabel('prob');
subplot(4,1,4); plot(stat.time, stat.mask); ylabel('significant');

%% low pass 40 Hz
cfg          = [];
cfg.lpfilter = 'yes';
cfg.lpfreq   = 40;
data1_lp40   = ft_preprocessing(cfg, data1);
data2_lp40   = ft_preprocessing(cfg, data2);

cfg = [];
cfg.keeptrials = 'yes';
timelock1 = ft_timelockanalysis(cfg, data1_lp40);
timelock2 = ft_timelockanalysis(cfg, data2_lp40);

diff_lp40 = timelock1;
diff_lp40.avg = timelock1.avg-timelock2.avg;

figure;plot(t,[timelock1.avg' timelock2.avg' diff_lp40.avg']);

cfg = [];
cfg.design = [ 1*ones(1,ntrials) 2*ones(1,ntrials) ];
cfg.ivar = 1;
cfg.method = 'montecarlo';
cfg.statistic = 'indepsamplesT';
cfg.correctm = 'cluster';
cfg.numrandomization = 5000;
stat_lp40 = ft_timelockstatistics(cfg, timelock1, timelock2);

figure;
subplot(4,1,1); plot(stat_lp40.time, stat_lp40.stat); ylabel('t-value');
title('lowpass filtering 40Hz')
subplot(4,1,2); plot(diff_lp40.time, diff_lp40.avg); ylabel('avg1-avg2
(uV)');
subplot(4,1,3); semilogy(stat_lp40.time, stat_lp40.prob); ylabel('prob');
subplot(4,1,4); plot(stat_lp40.time, stat_lp40.mask);
ylabel('significant');

%% boxcar 0.25ms
cfg = [];
cfg.boxcar = 0.25;
data1_box   = ft_preprocessing(cfg, data1);
data2_box   = ft_preprocessing(cfg, data2);

cfg = [];
cfg.keeptrials = 'yes';
timelock1 = ft_timelockanalysis(cfg, data1_box);
timelock2 = ft_timelockanalysis(cfg, data2_box);

diff = timelock1;
diff.avg = timelock1.avg-timelock2.avg;

figure;plot(t,[timelock1.avg' timelock2.avg' diff.avg']);

cfg = [];
cfg.design = [ 1*ones(1,ntrials) 2*ones(1,ntrials) ];
cfg.ivar = 1;
cfg.method = 'montecarlo';
cfg.statistic = 'indepsamplesT';
cfg.correctm = 'cluster';
cfg.numrandomization = 5000;
stat = ft_timelockstatistics(cfg, timelock1, timelock2);

figure;
subplot(4,1,1); plot(stat.time, stat.stat); ylabel('t-value');
title('boxcar filter 0.25s')
subplot(4,1,2); plot(diff.time, diff.avg); ylabel('avg1-avg2 (uV)');
subplot(4,1,3); semilogy(stat.time, stat.prob); ylabel('prob');
subplot(4,1,4); plot(stat.time, stat.mask); ylabel('significant');

%% downsample the data
cfg = [];
cfg.resamplefs = 125;
cfg.demean = 'yes';
data1_res = ft_resampledata(cfg, data1);
data2_res = ft_resampledata(cfg, data2);

cfg = [];
cfg.keeptrials = 'yes';
timelock1 = ft_timelockanalysis(cfg, data1_res);
timelock2 = ft_timelockanalysis(cfg, data2_res);

diff_res = timelock1;
diff_res.avg = timelock1.avg-timelock2.avg;

figure;plot(diff_res.time,[timelock1.avg' timelock2.avg' diff_res.avg']);

cfg = [];
cfg.design = [ 1*ones(1,ntrials) 2*ones(1,ntrials) ];
cfg.ivar = 1;
cfg.method = 'montecarlo';
cfg.statistic = 'indepsamplesT';
cfg.correctm = 'cluster';
cfg.numrandomization = 5000;
stat_res = ft_timelockstatistics(cfg, timelock1, timelock2);

figure;
hold all;subplot(4,1,1); plot(stat_res.time, stat_res.stat);
ylabel('t-value');
title('downsample to 125 Hz')
hold all;subplot(4,1,2); plot(diff_res.time, diff_res.avg);
ylabel('avg1-avg2 (uV)');
hold all;subplot(4,1,3); semilogy(stat_res.time, stat_res.prob);
ylabel('prob');
subplot(4,1,4); plot(stat_res.time, stat_res.mask); ylabel('significant');

%% all in one plot
figure;
subplot(4,1,1); plotyy(stat_res.time, stat_res.stat, stat_lp40.time,
stat_lp40.stat); ylabel('t-value');
subplot(4,1,2); plotyy(diff_res.time, diff_res.avg, diff_lp40.time,
diff_lp40.avg); ylabel('avg1-avg2 (uV)');
subplot(4,1,3); semilogy(stat_res.time, stat_res.prob); hold
all;semilogy(stat_lp40.time, stat_lp40.prob); ylabel('prob');
subplot(4,1,4); plotyy(stat_res.time, stat_res.mask,stat_lp40.time,
stat_lp40.mask); ylabel('significant');
legend('downsampled 125hz','500hz + lpfilt 40hz')

On 27 June 2018 at 13:58, Burkhard Maess <maess at cbs.mpg.de> wrote:

> Hi Irina and Diego,
>
> I am also expecting that the permutation test is rather insensitive to the
> bandwidth of the underlying data. I only would like to comment on the
> resampling issue: as a rule of thumb, often a third of the sampling
> frequency is assumed to represent the available bandwidth. With this ratio
> - the 500Hz sampled data represents about 160Hz bandwidth. Applying a
> lowpass of 40Hz (of similar steepness as the one used during the recording)
> will divide the bandwidth by four. Consequently, dividing the sampling rate
> by 4 is appropriate: 125Hz, which corresponds to delta-t of 8ms. The
> suggested 25ms by the reviewer are related to the 40Hz, but averaging over
> the 25ms would constitute an additional lowpass of poor quality.
> I would therefore recommend to rerun the cluster analysis using subsampled
> data. In principle, you may retrieve four different sets from the
> subsampling routine by taking either always the first, the second, the
> third or the forth sample of the 8ms interval. However, the results of the
> cluster analysis should be almost identical.
>
> best wishes,
> Burkhard
>
>
>
> --
> Dr. Burkhard Maess
> Max Planck Institute for Human Cognitive and Brain Sciences
> Stephanstr. 1a, D-04301 Leipzig, Germany
> phone/fax: +49 341 9940-2526/-2511   mail: maess 'at' cbs.mpg.de,
> http://www.cbs.mpg.de
>
> ----- Original Message -----
> | From: "Diego Lozano-Soldevilla" <dlozanosoldevilla at gmail.com>
> | To: "FieldTrip discussion list" <fieldtrip at science.ru.nl>
> | Sent: Wednesday, 27 June, 2018 12:09:43
> | Subject: Re: [FieldTrip] low-pass filtering
>
> | Hi Irina,
> |
> | On the one hand, the smoothing can indeed influence the cluster size. On
> the
> | other hand, the "type" of smoothing will influence how sensitive are you
> to
> | detect the potential differences too. I would try to illustrate the
> reviewer
> | the trade off between the type of differences you may have (transient vs
> | sustained; regarding the temporal domain) and the filter type that can
> be more
> | appropriate to detect the differences to justify your pipeline. I have
> no idea
> | about the type of ERF you're studying but I hope the example is
> illustrative
> | enough
> |
> | Look at the attached screenshot. In the first column, I simulated a
> transient
> | ERF with added noise (code below) and I ran the non-parametric
> cluster-based
> | permutation test that showed some significant and discontinuous
> differences.
> | The second column represents the same data but with a lowpass filter (40
> Hz)
> | and I find longer significant clusters. If you look at the filtered
> ERFs, the
> | structure is very similar to the noisy, unfiltered version. The third
> column is
> | what I understand is requested by the reviewer. Suddenly, no significant
> | differences are found because the ground truth simulated difference was
> too
> | short and weak to survive the 0.25ms averaging. In case of a long and
> sustained
> | ERF, may be the three plots would had been more congruent.
> |
> | I guess that the permutation part of the cluster-based nonparametric
> test will
> | take into account the non-independecy in the data resulting from low-pass
> | filtering. This non-independency will be the same for the original and
> the
> | permuted surrogates.
> |
> | I hope that helps!
> |
> | Diego
> |
> |
> |
> | %---------------------------------
> |
> | fsample = 600;
> | sigma = 0.5;
> | ferf = 6.0;
> | Aerf1 = -0.7;
> | Aerf2 = -0.2;
> | tau1 = 0.1;
> | tau2 = 0.05;
> | t = -0.8:1/fsample:0.8;
> | signoise = 0.2;
> | t0 = 0.05;
> | t0idx = nearest(t,t0);
> | ntrials = 100;
> |
> | data1.label{1}='Oz';
> | data2.label{1}='Oz';
> |
> | for k=1:ntrials;
> | freq1 = normrnd(ferf,sigma);
> | freq2 = normrnd(ferf,sigma);
> | for tk = 1:size(t,2);
> | phi1 = random('uniform',0,pi,1,1);
> | phi2 = random('uniform',0,pi,1,1);
> | if tk > t0idx;
> | Serf1(1,tk) = Aerf2 * exp(1 - ((t(1,tk)-t0)/tau1)) *
> | sin(2*pi*freq1*(t(1,tk)-t0)+phi1);
> | Serf2(1,tk) = Aerf2 * exp(1 - ((t(1,tk)-t0)/tau2)) *
> | sin(2*pi*freq2*(t(1,tk)-t0)+phi2);
> | end
> | Snoise1 = signoise.*randn(size(t));
> | Snoise2 = signoise.*randn(size(t));
> | end
> | data1.trial{k}(1,:) = Serf1 + Snoise1;
> | data1.time{k} = t;
> | data2.trial{k}(1,:) = Serf2 + Snoise2;
> | data2.time{k} = t;
> | end
> |
> | cfg = [];
> | cfg.keeptrials = 'yes';
> | timelock1 = ft_timelockanalysis(cfg, data1);
> | timelock2 = ft_timelockanalysis(cfg, data2);
> |
> | diff = timelock1;
> | diff.avg = timelock1.avg-timelock2.avg;
> |
> | figure;plot(t,[timelock1.avg' timelock2.avg' diff.avg']);
> | legend('condition 1', 'condition 2', 'difference');
> |
> | cfg = [];
> | cfg.design = [ 1*ones(1,ntrials) 2*ones(1,ntrials) ];
> | cfg.ivar = 1;
> | cfg.method = 'montecarlo';
> | cfg.statistic = 'indepsamplesT';
> | cfg.correctm = 'cluster';
> | cfg.numrandomization = 5000;
> | stat = ft_timelockstatistics(cfg, timelock1, timelock2);
> |
> | figure;
> | subplot(4,1,1); plot(stat.time, stat.stat); ylabel('t-value');
> | title('raw data: unfiltered')
> | subplot(4,1,2); plot(diff.time, diff.avg); ylabel('avg1-avg2 (uV)');
> | subplot(4,1,3); semilogy(stat.time, stat.prob); ylabel('prob');
> | subplot(4,1,4); plot(stat.time, stat.mask); ylabel('significant');
> |
> |
> | %% low pass 40 Hz
> | cfg = [];
> | cfg.lpfilter = 'yes';
> | cfg.lpfreq = 40;
> | data1_lp40 = ft_preprocessing(cfg, data1);
> | data2_lp40 = ft_preprocessing(cfg, data2);
> |
> | cfg = [];
> | cfg.keeptrials = 'yes';
> | timelock1 = ft_timelockanalysis(cfg, data1_lp40);
> | timelock2 = ft_timelockanalysis(cfg, data2_lp40);
> |
> | diff = timelock1;
> | diff.avg = timelock1.avg-timelock2.avg;
> |
> | figure;plot(t,[timelock1.avg' timelock2.avg' diff.avg']);
> |
> |
> | cfg = [];
> | cfg.design = [ 1*ones(1,ntrials) 2*ones(1,ntrials) ];
> | cfg.ivar = 1;
> | cfg.method = 'montecarlo';
> | cfg.statistic = 'indepsamplesT';
> | cfg.correctm = 'cluster';
> | cfg.numrandomization = 5000;
> | stat = ft_timelockstatistics(cfg, timelock1, timelock2);
> |
> | figure;
> | subplot(4,1,1); plot(stat.time, stat.stat); ylabel('t-value');
> | title('lowpass filtering 40Hz')
> | subplot(4,1,2); plot(diff.time, diff.avg); ylabel('avg1-avg2 (uV)');
> | subplot(4,1,3); semilogy(stat.time, stat.prob); ylabel('prob');
> | subplot(4,1,4); plot(stat.time, stat.mask); ylabel('significant');
> |
> |
> | %% boxcar 0.25ms
> | cfg = [];
> | cfg.boxcar = 0.25;
> | data1_box = ft_preprocessing(cfg, data1);
> | data2_box = ft_preprocessing(cfg, data2);
> |
> | cfg = [];
> | cfg.keeptrials = 'yes';
> | timelock1 = ft_timelockanalysis(cfg, data1_box);
> | timelock2 = ft_timelockanalysis(cfg, data2_box);
> |
> | diff = timelock1;
> | diff.avg = timelock1.avg-timelock2.avg;
> |
> | figure;plot(t,[timelock1.avg' timelock2.avg' diff.avg']);
> |
> | cfg = [];
> | cfg.design = [ 1*ones(1,ntrials) 2*ones(1,ntrials) ];
> | cfg.ivar = 1;
> | cfg.method = 'montecarlo';
> | cfg.statistic = 'indepsamplesT';
> | cfg.correctm = 'cluster';
> | cfg.numrandomization = 5000;
> | stat = ft_timelockstatistics(cfg, timelock1, timelock2);
> |
> | figure;
> | subplot(4,1,1); plot(stat.time, stat.stat); ylabel('t-value');
> | title('boxcar filter 0.25s')
> | subplot(4,1,2); plot(diff.time, diff.avg); ylabel('avg1-avg2 (uV)');
> | subplot(4,1,3); semilogy(stat.time, stat.prob); ylabel('prob');
> | subplot(4,1,4); plot(stat.time, stat.mask); ylabel('significant');
> |
> |
> |
> |
> |
> | On 26 June 2018 at 17:16, Simanova, I. (Irina) <
> i.simanova at donders.ru.nl >
> | wrote:
> |
> |
> |
> | Dear experts,
> |
> | We recently submitted a paper where we use a cluster permutation
> analysis of
> | ERFs (testing conditions exchangeability) . The EEG was sampled at 500
> Hz and
> | low-pass filtered at 40 Hz during preprocessing. One of the reviewers has
> | indicated that u sing this temporal resolution for the analysis seems
> | unnecessary given the low-pass filter, which makes each time point not
> | independent due to smoothing. He further asks: " Do all key significant
> | components replicate if the analysis is performed using a mean amplitude
> that
> | is averaged for larger time bins (25 ms bins) instead of individual time
> | points?"
> |
> | I understand that when doing parametric analysis one might want to
> reduce the
> | number of comparisons by averaging ERF samples over time bins. But here
> we
> | solve the MCP with the cluster analysis. But it also seems like the
> reviewer is
> | concerned about non-independecy in the data resulting from low-pass
> filtering.
> | I wanted to check with you: can smoothing time-series indeed compromise
> the
> | cluster-based analysis (e.g. affect the cluster size)?
> |
> | Thank you,
> | Kind regards,
> | Irina
> |
> |
> |
> |
> | _______________________________________________
> | fieldtrip mailing list
> | https://mailman.science.ru.nl/mailman/listinfo/fieldtrip
> | https://doi.org/10.1371/journal.pcbi.1002202
> |
> |
> | _______________________________________________
> | fieldtrip mailing list
> | https://mailman.science.ru.nl/mailman/listinfo/fieldtrip
> | https://doi.org/10.1371/journal.pcbi.1002202
> _______________________________________________
> fieldtrip mailing list
> https://mailman.science.ru.nl/mailman/listinfo/fieldtrip
> https://doi.org/10.1371/journal.pcbi.1002202
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.science.ru.nl/pipermail/fieldtrip/attachments/20180627/8dcdf5af/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Screenshot from 2018-06-27 15-11-38.png
Type: image/png
Size: 58548 bytes
Desc: not available
URL: <http://mailman.science.ru.nl/pipermail/fieldtrip/attachments/20180627/8dcdf5af/attachment.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: untitled.png
Type: image/png
Size: 13832 bytes
Desc: not available
URL: <http://mailman.science.ru.nl/pipermail/fieldtrip/attachments/20180627/8dcdf5af/attachment-0001.png>