[FieldTrip] Questions regarding cluster-based permutation tests on TFRs

Tue Aug 13 09:40:10 CEST 2019

Hi Mats,

After reading in detail about the different methods for correcting for the MCP, I’ve concluded that the best option for my analyses is to adjust for FDR. Where can I find documentation for setting up the configuration structure fields specific to cfg.correctm = ‘fdr’, and on how to interpret the output fields of the statistical analysis under this option. I want to make sure that I’m running a two-tailed test and that I’m plotting the adjusted p-values. I was also wondering if there’s a way of running the Benjamini-Hochberg method as opposed to the Benjamini-Yekutieli method? 

Thanks!

-Gabriel

--
Gabriel Obregon-Henao
PhD Candidate in Neuroscience 
University of Washington

> On Aug 9, 2019, at 4:31 AM, Es, M.W.J. van (Mats) <M.vanEs at donders.ru.nl> wrote:
> 
> Hi Gabriel,
>  
> Please make sure you’re responding to the fieldtrip mailing list instead of my personal e-mail, so that everyone can see the problem and learn from it, and potentially help you out.
>  
> 1.       It is definitely far from trivial to do statistics in the most sensitive way, and at the same time make sure you’re coming to your conclusions validly. It is important to think about your analysis during the design of the experiment. You can gain sensitivity from more specific hypotheses (for example by averaging power in a certain time or frequency range, or both). You can of course use non-parametric permutation tests, and make sure the FWER is controlled, but this is not the most sensitive test. I do want to warn you to not keep doing statistics until something ends up ‘significant’ (since nothing ended up significant in your permutation test), because this highly inflates false positives.
> 2.       I would not include the baseline in your statistical inference, since this is not of interest to you. If you worry about your baselines being different because you have a blocked design, first just look at the data. If there is only a slow drift in your signal, you can easily remove this with a detrend, or a high pass filter at for example 0.1 Hz. If you see the baselines are different over blocks, or between conditions, then you’d probably baseline correct before comparing conditions.
> I’m not sure if I understand why you want to compute relative power (vs baseline) before doing statistical inference. If you want to baseline correct your conditions before statistics, you can just subtract the absolute power values. The ‘relchange’ normalization option in the ECoG tutorial is used for visualization and interpreation, not for doing statistics. For interpretation, it’s easier to use relative numbers than absolute (would you understand a xx microvolt/femtoTesla increase? How about a 35% procent increase?).
> 3.       It is possible to statistically compare average TFRs using cluster-based permutation test, but you do need repetitions in some way. Either use trials, in a 1st level statistic (comparing two conditions within one subjects), or subjects as repetition (in which case you can use the average TFR per subject). For classical analyses (e.g. differences in mean), it is not necessary to have the same amount of trials in each condition (please see this thread on the EEGlab discussion list: (https://sccn.ucsd.edu/pipermail/eeglablist/2010/003240.html): the mean amplitude is not biased by number of trials (the peak amplitude is, for example). So it’s indeed important to think about what you’re testing, and whether it’s valid with different amounts of trials, but in the case of testing the means, it is all OK.
>  
> Hope this answers most of your questions.
>  
> Happy computing,
> Mats
> From: Gabriel Obregon-Henao <obregon at uw.edu> 
> Sent: donderdag 8 augustus 2019 23:12
> To: Es, M.W.J. van (Mats) <M.vanEs at donders.ru.nl>
> Subject: Re: [FieldTrip] Questions regarding cluster-based permutation tests on TFRs
>  
> Hi Mats,
>  
> Thank you for your prompt reply. I've implemented the solution that you suggested in response to my first question, and it seems to be working. I've been trying to analyze the data without prior assumptions and/or hypotheses, and I believe this is causing a detrimental effect in the sensitivity of my statistical analyses given the broad spectral and temporal ranges that I'm testing. I've started to think about strategies for constraining my analyses to specific time periods and/or frequency bands, by selecting them in a principled way that avoids circular analysis, but it seems far from a trivial process. 
>  
> With regards to your second answer, would it then be best to test whether there is a difference in baseline activity between conditions by including it within the time period of interest (i.e., cfg.latency = 'all')? My experiment does use a blocked design in which two conditions are tested within two different contexts. I think that the assumption of equal baselines seems fair within a given context, but it might not hold across the two contexts. Thus, I've been thinking of computing power relative to baseline but I'm afraid that using a ratio (via 'relchange' or 'db' baseline normalization) might introduce spurious effects given the low SNR at the single-trial level. Should I be concerned about this, and if so, why is 'relchange' baseline normalization implemented in the example of analyzing high-gamma activity in human ECoG? Wouldn't it be better to use 'absolute' baseline normalization? Alternatively, could one use the average baseline across all trials to normalize the power estimates in the activation period at the single-trial level?
>  
> I knew that the order of trial-averaging/baseline-normalization produces different results, but other than introducing spurious changes at the single-trial level, I wasn't aware that there was a "right" order for running statistical analyses (specially when using a nonparametric statistical test). 
>  
> Finally, I was wondering if it's possible to compare two average TFRs using cluster-based permutation tests, and if so, what would one use for the 'statistic', 'ivar', 'uvar', and 'design' fields? Also, if I were to compare multiple conditions within a subject, do I need to have the same number of trials per condition for computing the F statistic? Is there a way of pulling out interactions from the latter test?
>  
> Thanks!
>  
> --Gabriel
>  
>  
>  
> On Fri, Aug 2, 2019 at 6:15 AM Es, M.W.J. van (Mats) <M.vanEs at donders.ru.nl> wrote:
> Hi Gabriel,
>  
> In response to your first question: yes, it is possible to test your activation period against your shorter baseline period. It is common to average the baseline over time anyway, since you expect a relatively static baseline anyway. In that case, you can just repmat the average reaction time with the length of the activation period’s time window:
> Assuming the dimord is ‘chan_freq_time’;
> freq_baseline.powspctrm = repmat(mean(freq_baseline.powspctrm,3), [1 1 size(freq_active.powspctrm,3)]).
> Now you can easily compare freq_active and freq_baseline with ft_freqstatistics.
>  
> In response to your second question: baseline normalization is not strictly necessary when comparing two conditions. Assuming that there is no difference between baselines (which is often a fair assumption), you can directly compare the conditions without baseline correcting. Of course, this doesn’t always hold (think of cases in which baselines might differ, for example in certain blocked designs), and is thus dependent on the experimental design.
>  
> In response to your last question: is you use absolute, there is no difference between the order of trial-averaging/subtracting baseline. For other normalizations there is a difference though (try this out yourself!), and doing it in the wrong order might invalidate your statistics.
>  
> Hope this clears things up!
> Best,
> Mats
>  
> PhD candidate
> Dynamic Connectivity
> Donders Institute for Brain,
> Cognition and Behaviour
> e:
> m.vanes at donders.ru.nl
> a:
> Kapittelweg 29, 6525 EN Nijmegen
> p: 
> +31(0)24 36 68291
>  
>  
>  
>  
>  
> From: Gabriel Obregon-Henao <obregon at uw.edu> 
> Sent: donderdag 1 augustus 2019 23:47
> To: fieldtrip at science.ru.nl
> Subject: [FieldTrip] Questions regarding cluster-based permutation tests on TFRs
>  
> Hello everyone,
>  
> I've been going over the tutorial on cluster-based permutation tests on time-frequency data and I have a couple of questions I'd really appreciate getting help with. 
>  
> First, I was wondering if it's possible to compare the TFRs  in the baseline with multiple non-overlapping segments in the activation period, and if so, how would one go about doing so?
> The lengths of my baseline periods vary between 1 - 1.5 s, but I can only use the last 500 ms or so to avoid activity evoked by the onset of a fixation cross within the ITI. On the other hand, the length of my activation periods is fixed at 3 s. Since one has to use equal-length intervals from the baseline and activation periods for the statistical test, I can only select a 500 ms segment from my activation period. Is there a workaround for testing the whole activation period against baseline, such as breaking it into multiple 500 ms segments and correcting for the MCP post-hoc? Alternatively, could one trick the test by upsampling the baseline period so it contains the same number of samples as the activation period?
>  
> Second, I was wondering why baseline normalization is not performed in the between-trial and the within-subject experiment examples before running the cluster-based permutation tests? This is in contrast to the example on analyzing high-gamma activity from human ECoG recordings. Are there any subtleties in the preprocessing stages that I'm overseeing (e.g., in the cluster-based permutation tests tutorial the epochs are being re-centered using the entire length of the epoch). Does baseline normalization depend on the experimental question being asked  and/or does it affect the sensitivity of the statistical  test (I know that it definitely changes the interpretation of the results)?
>  
> Finally, I was wondering whether it matters whether one performs baseline normalization (using a baseline other than 'absolute') on the trial or average level, and how does it impact the results? 
>  
> Thanks in advance!
>  
> --Gabriel
>  
> --
> Gabriel Obregon-Henao
> PhD Candidate in Neuroscience
> University of Washington
> 
>  
> --
> Gabriel Obregon-Henao
> PhD Candidate in Neuroscience
> University of Washington
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.science.ru.nl/pipermail/fieldtrip/attachments/20190813/e6fe1ba3/attachment.html>