[FieldTrip] Question about cluster-based permutation tests on linear mixed models
David Groppe
david.m.groppe at gmail.com
Wed Oct 26 19:35:03 CEST 2016
P.S. If you want to explore using FDR control to correct for multiple
comparisons, I would not recommend limiting yourself to FieldTrip's FDR
correction code (fdr.m). It only implements the Benjamini-Yekutieli FDR
control procedure, which is guaranteed to control the FDR at or below the
desired level, but tends to be quite overly conservative in practice. The
more popular FDR control algorithm by Benjamini & Hochberg is not always
guaranteed to control the FDR at or below the desired level, but it is much
less conservative and tends to accurately control FDR in practice. Here is
some code for the
Benjamini & Hochberg algorithm:
https://www.mathworks.com/matlabcentral/fileexchange/27418-fdr-bh
MATLAB's mafdr.m function that is part of the Bioinformatics toolbox also
implements the
Benjamini & Hochberg algorithm.
On Tue, Oct 25, 2016 at 3:28 PM, David Groppe <david.m.groppe at gmail.com>
wrote:
> I would definitely recommend running some simulations.
>
> It might be simpler to use bootstrap samples rather than permutations to
> generate your null distribution. Bootstrapping in also asymptotically
> accurate.
> -David
>
>
>
> On Tue, Oct 25, 2016 at 1:29 PM, Alik Widge <alik.widge at gmail.com> wrote:
>
>> Thanks, that was super interesting! Was not aware of those.
>>
>> Have been meditating this afternoon on this and related Anderson papers.
>> What's interesting is that he appears to think my suggestion below *would*
>> be asymptotically acceptable -- *if* one specifically permutes the
>> dependent variable (power/ERP observation) rather than permuting each
>> column of the independent variables separately (i.e., if one preserves any
>> correlational structure that exists between the independent variables).
>> That's the Manly (1997) method, and it appears that the only reason it
>> breaks down sometimes is if there's an outlier in the independent variable.
>> This could presumably be a problem in the ecological sciences, for which
>> he's writing, where one can't control things like temperature in a season
>> or numbers of eels that swim past a given sensor. In cognitive
>> neuroscience, where the predictor/independent variables are usually dummy
>> coded properties of the trial, this seems like we might be on firmer
>> ground.
>>
>> Opinion based on reading and reasoning, of course, and not to be trusted
>> until and unless I or someone else were to back it up by doing some
>> simulated-data experiments...
>>
>>
>> Alik Widge
>> alik.widge at gmail.com
>> (206) 866-5435
>>
>>
>> On Tue, Oct 25, 2016 at 11:30 AM, David Groppe <david.m.groppe at gmail.com>
>> wrote:
>>
>>> Hi Elisabeth and Alik,
>>> Permutation methods applied to multiple regression models are not
>>> generally guaranteed to be accurate because testing individual terms in
>>> such models (e.g., partial correlation coefficients) requires accurate
>>> knowledge of other terms in the model (e.g., the slope coefficients for all
>>> the other predictors in the multiple regression). Because such parameters
>>> have to be estimated from the data, permutation tests are only
>>> ‘‘asymptotically exact’’ for such tests (Anderson, 2001; Good, 2005).
>>> Though there are special cases (e.g., a two factor ANOVA with two levels of
>>> each factor), where permutation methods do guarantee accuracy.
>>> In lieu of permutation testing, you might want to try using one of
>>> Benjamini and colleagues' false discovery rate (FDR) control algorithms to
>>> control for multiple comparisons. In my tests on simulated ERP data (Groppe
>>> et al., 2011), FDR correction was nearly as powerful as cluster-based
>>> permutation testing for detecting a very broadly distributed effect (e.g.,
>>> a P300-like effect) and it was far more sensitive than cluster-based
>>> testing for an effect with a very limited distribution (e.g., an N170-like
>>> effect). FDR correction is also very computationally efficient.
>>> hope this is helpful,
>>> -David
>>>
>>>
>>> Refs:
>>> Anderson, M. J. (2001). Permutation tests for univariate or multivariate
>>> analysis of variance and regression. *Canadian journal of fisheries and
>>> aquatic sciences*, *58*(3), 626-639.
>>>
>>> Good, P. I. (2005). Permutation, Parametric and Bootstrap Tests of
>>> Hypotheses: A Practical Guide to Resampling Methods for Testing Hypotheses.
>>>
>>> Groppe, D. M., Urbach, T. P., & Kutas, M. (2011). Mass univariate
>>> analysis of event‐related brain potentials/fields II: Simulation studies.
>>> *Psychophysiology*, *48*(12), 1726-1737.
>>>
>>>
>>> On Fri, Oct 21, 2016 at 1:38 PM, Elisabeth May <
>>> elisabethsusanne.may at gmail.com> wrote:
>>>
>>>> Dear Eric and Alik,
>>>>
>>>> thanks a lot for your helpful responses!
>>>>
>>>> I will have a close look at the faqs, Eric, and test the approaches you
>>>> outlined. I am curious, anyway, as to how different results will be for
>>>> simple regressions compared to the multilevel results of the linear-mixed
>>>> models.
>>>>
>>>> Like Alik, I am also curious about other people's opinions on the
>>>> general question if there are theoretical reasons against a combination of
>>>> the approaches like Alik suggested. We also thought about this approach but
>>>> haven't fully tested it yet because of the very long calculation times.
>>>>
>>>> Thanks again and have a nice weekend!
>>>> Elisabeth
>>>>
>>>> 2016-10-20 12:49 GMT+02:00 Alik Widge <alik.widge at gmail.com>:
>>>>
>>>>> Eric, I don't think I understand why you would say "I do not see how
>>>>> these models could be combined with permutation-based inference; they are
>>>>> just different statistical frameworks". As you somewhat hint, the (G)LMM is
>>>>> a regression, and the beta coefficient for the independent-variable of
>>>>> interest at each voxel/vertex/sensor x timepoint can be interpreted as "how
>>>>> much does the independent variable explain the brain activity?" In that
>>>>> framework, it seems to me that one could do the following:
>>>>>
>>>>> for n=1:1000
>>>>> 1) Permute the condition labels (within subjects) of the individual
>>>>> trials
>>>>> 2) Re-fit the LMM at each (voxel,timepoint), creating a beta map
>>>>> and corresponding t-map
>>>>> 3) Threshold and construct cluster mass statistic as usual
>>>>> end
>>>>> 4) Identify cluster in the original (unpermuted) analysis and report
>>>>> cluster p-value
>>>>>
>>>>>
>>>>> Now, the main thing that has come up when we've tried to do this is
>>>>> that re-fitting a (voxel x time) GLM 1000 times by the standard iterative
>>>>> maximum-likelihood engines is remarkably slow. In fieldtrip, I can imagine
>>>>> it would require rewriting at least a statfun, maybe other pieces of the
>>>>> code. (We had an idea that, since the betas likely should vary smoothly
>>>>> over time and space, one could use the output of one GLM as the seed to the
>>>>> next, which would speed up convergence.) So it still does not seem like a
>>>>> good idea, but based on the above, is there actually a *theoretical* reason
>>>>> it wouldn't work?
>>>>>
>>>>>
>>>>> Alik Widge, MD, PhD
>>>>> Director, Translational NeuroEngineering Laboratory
>>>>> Division of Neurotherapeutics, Massachusetts General Hospital
>>>>> Assistant Professor of Psychiatry, Harvard Medical School
>>>>> Clinical Fellow, Picower Institute for Learning & Memory (MIT)
>>>>> awidge at partners.org
>>>>> http://scholar.harvard.edu/awidge/
>>>>> 617-643-2580
>>>>>
>>>>> Alik Widge
>>>>> alik.widge at gmail.com
>>>>> (206) 866-5435
>>>>>
>>>>>
>>>>> On Thu, Oct 20, 2016 at 6:08 AM, Maris, E.G.G. (Eric) <
>>>>> e.maris at donders.ru.nl> wrote:
>>>>>
>>>>>> Note: this is the second time I post this reply, and the reason is
>>>>>> that I forgot to add an appropriate Subject (for findability) to my email
>>>>>> (shame on me…(-;)
>>>>>>
>>>>>> *From: *Elisabeth May <elisabethsusanne.may at gmail.com>
>>>>>> *Subject: **[FieldTrip] Question about cluster-based permutation
>>>>>> tests on linear mixed models*
>>>>>> *Date: *27 September 2016 at 14:46:55 GMT+2
>>>>>> *To: *<fieldtrip at science.ru.nl>
>>>>>> *Reply-To: *FieldTrip discussion list <fieldtrip at science.ru.nl>
>>>>>>
>>>>>>
>>>>>> Dear FieldTripers,
>>>>>>
>>>>>> I have a question about the potential use of cluster-based
>>>>>> permutation tests for results obtained using linear mixed models.
>>>>>>
>>>>>> We are working with data from a 10 min EEG experiment on source level
>>>>>> with the aim to quantify the relationship of brain activity in different
>>>>>> frequency bands with continous perceptual ratings across 20 subjects in
>>>>>> different experimental conditions. Thus, we have 10 min time courses of
>>>>>> brain activity and ratings for each voxel for different conditions and want
>>>>>> to test a) if there are significant relationships in the single conditions
>>>>>> and b) if these relationships differ between two conditions. To this end, I
>>>>>> have calculated linear mixed models in R using the lme4 toolbox. For both
>>>>>> the single condition relationships and the condition contrasts, they result
>>>>>> in a single t-value (and a corresponding p-value), which is based on
>>>>>> information on both the single subject and the group level (i.e. we perform
>>>>>> a multi-level analysis). However, with more than 2000 voxels, we have a lot
>>>>>> of t-values and are wondering if there is a way to apply cluster-based
>>>>>> tests to correct for multiple comparisons.
>>>>>>
>>>>>> The main problem I see is that I only have one multilevel t-value for
>>>>>> the effect across all subjects, i.e. I don't have single subjects values,
>>>>>> which I could then e.g. randomize between conditions as normally done in
>>>>>> cluster-based permutation tests. (Or rather, I would be able to extract
>>>>>> single subject values but would then loose the advantage of the multi-level
>>>>>> analysis.)
>>>>>>
>>>>>> I found an old thread in the mailinglist archive where it was
>>>>>> suggested to flip the signs of the t-statistic for cluster-level correction
>>>>>> (https://mailman.science.ru.nl/pipermail/fieldtrip/2012-July
>>>>>> /005375.html). I understand that, in our case, I would do this
>>>>>> randomly for all voxels in each randomization and then build spatial
>>>>>> clusters on the resulting (partly flipped) t-values. However, I am not sure
>>>>>> if that is a valid approach based on the null hypothesis that there are no
>>>>>> significant relations in my single conditions (a) or no significant
>>>>>> relationship differences in my condition contrasts (b).
>>>>>>
>>>>>> For the condition contrasts, I would be able to permute the condition
>>>>>> labels as normally done in cluster-based permutation tests,I think, but
>>>>>> would then have to recalculate the linear mixed models for all voxels in
>>>>>> every permutation. This would result in a very high computational load.
>>>>>>
>>>>>> Does anyone have any experience with this kind of analysis? Would the
>>>>>> flipping of t-values be a valid approach (and if yes, is there anything to
>>>>>> keep in mind in particular)? Can you think of other ways to combine linear
>>>>>> mixed models with a multiple comparison correction on the cluster level?
>>>>>>
>>>>>>
>>>>>> Hi Elisabeth,
>>>>>>
>>>>>> I’m not an expert on linear mixed modelling, at least not with
>>>>>> respect to the different ways in which they can be used to deal with
>>>>>> correlated observations (typically, time series). However, from a
>>>>>> theoretical point of view, I do not see how these models could be combined
>>>>>> with permutation-based inference; they are just different statistical
>>>>>> frameworks. However, it IS possible to answer your questions ("we
>>>>>> have 10 min time courses of brain activity and ratings for each voxel for
>>>>>> different conditions and wan to test a) if there are significant
>>>>>> relationships in the single conditions and b) if these relationships differ
>>>>>> between two conditions.”) within the framework of cluster-based permutation
>>>>>> tests. Question b) is the most straightforward because it amounts to a
>>>>>> cluster-based permutation test using the depsamplesT statfun applied to the
>>>>>> regression coefficients in each of the two conditions. Answering question
>>>>>> a) requires that you bin your ratings in a number of categories, calculate
>>>>>> the trial-averaged EEG data for each of the categoreies, and test the
>>>>>> difference between them using a cluster-based permutation test using the
>>>>>> depsamplesregrT statfun. Both of these approaches have been described
>>>>>> previously on this discussion list, and for the depsamplesregrT statfun
>>>>>> (your question a), it was Vladimir Litvak who used it first (actually, I
>>>>>> implemented it for him). The approach for question b) is actually a variant
>>>>>> on the general approach for testing interactions using cluster-based
>>>>>> permutation tests.
>>>>>>
>>>>>> Have a look here:
>>>>>> http://www.fieldtriptoolbox.org/faq/how_can_i_test_for_corre
>>>>>> lations_between_neuronal_data_and_quantitative_stimulus_and_
>>>>>> behavioural_variables
>>>>>> and
>>>>>> http://www.fieldtriptoolbox.org/faq/how_can_i_test_an_intera
>>>>>> ction_effect_using_cluster-based_permutation_tests
>>>>>>
>>>>>> These tutorials provide all the necessary concepts, although they do
>>>>>> not answer your question in a recipe-like fashion.
>>>>>>
>>>>>> best,
>>>>>> Eric Maris
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> fieldtrip mailing list
>>>>>> fieldtrip at donders.ru.nl
>>>>>> https://mailman.science.ru.nl/mailman/listinfo/fieldtrip
>>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> fieldtrip mailing list
>>>>> fieldtrip at donders.ru.nl
>>>>> https://mailman.science.ru.nl/mailman/listinfo/fieldtrip
>>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> fieldtrip mailing list
>>>> fieldtrip at donders.ru.nl
>>>> https://mailman.science.ru.nl/mailman/listinfo/fieldtrip
>>>>
>>>
>>>
>>> _______________________________________________
>>> fieldtrip mailing list
>>> fieldtrip at donders.ru.nl
>>> https://mailman.science.ru.nl/mailman/listinfo/fieldtrip
>>>
>>
>>
>> _______________________________________________
>> fieldtrip mailing list
>> fieldtrip at donders.ru.nl
>> https://mailman.science.ru.nl/mailman/listinfo/fieldtrip
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.science.ru.nl/pipermail/fieldtrip/attachments/20161026/7af7da85/attachment-0002.html>
More information about the fieldtrip
mailing list