[FieldTrip] Effect size measure for cluster-based permutation tests

Tue Sep 5 14:12:52 CEST 2017

Dear discussion list readers & contributors (especially Christine Blume),

There have been many questions (not only on the FT discussion list) about the calculation of effect size measures in the context of cluster-based permutation tests. I will continue my reply under the quotes below.

From: Blume Christine <christine.blume at sbg.ac.at<mailto:christine.blume at sbg.ac.at>>
Subject: Re: [FieldTrip] Effect size measure for cluster-based permutation tests
Date: 4 September 2017 at 14:28:47 GMT+2
To: FieldTrip discussion list <fieldtrip at science.ru.nl<mailto:fieldtrip at science.ru.nl>>
Reply-To: FieldTrip discussion list <fieldtrip at science.ru.nl<mailto:fieldtrip at science.ru.nl>>

Hi Alik,

Thanks a lot for your suggestion, which I hoped would prompt more answers. Does anyone have suggestions on how exactly to implement the calculation of an effect size measure?

Best,
Christine

Von: fieldtrip-bounces at science.ru.nl<mailto:fieldtrip-bounces at science.ru.nl> [mailto:fieldtrip-bounces at science.ru.nl] Im Auftrag von Alik Widge
Gesendet: Mittwoch, 23. August 2017 16:42
An: FieldTrip discussion list
Betreff: Re: [FieldTrip] Effect size measure for cluster-based permutation tests

My naive answer, which perhaps will provoke Eric to provide a better one: you have the actual  cluster statistic and its permutation distribution under the null hypothesis. It seems as though that distribution could be assumed Gaussian and something like Cohen's d calculated.

On Aug 23, 2017 9:35 AM, "Blume Christine" <christine.blume at sbg.ac.at<mailto:christine.blume at sbg.ac.at>> wrote:
Dear all,

I came across a question posted by someone about a year ago, which concerned effect size measures for cluster-based permutation tests. Unfortunately, the question does not seem to have been answered…

Q: I am using cluster-based permutation tests (depsamplesT, on time-frequency data) and am wondering how to best calculate an effect size from that.

Best,
Christine

Giving a useful answer to this question requires that one knows for what this effect size measure will be used. Typically, a standardised effect size measure is required to perform a power calculation. A power calculation is possible for a number of parametric statistical tests such as the T- and  the F-test. As input for this power calculation, Cohen’s d is required. A sensible value for Cohen’s d can sometimes be found in published studies (preferably with large sample sizes).

Cohen’s d can easily be obtained from the outcome of a cluster-based permutation test:

  1.  Calculate the non-standardised effect sizes by averaging the (sensor, frequency, time)-specific effects within the cluster of interest. Typically, the (sensor, frequency, time)-specific effects are raw differences between the subject averages for the experimental conditions that are being compared.
  2.  Calculate the standard deviation over the subjects of these non-standardised effect sizes.
  3.  Calculate Cohen’s d by dividing the grand average of the non-standardised effect sizes by the standard deviation obtained in 2.

Unfortunately, Cohen’s d calculated in this way, will be biased, and therefore cannot be used for a power calculation. This type of bias is sometimes denoted as “double dipping”.

In general, it is extremely challenging to perform a power calculations for statistical analyses that involve high-dimensional data. This does not only hold for electrophysiological, but also for fMRI data. To get idea about the difficulties that one encounters, have a look at this paper from the fMRI community: http://www.biorxiv.org/content/early/2016/04/20/049429. For the analysis of high-dimensional electrophysiological data, quite some statistical work still has to be done.

best,
Eric Maris

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.science.ru.nl/pipermail/fieldtrip/attachments/20170905/1518d740/attachment-0001.html>