[FieldTrip] Question about cluster-based statistical testing (sum of t-stats or suprathreshold t-stats?)

Artemy Kolchinsky akolchin at indiana.edu
Mon Dec 31 05:03:34 CET 2012

> Importantly, these constants also enter in the permutation distribution
> that is used to evaluated the significance of the maximum cluster-mass
> statistic, to the effect that the Bullmore-style and the Fieldtrip-style
> permutation distributions are shifted versions of each other. As a result,
> the p-values that roll out of the two approaches are identical.****
> ** **
> If I understand correctly, having the same resulting p-values could only
> be if the two methods assign the same rank-ordering to a given a set of
> clusters.  But I don't think that is the case. Let's imagine that the
> t-statistic cutoff 'c' is equal to 1, and the data contains two
> suprathreshold clusters (let's say this is a spatial test and the clusters
> are composed of electrodes):****
> ** **
> - The first cluster has 10 electrodes, each one with a t-statistic equal
> to 1.1****
> - The second cluster has 2 electrodes, both with a t-statistic equal to 3*
> ***
> ** **
> As I understand, Bullmore's method would assign cluster 1 a mass of
> 10*(1.1-1) = 1 and cluster 2 a mass of 2*(3-1)=4 , while your method would
> assign cluster 1 a mass of 10*1.1 = 11 and cluster 2 a mass of 2*3 = 6.
>  Hence, given a null distribution, it should be possible to choose a
> cluster-based threshold that indicates as significant only cluster 1 under
> Bullmore's method, and only cluster 2 under yours.****
> ** **
> ** **
> I think your reasoning is correct: when the data contain more than one
> suprathreshold cluster, my argument does not apply anymore. Your example
> shows that the Bullmore- and Fieldtrip-style cluster statistics have
> different sensitivities. Thank you for pointing this out. For every test
> statistic, the decisions based on the permutation p-value controls the
> type-I error rate, but the type-II error rate (the complement of
> sensitivity) depends on the exact test statistic.

Thanks for confirming that up. I should note, though, that this is an issue
even in data without multiple suprathreshold clusters.  The same logic as
above -- which shows that the two measures gives different ranks to same
set of clusters -- also applies to the distribution of clusters under the
null hypothesis.  Thus one can imagine a single cluster in the data that
would be judged significant under Fieldtrip's method and not significant
under Bullmore, or vice-versa.  I believe that generally, in comparison to
Bullmore's method, Fieldtrip's method would tend to favor
judging-as-significant large clusters (with many electrodes).

I personally think of the distinction not so much in terms of controlling
sensitivity, but rather as concerning the definition of what counts as a
cluster of interest.  Though both methods look for spatiotemoprally
contiguous regions of electrodes that exceed threshold, for Bullmore the
cluster is the sum of suprathreshold statistic values , while for Fieldtrip
it's the sum of the entire statistic values in the region.  I'm quite
interested in the question of which gives more justifiable/better results
in real-world settings, though unfortunately I have not seen any work done
on the matter.  From what I have seen in my brief forays into the extensive
analytic + numerical studies of cluster-based significance testing in the
fMRI literature, in that field they always refer to Bullmore-style clusters.

Thanks again & happy holidays...
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.science.ru.nl/pipermail/fieldtrip/attachments/20121230/64a5413b/attachment-0002.html>

More information about the fieldtrip mailing list