[FieldTrip] Problem with permutation testing and FDR correction

Thu Feb 3 11:24:18 CET 2011

Dear Michael,

Thank you for pointing this out. The origin of the problem is that FT calculates Monte Carlo estimates of the p-values. In practice there is no other way, except for very small studies where enumeration is possible. However, Monte Carlo estimates are useless if the number of draws from the permutation distribution (numpermutations) is very small, because in that case their Monte Carlo confidence interval is very large.

I propose that we add a Monte Carlo confidence interval for all Monte Carlo p-values that FT calculates. This is actually very easy, and I have described it in a paper together with Jan-Matthijs Schoffelen and Pascal Fries (JNeuroMeth, 2007). It just hasn't found its way into FT yet. I will discuss with Robert how to implement this.

Best,

Eric Maris

dr. Eric Maris
Donders Institute for Brain, Cognition and Behavior
Radboud University
P.O. Box 9104
6500 HE Nijmegen
The Netherlands
T:+31 24 3612651
Mobile: 06 39584581
F:+31 24 3616066
mailto:e.maris at donders.ru.nl
http://www.nphyscog.com/

> -----Original Message-----
> From: fieldtrip-bounces at donders.ru.nl [mailto:fieldtrip-
> bounces at donders.ru.nl] On Behalf Of Michael Wibral
> Sent: donderdag 3 februari 2011 10:18
> To: fieldtrip at donders.ru.nl
> Subject: [FieldTrip] Problem with permutation testing and FDR
> correction
> 
> Dear Fieldtrip users,
> 
> I think we detected an error with FDR correction and permutation
> testing. When increasing the number of permutations, the number of
> significant voxels goes DOWN, on the other hand when decreasing the
> number of permutations the number of significant voxels goes up. In my
> opinion the relationship should be the other way round. The theoretical
> background is as follows:
> 
> With FDR correction, the best p-value should survive bonferroni
> correction (if I am not completely mistaken here), the threshold  for
> the other p-values is then decreased successively.
> Hence, the p-value assigned to the best (most significant) statistical
> result plays a crucial role here. This best p-value in permutation
> tests can never be better than 1/numpermutations, i.e. when I do ten
> permutations, the best p I can possibly get is 0.1 EVEN IF ALL PERMUTED
> VALUES ARE LESS EXTREME. So to test with FDR at 275 sensors and 1
> timepoint at a threshold of 0.05 we need for anything to get
> significant a p-value of 0.05/275 = 0.000181818.... to be able to reach
> this in the best case (remebering that p-values can never be better
> than  1/numpermutations) we need at least (0.000181818....)^-1 = 5500
> permutations. In other words with anything less than this number of
> permutations we should not be able to get any significances. However,
> we do in fact get  alot of significant values at least in the fieldtrip
> versions tested up to 16th of January, e.g. in a freqstatistics test on
> 275 sensors, 50 frequencies and 26 timepoints I get 20760 significant
> voxels using only 10 permutations (!!). I assume that in the stats
> module the p-value is simply taken as the fraction of permutations that
> was more extreme than the actual value. This is correct as long as this
> fraction is not 0. In the case of a 0 fraction, however, this "0"
> should be replaced by "1/numpermutations", otherwise you get everything
> signifcant by just using 10 permutations. An alternative would be to
> issue an ERROR that the number of permutations is insufficient to
> perform the desired test with fdr correction.
> 
> 
> Example: for 4300 source points at 0.05 the number of permutations
> should at least be (0.05/4300)^-1=86000.
> 
> For now one should compute the p-value for a bonferroni correction
> manually, invert this value and take the resulting number as the number
> of permutations to be at least mathematically on the safe side
> (practically it seems to be advisable to multiply by another factor of
> 100 to have stable results, e.g 2000 permutations for uncorrceted
> testing at 0.05)
> 
> Please disregard this mail if you are sure that this behaviour has been
> fixed in the latest fieldtrip (past 16th of January) versions.
> 
> Michael