[FieldTrip] Statistics (against baseline)

Thu Apr 21 22:10:45 CEST 2011

Dear Field-trippers
First of all, I would like to thank the Fieldtrip mentors, and all the 
contributors.
I find this toolbox more than a toolbox. The website and the active 
mailing list makes it stimulating and definitively instructive.
Which motivates me to share with you some reflections and questions.

I did some statistics on data from implanted electrodes (ECoG) in human. 
For the purpose of this analysis, I mainly looked at the time frequency 
space, so I first ran the following script:
(with data being the output of the /ft_preprocessing/ function)

load data;
trig = [3 4 5];
for cond = 1:length(trig)
     cfg = [];
     cfg.method = 'wavelet';
     cfg.output = 'fourier';
     cfg.foi = 2:2:50;
     cfg.toi = -0.5:0.02:0.05;
     cfg.keeptrials = 'yes';
     cfg.keeptaper = 'yes';
     cfg.width = 5;
     cfg.trials = find(data.trialinfo(:,1) == trig(cond));
     TF_Mwlt_fourier{cond} = ft_freqanalysis(cfg, data);
end;

I used Morlet wavelet because a previous post from Robert that 
recommended not to use multitapering for PLF
(http://mailman.science.ru.nl/pipermail/fieldtrip/2006-March/000446.html).
And also to facilitate comparison with other studies.

The output being Fourier, I computed power and phase concentration (aka 
PLF or ITC),
(both calculated at single trial level for stats and then averaged for 
/ft_multiplot//TFR/ )

powplf_data.pow                  = abs(data.fourierspctrm) .^2;
powplf_data.powspctrm      = abs(mean(squeeze(powplf_data.pow),1));
powplf_data.plf                     = 
data.fourierspctrm./abs(data.fourierspctrm);
powplf_data.plf_average    =  abs(mean(squeeze(powplf_data.plf),1));

The first statistics I wanted to run was a comparison of the power and 
the PLF for each condition against their respective baseline period.
To do so I applied the following method (based on Delorme et Makeig 
2004) for a given channel:
- draw a value within the baseline period for each trial (independently 
for each time point and frequency).
- average along the trial dimension
- repeat those steps thousand time
- use those thousand repetition to construct the distribution
- count the percentage of values above (/below) the observed post-onset 
value from the data (at a given latency and frequency).
Define significance for power using two tails (p < 0.025 & p > 0.975), 
and one tail for PLF (p<0.05).

I decided to write my own function because I was not sure that I could 
do it using Field trip.
I noticed that there is the /statfun_actvsblT/ that can be specified in 
/cfg.method/ field of /ft_freqstatistics/.
But I have two concerns about it:
- I prefer to used a randomization method for distribution reason. 
Indeed even if for power it seems to be ok with my data (seems to be 
normally distributed),
it is by definition not the case with PLF (which is more like a gamma or 
F distribution; because values are more concentrated near to zero, rare 
value toward 1).
- this function average  over the specified baseline time period. This 
average step makes more sense to me in the case of ERP analysis, but 
less with time frequency.
Especially with PLF, since the average will have the tendency to 
compress the values toward 0.

I guess that an alternative would be to use /statfun_diff_itc/ with one 
condition being post-onset period and the other "fake" condition being 
the baseline.
But in this case the length (duration) of the two pools should be 
identical, as the time points would be "paired". Am I correct about this 
or is it more flexible ?

Here are the points I would like to discuss:
1) I think I did my analysis the correct way, but it might not be the 
case, so any comments about the method are very welcome.
2) when someone is interested in the phase, is there a "better" method  
to compute the time frequency transform?
(or any method is good as long as there is no frequency smoothing)
3) can we imagine a future extra/new option in the /statfun_actvsbl/ 
that would allow for choosing between averaging and taking a random time 
point within the specified time period ?
   Maybe this makes less sense for T-stats than in the case of a 
randomization test ? and even less when this is not done at the level of 
single trials ?

Thanks in advance for your comments.

Manuel

-- 
Manuel Mercier, PhD
Research Fellow

Cognitive Neurophysiology Laboratory,
Children's Evaluation and Rehabilitation Center (CERC),
Departments of Pediatrics
Albert Einstein College of Medicine,
1225 Morris Park Avenue
Bronx , New York, NY 10461

phone: +1 (718) 862 1824
fax: +1 (718) 862 1807

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.science.ru.nl/pipermail/fieldtrip/attachments/20110421/d55e1df4/attachment-0001.html>