[FieldTrip] calculating behavioural-power correlation -- follow-up questions

Wed Oct 21 19:54:40 CEST 2015

Hi Xiaoming, hi Arjen,

I've been encountering the same problem. I believe Xiaoming is right when he points out that the permutaion step shuffles data across conditions and that this introduces a negative bias in the distribution. I found the same thing when I correlated RT data with TFRs (absolute power). My distribution was shifted strongly to the left and, thus, not a single negative cluster was significant, but every positive one was.

Xiaomings explanation made a lot of sense to me when I thought about it graphically: Imagine correlating two data vectors, one (x) ranging between .5 and1, the other (y) between 50 and and 100. When plotting this, one gets a cloud of dots on the upper left corner of the diagram. When you then switch the variable assignment of half of the data points (which is what the permutation step seems to do), these dots will now be be shifted to the lower right corner of the diagram. So no matter what the correlation in the original data, chances are that (given different scaling) after permutaion, you get a negative correlation.

I am not 100% sure about this, so please let me know if I made a mistake.

What I tried instead of the 'ft_statfun_correlationT' was using a custom made statfun in which I pass the RTs via the design matrix. With this, my results looked much better. I am not sure, but I guess this is because there is no shuffling between the two variables in this case.

I would really like to know, what is the right way of doing this using just the FieldTrip functions. Is there a way to permute data within variables? I tried cfg.resampling = 'bootstrap', but this is not a permutation, as far as I know.

Thanks!
Martin

________________________________
Von: fieldtrip-bounces at science.ru.nl [fieldtrip-bounces at science.ru.nl]" im Auftrag von "Arjen Stolk [a.stolk8 at gmail.com]
Gesendet: Dienstag, 20. Oktober 2015 08:03
An: FieldTrip discussion list
Betreff: Re: [FieldTrip] calculating behavioural-power correlation -- follow-up questions

Hey Xiaoming,

It's still pretty hard, for me, to guess on basis of that matlab output what is going on here and what you mean with 'shuffling design matrices', and how that shuffling 'biases the cluster distribution'. As you mention yourself, it could be due to various reasons, and you're open to suggestions and increasing your understanding. I'd therefore suggest to try to funnel the number of potential explanations by simulating what you're doing (using input data for which you know how it should behave), after you've read more about what the design matrix and monte carlo statistics are supposed to do. Perhaps the statistics section at the bottom of this page provides a good starting point: http://www.fieldtriptoolbox.org/walkthrough

Hope that helps,
Arjen

2015-10-19 15:56 GMT-07:00 Xiaoming Du <XDu at mprc.umaryland.edu<mailto:XDu at mprc.umaryland.edu>>:
For example, our power values ranged from 1 to 3 (after log transform); my behavioral data ranged from 20 to 90;

by using above mentioned script, there are 14 negative clusters were reported in variable stat.

stat =

                   prob: [30x50 double]
            posclusters: []
    posclusterslabelmat: [30x50 double]
        posdistribution: [1x1000 double]
            negclusters: [1x14 struct]
    negclusterslabelmat: [30x50 double]
        negdistribution: [1x1000 double]
                cirange: [30x50 double]
                   mask: [30x50 logical]
                   stat: [30x50 double]
                    ref: [30x50 double]
                    rho: [30x50 double]
                 dimord: 'chan_freq'
                   freq: [1x50 double]
                  label: {30x1 cell}
                   time: 2.5000
                    cfg: [1x1 struct]

However, the p values of those clusters (i.e., stat.negclusters.prob) are all ones. The smallest value in  stat.negdistribution is way larger than the largest negative cluster t-sum. This could be real. However, it is more likely due to the shuffle between power and behavioral group. For example, design matrix  [1 1 1 1 2 2 2 2; 1 2 3 4 1 2 3 4] was shuffled to [1 2 2 1 2 2 1 1; 1 2 3 4 1 2 3 4].  After each permutation, for some subjects, their power data was labeled as behavioral data and vice versa. Because of the scale difference between power and behavioral data, large negative correlations were generated by permutation. This further biased the cluster distribution.
My limited understanding is that, for correlation, each permutation should fix cfg.ivar and only shuffle half of the cfg.uvar. For example, permute design matrix [1 1 1 1 2 2 2 2; 1 2 3 4 1 2 3 4]  to [1 1 1 1 2 2 2; 1 2 3 4 4 2 3 1]. THerefore, after permutation, one subject's power data corresponds to another subject's behavioral data.

I am not good at statistics. It will be really appreciated if you have any suggestions or comments.

Xiaoming

>>> Arjen Stolk <a.stolk8 at gmail.com<mailto:a.stolk8 at gmail.com>> 10/19/2015 6:01 PM >>>
Hey Xiaoming,

Not sure if I understand, but shouldn't the directions of the correlations be independent of the scaling of the two variables? Looking at the code of ft_statfun_correlationT it doesn't seem the conversion from correlation to T value (tstat = rho*(sqrt(max(nunits)-2))/sqrt((1-rho^2))) would result in a direction change either. Perhaps you could try to first manually calculate a correlation between signal power and behavioral power, and see whether anything is behaving unexpectedly?

Yours,
Arjen

2015-10-19 14:25 GMT-07:00 Xiaoming Du <XDu at mprc.umaryland.edu<mailto:XDu at mprc.umaryland.edu>>:
Dear FieldTrip users,
This is Xiaoming from University of Maryland Baltimore. My current project requires to calculate behavioral-power correlation across subjects. Similar topic was discussed here early this year. http://mailman.science.ru.nl/pipermail/fieldtrip/2015-February/008953.html
According to the suggestions in above mentioned thread, I duplicate my power dataset and replace the power values at each time-frequency point with behavioral data. Therefore, those two datasets have same structure and dimension. I used the following script to test if there are significant clusters of correlations.
cfg = [];
cfg.parameter = 'powspctrm';
cfg.method = 'montecarlo';
cfg.statistic = 'ft_statfun_correlationT';
...
etc
...
design = zeros(2, n1 * 2); % n1 is the number of subjects.
design(1,1:n1) = 1;
design(1,(n1 + 1):(n1 * 2)) = 2;
design(2, :) = [[1:n1 ] [1 : n1]];
cfg.design = design;

cfg.ivar = 1;
cfg.uvar = 2;
stat = ft_freqstatistics(cfg, dataBeh{:}, dataDX1{:});
However, it seems when each time the design matrix is permuted, FieldTrip is using the same method as for 'ft_statfun_depsamplesT', meaning cfg.uvar remains the same while cfg.ivar (1 or 2) is randomly assigned to each subject in design matrix. Although I confirmed this by uncommenting line 313 (i.e., tmpdesign = design(:,resample(i,:))) in ft_statistics_montecarlo.m which allows to display the permuted design matrix in command line, please correct me if this is not the case.
In my mind, this kind of permutation will cause trouble when dealing with correlation. For example, in my case, the behavioral data and power data have different scales. The power data are much larger than behavioral data in general. When assigning behavioral data into power group or vice versa, it will induce huge negative correlations between power and behavioral measurement. Therefore, no negative clusters will survive from permutation test.
Please let me know if I have mis-understanding or if I did anything wrong. Any suggestions will be highly appreciated!
Thanks.
Xiaoming

_______________________________________________
fieldtrip mailing list
fieldtrip at donders.ru.nl<mailto:fieldtrip at donders.ru.nl>
http://mailman.science.ru.nl/mailman/listinfo/fieldtrip

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.science.ru.nl/pipermail/fieldtrip/attachments/20151021/138520fa/attachment-0002.html>