[FieldTrip] calculating behavioural-power correlation -- follow-up questions

Wed Oct 21 20:22:09 CEST 2015

Hi Martin,

Thanks for thinking along. I've briefly tried to replicate/simplify the
situation depicted by you using the code below. It works as I would expect,
no matter whether one variable is scaled differently. But perhaps I'm not
fully capturing the issue, and something still goes awry. This is a
possibility because ft_statfun_correlationT has only recently been
implemented for a specific case, and was never really tested within
different situations (hence it's not well-documented on the wiki). Do you
think you could use this example code to replicate the situation you are
experiencing?

Yours,
Arjen

% simulate simple multiple subjects timelock structures

data_brain = [];

data_behav = [];

for j=1:10

data_brain{j}.avg = j; % increasing

data_brain{j}.dimord = 'chan_time';

data_brain{j}.time = 1;

data_brain{j}.label = {'1'};

data_behav{j} = data_brain{j};

data_behav{j}.avg = data_brain{j}.avg*-1000+50; % add scaling difference

end

% compute statistics with correlationT

cfg = [];

cfg.method           = 'montecarlo';

cfg.statistic        = 'ft_statfun_correlationT';

cfg.numrandomization = 100;

n1 = 10;    % n1 is the number of subjects

design              = zeros(2, n1 * 2);

design(1,1:n1)      = 1;

design(1,(n1 + 1):(n1 * 2)) = 2;

design(2, :)        = [1:n1 1:n1];

cfg.design           = design;

cfg.ivar             = 1;

cfg.uvar             = 2;

stat = ft_timelockstatistics(cfg, data_brain{:}, data_behav{:});

assert(isequal(stat.rho, -1));

2015-10-21 10:54 GMT-07:00 Krebber, Martin <martin.krebber at charite.de>:

> Hi Xiaoming, hi Arjen,
>
> I've been encountering the same problem. I believe Xiaoming is right when
> he points out that the permutaion step shuffles data across conditions and
> that this introduces a negative bias in the distribution. I found the same
> thing when I correlated RT data with TFRs (absolute power). My distribution
> was shifted strongly to the left and, thus, not a single negative cluster
> was significant, but every positive one was.
>
> Xiaomings explanation made a lot of sense to me when I thought about it
> graphically: Imagine correlating two data vectors, one (x) ranging between
> .5 and1, the other (y) between 50 and and 100. When plotting this, one gets
> a cloud of dots on the upper left corner of the diagram. When you then
> switch the variable assignment of half of the data points (which is what
> the permutation step seems to do), these dots will now be be shifted to the
> lower right corner of the diagram. So no matter what the correlation in the
> original data, chances are that (given different scaling) after permutaion,
> you get a negative correlation.
>
> I am not 100% sure about this, so please let me know if I made a mistake.
>
> What I tried instead of the 'ft_statfun_correlationT' was using a custom
> made statfun in which I pass the RTs via the design matrix. With this, my
> results looked much better. I am not sure, but I guess this is because
> there is no shuffling between the two variables in this case.
>
> I would really like to know, what is the right way of doing this using
> just the FieldTrip functions. Is there a way to permute data within
> variables? I tried cfg.resampling = 'bootstrap', but this is not a
> permutation, as far as I know.
>
>
> Thanks!
> Martin
>
>
> ------------------------------
> *Von:* fieldtrip-bounces at science.ru.nl [fieldtrip-bounces at science.ru.nl]"
> im Auftrag von "Arjen Stolk [a.stolk8 at gmail.com]
> *Gesendet:* Dienstag, 20. Oktober 2015 08:03
> *An:* FieldTrip discussion list
> *Betreff:* Re: [FieldTrip] calculating behavioural-power correlation --
> follow-up questions
>
> Hey Xiaoming,
>
> It's still pretty hard, for me, to guess on basis of that matlab output
> what is going on here and what you mean with 'shuffling design matrices',
> and how that shuffling 'biases the cluster distribution'. As you mention
> yourself, it could be due to various reasons, and you're open to
> suggestions and increasing your understanding. I'd therefore suggest to try
> to funnel the number of potential explanations by simulating what you're
> doing (using input data for which you know how it should behave), after
> you've read more about what the design matrix and monte carlo statistics
> are supposed to do. Perhaps the statistics section at the bottom of this
> page provides a good starting point:
> http://www.fieldtriptoolbox.org/walkthrough
>
> Hope that helps,
> Arjen
>
> 2015-10-19 15:56 GMT-07:00 Xiaoming Du <XDu at mprc.umaryland.edu>:
>
>> For example, our power values ranged from 1 to 3 (after log transform);
>> my behavioral data ranged from 20 to 90;
>>
>> by using above mentioned script, there are 14 negative clusters were
>> reported in variable stat.
>>
>> stat =
>>
>>                    prob: [30x50 double]
>>             posclusters: []
>>     posclusterslabelmat: [30x50 double]
>>         posdistribution: [1x1000 double]
>>             negclusters: [1x14 struct]
>>     negclusterslabelmat: [30x50 double]
>>         negdistribution: [1x1000 double]
>>                 cirange: [30x50 double]
>>                    mask: [30x50 logical]
>>                    stat: [30x50 double]
>>                     ref: [30x50 double]
>>                     rho: [30x50 double]
>>                  dimord: 'chan_freq'
>>                    freq: [1x50 double]
>>                   label: {30x1 cell}
>>                    time: 2.5000
>>                     cfg: [1x1 struct]
>>
>> However, the p values of those clusters (i.e., stat.negclusters.prob) are
>> all ones. The smallest value in  stat.negdistribution is way larger than
>> the largest negative cluster t-sum. This could be real. However, it is more
>> likely due to the shuffle between power and behavioral group. For example,
>> design matrix  [1 1 1 1 2 2 2 2; 1 2 3 4 1 2 3 4] was shuffled to [1 2 2 1
>> 2 2 1 1; 1 2 3 4 1 2 3 4].  After each permutation, for some subjects,
>> their power data was labeled as behavioral data and vice versa. Because of
>> the scale difference between power and behavioral data, large negative
>> correlations were generated by permutation. This further biased the cluster
>> distribution.
>> My limited understanding is that, for correlation, each permutation
>> should fix cfg.ivar and only shuffle half of the cfg.uvar. For example,
>> permute design matrix [1 1 1 1 2 2 2 2; 1 2 3 4 1 2 3 4]  to [1 1 1 1 2 2
>> 2; 1 2 3 4 4 2 3 1]. THerefore, after permutation, one subject's power data
>> corresponds to another subject's behavioral data.
>>
>> I am not good at statistics. It will be really appreciated if you have
>> any suggestions or comments.
>>
>> Xiaoming
>>
>>
>>
>>
>> >>> Arjen Stolk <a.stolk8 at gmail.com> 10/19/2015 6:01 PM >>>
>> Hey Xiaoming,
>>
>> Not sure if I understand, but shouldn't the directions of the
>> correlations be independent of the scaling of the two variables? Looking at
>> the code of ft_statfun_correlationT it doesn't seem the conversion from
>> correlation to T value (tstat = rho*(sqrt(max(nunits)-2))/sqrt((1-rho^2)))
>> would result in a direction change either. Perhaps you could try to first
>> manually calculate a correlation between signal power and behavioral power,
>> and see whether anything is behaving unexpectedly?
>>
>> Yours,
>> Arjen
>>
>> 2015-10-19 14:25 GMT-07:00 Xiaoming Du <XDu at mprc.umaryland.edu>:
>>
>>> Dear FieldTrip users,
>>> This is Xiaoming from University of Maryland Baltimore. My current
>>> project requires to calculate behavioral-power correlation across subjects.
>>> Similar topic was discussed here early this year.
>>> http://mailman.science.ru.nl/pipermail/fieldtrip/2015-February/008953.html
>>> According to the suggestions in above mentioned thread, I duplicate my
>>> power dataset and replace the power values at each time-frequency point
>>> with behavioral data. Therefore, those two datasets have same structure and
>>> dimension. I used the following script to test if there are significant
>>> clusters of correlations.
>>> cfg = [];
>>> cfg.parameter = 'powspctrm';
>>> cfg.method = 'montecarlo';
>>> cfg.statistic = 'ft_statfun_correlationT';
>>> ...
>>> etc
>>> ...
>>> design = zeros(2, n1 * 2); % n1 is the number of subjects.
>>> design(1,1:n1) = 1;
>>> design(1,(n1 + 1):(n1 * 2)) = 2;
>>> design(2, :) = [[1:n1 ] [1 : n1]];
>>> cfg.design = design;
>>>
>>> cfg.ivar = 1;
>>> cfg.uvar = 2;
>>> stat = ft_freqstatistics(cfg, dataBeh{:}, dataDX1{:});
>>> However, it seems when each time the design matrix is permuted,
>>> FieldTrip is using the same method as for 'ft_statfun_depsamplesT', meaning
>>> cfg.uvar remains the same while cfg.ivar (1 or 2) is randomly assigned to
>>> each subject in design matrix. Although I confirmed this by uncommenting
>>> line 313 (i.e., tmpdesign = design(:,resample(i,:))) in
>>> ft_statistics_montecarlo.m which allows to display the permuted design
>>> matrix in command line, please correct me if this is not the case.
>>> In my mind, this kind of permutation will cause trouble when dealing
>>> with correlation. For example, in my case, the behavioral data and power
>>> data have different scales. The power data are much larger than behavioral
>>> data in general. When assigning behavioral data into power group or vice
>>> versa, it will induce huge negative correlations between power and
>>> behavioral measurement. Therefore, no negative clusters will survive from
>>> permutation test.
>>> Please let me know if I have mis-understanding or if I did anything
>>> wrong. Any suggestions will be highly appreciated!
>>> Thanks.
>>> Xiaoming
>>>
>>> _______________________________________________
>>> fieldtrip mailing list
>>> fieldtrip at donders.ru.nl
>>> http://mailman.science.ru.nl/mailman/listinfo/fieldtrip
>>>
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.science.ru.nl/pipermail/fieldtrip/attachments/20151021/c798d778/attachment.html>