[FieldTrip] Performing statistics based on phase/amplitude correlation

Sat Mar 19 09:07:52 CET 2011

Hi Karl and Roemer,

Thanks for your replies.

@Karl: I know that real() does not give me the phase information
exclusively; sorry, I wasn't clear enough on this. As a first approach
to analysing PAC, I was looking into a measure known in the literature
as envelope-to-signal correlation, which is just the linear
correlation between the amplitude envelope of a high-frequency signal
and a single-frequency filtered low-frequency signal. Taking this as a
first approach allows me to easily compare my algorithms with stuff
that's implemented elsewhere. But you're right, using only the phase
information by angle() would probably be more appropriate; this is
something I'll look into in the future.

@Roemer: thanks for your detailed recipe! I will see where this gets
me, and I know where to find you if it turns out I could use some more
detailed help. Also, enjoy Seattle and the rest of the US! :)

Best,
Eelke

2011/3/19 Roemer van der Meij <r.vandermeij at donders.ru.nl>:
> Hi Eelke,
>
> This is how I would approach the problem:
> 1) compute the inner-product of the phase and amplitude vector per frequency
> pair, per trial (element-wise product and sum)
> - resulting in a single complex number per frequency pair, per trial
> 2) divide this by the product of the norms of both vectors: this will make
> sure the magnitude is normalized and goes from 0 to 1
> 2b) average over trials (you could also do this before step 2)
> 3) shuffle the trials of the amplitude and the trials of phase vector, and
> compute the above for each 'randomly paired trail' (phase of one trial, amp
> of another)
> 4) do this, say a 100 times or so, collecting the magnitudes of the complex
> numbers. This way, you will have 100 numbers per frequency pair, comprising
> a surrogate distribution of coupling
> 5) pick the highest number per frequency pair of the surrogate distribution,
> the 100th percentile, or estimate a normal distribution to pick a higher
> percentile
> 6) use this number per frequency pair to threshold your averaged-over-trials
> coupling, anything above threshold could be considered non-random coupling
> (do keep an eye for evoked effects if your original trials are short, as
> these could be quite similar for the random coupled trials)
>
> This is the way Eric and I approach PAC. If I was in Nijmegen I could
> explain our approach in more detail, but I'm doing a series of lab-visits in
> the US (Seattle atm :))
>
> If this doesn't work in your case, I would fake ft_freqstatistics in
> thinking your correlation data was freqdata, after doing a Fisher
> Z-transform on them.
>
> Hope it helps!
>
> Roemer
>
>
> On 18-Mar-11 05:25, Eelke Spaak wrote:
>
> Hmm, actually my own suggestion of switching around the time and trial
> dimensions seems to be very promising (and makes intuitive sense I
> guess, having thought about it some more).
>
> Best,
> Eelke
>
> 2011/3/18 Eelke Spaak <eelke.spaak at donders.ru.nl>:
>
> Dear FieldTrip community,
>
> At the moment, I am analysing some data to see whether a correlation
> exists between phase and amplitude over different frequencies, and in
> different channels. Computing the signals of interest (i.e., the phase
> and amplitude time series) works fine, and computing the correlation
> is also easily done with Matlab's corrcoef function. Now comes the
> tricky part. I would like to use the FT statistics framework to assess
> the significance of the found correlation, and use clustering in
> frequency-frequency space to correct for multiple comparisons. I
> understand that this requires a statfun_correlationPearson (or
> something like it), and I do not think I'll encounter any problems
> implementing this. Before I start doing so, however, I am not sure how
> to pass the design and the data to the higher-level statistics
> functions, such as ft_freqstatistics.
>
> My data structure is custom-imported, and contains a single 100s-long
> trial, on which the phase/amplitude analysis should be conducted:
>
> data =
>        hdr: [1x1 struct]
>    fsample: 1000
>      trial: {[24x100000 double]}
>       time: {[1x100000 double]}
>      label: {24x1 cell}
>        cfg: [1x1 struct]
>
> The phase and amplitude time series are generated by convolution with
> Morlet wavelets using ft_freqanalysis:
>
>    % perform wavelet convolution for amplitude data
>    cfg = [];
>    cfg.channel = chansAmpl;
>    cfg.method = 'wavelet';
>    cfg.output = 'fourier';
>    cfg.keeptrials = 'yes';
>    cfg.width = 7;
>    cfg.gwidth = 4;
>    cfg.foi = freqAmpl;
>    cfg.toi = data.time{1}; % compute estimate for each sample
>    tfr = ft_freqanalysis(cfg, data);
>
>    % perform wavelet convolution for phase data
>    cfg.channel = chansPh;
>    cfg.foi = freqPh;
>    tfrPh = ft_freqanalysis(cfg, data);
>
> where chansAmpl = {'CH10'}, chansPh = {'CH17'}, freqAmpl = 3:5:98,
> freqPh = 2:2:30. Simply computing the correlation of interest is done
> in the following manner:
>
>    % extract freqs X samples matrices to compute correlations on
>    phSig = real(squeeze(tfrPh.fourierspctrm(1,match_str(tfrPh.label,
> chansPh(k)), :, :)));
>    amplSig = abs(squeeze(tfr.fourierspctrm(1,match_str(tfr.label,
> chansAmpl(l)), :, :)));
>
>    % compute correlations
>    r = corr(phSig', amplSig', 'rows', 'pairwise'); % use only non-NaN rows
>
> The correlation matrix r I get out of this nicely corresponds to the
> results I found by using algorithms implemented elsewhere: so far so
> good. My (unsuccessful) attempts to use the statistics framework have
> been variations on the following:
>
>    % add amplitude spectrum to the TFR (which now only has fourierspctrm)
>    tfr.amplspctrm = abs(tfr.fourierspctrm);
>
>    cfg = [];
>    cfg.method = 'montecarlo';
>    cfg.statistic = 'corr'; % I have made a (now empty) statfun_corr
>    cfg.parameter = 'amplspctrm';
>    cfg.numrandomization = 100;
>    cfg.design = real(tfrPh.fourierspctrm);
>    stat = ft_freqstatistics(cfg, tfr);
>
> For your information, the tfr and cfg fields look like this before the
> call to ft_freqstatistics:
>
> cfg =
>              method: 'montecarlo'
>           statistic: 'corr'
>           parameter: 'amplspctrm'
>    numrandomization: 100
>              design: [4-D double]
>
> tfr =
>            label: {'CH10'}
>           dimord: 'rpttap_chan_freq_time'
>             freq: [3 8 13 18 23 28 33 38 43 48 53 58 63 68 73 78 83 88 93
> 98]
>             time: [1x100000 double]
>    fourierspctrm: [4-D double]
>        cumtapcnt: [1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1]
>              cfg: [1x1 struct]
>       amplspctrm: [4-D double]
>
> The statfun_corr gets a dat variable, that appears to be a column
> vector constructed by concatenating all the samples in the amplitude
> spectrum. How should my function treat this? I am aware that
> specifying a 4D design matrix is a bit odd, but it does contain my 15
> regressors X 100000 samples (X 2 singleton dimensions that correspond
> to the 1 trial and 1 channel that I analysed). If I squeeze() the
> matrix before calling ft_freqstatistics, I get an error 'the size of
> the design matrix does not match the number of observations in the
> data'.
>
> I realize my question is a bit vague, but I am somewhat lost as to how
> to proceed. In short, what I want is this: (1) compute a correlation
> (across samples) for each phase time series with each amplitude time
> series (each time series corresponds to a single frequency); (2)
> shuffle the phase time series around, computing correlations on the
> shuffled data to assess significance of the observed correlation; (3)
> prune away correlations that are isolated in frequency-frequency
> space, retain only large clusters of correlations.
>
> All the documentation about the cluster-based statistics is focused on
> comparing variables that differ across trials and/or subjects, and my
> 'problem' might be that I only have a single trial. Chopping up the
> data into 100000 trials, each 1 sample long, is maybe a possibility?
> But it seems a bit artificial.
>
> (It would actually probably not be very difficult to implement my
> desired functionality 'by hand', but I guess it should be possble
> within FT's statistics framework. That would have quite some
> advantages.)
>
> Many thanks in advance for any input you can offer.
>
> Best,
>
> Eelke
>
> _______________________________________________
> fieldtrip mailing list
> fieldtrip at donders.ru.nl
> http://mailman.science.ru.nl/mailman/listinfo/fieldtrip
>
>
> --
> Roemer van der Meij M.Sc.
> PhD student
> Donders Institute for Brain, Cognition and Behaviour
> Centre for Cognition
> Radboud University Nijmegen
> P.O. Box 9104
> 6500 HE Nijmegen
> The Netherlands
> Tel: +31(0)24 3655932
> E-mail: r.vandermeij at donders.ru.nl
>
> _______________________________________________
> fieldtrip mailing list
> fieldtrip at donders.ru.nl
> http://mailman.science.ru.nl/mailman/listinfo/fieldtrip
>