[FieldTrip] Performing statistics based on phase/amplitude correlation

Fri Mar 18 13:25:52 CET 2011

Hmm, actually my own suggestion of switching around the time and trial
dimensions seems to be very promising (and makes intuitive sense I
guess, having thought about it some more).

Best,
Eelke

2011/3/18 Eelke Spaak <eelke.spaak at donders.ru.nl>:
> Dear FieldTrip community,
>
> At the moment, I am analysing some data to see whether a correlation
> exists between phase and amplitude over different frequencies, and in
> different channels. Computing the signals of interest (i.e., the phase
> and amplitude time series) works fine, and computing the correlation
> is also easily done with Matlab's corrcoef function. Now comes the
> tricky part. I would like to use the FT statistics framework to assess
> the significance of the found correlation, and use clustering in
> frequency-frequency space to correct for multiple comparisons. I
> understand that this requires a statfun_correlationPearson (or
> something like it), and I do not think I'll encounter any problems
> implementing this. Before I start doing so, however, I am not sure how
> to pass the design and the data to the higher-level statistics
> functions, such as ft_freqstatistics.
>
> My data structure is custom-imported, and contains a single 100s-long
> trial, on which the phase/amplitude analysis should be conducted:
>
> data =
>        hdr: [1x1 struct]
>    fsample: 1000
>      trial: {[24x100000 double]}
>       time: {[1x100000 double]}
>      label: {24x1 cell}
>        cfg: [1x1 struct]
>
> The phase and amplitude time series are generated by convolution with
> Morlet wavelets using ft_freqanalysis:
>
>    % perform wavelet convolution for amplitude data
>    cfg = [];
>    cfg.channel = chansAmpl;
>    cfg.method = 'wavelet';
>    cfg.output = 'fourier';
>    cfg.keeptrials = 'yes';
>    cfg.width = 7;
>    cfg.gwidth = 4;
>    cfg.foi = freqAmpl;
>    cfg.toi = data.time{1}; % compute estimate for each sample
>    tfr = ft_freqanalysis(cfg, data);
>
>    % perform wavelet convolution for phase data
>    cfg.channel = chansPh;
>    cfg.foi = freqPh;
>    tfrPh = ft_freqanalysis(cfg, data);
>
> where chansAmpl = {'CH10'}, chansPh = {'CH17'}, freqAmpl = 3:5:98,
> freqPh = 2:2:30. Simply computing the correlation of interest is done
> in the following manner:
>
>    % extract freqs X samples matrices to compute correlations on
>    phSig = real(squeeze(tfrPh.fourierspctrm(1,match_str(tfrPh.label,
> chansPh(k)), :, :)));
>    amplSig = abs(squeeze(tfr.fourierspctrm(1,match_str(tfr.label,
> chansAmpl(l)), :, :)));
>
>    % compute correlations
>    r = corr(phSig', amplSig', 'rows', 'pairwise'); % use only non-NaN rows
>
> The correlation matrix r I get out of this nicely corresponds to the
> results I found by using algorithms implemented elsewhere: so far so
> good. My (unsuccessful) attempts to use the statistics framework have
> been variations on the following:
>
>    % add amplitude spectrum to the TFR (which now only has fourierspctrm)
>    tfr.amplspctrm = abs(tfr.fourierspctrm);
>
>    cfg = [];
>    cfg.method = 'montecarlo';
>    cfg.statistic = 'corr'; % I have made a (now empty) statfun_corr
>    cfg.parameter = 'amplspctrm';
>    cfg.numrandomization = 100;
>    cfg.design = real(tfrPh.fourierspctrm);
>    stat = ft_freqstatistics(cfg, tfr);
>
> For your information, the tfr and cfg fields look like this before the
> call to ft_freqstatistics:
>
> cfg =
>              method: 'montecarlo'
>           statistic: 'corr'
>           parameter: 'amplspctrm'
>    numrandomization: 100
>              design: [4-D double]
>
> tfr =
>            label: {'CH10'}
>           dimord: 'rpttap_chan_freq_time'
>             freq: [3 8 13 18 23 28 33 38 43 48 53 58 63 68 73 78 83 88 93 98]
>             time: [1x100000 double]
>    fourierspctrm: [4-D double]
>        cumtapcnt: [1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1]
>              cfg: [1x1 struct]
>       amplspctrm: [4-D double]
>
> The statfun_corr gets a dat variable, that appears to be a column
> vector constructed by concatenating all the samples in the amplitude
> spectrum. How should my function treat this? I am aware that
> specifying a 4D design matrix is a bit odd, but it does contain my 15
> regressors X 100000 samples (X 2 singleton dimensions that correspond
> to the 1 trial and 1 channel that I analysed). If I squeeze() the
> matrix before calling ft_freqstatistics, I get an error 'the size of
> the design matrix does not match the number of observations in the
> data'.
>
> I realize my question is a bit vague, but I am somewhat lost as to how
> to proceed. In short, what I want is this: (1) compute a correlation
> (across samples) for each phase time series with each amplitude time
> series (each time series corresponds to a single frequency); (2)
> shuffle the phase time series around, computing correlations on the
> shuffled data to assess significance of the observed correlation; (3)
> prune away correlations that are isolated in frequency-frequency
> space, retain only large clusters of correlations.
>
> All the documentation about the cluster-based statistics is focused on
> comparing variables that differ across trials and/or subjects, and my
> 'problem' might be that I only have a single trial. Chopping up the
> data into 100000 trials, each 1 sample long, is maybe a possibility?
> But it seems a bit artificial.
>
> (It would actually probably not be very difficult to implement my
> desired functionality 'by hand', but I guess it should be possble
> within FT's statistics framework. That would have quite some
> advantages.)
>
> Many thanks in advance for any input you can offer.
>
> Best,
>
> Eelke
>