[FieldTrip] PLV analysis clarification

Fri Mar 25 20:08:14 CET 2011

Hi Matt, Marco and others,

In the following I will address some issues which occurred earlier in  
this thread and try to also address the stuff that has come up since.

Here's my 2 cents:

As Matt and others already found out to their frustration, there is no  
straightforward way (in terms of how fieldtrip can handle the data) to  
carry out a statistical test (across a group of subjects) on data,  
which is the output of ft_connectivityanalysis. I will not talk about  
single subject analyses, but I sensed some confusion here and there,  
in that the 'statfun_indepsamplesZcoh' occasionally popped up. This is  
essentially indeed a function allowing to compute significance of  
coherence differences within a single subject.
Let's return to the statistics across a group case. The dirty hack I  
proposed a while ago in principle should work, but I could imagine  
that it doesn't work flawlessly, and it may need some fine-tuning.  
Having said this, in the past some functionality was built in to  
actually apply this dirty trick but only specifically so on coherence  
data, i.e. the data field called 'cohspctrm'. Internally in the  
ft_freqstatistics code, the field 'label' was created, in which each  
element consisted of a concatenation of 2 label strings, e.g. {'E1 -  
E2'}. The reason why other metrics are not supported, is that the  
aforementioned functionality was built in long before the existence of  
ft_connectivityanalysis, at the time we were mostly doing coherence  
analysis. As such, there is no principled reason why fieldtrip should  
not support similar functionality for other connectivity metrics, and  
it's certainly on the developers' to-do list (in this case:  my to-do  
list).
On a related note, once one manages to 'fool' freqstatistics (or if  
one would like to do a more traditional analysis with freqstatistics)  
and wants to apply clustering in the channel dimension, fieldtrip  
needs to know which channels are neighbours of each other. For this,  
the cfg structure needs to have a field called cfg.neighbours, and if  
this does not exist, fieldtrip tries to create one on the fly, based  
on the specification of the positions of the channels. This  
information is always stored (if known) in the data field called grad  
(for MEG), or elec (for EEG). Apparently, Marco's data does not  
contain this field, most likely causing the error he reported in an  
earlier message. This does not mean that one cannot do clustering in  
this case. It just means that one has to create the neighbourhood  
structure between the channels on his own. There's surely some  
documentation about this on the wiki, so I don't want to enter into  
details about this here.
If one now has scalp level estimates of connectivity between channel  
pairs, the clustering over channels may in principle be done (but not  
yet in fieldtrip as far as I am aware), since the channel dimension is  
now not just defined in 2D (as a projection of the channels onto a  
plane), but in 4-dimensional space. So in such a case one needs to  
specify an empty neighbourhood structure, i.e. cfg.neighbours{k}.label  
= data.label{k} and cfg.neighbours{k}.neighblabel = {} (where k is the  
k'th label in the (in this case) bivariate data, see above. Fieldtrip  
then still allows for clustering over frequencies and time.
Then it is of course the question as to how to generate the  
connectivity data in order to optimally fool ft_freqstatistics, and  
one level higher in the analysis pipeline, it relates to how  
ft_freqanalysis should behave. Let me start from the top, and use  
coherence as an example. Coherence essentially is the frequency domain  
version of the cross-correlation function and as such is constructed  
from the frequency domain representation of the time series. Analogous  
to the correlation which is the covariance between two signals,  
normalised with the variance of the two respective signals, the  
coherence is the cross-spectral density between two signals,  
normalised with the power of the two respective signals, at frequency  
X. The correlation can be loosely seen as a multiplication between two  
time domain signals, the cross-spectral density is obtained by  
(conjugate) multiplication between the frequency domain representation  
of those two signals. This essentially is related to the two different  
(relevant) outputs ft_freqanalysis will provide. Cfg.output =  
'fourier' provides the fourier transforms of the time domain signals,  
and cfg.output='powandcsd' provides the cross-spectral density between  
a specified set of channel combinations, and you get the power spectra  
for free.
Both versions allow for the coherence to be computed but the memory  
requirements at this stage differs. For 'fourier' data the  
dimensionality of the fourierspctrm = 'rpttap_chan_freq_time' (by  
force the single replicates are kept in memory, because replicates may  
only be averaged once the fourierspectra are multiplied (yielding  
auto- and cross-spectra). For 'powandcsd' data the dimensionality of  
crsspctrm is either 'chan_freq_time', or 'rpt_chan_freq_time'.  
Averaging across replicates is now possible (when cfg.keeptrials =  
'no' before calling ft_freqanalysis)  (since the multiplication is now  
performed), yet the 'chan' dimension now represents channel  
combinations, which in the case of cfg.channelcmb = {'all' 'all'}  
leads to (N-1)*N/2 channel combinations. This typically leads to a  
much higher memory usage, particularly because for some connectivity  
metrics (such as plv) one needs to process the single replicates  
before computing the plv, requiring cfg.keeptrials to be 'yes' prior  
to calling ft_freqanalysis. Therefore I would advocate the use of  
cfg.method = 'fourier', as long as the number of trials and tapers is  
not too big.
Next, ft_connectivityanalysis can be called, using the freq data with  
fourier. One can actually specify a cfg.channelcmb at this stage,  
which makes sense if not all combinations are needed, or if the  
requested connectivity metric is symmetric, such as coherence. This  
could be the trick Matt is waiting for, i.e. run ft_freqanalysis with  
output='fourier', and specify cfg.channelcmb = {'all' 'all'} before  
calling ft_connectivityanalysis. Alternatively, you could do a reshape  
on the plvspctrm, and keep only the interesting non-duplicated pairs,  
e.g. indx = reshape(1:length(conn.label).^2,[length(conn.label)  
length(conn.label]); indx = tril(indx,-1); plvspctrm =  
reshape(plvspctrm, [length(conn.label) length(conn.label)  
length(conn.freq) length(conn.time)]); plvspctrm =  
plvspctrm(indx(:),:,:); label = label(indx); label in this case is the  
processed labelcmb field.

I hope that this will be of any help.

Jan-Mathijs

On Mar 23, 2011, at 11:30 PM, Matt Mollison wrote:

> Dear Jan-Mathijs,
>
> Thanks for your reply. I know my post was a large one. I have two  
> further questions that are borne out in more detail below: 1) how  
> would one do connectivity analyses with 'fourier' data (the issue  
> for "next time")? and 2) what's the difference between using  
> 'powandcsd' and 'fourier'?
>
> 1. I'm still confused about the last case that you said to keep for  
> next time (cfg.method='mtmconvol', cfg.output='fourier').
> How do FT users run ft_freqstatistics on ft_connectivityanalysis  
> data (processed as output='fourier') in a within-subjects experiment  
> if the stats throw an error because of "chan_chan" in dimord? It  
> seems like one must reshape plvspctrm/cohspctrm and modify dimord.  
> However, for me, reshaping plvspctrm produces a dimension that is  
> more than twice as large as it should be.
> size(data_conn.plvspctrm)=129   129     9    33 (i.e.,  
> chan_chan_freq_time)
> If I then reshape with
> siz = size(data_conn.plvspctrm);
> size(reshape(data_conn.plvspctrm,siz(1)*siz(2),siz(3),siz(4))) =  
> 16641           9          33
> and (16641-129)/2 = 8256, which are the nchoosek(129,2)=8256 channel  
> combinations that I *should* have. So it seems that  
> ft_connectivityanalysis is producing duplicate channel combinations  
> (i.e., 'E1_E2' and 'E2_E1' when those should be equal) plus auto- 
> correlations ('E1_E1'), and making some imagesc plots of the data  
> confirms this.
> People here are surely running PLV/coherence analyses (there have  
> been questions about single-subject analyses with  
> statfun_indepsamplesZcoh, which requires fourier input), so there  
> must be a way to use fourier input.
>
> The only thing I can figure is to instead use cfg.output='powandcsd'  
> and implement the label-field creation trick you suggested. However,  
> this doesn't seem like it's the only option because  
> statfun_indepsamplesZcoh requires 'fourier'. Nonetheless, I tried  
> using 'powandcsd' with the label trick and I was able to run  
> ft_freqstatistics (finally!! doing both a t-test and a  
> clusteranalysis), but it seems like FT could/should deal with the  
> issue more elegantly instead of needing the trick. Maybe all coh/plv  
> analyses ever done with FT have required this hack and maybe the  
> correct reshape for fourier data still needs to be implemented, both  
> of which are completely fine with me, but in that case I'm curious  
> what hacks other people are using.
>
> 2. Despite being successful with powandcsd data, I still can't  
> figure out how to use fourier data. That leads me to ask, what's the  
> functional different between fourier and powandcsd output? Or  
> rather, what can/can't I do with complex Fourier spectra? Will FT  
> perform "better" (faster?) with one or the other? Are they used in  
> different situations (it seems like I should be able to use either  
> for PLV/coherence, especially since statfun_indepsamplesZcoh  
> requires fourier)? If I also want to look at power, can I calculate  
> it from the complex Fourier spectra? Sorry for my lack of knowledge  
> in this department.
>
> Thanks again for all the information.
> Matt
>
>
> On Wed, Mar 16, 2011 at 2:25 AM, jan-mathijs schoffelen <jan.schoffelen at donders.ru.nl 
> > wrote:
> Dear Matt,
>
> As usual, thanks for your detailed report (although one also  
> shouldn't exaggerate ;o) ).
> Reading your mail and attachment, it seems there are several  
> features/problems in the code that restrict your analysis.
> In the following, I pasted your script and will provide some comments
>
> =================================================================
> cfg_freq.method = 'wavelet';
> cfg_freq.output = 'powandcsd';
> cfg_freq.channelcmb = {'all','all'}; % 129 channels
>
> --------------------------------------
> ft_freqanalysis
> data_freq =
>        label: {129x1 cell}
>       dimord: 'chan_freq_time'
>         freq: [2.9630 3.7037 4.4444 5.1852 5.9259 6.6667 7.4074  
> 8.1481 8.8889]
>         time: [1x33 double]
>    powspctrm: [129x9x33 double]
>     labelcmb: {8256x2 cell}
>    crsspctrm: [8256x9x33 double]
>          cfg: [1x1 struct]
>
> ft_connectivityanalysis with 'plv'
> data_conn =
>      labelcmb: {8256x2 cell}
>       dimord: 'chan_freq_time'
>    plvspctrm: [8256x9x33 double]
>         freq: [2.9630 3.7037 4.4444 5.1852 5.9259 6.6667 7.4074  
> 8.1481 8.8889]
>         time: [1x33 double]
>          cfg: [1x1 struct]
>
> --------------------------------------
> FAILURE: can't run ft_freqgrandaverage on data_conn because there is  
> no label field
>
> This has indeed been noted by you and posted on bugzilla. We didn't  
> find time to work on it yet, but as a general remark  
> ft_freqgrandaverage (and ft_timelockgrandaverage and  
> ft_sourcegrandaverage) were designed to work on univariate data  
> only, i.e. it fails on data which comes out of  
> ft_connectivityanalysis. As such there is no other reason for it  
> than a historical one (the averaging functions were there long  
> before we started doing serious connectivity stuff). Of course it  
> should be possible to combine the data into one structure and do  
> statistics later on, although I guess the clustering in the channel  
> dimension becomes really complicated. Note, also that I pointed out  
> to you that as such ft_XXXgrandaverage is not mandatory before  
> calling ft_XXXstatistics, you can also input a cell-array of data  
> structures. Yet, this will not help you because then you probably  
> encounter a crash later on.
>
> SOLUTION for now: rename plvspctrm into powspctrm and create a label- 
> field, e.g. by   for i =1:size(data_conn.labelcmb,1)  
> data_conn.label{k} = [data_conn.labelcmb{i, 
> 1},'_',data_conn.labelcmb{i,2}]; end (and remove the labelcmb field)
>
>
>
> --------------------------------------
> FAILURE: can't run ft_singleplotTFR on data_conn
>
> cfg_ft.cohrefchannel = 'E3';
> cfg_ft.channel = 'E60';
> selected 1408 channels for plvspctrm
> ??? Error using ==> ft_channelselection at 72
> data with non-unique channel names is not supported
>
> Error in ==> ft_singleplotTFR at 290
> selchannel = ft_channelselection(cfg.channel, data.label);
>
> The plotting functions keep haunting me. I thought that with my  
> recent overhaul I made them somewhat more robust for plotting  
> connectivity data.
> It would be helpful if you create a bug out of this specific  
> section, and add an attachment containing data_conn, and cfg_ft)
>
> --------------------------------------
> FAILURE: can't run ft_topoplotTFR on data_conn
>
> creating layout from data.elec
> creating layout for egi128 system
> selected 1408 channels for plvspctrm
> Warning: Duplicate x-y data points detected: using average of the z  
> values.
> > In griddata at 108
>  In ft_plot_topo at 146
>  In ft_topoplotER at 747
>  In ft_topoplotTFR at 116
> ??? Error using ==> griddata at 122
> Not enough unique sample points specified.
>
> Error in ==> ft_plot_topo at 146
> [Xi,Yi,Zi] = griddata(chanX', chanY, dat, xi', yi, interpmethod); %  
> interpolate the topographic data
>
> Error in ==> ft_topoplotER at 747
>     
> ft_plot_topo 
> (chanX,chanY,datavector,'interpmethod',cfg.interpolation,...
>
> Error in ==> ft_topoplotTFR at 116
> cfg=ft_topoplotER(cfg, varargin{:});
> =================================================================
> =================================================================
>
> You requested only a single pair to be plotted (1 cohrefchannel and  
> 1 channel). A topography does not makes sense. FieldTrip does not  
> detect this, but the low-level interpolation function starts  
> complaining.
>
> SOLUTION: don't call ft_topoplotTFR if you want to look at 1 TFR  
> only, as specified by the cfg. In that case it should be possible to  
> specify cfg.interactive = 'yes', allowing you to toggle back and  
> forth between topographies and selected sensors.
>
>
> =================================================================
> cfg_freq.method = 'wavelet';
> cfg_freq.output = 'fourier'
>
> cfg_freq.channelcmb = {'E3','E60';'E3','E62';'E11','E60';'E11','E62'};
> cfg_freq.channel = unique(cfg_freq.channelcmb);
>
> have to set keeptapers to 'no' even though (as far as I know) there  
> are no tapers used in wavelets (see ft_freqanalysis line 313)
>
> cfg_freq.keeptrials = 'no';
> cfg_freq.keeptapers = 'no';
>
> --------------------------------------
> ft_freqanalysis
> data_freq =
>            label: {4x1 cell}
>           dimord: 'chan_freq_time'
>             freq: [2.9630 3.7037 4.4444 5.1852 5.9259 6.6667 7.4074  
> 8.1481 8.8889]
>             time: [1x33 double]
>    fourierspctrm: [4x9x33 double]
>              cfg: [1x1 struct]
>
> --------------------------------------
> FAILURE: can't run ft_connectivityanalysis on data_freq because  
> there is no cumtapcnt field
>
> ??? Reference to non-existent field 'cumtapcnt'.
>
> Error in ==> ft_checkdata>fixcsd at 844
>  sumtapcnt = [0;cumsum(data.cumtapcnt(:))];
>
> Error in ==> ft_checkdata at 594
>    data = fixcsd(data, cmbrepresentation, channelcmb);
>
> Error in ==> univariate2bivariate at 29
>        data    = ft_checkdata(data, 'cmbrepresentation', 'full');
>
> Error in ==> ft_connectivityanalysis at 234
>          [data, powindx, hasrpt] = univariate2bivariate(data,  
> 'fourierspctrm', 'crsspctrm', dtype, 'cmb', cfg.channelcmb);
> =================================================================
> =================================================================
>
> Thanks for noticing this. I never considered anybody to use the  
> wavelet method before calling ft_connectivityanalysis. Why would you  
> want to use it in the first place if there's mtmconvol ;o) ?
>
> SOLUTION: Don't use wavelet (and post a bug on bugzilla stating this  
> specific problem: output to waveletanalysis cannot be processed by  
> ft_connectivityanalysis).
>
> =================================================================
>
> cfg_freq.method = 'mtmconvol';
> cfg_freq.output = 'powandcsd';
> cfg_freq.channelcmb = {'E3','E60';'E3','E62';'E11','E60';'E11','E62'};
> cfg_freq.channel = unique(cfg_freq.channelcmb);
>
>
> cfg_freq.keeptrials = 'no';
> cfg_freq.keeptapers = 'no';
> cfg_freq.taper = 'hanning';
>
> --------------------------------------
> ft_freqanalysis
> data_freq =
>        label: {4x1 cell}
>       dimord: 'chan_freq_time'
>         freq: [2.9630 3.7037 4.4444 5.1852 5.9259 6.6667 7.4074  
> 8.1481 8.8889]
>         time: [1x33 double]
>    powspctrm: [4x9x33 double]
>     labelcmb: {4x2 cell}
>    crsspctrm: [4x9x33 double]
>    cumtapcnt: [173x9 double]
>          cfg: [1x1 struct]
>
> ft_connectivityanalysis with 'plv'
> data_conn =
>     labelcmb: {4x2 cell}
>       dimord: 'chan_freq_time'
>    plvspctrm: [4x9x33 double]
>         freq: [2.9630 3.7037 4.4444 5.1852 5.9259 6.6667 7.4074  
> 8.1481 8.8889]
>         time: [1x33 double]
>          cfg: [1x1 struct]
> --------------------------------------
> FAILURE: can't run ft_freqgrandaverage on data_conn because there is  
> no label field
>
>
> See above.
>
> --------------------------------------
> SUCCESS: ft_singleplotTFR successfully makes a plot
>
>
> Hurrah
>
>
> --------------------------------------
> FAILURE: can't run ft_freqstatistics on data_conn because there's no  
> label field
>
> ??? Reference to non-existent field 'label'.
>
> Error in ==> prepare_timefreq_data>forcedimord at 509
>  output.label  = input.label;
>
> Error in ==> prepare_timefreq_data at 87
>  [remember{c}, hascrsspctrm] = forcedimord(varargin{c});
>
> Error in ==> statistics_wrapper at 217
>  [cfg, data] = prepare_timefreq_data(cfg, varargin{:});
>
> Error in ==> ft_freqstatistics at 127
> [stat, cfg] = statistics_wrapper(cfg, varargin{:});
> =================================================================
> =================================================================
>
>
> See above
>
> =================================================================
>
> cfg_freq.method = 'mtmconvol';
> cfg_freq.output = 'fourier';
> cfg_freq.channelcmb = {'E3','E60';'E3','E62';'E11','E60';'E11','E62'};
> cfg_freq.channel = unique(cfg_freq.channelcmb);
>
>
> cfg_freq.keeptrials = 'no';
> cfg_freq.keeptapers = 'no';
> cfg_freq.taper = 'hanning';
>
> --------------------------------------
> ft_freqanalysis
> data_freq =
>            label: {4x1 cell}
>           dimord: 'chan_freq_time'
>             freq: [2.9630 3.7037 4.4444 5.1852 5.9259 6.6667 7.4074  
> 8.1481 8.8889]
>             time: [1x33 double]
>    fourierspctrm: [4x9x33 double]
>        cumtapcnt: [173x9 double]
>              cfg: [1x1 struct]
>
> --------------------------------------
> FAILURE: can't run ft_connectivityanalysis on data_freq
>
> ??? Subscripted assignment dimension mismatch.
>
> Error in ==> ft_checkdata>fixcsd at 851
>        crsspctrm(p,:,:,m,k) = (tmpdat*tmpdat')./data.cumtapcnt(p);
>
> Error in ==> ft_checkdata at 594
>    data = fixcsd(data, cmbrepresentation, channelcmb);
>
> Error in ==> univariate2bivariate at 29
>        data    = ft_checkdata(data, 'cmbrepresentation', 'full');
>
> Error in ==> ft_connectivityanalysis at 234
>          [data, powindx, hasrpt] = univariate2bivariate(data,  
> 'fourierspctrm', 'crsspctrm', dtype, 'cmb', cfg.channelcmb);
> =================================================================
>
> This seems to be a bug in ft_freqanalysis. It should be very  
> forbidden to run ft_freqanalysis with cfg.keeptrials = 'no' and  
> cfg.output = 'fourier'. Just doesn't make sense. Could  you file  
> this as a bug please?
>
>
> =================================================================
> =================================================================
>
> cfg_freq.method = 'mtmconvol';
> cfg_freq.output = 'fourier';
> cfg_freq.channelcmb = {'E3','E60';'E3','E62';'E11','E60';'E11','E62'};
> cfg_freq.channel = unique(cfg_freq.channelcmb);
>
>
> cfg_freq.keeptrials = 'yes';
> cfg_freq.keeptapers = 'yes';
> cfg_freq.taper = 'hanning';
>
> --------------------------------------
> ft_freqanalysis
> data_freq =
>            label: {4x1 cell}
>           dimord: 'rpttap_chan_freq_time'
>             freq: [2.9630 3.7037 4.4444 5.1852 5.9259 6.6667 7.4074  
> 8.1481 8.8889]
>             time: [1x33 double]
>    fourierspctrm: [4-D double]
>        cumtapcnt: [173x9 double]
>              cfg: [1x1 struct]
>
> ft_connectivityanalysis with 'plv'
> data_conn =
>        label: {4x1 cell}
>       dimord: 'chan_chan_freq_time'
>    plvspctrm: [4-D double]
>         freq: [2.9630 3.7037 4.4444 5.1852 5.9259 6.6667 7.4074  
> 8.1481 8.8889]
>         time: [1x33 double]
>          dof: [173 173 173 173 173 173 173 173 173]
>          cfg: [1x1 struct]
>
> --------------------------------------
> FAILURE: can't run ft_freqgrandaverage on data_conn
>
> ??? Error using ==> ft_freqgrandaverage at 170
> unsupported dimord
>
> --------------------------------------
> SUCCESS: ft_singleplotTFR makes a plot
>
> --------------------------------------
> FAILURE: can't do ft_freqstatistics on data_conn because of  
> chan_chan in dimord
>
> ??? Error using ==> size
> Dimension argument must be a positive integer scalar within indexing  
> range.
>
> Error in ==> prepare_timefreq_data>forcedimord at 540
>    Nchan = size(output.dat, chandim);
>
> Error in ==> prepare_timefreq_data at 87
>  [remember{c}, hascrsspctrm] = forcedimord(varargin{c});
>
> Error in ==> statistics_wrapper at 217
>  [cfg, data] = prepare_timefreq_data(cfg, varargin{:});
>
> Error in ==> ft_freqstatistics at 127
> [stat, cfg] = statistics_wrapper(cfg, varargin{:});
>
> Error in ==> mm_ft_ttestTFR at 228
>  cfg_ana.(vs_str) = eval(sprintf('ft_freqstatistics(cfg_ft, 
> %s);',subj_str));
> =================================================================
>
> Let's keep this one for the next time...
>
> Best,
>
> Jan-Mathijs
> _______________________________________________
> fieldtrip mailing list
> fieldtrip at donders.ru.nl
> http://mailman.science.ru.nl/mailman/listinfo/fieldtrip
>
> _______________________________________________
> fieldtrip mailing list
> fieldtrip at donders.ru.nl
> http://mailman.science.ru.nl/mailman/listinfo/fieldtrip

Dr. J.M. (Jan-Mathijs) Schoffelen
Donders Institute for Brain, Cognition and Behaviour,
Centre for Cognitive Neuroimaging,
Radboud University Nijmegen, The Netherlands
J.Schoffelen at donders.ru.nl
Telephone: 0031-24-3614793

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.science.ru.nl/pipermail/fieldtrip/attachments/20110325/0f82b207/attachment-0002.html>