[FieldTrip] Cluster-Statistics on wPLI data

Tzvetan Popov tzvetan.popov at uni-konstanz.de
Thu Jun 4 15:23:01 CEST 2015

Dear Nick et al.,

it is also important to consider alternative possibilities to reduce the degrees of freedom based on the existing knowledge about the data. For example:
Case 1- The research question.
Consider the case of mid-frontal theta in the context of error related design. Base on what we already know from the existing literature it is probably save to reduce the  128x128x70x50 matrix  to a 128x128 where the time-freq tile of interest will be something from .3 to 1 seconds ~6 Hz. 
Case 2- The data itself.
Consider continuous resting state recording and long-range connectivity as a research question. The time dimension doesn’t really make sense and from what we know, or believe to know about long range communication, this is likely to happen in the lower frequency bands. With a peak in the power spectrum as an additional including criterion it is likely that the frequency dimension would be reduced to ~ 10 Hz. 
Case 3- The preceding analysis
Admittedly not always yet more often than not connectivity analysis follows extensive analysis on sensor level where most likely relevant time-frequency tiles were already defined while controlling for MCP. In that case one will go for the connectivity patterns concerning these preceding findings thereby reducing the time-freq dimension again.

These are just a few examples where you will have a chan x chan matrix to begin with. Whether you are interested in the mean connective value of a given channel or channel(seed) to all other channels both approaches will give a 128 x 1 matrix that can be further evaluated in the context of a permutation approach or whatever preferred method.


> Hi Tom and Jake, thanks very much for adding momentum to this.  I haven't yet looked into spectral graph clustering, and really haven't been thinking of it terms of graph theory but it now seems silly that I haven't. 
> If I'm understanding the suggestions correctly, the idea is to use the eigenvectors of the connectivity measure (WPLI in this case) to determine which nodes/voxels to cluster together.  So nodes with similar graphs will be clustered together.  This would be analogous to the graph bisection problem where k=2.  The problem with mixture models is the arbitrary nature of how many clusters to use, and without a way to determine the statistical likelihood of a given cluster this is a real problem for determining the validity of a given cluster.  
> Jake thanks for the resources, I'm beginning to dig into them and they seem promising.  My understanding is that graph bisection divides the graph into two groups that have the minimum number of connections between them.  So this could be used to cluster nodes that have similar topographies together, but it's not clear the best division would occur in the first bisection, instead possibly in subsequent bisections.  Is this where the recursive algorithm comes in?  I found this paper of yours ' Spectral Techniques for Graph Bisection in Genetic Algorithms', but it doesn't seem to get past the initial bisection.  
> As I said, I haven't really been thinking of the problem in graph theory terms, but if my understanding is correct the current data set I'm working with could have an adjacency matrix as large as 128x128x70x50 (chan x chan x time x frequency), and as small as 128x128 for a single time x frequency bin.  
> Nick
> On 6/2/15 12:38 PM, Jacob Martin wrote:
>> Hi Tom and Nick,
>> Here's a paper i wrote which may help give some background on spectral clustering:
>> "Ranks and representations for spectral graph bisection"
>> I also have a genetic algorithm which uses recursive spectral graph bisection to cluster graphs using the ideas from the above paper.  It is written in Java and is quite competitive and fast compared to other algorithms.  Nick, perhaps i could lend a hand. 
>> I suggest looking up Chris Walshaw's partition archive for lots of examples of algorithms.   I had a few minimum bisection records in there a while back with an older algorithm, but haven't looked at it in years or tried with my new algorithms yet. 
>> How big are your adjacency matrices?  Note that the minimum bisection problem is NP complete. 
>> Best
>> Jake
>> --
>> Jacob Martin, PhD
>> Center for Brain and Cognition
>> CerCo CNRS
>> Toulouse, France
>> On 02 Jun 2015, at 13:33, Tom Holroyd <tomh at kurage.nimh.nih.gov> wrote:
>> Several people have been doing things with spectral clustering,
>> involving the graph Laplacian, wherein we say that each voxel of the
>> brain is a node in a graph, and various connectivity measures connect
>> the nodes. The Laplacian of the graph has many interesting properties,
>> and the first K eigenvalues can be used to create clusters.
>> I'm afraid I don't know how to do permutation stats on that, but
>> k-means clustering of the graph eigenvectors can have random starting
>> points ...
>> On Tue, 02 Jun 2015 09:53:42 -0600
>> Nick Ketz <nick.ketz at gmail.com> wrote:
>>> Mathis, you have hit on a much larger problem.  Tzvetan is right to 
>>> suggest calculating each grid-point's connectivity with all other 
>>> grid-points, which gives you a connectivity topography map for each 
>>> grid-point.  The next logical step, which I've been struggling with,
>>> is to somehow cluster these topographies together, giving you
>>> clusters of neighboring grid-point topographies that are similar.  Is
>>> there any work being done to solve this problem?
>>> I have been struggling with this problem for some time, that is: how
>>> to do cluster based permutation statistics in chan x chan bins of 
>>> connectivity  measures.  Ideally I want to do this in a full chan x
>>> chan x freq x time cluster analysis.  I've searched on the list for
>>> related topics; posted, probably incoherently, my own issues; and
>>> generally scanned the list and the literature for any sort of
>>> relevant topics trying to broach this problem, but am still stuck.
>>> The standard 3d chan x freq x time clustering uses a neighborhood 
>>> structure to define what channels are considered neighbors, and uses 
>>> numeric adjacency to cluster time and frequency. With chan x chan
>>> data however, there unfortunately isn't an intuitive way to think
>>> about neighborhood for the pairwise connectivity.  There are
>>> neighbors surrounding the reference grid-point, and there are
>>> neighbors in each grid-point's connectivity topography, i.e.
>>> neighboring grid-points that have similar/significant connectivity
>>> values.  How do you combine the two in a sensible way?  And even more
>>> challenging how do you do statistics on them?
>>> I have been considering several approaches, including
>>> multi-dimensional scaling of these grid-point connectivity
>>> topographies, or just a correlation between topographies, and then
>>> using those values to select which reference grid-points to consider
>>> as similar enough to average together.  This then gives you a
>>> reference cluster of grid-points and allows you to do standard
>>> cluster permutation statistics on the average of the reference
>>> grid-points.  How exactly to determine which reference grid-points
>>> are 'similar enough' to be consider clusters is the tricky part, i.e.
>>> how can you statistically test if they should be considered a cluster
>>> or not?
>>> I've been working in isolation on this problem for some time, and
>>> would love a discussion on the topic.  I'm hoping this message spurs
>>> other list lurkers to at least commiserate with me, but if possible
>>> to also post whatever approaches they have taken to solve the problem
>>> of clustering pairwise connectivity data.
>>> Nick
>>>> On 6/2/15 4:00 AM, fieldtrip-request at science.ru.nl wrote:
>>>> Hi Mathis, alternatively you could use ft_sourcestatistics. For
>>>> this you should reduce your 600 x 600 x freq matrix to freq of
>>>> interest first. Next, the resulting 600 x 600 matrix you?d reduce
>>>> to 600 x 1 where each grid point has a mean connectivity value to
>>>> all possible grid points. >From then on you could use
>>>> ft_sourcestatistics and ft_sourceinterpolate to visualize. Good
>>>> luck, Tzvetan
>>>>>> Dear Fieldtrip Users,
>>>>>> I am a first-year PhD student and have been lurking here for a
>>>>>> long time (finding lots of useful answers), it's about time I ask
>>>>>> my first question:
>>>>>> I am trying to compare connectivity in source space (~25
>>>>>> subjects, between two conditions, ~600 virtual channels on a
>>>>>> 1.5cm grid) using ft_freqstatistics with a cluster-based
>>>>>> permutation test (see code below).
>>>>>> The original input datasets have the dimensions: chan x chan x
>>>>>> freq. I tried restructuring them to chancmb x freq
>>>>>> (seehttp://mailman.science.ru.nl/pipermail/fieldtrip/2014-February/007620.html)
>>>>>> before doing ft_freqstatistics, but the function then takes
>>>>>> forever appending the (admittedly big) datasets.
>>>>>> I then tried feeding them to ft_freqstatistics without
>>>>>> restructuring, which didn't throw any errors. However, I noticed
>>>>>> that the resulting clusters consist of adjacent cells in the chan
>>>>>> x chan matrix, which doesn't make much sense since channels in
>>>>>> adjacent cells are not necessarily spatially adjacent. I suspect
>>>>>> that ft_freqstatistic assumes the 3-D input to contain a temporal
>>>>>> dimension and therefore tries to build temporally adjacent
>>>>>> clusters. Could any of the fieldtrip developers comment on this?
>>>>>> Does anyone in the FT-community have any advice on how to
>>>>>> statistically evaluate connectivity data using the permutation
>>>>>> approach?
>>>>>> Thanks for any input, best wishes,
>>>>>> Mathis Kaiser
>>> _______________________________________________
>>> fieldtrip mailing list
>>> fieldtrip at donders.ru.nl
>>> http://mailman.science.ru.nl/mailman/listinfo/fieldtrip
> _______________________________________________
> fieldtrip mailing list
> fieldtrip at donders.ru.nl
> http://mailman.science.ru.nl/mailman/listinfo/fieldtrip

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.science.ru.nl/pipermail/fieldtrip/attachments/20150604/20d12f66/attachment-0002.html>

More information about the fieldtrip mailing list