[FieldTrip] impact of skewed power distributions on data analysis

Wed Dec 14 10:22:00 CET 2016

In terms of statistics it is the distribution of values that you do the statistics on that matters. In case of a paired-samples t-test when comparing two conditions, it is the distribution of difference values that has to be normally distributed. The distribution of difference values is often normal given two similarly non-normal distributions, offering no complications for a regular parametric test.

The non-parametric tests offered in fieldtrip indeed do not assume normality, so you should have no problem there either.

From: fieldtrip-bounces at science.ru.nl [mailto:fieldtrip-bounces at science.ru.nl] On Behalf Of Alik Widge
Sent: Tuesday, December 13, 2016 3:10 PM
To: FieldTrip discussion list <fieldtrip at science.ru.nl>
Subject: Re: [FieldTrip] impact of skewed power distributions on data analysis

In this, Teresa is right and we have observed this in our own EEG data -- depending on one's level of noise and number of trials/patients, the mean can be a very poor estimator of central tendency. My students are still arguing about what we really want to do with it, but at least one of them has shifted to using the median as a matter of course for baseline normalization.

Alik Widge
alik.widge at gmail.com<mailto:alik.widge at gmail.com>
(206) 866-5435

On Mon, Dec 12, 2016 at 6:45 PM, Teresa Madsen <tmadsen at emory.edu<mailto:tmadsen at emory.edu>> wrote:
That may very well be true; to be honest, I haven't looked that deeply into the stats offerings yet. However, my plan is to express each electrode's experimental data in terms of change from their respective baseline recordings before attempting any group averaging or statistical testing, and this problem shows up first in the baseline correction step, where FieldTrip averages raw power over time.

~Teresa

On Mon, Dec 12, 2016 at 4:56 PM Nicholas A. Peatfield <nick.peatfield at gmail.com<mailto:nick.peatfield at gmail.com>> wrote:
Correct me if I'm wrong, but, if you are using the non-parametric statistics implemented by fieldtrip, the data does not need to be normally distributed.

On 12 December 2016 at 13:39, Teresa Madsen <tmadsen at emory.edu<mailto:tmadsen at emory.edu>> wrote:
No, sorry, that's not what I meant, but thanks for giving me the opportunity to clarify. Of course everyone is familiar with the 1/f pattern across frequencies, but the distribution across time (and according to the poster, also across space), also has an extremely skewed, negative exponential distribution. I probably confused everyone by trying to show too much data in my figure, but each color represents the distribution of power values for a single frequency over time, using a histogram and a line above with circles at the mean +/- one standard deviation.

My main point was that the mean is not representative of the central tendency of such an asymmetrical distribution of power values over time. It's even more obvious which is more representative of their actual distributions when I plot e^mean(logpower) on the raw plot and log(mean(rawpower)) on the log plot, but that made the figure even more busy and confusing.

I hope that helps,
Teresa

On Mon, Dec 12, 2016 at 3:47 PM Nicholas A. Peatfield <nick.peatfield at gmail.com<mailto:nick.peatfield at gmail.com>> wrote:
Hi Teresa,

I think what you are discussing is the 1/f power scaling of the power spectrum. This is one of the reasons that comparisons are made within a band (i.e. alpha to alpha) and not between bands (i.e. alpha to gamma), as such the assumption is that within bands there should be a relative change against baseline and this is what the statistics are performed on. That is, baseline correction is assumed to be the mean for a specific frequency and not a mean across frequencies.

 And this leads to another point that when you are selecting a frequency range to do the non-parametric statistics on you should not do 1-64 Hz but break it up based on the bands.

Hope my interpretation of your point is correct. I sent in individually, as I wanted to ensure I followed your point.

Cheers,

Nick

On 12 December 2016 at 08:23, Teresa Madsen <tmadsen at emory.edu<mailto:tmadsen at emory.edu>> wrote:
FieldTrippers,

While analyzing my data for the annual Society for Neuroscience meeting, I developed a concern that was quickly validated by another poster (full abstract copied and linked below) focusing on the root of the problem:  neural oscillatory power is not normally distributed across time, frequency, or space.  The specific problem I had encountered was in baseline-correcting my experimental data, where, regardless of cfg.baselinetype, ft_freqbaseline depends on the mean power over time.  However, I found that the distribution of raw power over time is so skewed that the mean was not a reasonable approximation of the central tendency of the baseline power, so it made most of my experimental data look like it had decreased power compared to baseline.  The more I think about it, the more I realize that averaging is everywhere in the way we analyze neural oscillations (across time points, frequency bins, electrodes, trials, subjects, etc.), and many of the standard statistics people use also rely on assumptions of normality.

The most obvious solution for me was to log transform the data first, as it appears to be fairly log normal, and I always use log-scale visualizations anyway.  Erik Peterson, middle author on the poster, agreed that this would at least "restore (some) symmetry to the error distribution."  I used a natural log transform, sort of arbitrarily to differentiate from the standard decibel transform included in FieldTrip as cfg.baselinetype = 'db'.  The following figures compare the 2 distributions across several frequency bands (using power values from a wavelet spectrogram obtained from a baseline LFP recorded in rat prelimbic cortex).  The lines at the top represent the mean +/- one standard deviation for each frequency band, and you can see how those descriptive stats are much more representative of the actual distributions in the log scale.

[cid:image001.png at 01D255F3.787B5C10]

For my analysis, I also calculated a z-score on the log transformed power to assess how my experimental data compared to the variability of the noise in a long baseline recording from before conditioning, rather than a short pre-trial baseline period, since I find that more informative than any of FieldTrip's built-in baseline types.  I'm happy to share the custom functions I wrote for this if people think it would be a useful addition to FieldTrip.  I can also share more about my analysis and/or a copy of the poster, if anyone wants more detail - I just didn't want to make this email too big.

Mostly, I'm just hoping to start some discussion here as to how to address this.  I searched the wiki<http://www.fieldtriptoolbox.org/development/zscores>, listserv<https://mailman.science.ru.nl/pipermail/fieldtrip/2006-December/000773.html> archives<https://mailman.science.ru.nl/pipermail/fieldtrip/2010-March/002718.html>, and bugzilla<http://bugzilla.fieldtriptoolbox.org/show_bug.cgi?id=1574> for anything related and came up with a few topics surrounding normalization and baseline correction, but only skirting this issue.  It seems important, so I want to find out whether others agree with my approach or already have other ways of avoiding the problem, and whether FieldTrip's code needs to be changed or just documentation added, or what?

Thanks for any insights,
Teresa

271.03 / LLL17 - Neural oscillatory power is not Gaussian distributed across time<http://www.abstractsonline.com/pp8/#!/4071/presentation/24150>
Authors
*L. IZHIKEVICH, E. PETERSON, B. VOYTEK;
Cognitive Sci., UCSD, San Diego, CA
Disclosures
 L. Izhikevich: None. E. Peterson: None. B. Voytek: None.
Abstract
Neural oscillations are important in organizing activity across the human brain in healthy cognition, while oscillatory disruptions are linked to numerous disease states. Oscillations are known to vary by frequency and amplitude across time and between different brain regions; however, this variability has never been well characterized. We examined human and animal EEG, LFP, MEG, and ECoG data from over 100 subjects to analyze the distribution of power and frequency across time, space and species. We report that between data types, subjects, frequencies, electrodes, and time, an inverse power law, or negative exponential distribution, is present in all recordings. This is contrary to, and not compatible with, the Gaussian noise assumption made in many digital signal processing techniques. The statistical assumptions underlying common algorithms for power spectral estimation, such as Welch's method, are being violated resulting in non-trivial misestimates of oscillatory power. Different statistical approaches are warranted.

--
Teresa E. Madsen, PhD
Research Technical Specialist:  in vivo electrophysiology & data analysis
Division of Behavioral Neuroscience and Psychiatric Disorders
Yerkes National Primate Research Center
Emory University
Rainnie Lab, NSB 5233
954 Gatewood Rd. NE
Atlanta, GA 30329
(770) 296-9119<tel:(770)%20296-9119>
braingirl at gmail.com<mailto:braingirl at gmail.com>
https://www.linkedin.com/in/temadsen

_______________________________________________
fieldtrip mailing list
fieldtrip at donders.ru.nl<mailto:fieldtrip at donders.ru.nl>
https://mailman.science.ru.nl/mailman/listinfo/fieldtrip

--
Nicholas Peatfield, PhD

--
Nicholas Peatfield, PhD

_______________________________________________
fieldtrip mailing list
fieldtrip at donders.ru.nl<mailto:fieldtrip at donders.ru.nl>
https://mailman.science.ru.nl/mailman/listinfo/fieldtrip

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.science.ru.nl/pipermail/fieldtrip/attachments/20161214/9a8ed7a6/attachment-0002.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image001.png
Type: image/png
Size: 38279 bytes
Desc: image001.png
URL: <http://mailman.science.ru.nl/pipermail/fieldtrip/attachments/20161214/9a8ed7a6/attachment-0002.png>