[FieldTrip] impact of skewed power distributions on data analysis

Alik Widge alik.widge at gmail.com
Tue Dec 20 02:01:11 CET 2016


Indeed, in a separate thread with Michael Cohen several months back he
suggested precisely that paper.

On Dec 19, 2016 5:07 PM, "Nicholas A. Peatfield" <nick.peatfield at gmail.com>
wrote:

> I think this paper is relevant to this discussion.
>
> Grandchamp, R., & Delorme, A. (2011). Single-Trial Normalization for
> Event-Related Spectral Decomposition Reduces Sensitivity to Noisy Trials. *Frontiers
> in Psychology*, *2*, 236. http://doi.org/10.3389/fpsyg.2011.00236
>
> https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3183439/
>
>
>
> On 19 December 2016 at 13:08, Teresa Madsen <tmadsen at emory.edu> wrote:
>
>> I appreciate everyone's feedback, but I still wonder if something is
>> being missed.  I understand that the non-normally distributed power values
>> may be less of an issue when performing non-parametric stats or even a
>> paired-samples t-test that looks at difference values which may be normal
>> even when the raw data isn't.  However, my concern comes into play even
>> before these statistical comparisons are made, whenever any averaging is
>> done to freq-type data across times, frequencies, trials, electrodes,
>> subjects, etc.  That means any time any of these configuration options are
>> used for any of these functions, and probably more:
>>
>> ft_freqanalysis:          cfg.keeptrials or cfg.keeptapers = 'no';
>> ft_freqgrandaverage:   cfg.keepindividual = 'no';
>> ft_freqstatistics:         cfg.avgoverchan, cfg.avgovertime, or
>> cfg.avgoverfreq = 'yes';
>> ft_freqbaseline:          cfg.baseline = anything but 'no'
>>
>> In each case, if raw power values are averaged, the result will be
>> positively skewed.  Maybe it's not a huge problem if all of the data is
>> treated identically, but the specific case that triggered my concern was in
>> ft_freqbaseline, where the individual time-frequency bins are compared to
>> the mean over time for the baseline period.  For example, when using
>> cfg.baselinetype = 'db', as Giuseppe Pellizzer suggested, the output freq
>> data does indeed have a more normal distribution over time, but the mean
>> over the baseline time period is performed *before* the log transform, when
>> the distribution is still highly skewed:
>>
>>   meanVals = repmat(nanmean(data(:,:,baselineTimes), 3), [1 1 size(data,
>> 3)]);
>>   data = 10*log10(data ./ meanVals);
>>
>> That's what I had originally done when analyzing data for my SfN poster,
>> when I realized the background noise that shouldn't have changed much from
>> baseline was mostly showing a decrease from baseline of about -3dB.
>>
>> Now, I've realized I'm seeing this as more of a problem than others
>> because of another tweak I made, which was to use a long, separate baseline
>> recording to normalize my trial data, rather than a short pre-trial period
>> as ft_freqbaseline is designed to do.  Averaging a few hundred milliseconds
>> for a baseline power estimate might be okay because overlapping time points
>> in the original data are used to calculate those power values anyway,
>> probably making them less skewed, but also (it seems to me) more arbitrary
>> and prone to error.  I already offered my custom function BLnorm.m to one
>> person who was asking about this issue of normalizing to a separate
>> baseline recording, and I would be happy to contribute it to FieldTrip if
>> others would appreciate it.
>>
>> Since a few people suggested using the median, and it is also suggested
>> in Cohen's textbook
>> <https://mitpress.mit.edu/books/analyzing-neural-time-series-data> as an
>> alternative measure of the central tendency for skewed raw power values, I
>> wonder if the simplest fix might be to add an option to select mean or
>> median in each of the functions listed above.  Another possibility would be
>> adding an option to transform the power values upon output from
>> ft_freqanalysis.
>>
>> Would anyone else find such changes useful?
>>
>> Thanks,
>> Teresa
>>
>>
>> On Wed, Dec 14, 2016 at 4:22 AM, Herring, J.D. (Jim) <
>> J.Herring at donders.ru.nl> wrote:
>>
>>> In terms of statistics it is the distribution of values that you do the
>>> statistics on that matters. In case of a paired-samples t-test when
>>> comparing two conditions, it is the distribution of difference values that
>>> has to be normally distributed. The distribution of difference values is
>>> often normal given two similarly non-normal distributions, offering no
>>> complications for a regular parametric test.
>>>
>>>
>>>
>>> The non-parametric tests offered in fieldtrip indeed do not assume
>>> normality, so you should have no problem there either.
>>>
>>>
>>>
>>>
>>>
>>> *From:* fieldtrip-bounces at science.ru.nl [mailto:fieldtrip-bounces at scie
>>> nce.ru.nl] *On Behalf Of *Alik Widge
>>> *Sent:* Tuesday, December 13, 2016 3:10 PM
>>> *To:* FieldTrip discussion list <fieldtrip at science.ru.nl>
>>> *Subject:* Re: [FieldTrip] impact of skewed power distributions on data
>>> analysis
>>>
>>>
>>>
>>> In this, Teresa is right and we have observed this in our own EEG data
>>> -- depending on one's level of noise and number of trials/patients, the
>>> mean can be a very poor estimator of central tendency. My students are
>>> still arguing about what we really want to do with it, but at least one of
>>> them has shifted to using the median as a matter of course for baseline
>>> normalization.
>>>
>>>
>>> Alik Widge
>>> alik.widge at gmail.com
>>> (206) 866-5435
>>>
>>>
>>>
>>> On Mon, Dec 12, 2016 at 6:45 PM, Teresa Madsen <tmadsen at emory.edu>
>>> wrote:
>>>
>>> That may very well be true; to be honest, I haven't looked that deeply
>>> into the stats offerings yet. However, my plan is to express each
>>> electrode's experimental data in terms of change from their respective
>>> baseline recordings before attempting any group averaging or statistical
>>> testing, and this problem shows up first in the baseline correction step,
>>> where FieldTrip averages raw power over time.
>>>
>>> ~Teresa
>>>
>>>
>>>
>>> On Mon, Dec 12, 2016 at 4:56 PM Nicholas A. Peatfield <
>>> nick.peatfield at gmail.com> wrote:
>>>
>>> Correct me if I'm wrong, but, if you are using the non-parametric
>>> statistics implemented by fieldtrip, the data does not need to be normally
>>> distributed.
>>>
>>>
>>>
>>> On 12 December 2016 at 13:39, Teresa Madsen <tmadsen at emory.edu> wrote:
>>>
>>> No, sorry, that's not what I meant, but thanks for giving me the
>>> opportunity to clarify. Of course everyone is familiar with the 1/f pattern
>>> across frequencies, but the distribution across time (and according to the
>>> poster, also across space), also has an extremely skewed, negative
>>> exponential distribution. I probably confused everyone by trying to show
>>> too much data in my figure, but each color represents the distribution of
>>> power values for a single frequency over time, using a histogram and a line
>>> above with circles at the mean +/- one standard deviation.
>>>
>>> My main point was that the mean is not representative of the central
>>> tendency of such an asymmetrical distribution of power values over time.
>>> It's even more obvious which is more representative of their actual
>>> distributions when I plot e^mean(logpower) on the raw plot and
>>> log(mean(rawpower)) on the log plot, but that made the figure even more
>>> busy and confusing.
>>>
>>> I hope that helps,
>>> Teresa
>>>
>>>
>>>
>>> On Mon, Dec 12, 2016 at 3:47 PM Nicholas A. Peatfield <
>>> nick.peatfield at gmail.com> wrote:
>>>
>>> Hi Teresa,
>>>
>>>
>>>
>>> I think what you are discussing is the 1/f power scaling of the power
>>> spectrum. This is one of the reasons that comparisons are made within
>>> a band (i.e. alpha to alpha) and not between bands (i.e. alpha to gamma),
>>> as such the assumption is that within bands there should be a relative
>>> change against baseline and this is what the statistics are performed on.
>>> That is, baseline correction is assumed to be the mean for a specific
>>> frequency and not a mean across frequencies.
>>>
>>>
>>>
>>>  And this leads to another point that when you are selecting a frequency
>>> range to do the non-parametric statistics on you should not do 1-64 Hz but
>>> break it up based on the bands.
>>>
>>>
>>>
>>> Hope my interpretation of your point is correct. I sent in individually,
>>> as I wanted to ensure I followed your point.
>>>
>>>
>>>
>>> Cheers,
>>>
>>>
>>>
>>> Nick
>>>
>>>
>>>
>>>
>>>
>>> On 12 December 2016 at 08:23, Teresa Madsen <tmadsen at emory.edu> wrote:
>>>
>>> FieldTrippers,
>>>
>>>
>>>
>>> While analyzing my data for the annual Society for Neuroscience meeting,
>>> I developed a concern that was quickly validated by another poster (full
>>> abstract copied and linked below) focusing on the root of the problem:
>>>  neural oscillatory power is not normally distributed across time,
>>> frequency, or space.  The specific problem I had encountered was in
>>> baseline-correcting my experimental data, where, regardless of
>>> cfg.baselinetype, ft_freqbaseline depends on the mean power over time.
>>> However, I found that the distribution of raw power over time is so skewed
>>> that the mean was not a reasonable approximation of the central tendency of
>>> the baseline power, so it made most of my experimental data look like it
>>> had decreased power compared to baseline.  The more I think about it, the
>>> more I realize that averaging is everywhere in the way we analyze neural
>>> oscillations (across time points, frequency bins, electrodes, trials,
>>> subjects, etc.), and many of the standard statistics people use also rely
>>> on assumptions of normality.
>>>
>>>
>>>
>>> The most obvious solution for me was to log transform the data first, as
>>> it appears to be fairly log normal, and I always use log-scale
>>> visualizations anyway.  Erik Peterson, middle author on the poster, agreed
>>> that this would at least "restore (some) symmetry to the error
>>> distribution."  I used a natural log transform, sort of arbitrarily to
>>> differentiate from the standard decibel transform included in FieldTrip as
>>> cfg.baselinetype = 'db'.  The following figures compare the 2 distributions
>>> across several frequency bands (using power values from a wavelet
>>> spectrogram obtained from a baseline LFP recorded in rat prelimbic
>>> cortex).  The lines at the top represent the mean +/- one standard
>>> deviation for each frequency band, and you can see how those descriptive
>>> stats are much more representative of the actual distributions in the log
>>> scale.
>>>
>>>
>>>
>>>
>>> ​​
>>>
>>> For my analysis, I also calculated a z-score on the log transformed
>>> power to assess how my experimental data compared to the variability of the
>>> noise in a long baseline recording from before conditioning, rather than a
>>> short pre-trial baseline period, since I find that more informative than
>>> any of FieldTrip's built-in baseline types.  I'm happy to share the custom
>>> functions I wrote for this if people think it would be a useful addition to
>>> FieldTrip.  I can also share more about my analysis and/or a copy of the
>>> poster, if anyone wants more detail - I just didn't want to make this email
>>> too big.
>>>
>>>
>>>
>>> Mostly, I'm just hoping to start some discussion here as to how to
>>> address this.  I searched the wiki
>>> <http://www.fieldtriptoolbox.org/development/zscores>, listserv
>>> <https://mailman.science.ru.nl/pipermail/fieldtrip/2006-December/000773.html>
>>>  archives
>>> <https://mailman.science.ru.nl/pipermail/fieldtrip/2010-March/002718.html>,
>>> and bugzilla <http://bugzilla.fieldtriptoolbox.org/show_bug.cgi?id=1574> for
>>> anything related and came up with a few topics surrounding normalization
>>> and baseline correction, but only skirting this issue.  It seems important,
>>> so I want to find out whether others agree with my approach or already have
>>> other ways of avoiding the problem, and whether FieldTrip's code needs to
>>> be changed or just documentation added, or what?
>>>
>>>
>>>
>>> Thanks for any insights,
>>>
>>> Teresa
>>>
>>>
>>>
>>>
>>> 271.03 / LLL17 - Neural oscillatory power is not Gaussian distributed
>>> across time
>>> <http://www.abstractsonline.com/pp8/#!/4071/presentation/24150>
>>>
>>> *Authors*
>>>
>>> **L. IZHIKEVICH*, E. PETERSON, B. VOYTEK;
>>> Cognitive Sci., UCSD, San Diego, CA
>>>
>>> *Disclosures*
>>>
>>>  *L. Izhikevich:* None. *E. Peterson:* None. *B. Voytek:* None.
>>>
>>> *Abstract*
>>>
>>> Neural oscillations are important in organizing activity across the
>>> human brain in healthy cognition, while oscillatory disruptions are linked
>>> to numerous disease states. Oscillations are known to vary by frequency and
>>> amplitude across time and between different brain regions; however, this
>>> variability has never been well characterized. We examined human and animal
>>> EEG, LFP, MEG, and ECoG data from over 100 subjects to analyze the
>>> distribution of power and frequency across time, space and species. We
>>> report that between data types, subjects, frequencies, electrodes, and
>>> time, an inverse power law, or negative exponential distribution, is
>>> present in all recordings. This is contrary to, and not compatible with,
>>> the Gaussian noise assumption made in many digital signal processing
>>> techniques. The statistical assumptions underlying common algorithms for
>>> power spectral estimation, such as Welch's method, are being violated
>>> resulting in non-trivial misestimates of oscillatory power. Different
>>> statistical approaches are warranted.
>>>
>>>
>>>
>>> --
>>>
>>> Teresa E. Madsen, PhD
>>> Research Technical Specialist:  *in vivo *electrophysiology & data
>>> analysis
>>>
>>> Division of Behavioral Neuroscience and Psychiatric Disorders
>>> Yerkes National Primate Research Center
>>>
>>> Emory University
>>>
>>> Rainnie Lab, NSB 5233
>>> 954 Gatewood Rd. NE
>>> Atlanta, GA 30329
>>>
>>> (770) 296-9119
>>>
>>> braingirl at gmail.com
>>>
>>> https://www.linkedin.com/in/temadsen
>>>
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> fieldtrip mailing list
>>> fieldtrip at donders.ru.nl
>>> https://mailman.science.ru.nl/mailman/listinfo/fieldtrip
>>>
>>>
>>>
>>>
>>>
>>> --
>>>
>>> Nicholas Peatfield, PhD
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> --
>>>
>>> Nicholas Peatfield, PhD
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> fieldtrip mailing list
>>> fieldtrip at donders.ru.nl
>>> https://mailman.science.ru.nl/mailman/listinfo/fieldtrip
>>>
>>>
>>>
>>> _______________________________________________
>>> fieldtrip mailing list
>>> fieldtrip at donders.ru.nl
>>> https://mailman.science.ru.nl/mailman/listinfo/fieldtrip
>>>
>>
>>
>>
>> --
>> Teresa E. Madsen, PhD
>> Division of Behavioral Neuroscience and Psychiatric Disorders
>> Yerkes National Primate Research Center
>> Emory University
>> Rainnie Lab, NSB 5233
>> 954 Gatewood Rd. NE
>> Atlanta, GA 30329
>> (770) 296-9119
>>
>> _______________________________________________
>> fieldtrip mailing list
>> fieldtrip at donders.ru.nl
>> https://mailman.science.ru.nl/mailman/listinfo/fieldtrip
>>
>
>
>
> --
> Nicholas Peatfield, PhD
>
>
> _______________________________________________
> fieldtrip mailing list
> fieldtrip at donders.ru.nl
> https://mailman.science.ru.nl/mailman/listinfo/fieldtrip
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.science.ru.nl/pipermail/fieldtrip/attachments/20161219/4524e156/attachment-0002.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image001.png
Type: image/png
Size: 38279 bytes
Desc: not available
URL: <http://mailman.science.ru.nl/pipermail/fieldtrip/attachments/20161219/4524e156/attachment-0002.png>


More information about the fieldtrip mailing list