[FieldTrip] brainvision reading with ft_preprocessing slow

Mon Sep 9 15:48:34 CEST 2024

hi all,

was a preallocation issue in read_brainvision_eeg.m (specifically for
binary-vectorized files). should now be fixed.

Roy

On Sat, Sep 7, 2024 at 1:13 PM Schoffelen, J.M. (Jan Mathijs) via fieldtrip
<fieldtrip at science.ru.nl> wrote:

> Hi Roy,
>
> I didn’t study the relevant pieces of code in detail, but I would suspect
> that EEGLAB requires the same reading heuristics as Fieldtrip’s brainvision
> reader. The only thing that I can see in the Fieldtrip reader function for
> datafiles that are both ‘binary’ and ‘vectorized’, is that there is no
> memory pre-allocation for the ‘dat’ variable, which is read in a for-loop
> across channels. Re-allocation of memory in each iteration of a for-loop
> significantly slows down code execution, particularly if the matrices are
> big.
> I suggest that you test this locally, by properly pre-allocating the
> memory for the dat variable (you can get inspired by the other sections of
> the code), and once you have verified that this would be a good
> improvement, we would be happy to receive a PR for this.
>
> Best wishes,
> Jan-Mathijs
>
>
> On 6 Sep 2024, at 15:51, Roy Cox via fieldtrip <fieldtrip at science.ru.nl>
> wrote:
>
> hi Konstantinos,
>
> thanks, tried your suggestion, but using ft_definetrial is just as slow as
> regular ft_preprocessing when using trialdef.triallength = Inf, and even
> slower when setting trialdef.triallength = 1 (or some other number).
>
> additional testing showed that the slow reading only occurs when the
> Brainvision subformat (data orientation) is "VECTORIZED". when the
> Brainvision subformat is "MULTIPLEXED", FieldTrip's reading is much faster
> and on par with eeglab's import functionality. guessing that "MULTIPLEXED"
> orientation prevents FieldTrip from reading in blockwise in the first
> place. so as a workaround, I could ensure that all our Brainvision data is
> multiplexed to begin with (but many other users might not have that option).
>
> while I see the potential advantages of reading data blockwise when not
> all data is required for further processing, I think that loading complete
> continuous data into memory is a very typical use case that I'm surprised
> is not supported. At least for vectorized Brainvision data, blockwise
> reading is tremendously slowing down rather than speeding up the pipeline
> (and curious whether this affects other data formats too...)
>
> thanks again for your input!
> Roy
>
> On Fri, Sep 6, 2024 at 3:28 AM Konstantinos Tsilimparis via fieldtrip <
> fieldtrip at science.ru.nl> wrote:
>
>> Hi Roy,
>>
>>
>>
>> I believe you’re correct that FieldTrip reads data in blocks. For
>> example, it processes the first 5 seconds from all 64 EEG channels, then
>> moving on to the next 5 seconds, and so forth. From my experience with MEG
>> data, reading the data in blocks does seem to improve the speed of data
>> importing. Unfortunately, I’m not as familiar with how EEGLAB handles data
>> imports.
>>
>>
>>
>> One advantage of FieldTrip that might speed up data reading is that it
>> allows to selectively read the raw data according to pre-defined trial
>> triggers (see https://www.fieldtriptoolbox.org/tutorial/preprocessing/
>> <https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.fieldtriptoolbox.org%2Ftutorial%2Fpreprocessing%2F&data=05%7C02%7Cfieldtrip%40science.ru.nl%7Cd03be98ba33e42b84dd508dccf2d1a46%7C084578d9400d4a5aa7c7e76ca47af400%7C1%7C0%7C638613039818924842%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=1ze45t0Nph8TkVy2%2BIIv4IqfpMpdqdJMh1lPkeuyQK4%3D&reserved=0>).
>> In contrast, in EEGLAB the whole dataset has to be loaded into memory
>> before it can be processed.
>>
>>
>>
>> Given that sleep data often requires loading continuous data without
>> triggers, one potential solution could be to segment the continuous data
>> into one-second ‘’fake’’ trials while reading from disk (see:
>> https://www.fieldtriptoolbox.org/tutorial/continuous/#segmenting-continuous-data-into-one-second-pieces
>> <https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.fieldtriptoolbox.org%2Ftutorial%2Fcontinuous%2F%23segmenting-continuous-data-into-one-second-pieces&data=05%7C02%7Cfieldtrip%40science.ru.nl%7Cd03be98ba33e42b84dd508dccf2d1a46%7C084578d9400d4a5aa7c7e76ca47af400%7C1%7C0%7C638613039818924842%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=igV5RzrKhlfrP9L7yDpoNnN4WoSVJdq1%2F6EImEp%2BYo0%3D&reserved=0>).
>> This is just a thought, and I’m not sure if it will speed up your analysis
>> in the end, but it might be worth a try!
>>
>>
>>
>> Best,
>>
>> Konstantinos
>>
>>
>>
>>
>>
>> *From:* fieldtrip <fieldtrip-bounces at science.ru.nl> *On Behalf Of *Roy
>> Cox via fieldtrip
>> *Sent:* Tuesday, September 3, 2024 1:25 PM
>> *To:* FieldTrip discussion list <fieldtrip at science.ru.nl>
>> *Cc:* Roy Cox <roycox.roycox at gmail.com>
>> *Subject:* [FieldTrip] brainvision reading with ft_preprocessing slow
>>
>>
>>
>> hi all,
>>
>>
>>
>> I noticed that FieldTrip's ft_preprocessing is extremely slow reading in
>> our Branvision files (high-density sleep, about 8 GB). I've compared it to
>> eeglab's import tool for Brainvision (bva-io1.71)
>>
>>
>>
>> %---eeglab (all FieldTrip paths removed)
>>
>> tic
>> [bv_folder,bv_file]=fileparts(bf_file);
>> EEG=pop_loadbv(bv_folder,[bv_file '.vhdr']);
>> t_eeglab=toc
>>
>>
>>
>> t_eeglab =
>>
>>     40.3622
>>
>>
>>
>> %--fieldtrip (all EEGLAB paths removed)
>>
>> tic
>> cfg=[];
>> cfg.dataset=bf_file;
>> cfg.continuous = 'yes';
>> cfg.readbids ='no';
>> data_ft=ft_preprocessing(cfg);
>> t_ft=toc
>>
>>
>>
>> t_ft =
>>
>>   458.0801
>>
>>
>>
>> So that's tenfold slower.
>>
>>
>>
>> One clue that may be of help: as we keep the data on a remote server I
>> can monitor the ethernet throughput. Whereas eeglab's reading function
>> quickly jumps to 1 Mbps (presumably fetching the file with all available
>> bandwith), Fieldtrip's reading operation barely leads to an observable
>> increase in traffic. Without knowing anything, I would speculate that data
>> is fetched piecewise, e.g., by channel or block, leading to notcieable
>> slowing for larger files.
>>
>>
>>
>> Is there anything that can be done to speed up reading data?
>>
>>
>>
>> Regards,
>>
>> Roy
>> _______________________________________________
>> fieldtrip mailing list
>> https://mailman.science.ru.nl/mailman/listinfo/fieldtrip
>> <https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fmailman.science.ru.nl%2Fmailman%2Flistinfo%2Ffieldtrip&data=05%7C02%7Cfieldtrip%40science.ru.nl%7Cd03be98ba33e42b84dd508dccf2d1a46%7C084578d9400d4a5aa7c7e76ca47af400%7C1%7C0%7C638613039818924842%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=VI7ISaOB6m92jkBPYvLE69VXXjRTqOuCTR4KtvZRrJg%3D&reserved=0>
>> https://doi.org/10.1371/journal.pcbi.1002202
>> <https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdoi.org%2F10.1371%2Fjournal.pcbi.1002202&data=05%7C02%7Cfieldtrip%40science.ru.nl%7Cd03be98ba33e42b84dd508dccf2d1a46%7C084578d9400d4a5aa7c7e76ca47af400%7C1%7C0%7C638613039818924842%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=Cel4H8g%2FUTFfoO7973FFtsVfwuWN3pyKl9IIa0wQTwY%3D&reserved=0>
>>
> _______________________________________________
> fieldtrip mailing list
> https://mailman.science.ru.nl/mailman/listinfo/fieldtrip
> https://doi.org/10.1371/journal.pcbi.1002202
>
>
> _______________________________________________
> fieldtrip mailing list
> https://mailman.science.ru.nl/mailman/listinfo/fieldtrip
> https://doi.org/10.1371/journal.pcbi.1002202
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.science.ru.nl/pipermail/fieldtrip/attachments/20240909/6f56b0d3/attachment.htm>