[FieldTrip] brainvision reading with ft_preprocessing slow

Sat Sep 7 13:06:20 CEST 2024

Hi Roy,

I didn’t study the relevant pieces of code in detail, but I would suspect that EEGLAB requires the same reading heuristics as Fieldtrip’s brainvision reader. The only thing that I can see in the Fieldtrip reader function for datafiles that are both ‘binary’ and ‘vectorized’, is that there is no memory pre-allocation for the ‘dat’ variable, which is read in a for-loop across channels. Re-allocation of memory in each iteration of a for-loop significantly slows down code execution, particularly if the matrices are big.
I suggest that you test this locally, by properly pre-allocating the memory for the dat variable (you can get inspired by the other sections of the code), and once you have verified that this would be a good improvement, we would be happy to receive a PR for this.

Best wishes,
Jan-Mathijs

On 6 Sep 2024, at 15:51, Roy Cox via fieldtrip <fieldtrip at science.ru.nl> wrote:

hi Konstantinos,

thanks, tried your suggestion, but using ft_definetrial is just as slow as regular ft_preprocessing when using trialdef.triallength = Inf, and even slower when setting trialdef.triallength = 1 (or some other number).

additional testing showed that the slow reading only occurs when the Brainvision subformat (data orientation) is "VECTORIZED". when the Brainvision subformat is "MULTIPLEXED", FieldTrip's reading is much faster and on par with eeglab's import functionality. guessing that "MULTIPLEXED" orientation prevents FieldTrip from reading in blockwise in the first place. so as a workaround, I could ensure that all our Brainvision data is multiplexed to begin with (but many other users might not have that option).

while I see the potential advantages of reading data blockwise when not all data is required for further processing, I think that loading complete continuous data into memory is a very typical use case that I'm surprised is not supported. At least for vectorized Brainvision data, blockwise reading is tremendously slowing down rather than speeding up the pipeline (and curious whether this affects other data formats too...)

thanks again for your input!
Roy

On Fri, Sep 6, 2024 at 3:28 AM Konstantinos Tsilimparis via fieldtrip <fieldtrip at science.ru.nl<mailto:fieldtrip at science.ru.nl>> wrote:
Hi Roy,

I believe you’re correct that FieldTrip reads data in blocks. For example, it processes the first 5 seconds from all 64 EEG channels, then moving on to the next 5 seconds, and so forth. From my experience with MEG data, reading the data in blocks does seem to improve the speed of data importing. Unfortunately, I’m not as familiar with how EEGLAB handles data imports.

One advantage of FieldTrip that might speed up data reading is that it allows to selectively read the raw data according to pre-defined trial triggers (see https://www.fieldtriptoolbox.org/tutorial/preprocessing/<https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.fieldtriptoolbox.org%2Ftutorial%2Fpreprocessing%2F&data=05%7C02%7Cfieldtrip%40science.ru.nl%7Cd03be98ba33e42b84dd508dccf2d1a46%7C084578d9400d4a5aa7c7e76ca47af400%7C1%7C0%7C638613039818924842%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=1ze45t0Nph8TkVy2%2BIIv4IqfpMpdqdJMh1lPkeuyQK4%3D&reserved=0>). In contrast, in EEGLAB the whole dataset has to be loaded into memory before it can be processed.

Given that sleep data often requires loading continuous data without triggers, one potential solution could be to segment the continuous data into one-second ‘’fake’’ trials while reading from disk (see: https://www.fieldtriptoolbox.org/tutorial/continuous/#segmenting-continuous-data-into-one-second-pieces<https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.fieldtriptoolbox.org%2Ftutorial%2Fcontinuous%2F%23segmenting-continuous-data-into-one-second-pieces&data=05%7C02%7Cfieldtrip%40science.ru.nl%7Cd03be98ba33e42b84dd508dccf2d1a46%7C084578d9400d4a5aa7c7e76ca47af400%7C1%7C0%7C638613039818924842%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=igV5RzrKhlfrP9L7yDpoNnN4WoSVJdq1%2F6EImEp%2BYo0%3D&reserved=0>). This is just a thought, and I’m not sure if it will speed up your analysis in the end, but it might be worth a try!

Best,
Konstantinos

From: fieldtrip <fieldtrip-bounces at science.ru.nl<mailto:fieldtrip-bounces at science.ru.nl>> On Behalf Of Roy Cox via fieldtrip
Sent: Tuesday, September 3, 2024 1:25 PM
To: FieldTrip discussion list <fieldtrip at science.ru.nl<mailto:fieldtrip at science.ru.nl>>
Cc: Roy Cox <roycox.roycox at gmail.com<mailto:roycox.roycox at gmail.com>>
Subject: [FieldTrip] brainvision reading with ft_preprocessing slow

hi all,

I noticed that FieldTrip's ft_preprocessing is extremely slow reading in our Branvision files (high-density sleep, about 8 GB). I've compared it to eeglab's import tool for Brainvision (bva-io1.71)

%---eeglab (all FieldTrip paths removed)
tic
[bv_folder,bv_file]=fileparts(bf_file);
EEG=pop_loadbv(bv_folder,[bv_file '.vhdr']);
t_eeglab=toc

t_eeglab =

    40.3622

%--fieldtrip (all EEGLAB paths removed)
tic
cfg=[];
cfg.dataset=bf_file;
cfg.continuous = 'yes';
cfg.readbids ='no';
data_ft=ft_preprocessing(cfg);
t_ft=toc

t_ft =

  458.0801

So that's tenfold slower.

One clue that may be of help: as we keep the data on a remote server I can monitor the ethernet throughput. Whereas eeglab's reading function quickly jumps to 1 Mbps (presumably fetching the file with all available bandwith), Fieldtrip's reading operation barely leads to an observable increase in traffic. Without knowing anything, I would speculate that data is fetched piecewise, e.g., by channel or block, leading to notcieable slowing for larger files.

Is there anything that can be done to speed up reading data?

Regards,
Roy
_______________________________________________
fieldtrip mailing list
https://mailman.science.ru.nl/mailman/listinfo/fieldtrip<https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fmailman.science.ru.nl%2Fmailman%2Flistinfo%2Ffieldtrip&data=05%7C02%7Cfieldtrip%40science.ru.nl%7Cd03be98ba33e42b84dd508dccf2d1a46%7C084578d9400d4a5aa7c7e76ca47af400%7C1%7C0%7C638613039818924842%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=VI7ISaOB6m92jkBPYvLE69VXXjRTqOuCTR4KtvZRrJg%3D&reserved=0>
https://doi.org/10.1371/journal.pcbi.1002202<https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdoi.org%2F10.1371%2Fjournal.pcbi.1002202&data=05%7C02%7Cfieldtrip%40science.ru.nl%7Cd03be98ba33e42b84dd508dccf2d1a46%7C084578d9400d4a5aa7c7e76ca47af400%7C1%7C0%7C638613039818924842%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=Cel4H8g%2FUTFfoO7973FFtsVfwuWN3pyKl9IIa0wQTwY%3D&reserved=0>
_______________________________________________
fieldtrip mailing list
https://mailman.science.ru.nl/mailman/listinfo/fieldtrip
https://doi.org/10.1371/journal.pcbi.1002202

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.science.ru.nl/pipermail/fieldtrip/attachments/20240907/1c36a045/attachment.htm>