mef_tools#


mef_tools.io#

class mef_tools.io.MefReader(session_path, password2=None)#

Class to handle reading of MEF files.

__version__#

Version of the MefReader class.

Type:

str

property channels#

Returns a list of all channels present in the session.

Returns:

List of channels.

Return type:

list

close()#

Closes the MEF session.

get_annotations(channel=None)#

Returns annotations for a specified channel. If no channel is specified, returns annotations for all channels.

Parameters:

channel (str, optional) – Name of the channel. If not provided, method returns annotations for all channels.

Returns:

List of annotations.

Return type:

list

get_channel_info(channel=None)#

Returns information for a given channel. If no channel is specified, returns information for all channels.

Parameters:

channel (str, optional) – Name of the channel. If not provided, method returns info for all channels.

Returns:

Channel info or list of channel info.

Return type:

dict or list

get_data(channels, t_stamp1=None, t_stamp2=None)#

Returns processed data for specified channels and time stamps.

Parameters:
  • channels (int64, str, list, or numpy.ndarray)

  • data. (Channels for which to return)

  • t_stamp1 (int64, optional) – Start time stamp. If not provided, method uses the earliest time stamp.

  • t_stamp2 (int64, optional) – End time stamp. If not provided, method uses the latest time stamp.

Returns:

Array of processed data.

Return type:

numpy.ndarray

get_property(property_name, channel=None)#

Returns the specified property for a given channel. If no channel is specified, returns the property for all channels.

Parameters:
  • property_name (str) – Name of the property.

  • channel (str, optional) – Name of the channel. If not provided, method returns property for all channels.

Returns:

Property or list of properties.

Return type:

list or str

get_raw_data(channels, t_stamp1=None, t_stamp2=None)#

Returns raw data for specified channels and time stamps.

Parameters:
  • channels (int64, str, list, or numpy.ndarray) – Channels for which to return data.

  • t_stamp1 (int64, optional) – Start time stamp. If not provided, method uses the earliest time stamp.

  • t_stamp2 (int64, optional) – End time stamp. If not provided, method uses the latest time stamp.

Returns:

Array of raw data.

Return type:

numpy.ndarray

property properties#

Returns a list of all unique properties across all channels in the session.

Returns:

List of unique properties.

Return type:

list

class mef_tools.io.MefWriter(session_path, overwrite=False, password1=None, password2=None, verbose=False)#

MefWriter is a utility class for writing data in the MEF3 format. The class allows easy writing and appending of data to existing MEF3 files.

session_path#

The path of the MEF3 session to be written.

overwrite#

A boolean flag that if set to True, allows overwriting of existing files. Default is False.

password1#

The password for level 1 encryption. Default is None. This password is needed only for while creating the session.

password2#

The password for level 2 encryption. Default is None. This password is required for any read/write operation of an existing session.

verbose#

A boolean flag that if set to True, enables verbose mode. Default is False.

property data_units#

Getter for the units of the data.

Returns:

The units of the data.

Return type:

str

get_mefblock_len(fs)#

Get the length of a MEF block.

Parameters:

fs (float) – Sampling frequency of the data.

Returns:

Length of the MEF block.

Return type:

int

property max_nans_written#

Getter for the maximum number of NaN values allowed to be written. NaNs that are written as values will be written as the maximum value of the data type. Recommended value is 0, which will not allow any NaN values to be written. The signal will be split into data blocks based on the NaN values. This might cause poor data compression if a lot of NaN segments are present in the data.

Returns:

The maximum number of NaN values allowed to be written.

Return type:

int

property mef_block_len#

Getter for the MEF block length. Higher the mef_block length, better the compression, but higher the memory usage.

Returns:

The MEF block length.

Return type:

int

property record_offset#

Getter for the offset of the record.

Returns:

The offset of the record.

Return type:

int

write_annotations(annotations, channel=None)#

Method writes annotations to a session/channel. Method handles new annotations or appending to existing annotations. Input data has to have required structure.

Parameters:
  • annotations (pandas.DataFrame) – DataFrame has to have a proper structure with columns - time column [uutctimestamp], type [‘str specified in pymef’ - Note or EDFA], text [‘str’], optional duration [usec]

  • channel (str, optional) – annotations are written at the channel level

write_data(data_write, channel, start_uutc, sampling_freq, end_uutc=None, precision=None, new_segment=False, discont_handler=True, reload_metadata=True)#

General method for writing any data to the session. Method handles new channel data or appending to existing channel data automatically. Discont handler flag can be used for fragmentation to smaller intervals which are written in sequence with nans intervals skipped.

Parameters:
  • data_write (np.ndarray) – data to be written, data will be scaled a translated to int32 automatically if precision parameter is not given

  • channel (str) – name of the stored channel

  • start_uutc (int64) – uutc timestamp of the first sample

  • sampling_freq (float) – only 0.1 Hz resolution is tested

  • end_uutc (int, optional) – end of the data uutc timestamp, if less data is provided than end_uutc - start_uutc nans gap will be inserted to the data

  • precision (int, optional) –

    Number of floating point to be scaled above zero. Data are multiplied by 10**precision before writing and scale factor is stored in metadata. used for transforming data to int32, can be positive or 0 = no change

    in scale, only loss of decimals.

  • new_segment (bool, optional) – if new mef3 segment should be created

  • discont_handler (bool, optional) – disconnected segments will be stored in intervals if the gap in data is higher than max_nans_written property

  • reload_metadata (bool, optional) – default: true. Parameter Controls reloading of metadata after writing new data - frequent call of write method on short signals can slow down writing. When false appending is not protected for correct endtime check, but data write is faster. Metadata are always reloaded with new segment.

Returns:

out – True on success

Return type:

bool

mef_tools.io.check_data_integrity(original_data, converted_data, precision)#

Check the integrity of the original data against the converted data.

Parameters:
  • original_data (array-like) – The original data before conversion.

  • converted_data (array-like) – The data after conversion.

  • precision (int) – The precision used during the conversion process.

Returns:

result_bin – True if all close, else False.

Return type:

bool

Notes

This function checks the integrity of the original data against the converted data. It converts the converted data back to the original scale, excludes NaNs, and checks if the original and reconverted data are close to each other within a specified tolerance. The check is performed using numpy’s allclose function with a tolerance of 0.1^(precision-1).

mef_tools.io.check_int32_dynamic_range(x_min, x_max, alpha)#

Checks whether the scaled range of the input values falls within the dynamic range of int32.

Parameters:
  • x_min (float or int) – The minimum value of the input.

  • x_max (float or int) – The maximum value of the input.

  • alpha (float or int) – The scaling factor applied to the input range.

Returns:

Returns True if the scaled range falls within the dynamic range of int32. Otherwise, returns False.

Return type:

bool

Notes

This function checks whether the input range, when scaled by a factor of alpha, falls within the dynamic range of the int32 datatype. If the scaled range exceeds the dynamic range of int32, the function returns False. If the scaled range falls within the dynamic range of int32, the function returns True.

mef_tools.io.convert_data_to_int32(data, precision=None)#

Converts the input data to int32 type, optionally scaling it by a specified factor.

Parameters:
  • data (array-like) – The input data.

  • precision (int, optional) – The scaling factor (expressed as a power of 10) to apply to the data. If not provided, it will be inferred using the infer_conversion_factor function.

Returns:

data_int32 – The input data converted to int32 type and scaled by the specified factor.

Return type:

ndarray

Notes

This function converts the input data to int32 type. If a scaling factor (precision) is provided, it is used to scale the data before conversion. If no scaling factor is provided, the function infers an optimal factor using the infer_conversion_factor function.

The data is first rounded to the specified number of decimal places, then multiplied by 10 to the power of the precision factor, and finally cast to int32 type.

If the specified precision is less than 0 or not an integer, a warning is printed and the precision is set to 0, meaning no scaling is applied.

mef_tools.io.create_pink_noise(fs, seg_len, low_bound, up_bound)#

Creates a pink noise signal.

Parameters:
  • fs (int) – Sampling frequency of the signal.

  • seg_len (int) – Length of the segment for which pink noise is to be generated.

  • low_bound (float) – Lower bound for the amplitude of the generated noise.

  • up_bound (float) – Upper bound for the amplitude of the generated noise.

Returns:

The generated pink noise signal.

Return type:

numpy.ndarray

Raises:

ValueError – If the requested segment length results in too many samples.

mef_tools.io.find_intervals_binary_vector(input_bin_vector, fs, start_uutc, samples_of_nans_allowed=None)#

Detects continuous intervals of ones in a binary vector and returns their start and stop times.

Parameters:
  • input_bin_vector (array-like) – The input binary vector.

  • fs (int) – The sampling frequency of the data.

  • start_uutc (int) – The start time of the data in microseconds since Unix Epoch.

  • samples_of_nans_allowed (int, optional) – The maximum number of consecutive zeros (NaNs) that are considered part of an interval. If not provided, it defaults to the sampling frequency.

Returns:

connected_detected_intervals – A DataFrame containing the start and stop times (in samples and microseconds) of the continuous intervals of ones in the input binary vector.

Return type:

DataFrame

Notes

This function processes a binary vector and detects continuous intervals of ones. It considers an interval to continue over a stretch of zeros (NaNs) if their number does not exceed a specified limit (samples_of_nans_allowed).

The function returns a DataFrame containing the start and stop times of each detected interval, both in number of samples and in microseconds since Unix Epoch.

The function first extends the input vector with a zero at both ends, then calculates the difference between consecutive elements. The positions where this difference equals 1 correspond to the starts of intervals of ones, while the positions where it equals -1 correspond to their ends. The function then merges intervals that are closer to each other than samples_of_nans_allowed and calculates the corresponding start and stop times.

mef_tools.io.infer_conversion_factor(data)#

Infers the optimal conversion factor to scale the input data.

Parameters:

data (array-like) – The input data.

Returns:

precision – The optimal conversion factor for scaling the input data.

Return type:

int

Notes

This function infers the optimal conversion factor for scaling the input data to bring it within the dynamic range of int32. It initially calculates the mean of the absolute differences of the data and scales it up until the mean reaches a threshold value. Then it checks if the range of the scaled data falls within the dynamic range of int32, and if not, it reduces the scaling factor until the scaled data is within the dynamic range of int32.

If the input data has high dynamic range, this function might decrease the scaling factor to avoid saturation. In this case, a warning message will be printed indicating the decreased precision.

mef_tools.io.scale_signal(data, a, b)#

Scales a signal to a specified range.

Parameters:
  • data (numpy.ndarray) – The input signal to scale.

  • a (float) – The lower bound of the desired range.

  • b (float) – The upper bound of the desired range.

Returns:

The input signal, rescaled to the range [a, b].

Return type:

numpy.ndarray

Notes

This function performs a linear transformation of the input data such that the minimum value becomes a and the maximum value becomes b.

mef_tools.io.voss(nrows, ncols=32)#

Generates pink noise using the Voss-McCartney algorithm.

nrows: number of values to generate rcols: number of random sources to add

returns: NumPy array

ReadmeMe#


[![Build Status](mselair/mef_tools) [![Documentation Status](https://readthedocs.org/projects/mef-tools/badge/?version=latest)](https://mef-tools.readthedocs.io/en/latest/?badge=latest) [![Python Versions](https://img.shields.io/pypi/pyversions/Django)](https://pypi.org/project/mef-tools/) [![Platform](https://img.shields.io/badge/platform-windows%20%7C%20macos%20%7C%20linux-lightgrey)](https://pypi.org/project/mef-tools/)

# MEF_Tools

This package provides tools for easier [Multiscale Electrophysiology Format (MEF)](https://doi.org/10.1016%2Fj.jneumeth.2009.03.022) data saving and reading. See the example below and [documentation](https://mef-tools.readthedocs.io/en/latest/?badge=latest).

## Multiscale Electrophysiology Format (MEF)

[Multiscale Electrophysiology Format (MEF)](https://doi.org/10.1016%2Fj.jneumeth.2009.03.022) is a specialized file format designed for storing electrophysiological data. This format is capable of storing multiple channels of data in a single file, with each channel storing a time series of data points.

MEF is particularly useful for handling large volumes of electrophysiological data, as it employs a variety of techniques such as lossless and lossy compression, data encryption and data de-identification to make the storage and transmission of such data more efficient and secure.

Python’s pymef library provides a set of tools for working with MEF files, including reading from and writing to these files. Below are examples demonstrating the use of these tools.

## Dependencies - [meflib](msel-source/meflib) - binaries are included in the pymef package - [pymef](msel-source/pymef) - [numpy](https://numpy.org/) - [pandas](https://pandas.pydata.org/)

## Installation

See installation instructions [INSTALL.md](INSTALL.md).

## License

This software is licensed under the Apache-2.0 License. See [LICENSE](xmival00/MEF_Tools) file in the root directory of this project.

## Reference

  • Brinkmann BH, Bower MR, Stengel KA, Worrell GA, Stead M. Large-scale electrophysiology: acquisition, compression, encryption, and storage of big data. J Neurosci Methods. 2009;180(1):185‐192. doi:10.1016/j.jneumeth.2009.03.022

## Examples

First, we need to import the necessary libraries:

`python import os import time import numpy as np import pandas as pd from mef_tools.io import MefWriter, MefReader, create_pink_noise `

Next, we define the path to our MEF file, and the amount of data (in seconds) we want to write:

`python session_name = 'session' session_path = os.getcwd() + f'/{session_name}.mefd' mef_session_path = session_path secs_to_write = 30 `

We also need to specify the start and end times of our data in uUTC time. uUTC time is the number of microseconds since January 1, 1970, 00:00:00 UTC. We can use the [time](https://docs.python.org/3/library/time.html) library to convert between UTC time and other time formats. In this example, we will use the current time as the start time, and the start time plus the number of seconds we want to write as the end time:

`python start_time = int(time.time() * 1e6) end_time = int(start_time + 1e6*secs_to_write) `

With our file path and timing details set, we can now create our MEFWriter instance:

`python pass1 = 'pass1' # password needed for writing to file pass2 = 'pass2' # password needed for every read/write operation Wrt = MefWriter(session_path, overwrite=True, password1=pass1, password2=pass2) Wrt.max_nans_written = 0 Wrt.data_units = 'mV' `

We then create some test data to write to our file:

`python fs = 500 low_b = -10 up_b = 10 data_to_write = create_pink_noise(fs, secs_to_write, low_b, up_b) `

This data is written to a channel in our MEF file:

`python channel = 'channel_1' precision = 3 Wrt.write_data(data_to_write, channel, start_time, fs, precision=precision) `

## Appending Data to an Existing MEF File

To append data to an existing MEF file, we first need to create a new writer:

`python secs_to_append = 5 discont_length = 3 append_time = end_time + int(discont_length*1e6) append_end = append_time + 1e6*secs_to_append data = create_pink_noise(fs, secs_to_append, low_b, up_b) Wrt2 = MefWriter(session_path, overwrite=False, password1=pass1, password2=pass2) Wrt2.write_data(data, channel, append_time, fs) `

## Creating a New Segment in the MEF File

To create a new segment, we simply need to change the new_segment flag to True:

`python secs_to_write_seg2 = 10 gap_time = 3.36*1e6 newseg_time = append_end + int(gap_time) newseg_end = newseg_time + 1e6*secs_to_write_seg2 data = create_pink_noise(fs, secs_to_write_seg2, low_b, up_b) data[30:540] = np.nan data[660:780] = np.nan Writer2.write_data(data, channel, newseg_time, fs, new_segment=True) `

We can also write data to a new channel with inferred precision:

`python channel = 'channel_2' Wrt2.write_data(data, channel, newseg_time, fs, new_segment=True) `

## Writing Annotations to the MEF File

Annotations can also be added to the MEF file at both the session and channel levels. Here’s an example of how to do this:

```python start_time = start_time end_time = start_time + 1e6 * 300 offset = start_time - 1e6 starts = np.arange(start_time, end_time, 2e6) text = [‘test’] * len(starts) types = [‘Note’] * len(starts) note_annotations = pd.DataFrame(data={‘time’: starts, ‘text’: text, ‘type’: types}) Wrt2.write_annotations(note_annotations)

starts = np.arange(start_time, end_time, 1e5) text = [‘test’] * len(starts) types = [‘EDFA’] * len(starts) duration = [10025462] * len(starts) note_annotations = pd.DataFrame(data={‘time’: starts, ‘text’: text, ‘type’: types, ‘duration’:duration}) Wrt2.write_annotations(note_annotations, channel=channel ) ```

## Reading from an MEF File

In this example, we create a MefReader instance, print out the properties of the MEF file, and then read the first 10 seconds of data from each channel. The data from each channel is appended to a list.

```python Reader = MefReader(session_path, password2=pass2) signals = []

properties = Reader.properties print(properties)

for channel in Reader.channels:

start_time = Reader.get_property(‘start_time’, channel) end_time = Reader.get_property(‘end_time’, channel) x = Reader.get_data(channel, start_time, start_time+10*1e6) signals.append(x)

```