mef_tools#
mef_tools.io#
- class mef_tools.io.MefReader(session_path, password2=None)#
Class to handle reading of MEF files.
- __version__#
Version of the MefReader class.
- Type:
str
- property channels#
Returns a list of all channels present in the session.
- Returns:
List of channels.
- Return type:
list
- close()#
Closes the MEF session.
- get_annotations(channel=None)#
Returns annotations for a specified channel. If no channel is specified, returns annotations for all channels.
- Parameters:
channel (str, optional) – Name of the channel. If not provided, method returns annotations for all channels.
- Returns:
List of annotations.
- Return type:
list
- get_channel_info(channel=None)#
Returns information for a given channel. If no channel is specified, returns information for all channels.
- Parameters:
channel (str, optional) – Name of the channel. If not provided, method returns info for all channels.
- Returns:
Channel info or list of channel info.
- Return type:
dict or list
- get_data(channels, t_stamp1=None, t_stamp2=None)#
Returns processed data for specified channels and time stamps.
- Parameters:
channels (int64, str, list, or numpy.ndarray)
data. (Channels for which to return)
t_stamp1 (int64, optional) – Start time stamp. If not provided, method uses the earliest time stamp.
t_stamp2 (int64, optional) – End time stamp. If not provided, method uses the latest time stamp.
- Returns:
Array of processed data.
- Return type:
numpy.ndarray
- get_property(property_name, channel=None)#
Returns the specified property for a given channel. If no channel is specified, returns the property for all channels.
- Parameters:
property_name (str) – Name of the property.
channel (str, optional) – Name of the channel. If not provided, method returns property for all channels.
- Returns:
Property or list of properties.
- Return type:
list or str
- get_raw_data(channels, t_stamp1=None, t_stamp2=None)#
Returns raw data for specified channels and time stamps.
- Parameters:
channels (int64, str, list, or numpy.ndarray) – Channels for which to return data.
t_stamp1 (int64, optional) – Start time stamp. If not provided, method uses the earliest time stamp.
t_stamp2 (int64, optional) – End time stamp. If not provided, method uses the latest time stamp.
- Returns:
Array of raw data.
- Return type:
numpy.ndarray
- property properties#
Returns a list of all unique properties across all channels in the session.
- Returns:
List of unique properties.
- Return type:
list
- class mef_tools.io.MefWriter(session_path, overwrite=False, password1=None, password2=None, verbose=False)#
MefWriter is a utility class for writing data in the MEF3 format. The class allows easy writing and appending of data to existing MEF3 files.
- session_path#
The path of the MEF3 session to be written.
- overwrite#
A boolean flag that if set to True, allows overwriting of existing files. Default is False.
- password1#
The password for level 1 encryption. Default is None. This password is needed only for while creating the session.
- password2#
The password for level 2 encryption. Default is None. This password is required for any read/write operation of an existing session.
- verbose#
A boolean flag that if set to True, enables verbose mode. Default is False.
- property data_units#
Getter for the units of the data.
- Returns:
The units of the data.
- Return type:
str
- get_mefblock_len(fs)#
Get the length of a MEF block.
- Parameters:
fs (float) – Sampling frequency of the data.
- Returns:
Length of the MEF block.
- Return type:
int
- property max_nans_written#
Getter for the maximum number of NaN values allowed to be written. NaNs that are written as values will be written as the maximum value of the data type. Recommended value is 0, which will not allow any NaN values to be written. The signal will be split into data blocks based on the NaN values. This might cause poor data compression if a lot of NaN segments are present in the data.
- Returns:
The maximum number of NaN values allowed to be written.
- Return type:
int
- property mef_block_len#
Getter for the MEF block length. Higher the mef_block length, better the compression, but higher the memory usage.
- Returns:
The MEF block length.
- Return type:
int
- property record_offset#
Getter for the offset of the record.
- Returns:
The offset of the record.
- Return type:
int
- write_annotations(annotations, channel=None)#
Method writes annotations to a session/channel. Method handles new annotations or appending to existing annotations. Input data has to have required structure.
- Parameters:
annotations (pandas.DataFrame) – DataFrame has to have a proper structure with columns - time column [uutctimestamp], type [‘str specified in pymef’ - Note or EDFA], text [‘str’], optional duration [usec]
channel (str, optional) – annotations are written at the channel level
- write_data(data_write, channel, start_uutc, sampling_freq, end_uutc=None, precision=None, new_segment=False, discont_handler=True, reload_metadata=True)#
General method for writing any data to the session. Method handles new channel data or appending to existing channel data automatically. Discont handler flag can be used for fragmentation to smaller intervals which are written in sequence with nans intervals skipped.
- Parameters:
data_write (np.ndarray) – data to be written, data will be scaled a translated to int32 automatically if precision parameter is not given
channel (str) – name of the stored channel
start_uutc (int64) – uutc timestamp of the first sample
sampling_freq (float) – only 0.1 Hz resolution is tested
end_uutc (int, optional) – end of the data uutc timestamp, if less data is provided than end_uutc - start_uutc nans gap will be inserted to the data
precision (int, optional) –
Number of floating point to be scaled above zero. Data are multiplied by 10**precision before writing and scale factor is stored in metadata. used for transforming data to int32, can be positive or 0 = no change
in scale, only loss of decimals.
new_segment (bool, optional) – if new mef3 segment should be created
discont_handler (bool, optional) – disconnected segments will be stored in intervals if the gap in data is higher than max_nans_written property
reload_metadata (bool, optional) – default: true. Parameter Controls reloading of metadata after writing new data - frequent call of write method on short signals can slow down writing. When false appending is not protected for correct endtime check, but data write is faster. Metadata are always reloaded with new segment.
- Returns:
out – True on success
- Return type:
bool
- mef_tools.io.check_data_integrity(original_data, converted_data, precision)#
Check the integrity of the original data against the converted data.
- Parameters:
original_data (array-like) – The original data before conversion.
converted_data (array-like) – The data after conversion.
precision (int) – The precision used during the conversion process.
- Returns:
result_bin – True if all close, else False.
- Return type:
bool
Notes
This function checks the integrity of the original data against the converted data. It converts the converted data back to the original scale, excludes NaNs, and checks if the original and reconverted data are close to each other within a specified tolerance. The check is performed using numpy’s allclose function with a tolerance of 0.1^(precision-1).
- mef_tools.io.check_int32_dynamic_range(x_min, x_max, alpha)#
Checks whether the scaled range of the input values falls within the dynamic range of int32.
- Parameters:
x_min (float or int) – The minimum value of the input.
x_max (float or int) – The maximum value of the input.
alpha (float or int) – The scaling factor applied to the input range.
- Returns:
Returns True if the scaled range falls within the dynamic range of int32. Otherwise, returns False.
- Return type:
bool
Notes
This function checks whether the input range, when scaled by a factor of alpha, falls within the dynamic range of the int32 datatype. If the scaled range exceeds the dynamic range of int32, the function returns False. If the scaled range falls within the dynamic range of int32, the function returns True.
- mef_tools.io.convert_data_to_int32(data, precision=None)#
Converts the input data to int32 type, optionally scaling it by a specified factor.
- Parameters:
data (array-like) – The input data.
precision (int, optional) – The scaling factor (expressed as a power of 10) to apply to the data. If not provided, it will be inferred using the infer_conversion_factor function.
- Returns:
data_int32 – The input data converted to int32 type and scaled by the specified factor.
- Return type:
ndarray
Notes
This function converts the input data to int32 type. If a scaling factor (precision) is provided, it is used to scale the data before conversion. If no scaling factor is provided, the function infers an optimal factor using the infer_conversion_factor function.
The data is first rounded to the specified number of decimal places, then multiplied by 10 to the power of the precision factor, and finally cast to int32 type.
If the specified precision is less than 0 or not an integer, a warning is printed and the precision is set to 0, meaning no scaling is applied.
- mef_tools.io.create_pink_noise(fs, seg_len, low_bound, up_bound)#
Creates a pink noise signal.
- Parameters:
fs (int) – Sampling frequency of the signal.
seg_len (int) – Length of the segment for which pink noise is to be generated.
low_bound (float) – Lower bound for the amplitude of the generated noise.
up_bound (float) – Upper bound for the amplitude of the generated noise.
- Returns:
The generated pink noise signal.
- Return type:
numpy.ndarray
- Raises:
ValueError – If the requested segment length results in too many samples.
- mef_tools.io.find_intervals_binary_vector(input_bin_vector, fs, start_uutc, samples_of_nans_allowed=None)#
Detects continuous intervals of ones in a binary vector and returns their start and stop times.
- Parameters:
input_bin_vector (array-like) – The input binary vector.
fs (int) – The sampling frequency of the data.
start_uutc (int) – The start time of the data in microseconds since Unix Epoch.
samples_of_nans_allowed (int, optional) – The maximum number of consecutive zeros (NaNs) that are considered part of an interval. If not provided, it defaults to the sampling frequency.
- Returns:
connected_detected_intervals – A DataFrame containing the start and stop times (in samples and microseconds) of the continuous intervals of ones in the input binary vector.
- Return type:
DataFrame
Notes
This function processes a binary vector and detects continuous intervals of ones. It considers an interval to continue over a stretch of zeros (NaNs) if their number does not exceed a specified limit (samples_of_nans_allowed).
The function returns a DataFrame containing the start and stop times of each detected interval, both in number of samples and in microseconds since Unix Epoch.
The function first extends the input vector with a zero at both ends, then calculates the difference between consecutive elements. The positions where this difference equals 1 correspond to the starts of intervals of ones, while the positions where it equals -1 correspond to their ends. The function then merges intervals that are closer to each other than samples_of_nans_allowed and calculates the corresponding start and stop times.
- mef_tools.io.infer_conversion_factor(data)#
Infers the optimal conversion factor to scale the input data.
- Parameters:
data (array-like) – The input data.
- Returns:
precision – The optimal conversion factor for scaling the input data.
- Return type:
int
Notes
This function infers the optimal conversion factor for scaling the input data to bring it within the dynamic range of int32. It initially calculates the mean of the absolute differences of the data and scales it up until the mean reaches a threshold value. Then it checks if the range of the scaled data falls within the dynamic range of int32, and if not, it reduces the scaling factor until the scaled data is within the dynamic range of int32.
If the input data has high dynamic range, this function might decrease the scaling factor to avoid saturation. In this case, a warning message will be printed indicating the decreased precision.
- mef_tools.io.scale_signal(data, a, b)#
Scales a signal to a specified range.
- Parameters:
data (numpy.ndarray) – The input signal to scale.
a (float) – The lower bound of the desired range.
b (float) – The upper bound of the desired range.
- Returns:
The input signal, rescaled to the range [a, b].
- Return type:
numpy.ndarray
Notes
This function performs a linear transformation of the input data such that the minimum value becomes a and the maximum value becomes b.
- mef_tools.io.voss(nrows, ncols=32)#
Generates pink noise using the Voss-McCartney algorithm.
nrows: number of values to generate rcols: number of random sources to add
returns: NumPy array
ReadmeMe#
[ [](https://mef-tools.readthedocs.io/en/latest/?badge=latest) [](https://pypi.org/project/mef-tools/) [](https://pypi.org/project/mef-tools/)
# MEF_Tools
This package provides tools for easier [Multiscale Electrophysiology Format (MEF)](https://doi.org/10.1016%2Fj.jneumeth.2009.03.022) data saving and reading. See the example below and [documentation](https://mef-tools.readthedocs.io/en/latest/?badge=latest).
## Multiscale Electrophysiology Format (MEF)
[Multiscale Electrophysiology Format (MEF)](https://doi.org/10.1016%2Fj.jneumeth.2009.03.022) is a specialized file format designed for storing electrophysiological data. This format is capable of storing multiple channels of data in a single file, with each channel storing a time series of data points.
MEF is particularly useful for handling large volumes of electrophysiological data, as it employs a variety of techniques such as lossless and lossy compression, data encryption and data de-identification to make the storage and transmission of such data more efficient and secure.
Python’s pymef library provides a set of tools for working with MEF files, including reading from and writing to these files. Below are examples demonstrating the use of these tools.
## Dependencies - [meflib](msel-source/meflib) - binaries are included in the pymef package - [pymef](msel-source/pymef) - [numpy](https://numpy.org/) - [pandas](https://pandas.pydata.org/)
## Installation
See installation instructions [INSTALL.md](INSTALL.md).
## License
This software is licensed under the Apache-2.0 License. See [LICENSE](xmival00/MEF_Tools) file in the root directory of this project.
## Reference
Brinkmann BH, Bower MR, Stengel KA, Worrell GA, Stead M. Large-scale electrophysiology: acquisition, compression, encryption, and storage of big data. J Neurosci Methods. 2009;180(1):185‐192. doi:10.1016/j.jneumeth.2009.03.022
## Examples
First, we need to import the necessary libraries:
`python
import os
import time
import numpy as np
import pandas as pd
from mef_tools.io import MefWriter, MefReader, create_pink_noise
`
Next, we define the path to our MEF file, and the amount of data (in seconds) we want to write:
`python
session_name = 'session'
session_path = os.getcwd() + f'/{session_name}.mefd'
mef_session_path = session_path
secs_to_write = 30
`
We also need to specify the start and end times of our data in uUTC time. uUTC time is the number of microseconds since January 1, 1970, 00:00:00 UTC. We can use the [time](https://docs.python.org/3/library/time.html) library to convert between UTC time and other time formats. In this example, we will use the current time as the start time, and the start time plus the number of seconds we want to write as the end time:
`python
start_time = int(time.time() * 1e6)
end_time = int(start_time + 1e6*secs_to_write)
`
With our file path and timing details set, we can now create our MEFWriter instance:
`python
pass1 = 'pass1' # password needed for writing to file
pass2 = 'pass2' # password needed for every read/write operation
Wrt = MefWriter(session_path, overwrite=True, password1=pass1, password2=pass2)
Wrt.max_nans_written = 0
Wrt.data_units = 'mV'
`
We then create some test data to write to our file:
`python
fs = 500
low_b = -10
up_b = 10
data_to_write = create_pink_noise(fs, secs_to_write, low_b, up_b)
`
This data is written to a channel in our MEF file:
`python
channel = 'channel_1'
precision = 3
Wrt.write_data(data_to_write, channel, start_time, fs, precision=precision)
`
## Appending Data to an Existing MEF File
To append data to an existing MEF file, we first need to create a new writer:
`python
secs_to_append = 5
discont_length = 3
append_time = end_time + int(discont_length*1e6)
append_end = append_time + 1e6*secs_to_append
data = create_pink_noise(fs, secs_to_append, low_b, up_b)
Wrt2 = MefWriter(session_path, overwrite=False, password1=pass1, password2=pass2)
Wrt2.write_data(data, channel, append_time, fs)
`
## Creating a New Segment in the MEF File
To create a new segment, we simply need to change the new_segment flag to True:
`python
secs_to_write_seg2 = 10
gap_time = 3.36*1e6
newseg_time = append_end + int(gap_time)
newseg_end = newseg_time + 1e6*secs_to_write_seg2
data = create_pink_noise(fs, secs_to_write_seg2, low_b, up_b)
data[30:540] = np.nan
data[660:780] = np.nan
Writer2.write_data(data, channel, newseg_time, fs, new_segment=True)
`
We can also write data to a new channel with inferred precision:
`python
channel = 'channel_2'
Wrt2.write_data(data, channel, newseg_time, fs, new_segment=True)
`
## Writing Annotations to the MEF File
Annotations can also be added to the MEF file at both the session and channel levels. Here’s an example of how to do this:
```python start_time = start_time end_time = start_time + 1e6 * 300 offset = start_time - 1e6 starts = np.arange(start_time, end_time, 2e6) text = [‘test’] * len(starts) types = [‘Note’] * len(starts) note_annotations = pd.DataFrame(data={‘time’: starts, ‘text’: text, ‘type’: types}) Wrt2.write_annotations(note_annotations)
starts = np.arange(start_time, end_time, 1e5) text = [‘test’] * len(starts) types = [‘EDFA’] * len(starts) duration = [10025462] * len(starts) note_annotations = pd.DataFrame(data={‘time’: starts, ‘text’: text, ‘type’: types, ‘duration’:duration}) Wrt2.write_annotations(note_annotations, channel=channel ) ```
## Reading from an MEF File
In this example, we create a MefReader instance, print out the properties of the MEF file, and then read the first 10 seconds of data from each channel. The data from each channel is appended to a list.
```python Reader = MefReader(session_path, password2=pass2) signals = []
properties = Reader.properties print(properties)
- for channel in Reader.channels:
start_time = Reader.get_property(‘start_time’, channel) end_time = Reader.get_property(‘end_time’, channel) x = Reader.get_data(channel, start_time, start_time+10*1e6) signals.append(x)