Preprocessing

Preprocessing#

brainmaze_eeg.preprocessing.channel_data_rate_thresholding(x: ndarray[tuple[Any, ...], dtype[float64]], threshold_data_rate: float = 0.1)#

Masks entire channels (sets to NaN) based on data availability.

Assesses the proportion of non-NaN values (data rate) for each channel. Channels with a data rate at or below the specified threshold_data_rate are fully masked, resulting in the output signal having those channels entirely as NaN. This filters out channels with excessive missing data for quality control.

Parameters:

x (np.ndarray) – Input signal array, expected to be [n_channels, n_samples] or [n_samples]. May contain NaN values.
threshold_data_rate (float, optional) – Minimum acceptable proportion of non-NaNs for a channel to be kept. Channels <= this rate are masked. Default is 0.1.

Returns:

The input signal with channels below the data rate threshold: set entirely to NaN. Shape is the same as the input (or the original 1D shape if input was 1D).

Return type:

np.ndarray

Raises:

ValueError – If input is not 1D or 2D.

brainmaze_eeg.preprocessing.detect_flat_line_segments(x: ndarray[tuple[Any, ...], dtype[float64]], fs: float, window_s: float = 0.5, threshold: float = 5e-07)#

Detects flat-line segments in the signal based on low variability.

Identifies periods where the signal is constant by checking if the mean absolute difference within short segments falls below a threshold. Operates on segments. Returns a boolean mask indicating flagged segments.

Parameters:

x (np.ndarray) – Input signal [n_channels, n_samples] or [n_samples], can have NaNs.
fs (float) – Sampling frequency (Hz).
window_s (float, optional) – Analysis window duration in seconds. Default is 0.5.
threshold (float, optional) – Threshold for mean absolute difference to flag flat-line. Default is 0.5e-6.

Returns:

Boolean mask [n_channels, n_segments] or [n_segments]. True indicates: a flat-line detected in that specific segment and channel.

Return type:

np.ndarray

Raises:

ValueError – If input is not 1D or 2D.

brainmaze_eeg.preprocessing.detect_outlier_segments(x: ndarray[tuple[Any, ...], dtype[float64]], fs: float, window_s: float = 0.5, threshold: float = 10)#

Detects time segments containing amplitude outliers.

Identifies segments with sudden, large amplitude deflections by applying a robust percentile-based threshold to the signal within short time windows. Segments where samples exceed the threshold are flagged. Operates on segments. Returns a boolean mask indicating flagged segments.

Parameters:

x (np.ndarray) – Input signal [n_channels, n_samples] or [n_samples], can have NaNs.
fs (float) – Sampling frequency (Hz).
window_s (float, optional) – Analysis window duration in seconds. Default is 0.5.
threshold (float, optional) – Multiplier for percentile range to set threshold. Default is 10.

Returns:

Boolean mask [n_channels, n_segments] or [n_segments]. True indicates: amplitude outliers detected in that specific segment and channel.

Return type:

np.ndarray

Raises:

ValueError – If input is not 1D or 2D.

brainmaze_eeg.preprocessing.detect_powerline_segments(x: ndarray[tuple[Any, ...], dtype[float64]], fs: float, window_s: float = 0.5, powerline_freq: float = 60, threshold: float = 1000)#

Detects time segments within each channel affected by powerline noise.

Identifies segments by analyzing the spectral power ratio between the powerline frequency/harmonics and the 2-40Hz band within short time windows. Segments where this ratio exceeds a specified threshold are flagged. Operates on segments; drops partial segments at the end. Returns a boolean mask.

Parameters:

x (np.ndarray) – Input signal [n_channels, n_samples] or [n_samples], can have NaNs.
fs (float) – Sampling frequency (Hz).
window_s (float, optional) – Analysis window duration in seconds. Default is 0.5.
powerline_freq (float, optional) – Fundamental powerline frequency. Harmonics also checked. Default is 60 Hz.
threshold (float, optional) – Power ratio threshold for flagging segments. Default is 1000.

Returns:

Boolean mask [n_channels, n_segments] or [n_segments]. True indicates: powerline noise detected in that specific segment and channel.

Return type:

np.ndarray

Raises:

ValueError – If input is not 1D or 2D.

brainmaze_eeg.preprocessing.detect_stim_segments(x: ndarray[tuple[Any, ...], dtype[float64]], fs: float, window_s: float = 1, threshold: float = 2000, freq_band: Tuple[float, float] = (80, 110))#

Detects stimulation artifacts using spectral analysis of the difference signal.

Identifies artifacts by checking for high spectral power in a specific high-frequency band (e.g., 80-110 Hz) within short time windows of the signal’s derivative. Operates on segments, drops partial end segments. Returns a boolean mask of detected segments and the calculated power sums.

Parameters:

x (np.ndarray) – Input signal [n_channels, n_samples] or [n_samples], can have NaNs.
fs (float) – Sampling frequency (Hz).
window_s (float, optional) – Analysis window duration in seconds. Default is 1.
threshold (float, optional) – Power sum threshold for flagging artifacts. Default is 2000.
freq_band (tuple, optional) – Frequency range (low, high in Hz) for artifact power check. Default is (80, 110).

Returns:

detected_stim (np.ndarray): Boolean mask [n_channels, n_segments] or [n_segments]. True indicates artifact detected.
psd_sum (np.ndarray): Sum of spectral power in freq_band per segment/channel.

Return type:

tuple[np.ndarray, np.ndarray]

Raises:

ValueError – If input is not 1D or 2D.

brainmaze_eeg.preprocessing.detection_dilatation(mask: ndarray, extend_left: int = 2, extend_right: int = 2)#

Expands detected regions in a boolean/integer mask using binary dilation.

Applies binary dilation to a mask, effectively widening the regions marked as True (or 1) by a specified number of positions to the left and right. This is useful post-detection to add a buffer around flagged segments, accounting for potential edge effects. Returns the expanded mask.

Parameters:

mask (np.ndarray) – 1D or 2D boolean/int mask [n_channels, n_segments] or [n_segments]. True/1 indicates detection.
extend_left (int, optional) – Positions to extend the True region to the left. Default is 2.
extend_right (int, optional) – Positions to extend the True region to the right. Default is 2.

Returns:

The expanded boolean mask (as int 0/1). Same shape as input.

Return type:

np.ndarray

Raises:

ValueError – If input mask is not 1D or 2D.

brainmaze_eeg.preprocessing.filter_powerline(x: ndarray[tuple[Any, ...], dtype[float64]], fs: float, powerline_freq: float = 60)#

Removes powerline noise using a notch filter and handles NaNs.

Applies a notch filter at the specified powerline frequency. Handles NaNs by temporarily imputing with the median before filtering and then restoring the original NaN locations in the output. Note potential ringing artifacts near original NaN gaps or sharp signal transitions.

Parameters:

x (np.ndarray) – Input signal [n_channels, n_samples] or [n_samples], can have NaNs.
fs (float) – Sampling frequency (Hz).
powerline_freq (float, optional) – Frequency of noise to remove. Default is 60 Hz.

Returns:

Filtered signal with powerline noise attenuated. Original NaN: locations are preserved. Same shape as input.

Return type:

np.ndarray

Raises:

ValueError – If input is not 1D or 2D.

brainmaze_eeg.preprocessing.mask_segments_with_nans(x: ndarray[tuple[Any, ...], dtype[float64]], segment_mask: ndarray[tuple[Any, ...], dtype[float64]], fs: float, segment_len_s: float)#

Masks (sets to NaN) signal segments specified by a boolean mask.

Applies a pre-computed boolean/integer mask to set corresponding time segments in the signal to NaN. Converts segment indices from the mask to sample indices to apply masking. Returns a copy of the input signal with artifactual segments replaced by NaN.

Parameters:

x (np.ndarray) – Input signal [n_channels, n_samples] or [n_samples].
segment_mask (np.ndarray) – Boolean/int mask [n_channels, n_segments] or [n_segments]. True/1 flags segments to mask.
fs (float) – Sampling frequency (Hz).
segment_len_s (float) – Duration of each segment in seconds, matches mask resolution.

Returns:

Copy of input signal with specified segments replaced by NaN. Same shape.

Return type:

np.ndarray

Raises:

ValueError – If input is not 1D/2D or mask dimension mismatch.

brainmaze_eeg.preprocessing.replace_nans_with_median(x: ndarray[tuple[Any, ...], dtype[float64]])#

Imputes NaN values per channel with the median of valid data.

Replaces NaN values by computing the median of non-NaNs for each channel independently and filling the NaNs with this channel-specific median. This provides a robust way to fill missing data points. Returns the processed signal and a boolean mask indicating the original NaN locations.

Parameters:

x (np.ndarray) – Input signal array [n_channels, n_samples] or [n_samples], can have NaNs.

Returns:

processed_signal (np.ndarray): Signal with NaNs replaced by channel medians.
mask (np.ndarray): Boolean mask where True indicates original NaN positions.

Return type:

Tuple[np.ndarray, np.ndarray]

Raises:

ValueError – If input is not 1D or 2D.

Preprocessing

Contents

Preprocessing#