emsarray.masking#
Common functions for working with dataset masks.
Masks are used when clipping datasets to a smaller geographic subset,
such as Convention.clip().
- mask_grid_dataset(dataset, mask, work_dir, **kwargs)#
Apply a mask to a two-dimensional grid dataset, such as
CFGrid1DandCFGrid2D, or datasets with multiple grids such asArakawaC- Parameters:
dataset – The
Datasetinstance to maskmask – The mask to apply. Different types of datasets need different masks.
work_dir – An empty directory where temporary files can be stored while applying the mask. The returned dataset will be built from files inside this directory, so callers must save the returned dataset before deleting this directory.
kwargs – Any extra kwargs are passed to open_mfdataset when assembling the new, clipped dataset.
- Returns:
Dataset– The masked dataset
- mask_grid_data_array(mask, data_array)#
Apply a mask to a single data array. A mask dataset contains one or more mask data arrays. The mask to apply is selected by comparing dimensions - the first mask found which has dimensions that are a subset of the data array dimensions is used.
- Parameters:
mask (
xarray.Dataset) – The mask datasetdata_array (
xarray.DataArray) – TheDataArrayto mask
- Returns:
xarray.DataArray– A newDataArraywith any masked values replaced with _FillValue. The returned data array will be the same shape as the input data array. If no appropriate mask is found, the original data array is returned unmodified.
- find_fill_value(data_array)#
Float-typed variables can easily be masked. If they don’t already have a fill value, they can be masked using NaN without issue. However there are some int-typed variables without a fill value that _cant_ be automatically masked.
- Parameters:
data_array (
xarray.DataArray) – TheDataArrayto find an appropriate_FillValuefor.- Returns:
fill value– A numpy scalar value appropriate to use as the fill value in the data array.For masked arrays, this will be
numpy.ma.masked— note that xarray itself does not use masked arrays, but is compatible with them.For data arrays that already have a
_FillValueused by xarray,numpy.nanis returned. xarray will substitute in all_FillValuewithnumpy.nanwhen opening files.For data arrays that have been opened with
mask_and_scale=False, the existing_FillValueis returned.If the data array has a float
dtype,numpy.nanis returned.If none of the above are true, a
ValueErroris raised.
- calculate_grid_mask_bounds(mask)#
Calculate the included bounds of a mask dataset for each dimension.
- Parameters:
mask (
xarray.Dataset) – The mask dataset should contain one or more boolean data arrays.- Returns:
dict– A dict of{dimension_name: slice(min_index, max_index)}will be returned. This dict can be passed directly in to axarray.Dataset.isel()call to crop a dataset to the bounds of a mask.
- smear_mask(arr, pad_axes)#
Take a boolean numpy array and a list indicating which axes to smear along. Return a new array, expanded along the axes, with the boolean values smeared accordingly.
This is a half baked convolution operator where the pad_axes parameter is used to build the kernel.
- Parameters:
arr (
numpy.ndarray) – A boolean numpynumpy.ndarray.pad_axes (
listofbool) – A list of booleans, indicating which axes to smear along.
- Returns:
numpy.ndarray– The smeared array. For every axis wherepad_axeswas True, the array will be one element larger.
Examples
>>> arr array([[0, 0, 1, 0, 0], [0, 1, 0, 1, 0], [1, 0, 0, 0, 1]]
Smear along the y-axis:
>>> smear_mask(arr, [False, True]) array([[0, 0, 1, 1, 0, 0], [0, 1, 1, 1, 1, 0], [1, 1, 0, 0, 1, 1]]
Smear along both axes:
>>> smear_mask(arr, [True, True]) array([[0, 0, 1, 1, 0, 0], [0, 1, 1, 1, 1, 0], [1, 1, 1, 1, 1, 1], [1, 1, 0, 0, 1, 1])
- blur_mask(arr, size=1)#
Take a boolean numpy array and blur it, such that all indexes neighbouring a True value in the input array are True in the output array. The output array will have the same shape as the input array.
- Parameters:
arr (
numpy.ndarray) – A boolean array to blursize (
int) – The kernel size to use when blurring. In the output array, any cell that has a true value within its size neighbours along any axis is true.
- Returns:
numpy.ndarray– The blurred array
Examples
>>> arr = numpy.array([ ... [1, 0, 0, 0, 0], ... [0, 0, 0, 0, 0], ... [0, 0, 0, 1, 0], ... [0, 0, 0, 0, 1]]) >>> blur_mask(arr) array([[1, 1, 0, 0, 0], [1, 1, 1, 1, 1], [0, 0, 1, 1, 1], [0, 0, 1, 1, 1]]