emsarray.operations.point_extraction#

Subset a dataset at a set of points.

extract_dataframe() takes a pandas DataFrame, subsets the dataset at the point specified in each row, and merges the dataset with the dataframe. The points extracted will form the coordinates for the new dataset.

extract_points() takes a list of Shapely Points, subsets the dataset at these points, and returns a new dataset with out any associated geometry. This is useful if you want to add your own metadata to the subset dataset.

If any of the supplied points does not intersect the dataset geometry, a NonIntersectingPoints exception is raised. This will include the indices of the points that do not intersect.

emsarray extract-points is a command line interface to extract_dataframe().

Functions#

extract_dataframe(dataset, dataframe, coordinate_columns, *, point_dimension='point')#

Extract the points listed in a pandas DataFrame, and merge the remaining columns in to the Dataset.

Parameters
  • dataset (xarray.Dataset) – The dataset to extract point data from

  • dataframe (pandas.DataFrame) – A dataframe with longitude and latitude columns, and possibly other columns.

  • coordinate_columns (tuple of str, str) – The names of the longitude and latitude columns in the dataframe.

  • point_dimension (Hashable, optional) – The name of the new dimension to create in the dataset. Optional, defaults to “point”.

Returns

xarray.Dataset – A new dataset that only contains data at the given points, plus any new columns present in the dataframe.

Example

import emsarray
import pandas as pd
from emsarray.operations import point_extraction

ds = emsarray.tutorial.open_dataset('gbr4')
df = pd.DataFrame({
    'lon': [152.807, 152.670, 153.543],
    'lat': [-24.9595, -24.589, -25.488],
    'name': ['a', 'b', 'c'],
})
point_data = point_extraction.extract_dataframe(
    ds, df, ['lon', 'lat'])
point_data
<xarray.Dataset>
Dimensions:  (k: 47, point: 3, time: 1)
Coordinates:
    zc       (k) float32 ...
  * time     (time) datetime64[ns] 2022-05-11T14:00:00
    lon      (point) float64 152.8 152.7 153.5
    lat      (point) float64 -24.96 -24.59 -25.49
Dimensions without coordinates: k, point
Data variables:
    botz     (point) float32 ...
    eta      (time, point) float32 ...
    salt     (time, k, point) float32 ...
    temp     (time, k, point) float32 ...
    name     (point) object 'a' 'b' 'c'
Attributes: (12/14)
    ...
extract_points(dataset, points, *, point_dimension='point')#

Drop all data except for cells that intersect the given points. Return a new dataset with a new dimension named point_dimension, with the same size as the nubmer of points, containing only data at those points.

The returned dataset has no coordinate information.

Parameters
  • dataset (xarray.Dataset) – The dataset to extract point data from.

  • points (list of Point) – The points to select.

  • point_dimension (Hashable, optional) – The name of the new dimension to index points along. Defaults to "point".

Returns

xarray.Dataset – A subset of the input dataset that only contains data at the given points. The dataset will only contain the values, without any coordinate information.

Exceptions#

exception NonIntersectingPoints(indices, points)#

Raised when a point to extract does not intersect the dataset geometry.

indices#

The indices of the points that do not intersect

points#

The non-intersecting points