pyresample.ewa package¶
Submodules¶
pyresample.ewa.dask_ewa module¶
Dask-friendly implementation of the EWA resampling algorithm.
The DaskEWAResampler class implements the Elliptical Weighted Averaging (EWA) resampling algorithm in a per-chunk processing scheme. This allows common dask configurations (number of workers, chunk size, etc) to control how much data is being worked on at any one time. This limits how much data is being held in memory at any one time. In cases where not all input array chunks will be used this implementation should avoid loading/computing the data for that chunk. In cases where not all output chunks have data in them, this implementation should avoid unnecessary array creation and memory usage until necessary.
- class pyresample.ewa.dask_ewa.DaskEWAResampler(source_geo_def, target_geo_def)¶
Bases:
BaseResampler
Resample using an elliptical weighted averaging algorithm.
This algorithm does not use caching or any externally provided data mask (unlike the ‘nearest’ resampler).
This algorithm works under the assumption that the data is observed one scan line at a time. However, good results can still be achieved for non-scan based data provided rows_per_scan is set to the number of rows in the entire swath or by setting it to None.
- compute(data, cache_id=None, rows_per_scan=None, chunks=None, fill_value=None, weight_count=10000, weight_min=0.01, weight_distance_max=1.0, weight_delta_max=1.0, weight_sum_min=-1.0, maximum_weight_mode=None, **kwargs)¶
Resample the data according to the precomputed X/Y coordinates.
- precompute(cache_dir=None, rows_per_scan=None, persist=False, **kwargs)¶
Generate row and column arrays and store it for later use.
- resample(data, cache_dir=None, mask_area=None, rows_per_scan=None, persist=False, chunks=None, fill_value=None, weight_count=10000, weight_min=0.01, weight_distance_max=1.0, weight_delta_max=1.0, weight_sum_min=-1.0, maximum_weight_mode=None)¶
Resample using an elliptical weighted averaging algorithm.
This algorithm does not use caching or any externally provided data mask (unlike the ‘nearest’ resampler). See the
DaskEWAResampler
class docstring for more information on how the algorithm works.Note
This sets the default of ‘mask_area’ to False since it is not needed in EWA resampling currently.
- Parameters
data (numpy.ndarray, dask.array.Array, xarray.DataArray) – Raster data to be resampled. Can be a numpy array, dask array, or xarray DataArray backed by a numpy or dask array. If the data is a numpy or dask array then only 2D (y, x) arrays are permitted. DataArray objects may be 2D or 3D where the third dimension is named “bands”. Note that regardless of the input type, data is converted to a dask array for internal processing and converted back to the original data type on return.
cache_dir (str, None) – Not used by this resampler.
mask_area (bool, None) – Not used by this resampler.
rows_per_scan (int, None) – Number of array rows that represent a single scan of the instrument. If
None
(default), then the.attrs
of the source swath longitude and latitude data is checked for this value if they are DataArray objects. Otherwise, this value must be provided. Decent results may be possible if this value is set to the total number of rows in the array. As a convenience, providing0
will result in the total number of rows being used.persist (bool) – Whether to persist (as in dask) the computations during precompute or compute them on the fly during compute. Persisting allows the resampler to determine which input chunks will overlap with the target area. This can greatly reduce the number of tasks and checks that will need to be computed in cases where it is known that only a small amount of input data will fall into the output area.
chunks (tuple, int, dict, string) – Chunk size of resulting dask array. See
normalize_chunks()
for more information.fill_value (int, float) – Output value when no data is present. Defaults to
numpy.nan
for float types or the maximum value for any integer types.weight_count (int) – number of elements to create in the gaussian weight table. Default is 10000. Must be at least 2
weight_min (float) – the minimum value to store in the last position of the weight table. Default is 0.01, which, with a weight_distance_max of 1.0 produces a weight of 0.01 at a grid cell distance of 1.0. Must be greater than 0.
weight_distance_max (float) – distance in grid cell units at which to apply a weight of weight_min. Default is 1.0. Must be greater than 0.
weight_delta_max (float) – maximum distance in grid cells in each grid dimension over which to distribute a single swath cell. Default is 10.0.
weight_sum_min (float) – minimum weight sum value. Cells whose weight sums are less than weight_sum_min are set to the grid fill value. Default is EPSILON.
maximum_weight_mode (bool) – If False (default), a weighted average of all swath cells that map to a particular grid cell is used. If True, the swath cell having the maximum weight of all swath cells that map to a particular grid cell is used. This option should be used for coded/category data, i.e. snow cover.
pyresample.ewa.ewa module¶
EWA algorithms operating on numpy arrays.
Remap data in to output grid using elliptical weighted averaging.
This algorithm works under the assumption that the data is observed one scan line at a time. However, good results can still be achieved for non-scan based data is provided if rows_per_scan is set to the number of rows in the entire swath or by setting it to None.
- Parameters
cols (numpy array) – Column location for each input swath pixel (from ll2cr)
rows (numpy array) – Row location for each input swath pixel (from ll2cr)
area_def (AreaDefinition) – Grid definition to be mapped to
data_in (numpy array or tuple of numpy arrays) – Swath data to be remapped to output grid
rows_per_scan (int or None, optional) – Number of data rows for every observed scanline. If None then the entire swath is treated as one large scanline.
fill (float/int or None, optional) – If data_in is made of numpy arrays then this represents the fill value used to mark invalid data pixels. This value will also be used in the output array(s). If None, then np.nan will be used for float arrays and -999 will be used for integer arrays.
out (numpy array or tuple of numpy arrays, optional) – Specify a numpy array to be written to for each input array. This can be used as an optimization by providing np.memmap arrays or other array-like objects.
weight_count (int, optional) – number of elements to create in the gaussian weight table. Default is 10000. Must be at least 2
weight_min (float, optional) – the minimum value to store in the last position of the weight table. Default is 0.01, which, with a weight_distance_max of 1.0 produces a weight of 0.01 at a grid cell distance of 1.0. Must be greater than 0.
weight_distance_max (float, optional) – distance in grid cell units at which to apply a weight of weight_min. Default is 1.0. Must be greater than 0.
weight_delta_max (float, optional) – maximum distance in grid cells in each grid dimension over which to distribute a single swath cell. Default is 10.0.
weight_sum_min (float, optional) – minimum weight sum value. Cells whose weight sums are less than weight_sum_min are set to the grid fill value. Default is EPSILON.
maximum_weight_mode (bool, optional) – If False (default), a weighted average of all swath cells that map to a particular grid cell is used. If True, the swath cell having the maximum weight of all swath cells that map to a particular grid cell is used. This option should be used for coded/category data, i.e. snow cover.
- Returns
(valid grid points, output arrays) – The valid_grid_points tuple holds the number of output grid pixels that were written with valid data. The second element in the tuple is a tuple of output grid numpy arrays for each input array. If there was only one input array provided then the returned tuple is simply the singe points integer and single output grid array.
- Return type
tuple of integer tuples and numpy array tuples
- pyresample.ewa.ewa.ll2cr(swath_def, area_def, fill=nan, copy=True)¶
Map input swath pixels to output grid column and rows.
- Parameters
swath_def (SwathDefinition) – Navigation definition for swath data to remap
area_def (AreaDefinition) – Grid definition to be mapped to
fill (float, optional) – Fill value used in longitude and latitude arrays
copy (bool, optional) – Create a copy of the longitude and latitude arrays (default: True)
- Returns
(swath_points_in_grid, cols, rows) (tuple of integer, numpy array, numpy array) – Number of points from the input swath overlapping the destination area and the column and row arrays to pass to fornav.
.. note:: – ll2cr uses the pyproj library which is limited to 64-bit float navigation arrays in order to not do additional copying or casting of data types.
Module contents¶
Code for resampling using the Elliptical Weighted Averaging (EWA) algorithm.
The logic and original code for this algorithm were translated from the software package “MODIS Swath 2 Grid Toolbox” or “ms2gt” created by the NASA National Snow & Ice Data Center (NSIDC):
Since the project has slowed down, Terry Haran has maintained the package and made updates available:
The ms2gt C executables “ll2cr” and “fornav” were rewritten for the Polar2Grid software package created by the Space Science Engineering Center (SSEC)/Cooperative Institute for Meteorological Satellite Studies. They were rewritten as a combination of C++ and Cython to make them more python friendly by David Hoese and were then copied and modified here in pyresample. The rewrite of “ll2cr” also included an important switch from using the “mapx” library to using the more popular and capable pyproj (PROJ.4) library.
The EWA algorithm consists of two parts “ll2cr” and “fornav” and are described below.
ll2cr¶
The “ll2cr” process is the first step in the EWA algorithm. It stands for “latitude/longitude to column/row”. Its main purpose is to convert input longitude and latitude coordinates to column and row coordinates of the destination grid. These coordinates are then used in the next step “fornav”.