The main product delivered by ESMF is the ESMF library that allows application developers to write programs based on the ESMF API. In addition to the programming library, ESMF distributions come with a small set of applications that are of general interest to the community. These applications utilize the ESMF library to implement features such as printing general information about the ESMF installation, or generating regrid weight files. The provided ESMF applications are intended to be used as standard command line tools.
The bundled ESMF applications are built and installed during the usual ESMF installation process, which is described in detail in the ESMF User's Guide section "Building and Installing the ESMF". After the installation the applications will be located in the ESMF_APPSDIR directory, which can be found as a Makefile variable in the esmf.mk file. The esmf.mk file can be found in the ESMF_INSTALL_LIBDIR directory after a successful installation. The ESMF User's Guide discusses the esmf.mk mechanism to access the bundled applications in more detail in section "Using Bundled ESMF Applications".
The following sections provide in-depth documentation of the bundled ESMF applications. In addition, each application supports the standard --help
command line argument, providing a brief description of how to invoke the program.
The ESMF_Info application prints basic information about the ESMF installation to stdout.
The application usage is as follows:
ESMF_Info [--help] where --help prints a brief usage message
This section describes the offline regridding application provided by ESMF. Regridding, also called remapping or interpolation, is the process of changing the grid that underlies data values while preserving qualities of the original data. Different kinds of transformations are appropriate for different problems. Regridding may be needed when communicating data between Earth system model components such as land and atmosphere, or between different data sets to support operations such as visualization.
Regridding can be broken into two stages. The first stage is generation of an interpolation weight matrix that describes how points in the source grid contribute to points in the destination grid. The second stage is the multiplication of values on the source grid by the interpolation weight matrix to produce values on the destination grid. This occurs through a parallel sparse matrix multiply.
There are two options for accessing ESMF regridding functionality: integrated and offline. Integrated regridding is a process whereby interpolation weights are generated via subroutine calls during the execution of the user's code. The integrated regridding can also perform the parallel sparse matrix multiply. In other words, ESMF integrated regridding allows a user to perform the whole process of interpolation within their code. For a further description of ESMF integrated regridding please see Section . In contrast to integrated regridding, offline regridding is a process whereby interpolation weights are generated by a separate ESMF application, not within the user code. The ESMF offline regridding application also only generates the interpolation matrix, the user is responsible for reading in this matrix and doing the actual interpolation (multiplication by the sparse matrix) in their code. The rest of this section further describes ESMF offline regridding.
For a discussion of installing and accessing ESMF applications such as this one please see the beginning of this part of the refernce manual (Section II) or for the quickest approach to just building and accessing the applications please refer to the ``Building and using bundled ESMF applications'' Section in the ESMF User's Guide.
As described above, this tool reads in two grid files and outputs weights for interpolation between the two grids. The input and output files are all in NetCDF format. The grid files are either in the same format 9.4 as is used as an input to SCRIP [3], or in the ESMF unstructured grid format . The weight file is the same format 9.5 as is output by SCRIP. The interpolation weights can be generated with the bilinear, patch, or first order conservative methods decribed below. Masking is supported for 2D logically rectangular (i.e. with grid_rank=2) grids in the SCRIP format. This application can do regrid weight generation from a global or regional source grid to a global or regional destination grid. It assumes that the source and destination grids are on a sphere and that the coordinates given in the files are latitude and longitude values. The coordinates can either be in degrees or radians (this is indicated by the ``units'' attribute attached to the value). As is true with many global models, this application currently assumes the latitude and longitude refer to positions on a perfect sphere, as opposed to a more complex and accurate representation of the earth's true shape such as would be used in a GIS system. (ESMF's current user base doesn't require this level of detail in representing the earth's shape, but it could be added in the future if necessary.) This file based regrid weight generation application is parallel. This application is used in the ESMF_RegridWeightGenCheck external demo, so that can serve as an example of its use.
This application requires the NetCDF libary to read the grid files and write out the weight files in NetCDF format. In addition, it also requires the LAPACK library to generate the patch regridding weights. To compile ESMF with the NetCDF library and the LAPACK library, please refer to the ``Third Party Libraries'' Section in the ESMF User's Guide for more information.
Internally this application uses the ESMF public API to generate the interpolation weights. If a source or destination grid is logically rectangular, then ESMF_GridCreate() is used to create an ESMF_Grid object. The cell center coordinates of the input grid are put into the center stagger location (ESMF_STAGGERLOC_CENTER). In addition, the corner coordinates are also put into the corner stagger location (ESMF_STAGGERLOC_CORNER), for conservative regridding. The method ESMF_MeshCreate() is used to create an ESMF_Mesh object, if the source or destination grid is a cubed sphere grid or an unstructured grid. When making this call, the flag convert3D is set to TRUE to convert the 2D coordinates into 3D Cartesian coordinates. Currently, ESMF only supports triangle or quadrilateral element types for a 2D Mesh. Therefore, when the cells in an unstructured grid contain more than four edges, they are broken into multiple triangle elements before ESMF_MeshCreate() is called to create the ESMF_Mesh object. After the calculation of the weight matrix based on the broken up cells, the matrix entries for the triangles are merged together, so that the output matrix is in terms of the original cells. Internally ESMF_FieldRegridStore() is used to generate the weight table and indices table representing the interpolation matrix.
The regridding occurs in 3D to avoid problems with periodicity and with the pole singularity. This application supports four options for handling the pole region (i.e. the empty area above the top row of the source grid or below the bottom row of the source grid). The first option is to leave the pole region empty (``-p none''), in this case if a destination point lies above or below the top row of the source grid, it will fail to map, yielding an error (unless ``-i'' is specified). With the next two options, the pole region is handled by constructing an artificial pole in the center of the top and bottom row of grid points and then filling in the region from this pole to the edges of the source grid with triangles. The pole is located at the average of the position of the points surrounding it, but moved outward to be at the same radius as the rest of the points in the grid. The difference between these two artificial pole options is what value is used at the pole. The default pole option (``-p all'') sets the value at the pole to be the average of the values of all of the grid points surrounding the pole. For the other option (``-p N''), the user chooses a number N from 1 to the number of source grid points around the pole. For each destination point, the value at the pole is then the average of the N source points surrounding that destination point. For the last pole option (``-p teeth'') no artificial pole is constructed, instead the pole region is covered by connecting points across the top and bottom row of the source Grid into triangles. As this makes the top and bottom of the source sphere flat, for a big enough difference between the size of the source and destination pole regions, this can still result in unmapped destination points. Only pole option ``none'' is currently supported with the conservative interpolation method (i.e. ``-m conserve'').
Masking is supported for grids generated from a SCRIP file where the grid_rank=2 (i.e. 2D logically rectangular grids). Masking is currently not supported for unstructured grids. If the variable ``grid_imask'' is set to 0 for a grid point, then that point is considered masked out and won't be used in the weights generated by the application.
If a destination point can't be mapped because it falls outside the unmasked source grid, then the default behavior of the application is to stop with an error. By specifying ``-i'' or the equivalent ``-ignore_unmapped'' the user can cause the application to ignore unmapped destination points. In this case, the output matrix won't contain entries for the unmapped destination points.
This regridding application can be used to generate bilinear, patch, or first-order conservative interpolation weights. The default interpolation method is bilinear. The algorithm used by this application to generate the bilinear weights is the standard one found in many textbooks. Each destination point is mapped to a location in the source Mesh, the position of the destination point relative to the source points surrounding it is used to calculate the interpolation weights.
This application can also be used to generate patch interpolation weights. Patch interpolation is the ESMF version of a technique called ``patch recovery'' commonly used in finite element modeling [1] [2]. It typically results in better approximations to values and derivatives when compared to bilinear interpolation. Patch interpolation works by constructing multiple polynomial patches to represent the data in a source element. For 2D grids, these polynomials are currently 2nd degree 2D polynomials. The interpolated value at the destination point is the weighted average of the values of the patches at that point.
The patch interpolation process works as follows. For each source element containing a destination point we construct a patch for each corner node that makes up the element (e.g. 4 patches for quadrilateral elements, 3 for triangular elements). To construct a polynomial patch for a corner node we gather all the elements around that node. (Note that this means that the patch interpolation weights depends on the source element's nodes, and the nodes of all elements neighboring the source element.) We then use a least squares fitting algorithm to choose the set of coefficients for the polynomial that produces the best fit for the data in the elements. This polynomial will give a value at the destination point that fits the source data in the elements surrounding the corner node. We then repeat this process for each corner node of the source element generating a new polynomial for each set of elements. To calculate the value at the destination point we do a weighted average of the values of each of the corner polynomials evaluated at that point. The weight for a corner's polynomial is the bilinear weight of the destination point with regard to that corner. The patch method has a larger stencil than the bilinear, for this reason the patch weight matrix can be correspondingly larger than the bilinear matrix (e.g. for a quadrilateral grid the patch matrix is around 4x the size of the bilinear matrix). This can be an issue when performing a regrid weight generation operation close to the memory limit on a machine.
First-order conservative interpolation [4] is also available as a regridding method. This method will typically have a larger interpolation error than the previous two methods, but will do a much better job of preserving the value of the integral of data between the source and destination grid. In this method the value across each source cell is treated as a constant. The weights for a particular destination cell, are the area of intersection of each source cell with the destination cell divided by the area of the destination cell. Areas in this case are the great circle areas of the polygons which make up the cells (the cells around each center are defined by the corner coordinates in the grid file).
The interpolation weights generated by this application are output to a NetCDF file (specified by the "-w" or "-weight" keywords). The format of this file is the same as that generated by SCRIP. See Section 9.5 for a description of the format. Note that the sequence of the weights in the file can vary with the number of processors used to run the application. This means that two weight files generated by using different numbers of processors can contain exactly the same interpolation matrix, but can appear different in a direct line by line comparison (such as would be done by ncdiff).
The command line arguments are all keyword based. Both the long keyword prefixed with '--'
or the
one character short keyword prefixed with '-' are supported. The format to run the application is
as follows:
ESMF_RegridWeightGen [--help] [--version] [--source|-s] src_grid_filename [--destination|-d] dst_grid_filename [--weight|-w] out_weight_file [--method|-m] [bilinear|patch|conserve] [--pole|-p] [none|all|teeth|1|2|..] [--ignore_unmapped|-i] --src_type [SCRIP|ESMF] --dst_type [SCRIP|ESMF] -t [SCRIP|ESMF] -r --src_regional --dst_regional --64bit_offset where --help - Print the usage message and exit. --version - Print ESMF version and license information and exit. --source or -s - a required argument specifying the source grid file name --destination or -d - a required argument specifying the destination grid file name --weight or -w - a required argument specifying the output regridding weight file name --method or -m - an optional argument specifying which interpolation method is used. The value can be one of the following: bilinear - for bilinear interpolation, also the default method if not specified. patch - for patch recovery interpolation conserve - for first-order conservative interpolation --pole or -p - an optional argument indicating what to do with the pole. The value can be one of the following: none - No pole, the source grid ends at the top (and bottom) row of nodes specified in <source grid>. all - Construct an artificial pole placed in the center of the top (or bottom) row of nodes, but projected onto the sphere formed by the rest of the grid. The value at this pole is the average of all the pole values. This is the default option. teeth - No new pole point is constructed, instead the holes at the poles are filled by constructing triangles across the top and bottom row of the source Grid. This can be useful because no averaging occurs, however, because the top and bottom of the sphere are now flat, for a big enough mismatch between the size of the destination and source pole regions, some destination points may still not be able to be mapped to the source Grid. <N> - Construct an artificial pole placed in the center of the top (or bottom) row of nodes, but projected onto the sphere formed by the rest of the grid. The value at this pole is the average of the N source nodes next to the pole and surrounding the destination point (i.e. the value may differ for each destination point. Here N ranges from 1 to the number of nodes around the pole. --ignore_unmapped or -i - ignore unmapped destination points. If not specified the default is to stop with an error if an unmapped point is found. --src_type - an optional argument specifying the source grid file type. The value could be either SCRIP or ESMF. Currently, the ESMF file type is only available for the unstructured grid. The default option is SCRIP. --dst_type - an optional argument specifying the destination grid file type. The value could be either SCRIP or ESMF. Currently, the ESMF file type is only available for the unstructured grid. The default option is SCRIP. -t - an optional argument specifying the file types for both the source and the destination grid files. The default option is SCRIP. If both -t and --src_type or --dst_type are given at the same time and they disagree with each other, an error message will be generated. -r - an optional argument specifying that the source and destination grids are regional grids. If the argument is not given, the grids are assumed to be global. --src_regional - an optional argument specifying that the source is a regional grid and the destination is a global grid. --dst_regional - an optional argument specifying that the destination is a regional grid and the source is a global grid. --64bit_offset - an optional argument specifying that the weight file will be created in the NetCDF 64-bit offset format to allow variables larger than 2GB. Note the 64-bit offset format is not supported in the NetCDF version earlier than 3.6.0. An error message will be generated if this flag is specified while the application is linked with a NetCDF library earlier than 3.6.0.
The example below shows the command to generate a set of conservative interpolation weights between a global SCRIP format source grid file (src.nc) and a global SCRIP format destination grid file (dst.nc). The weights are written into file w.nc. In this case the ESMF library and applications have been compiled using an MPI parallel communication library (e.g. setting ESMF_COMM to openmpi) to enable it to run in parallel. To demonstrate running in parallel the mpirun script is used to run the application in parallel on 4 processors.
mpirun -np 4 ./ESMF_RegridWeightGen -s src.nc -d dst.nc -m conserve -w w.nc
The next example below shows the command to do the same thing as the previous example except for three changes. The first change is this time the source grid is regional (``-src_regional''). The second change is that for this example bilinear interpolation (``-m bilinear'') is being used. Because bilinear is the default, we could also omit the ``-m bilinear''. The third change is that in this example some of the destination points are expected to not be found in the source grid, but the user is ok with that and just wants those points to not appear in the weight file instead of causing an error (``-i'').
mpirun -np 4 ./ESMF_RegridWeightGen -i --src_regional -s src.nc -d dst.nc \ -m bilinear -w w.nc
A SCRIP format grid file is a NetCDF file and the header of a sample grid file is shown as follows:
netcdf remap_grid_T42 { dimensions: grid_size = 8192 ; grid_corners = 4 ; grid_rank = 2 ; variables: int grid_dims(grid_rank) ; double grid_center_lat(grid_size) ; grid_center_lat:units = "radians" ; double grid_center_lon(grid_size) ; grid_center_lon:units = "radians" ; int grid_imask(grid_size) ; grid_imask:units = "unitless" ; double grid_corner_lat(grid_size, grid_corners) ; grid_corner_lat:units = "radians" ; double grid_corner_lon(grid_size, grid_corners) ; grid_corner_lon:units ="radians" ; // global attributes: :title = "T42 Gaussian Grid" ; }
The grid_size dimension is the total number of cells in the grid; grid_rank refers to the number of dimensions. grid_rank is 2 for a 2D logically rectangular grid and 1 for an unstructured grid. The integer array grid_dims gives the number of grid cells along each dimension. The number of corners (vertices) in each grid cell is given by grid_corners. Note that if your grid has a variable number of corners on grid cells, then you should set grid_corners to be the highest value and use redundant points on cells with fewer corners. The grid corner coordinates must be written in an order which traces the outside of a grid cell in a counterclockwise order.
The integer array grid_imask is used to mask out grid cells which should not participate in the regridding. The array should by zero for any points that do not participate in the regridding and one for all other points. Coordinate arrays provide the latitudes and longitudes of cell centers and cell corners. The unit of the coordinates can be either "radians" or "degrees".
The regridding weight output file is in NetCDF format and contain some grid information from each grid as well as the regridding indices and weights. Following is the header of a sample output weight file that was generated by regridding a logically rectangular 2D grid to a triangle mesh unstructured grid:
netcdf t42mpas-bilinear { dimensions: n_a = 8192 ; n_b = 20480 ; n_s = 42456 ; nv_a = 4 ; nv_b = 3 ; num_wgts = 1 ; src_grid_rank = 2 ; dst_grid_rank = 1 ; variables: int src_grid_dims(src_grid_rank) ; int dst_grid_dims(dst_grid_rank) ; double yc_a(n_a) ; yc_a:units = "degrees" ; double yc_b(n_b) ; yc_b:units = "radians" ; double xc_a(n_a) ; xc_a:units = "degrees" ; double xc_b(n_b) ; xc_b:units = "radians" ; double yv_a(n_a, nv_a) ; yv_a:units = "degrees" ; double xv_a(n_a, nv_a) ; xv_a:units = "degrees" ; double yv_b(n_b, nv_b) ; yv_b:units = "radians" ; double xv_b(n_b, nv_b) ; xv_b:units = "radians" ; int mask_a(n_a) ; mask_a:units = "unitless" ; int mask_b(n_b) ; mask_b:units = "unitless" ; double area_a(n_a) ; area_a:units = "square radians" ; double area_b(n_b) ; area_b:units = "square radians" ; double frac_a(n_a) ; frac_a:units = "unitless" ; double frac_b(n_b) ; frac_b:units = "unitless" ; int col(n_s) ; int row(n_s) ; double S(n_s) ; // global attributes: :title = "ESMF Offline Regridding Weight Generator" ; :normalization = "destarea" ; :map_method = "Bilinear remapping" ; :conventions = "NCAR-CSM" ; :domain_a = "T42_grid.nc" ; :domain_b = "grid-dual.nc" ; :grid_file_src = "T42_grid.nc" ; :grid_file_dst = "grid-dual.nc" ; :CVS_revision = "5.3.0 beta snapshot" ; }
Variables ended with "_a" are the variables for the source grid and the ones ended with "_b" are the variables for the destination grid. For instance, xc_a and yc_a are corresponding to the grid_center_lon and grid_center_lat variables in the source grid file. The grid information includes the center and corner coordinates and the grid mask array from the input grid file and the grid area and grid frac arrays calculated by ESMF_RegridWeightGen. The grid area array currently is only computed by the conservative remapping option. The values of the area array are set to zeros for bilinear and patch remappings. For conservative remapping, the grid frac array returns the area fraction of the grid cell which participates in the remapping. For bilinear and patch remapping, the destination grid frac array is one where the grid point participates in the remapping and zero otherwise. For bilinear and patch remapping, the source grid frac array is always set to zero.
The indices and weights generated by ESMF_FieldRegridStore() are stored in the output file as variables col, row and S. Where col and row are the indices to the source and the destination grid cells. These are a one-dimension array with length defined by dimension n_s. S is the weight which is multiplied by the source value indicated by col and then summed with the destination value indicated by row to build the final interpolated value of the destination.