The main product delivered by ESMF is the ESMF library that allows application developers to write programs based on the ESMF API. In addition to the programming library, ESMF distributions come with a small set of applications that are of general interest to the community. These applications utilize the ESMF library to implement features such as printing general information about the ESMF installation, or generating regrid weight files. The provided ESMF applications are intended to be used as standard command line tools.
The bundled ESMF applications are built and installed during the usual ESMF installation process, which is described in detail in the ESMF User's Guide section "Building and Installing the ESMF". After the installation the applications will be located in the ESMF_APPSDIR directory, which can be found as a Makefile variable in the esmf.mk file. The esmf.mk file can be found in the ESMF_INSTALL_LIBDIR directory after a successful installation. The ESMF User's Guide discusses the esmf.mk mechanism to access the bundled applications in more detail in section "Using Bundled ESMF Applications".
The following sections provide in-depth documentation of the bundled ESMF applications. In addition, each application supports the standard --help
command line argument, providing a brief description of how to invoke the program.
The ESMF_Info application prints basic information about the ESMF installation to stdout.
The application usage is as follows:
ESMF_Info [--help] where --help prints a brief usage message
This section describes the offline regridding application provided by ESMF. Regridding, also called remapping or interpolation, is the process of changing the grid that underlies data values while preserving qualities of the original data. Different kinds of transformations are appropriate for different problems. Regridding may be needed when communicating data between Earth system model components such as land and atmosphere, or between different data sets to support operations such as visualization.
Regridding can be broken into two stages. The first stage is generation of an interpolation weight matrix that describes how points in the source grid contribute to points in the destination grid. The second stage is the multiplication of values on the source grid by the interpolation weight matrix to produce values on the destination grid. This occurs through a parallel sparse matrix multiply.
There are two options for accessing ESMF regridding functionality: integrated and offline. Integrated regridding is a process whereby interpolation weights are generated via subroutine calls during the execution of the user's code. The integrated regridding can also perform the parallel sparse matrix multiply. In other words, ESMF integrated regridding allows a user to perform the whole process of interpolation within their code. For a further description of ESMF integrated regridding please see Section . In contrast to integrated regridding, offline regridding is a process whereby interpolation weights are generated by a separate ESMF application, not within the user code. The ESMF offline regridding application also only generates the interpolation matrix, the user is responsible for reading in this matrix and doing the actual interpolation (multiplication by the sparse matrix) in their code. The rest of this section further describes ESMF offline regridding.
For a discussion of installing and accessing ESMF applications such as this one please see the beginning of this part of the refernce manual (Section II) or for the quickest approach to just building and accessing the applications please refer to the "Building and using bundled ESMF applications" Section in the ESMF User's Guide.
As described above, this tool reads in two grid files and outputs weights for interpolation between the two grids. The input and output files are all in NetCDF format. The grid files can be defined in four different formats: the SCRIP format 9.4 as is used as an input to SCRIP [3], the GRIDSPEC Tile grid file 9.6 following the CF metadata conventions, the ESMF unstructured grid format 9.5 or the proposed CF unstructured grid 9.7 in the ESMF 5.3.0 release. GRIDSPEC is a proposed CF extention for the annotation of complex Earth system grids. In 5.3.0, we only support a single tile grid file for rectangular lat/lon grid. For UGRID, currently we only support the 2D flexible mesh topology with mixed triangles and quadrilaterals.
The weight file is the same format 9.8 as is output by SCRIP. The interpolation weights can be generated with the bilinear, patch, or first order conservative methods decribed below. Masking is supported for 2D logically rectangular (i.e. with grid_rank=2) grids in the SCRIP format. This application can do regrid weight generation from a global or regional source grid to a global or regional destination grid. It assumes that the source and destination grids are on a sphere and that the coordinates given in the files are latitude and longitude values. The coordinates can either be in degrees or radians (this is indicated by the "units" attribute attached to the value). As is true with many global models, this application currently assumes the latitude and longitude refer to positions on a perfect sphere, as opposed to a more complex and accurate representation of the earth's true shape such as would be used in a GIS system. (ESMF's current user base doesn't require this level of detail in representing the earth's shape, but it could be added in the future if necessary.) This file based regrid weight generation application is parallel. This application is used in the ESMF_RegridWeightGenCheck external demo, so that can serve as an example of its use.
This application requires the NetCDF libary to read the grid files and write out the weight files in NetCDF format. In addition, it also requires LAPACK library calls to generate the patch regridding weights. To compile ESMF with the NetCDF library and the LAPACK library, please refer to the "Third Party Libraries" Section in the ESMF User's Guide for more information.
Internally this application uses the ESMF public API to generate the interpolation weights. If a source or destination grid is logically rectangular, then ESMF_GridCreate() is used to create an ESMF_Grid object. The cell center coordinates of the input grid are put into the center stagger location (ESMF_STAGGERLOC_CENTER). In addition, the corner coordinates are also put into the corner stagger location (ESMF_STAGGERLOC_CORNER) for conservative regridding. The method ESMF_MeshCreate() is used to create an ESMF_Mesh object, if the source or destination grid is a cubed sphere grid or an unstructured grid. When making this call, the flag convert3D is set to TRUE to convert the 2D coordinates into 3D Cartesian coordinates. Currently, ESMF only supports triangle or quadrilateral element types for a 2D Mesh. Therefore, when the cells in an unstructured grid contain more than four edges, they are broken into multiple triangle elements before ESMF_MeshCreate() is called to create the ESMF_Mesh object. After the calculation of the weight matrix based on the broken up cells, the matrix entries for the triangles are merged together, so that the output matrix is in terms of the original cells. Internally ESMF_FieldRegridStore() is used to generate the weight table and indices table representing the interpolation matrix.
The regridding occurs in 3D to avoid problems with periodicity and with the pole singularity. This application supports four options for handling the pole region (i.e. the empty area above the top row of the source grid or below the bottom row of the source grid). The first option is to leave the pole region empty ("-p none"), in this case if a destination point lies above or below the top row of the source grid, it will fail to map, yielding an error (unless "-i" is specified). With the next two options, the pole region is handled by constructing an artificial pole in the center of the top and bottom row of grid points and then filling in the region from this pole to the edges of the source grid with triangles. The pole is located at the average of the position of the points surrounding it, but moved outward to be at the same radius as the rest of the points in the grid. The difference between these two artificial pole options is what value is used at the pole. The default pole option ("-p all") sets the value at the pole to be the average of the values of all of the grid points surrounding the pole. For the other option ("-p N"), the user chooses a number N from 1 to the number of source grid points around the pole. For each destination point, the value at the pole is then the average of the N source points surrounding that destination point. For the last pole option ("-p teeth") no artificial pole is constructed, instead the pole region is covered by connecting points across the top and bottom row of the source Grid into triangles. As this makes the top and bottom of the source sphere flat, for a big enough difference between the size of the source and destination pole regions, this can still result in unmapped destination points. Only pole option "none" is currently supported with the conservative interpolation method (i.e. "-m conserve").
Masking is supported for both the logically rectangular grids and the unstructured grids.
If the grid file is in the SCRIP format, the variable "grid_imask" is used as the mask.
If the value is set to 0 for a grid point, then that point is considered masked out and
won't be used in the weights generated by the application. If the grid file is in the ESMF format, the variable "elementMask" is used as the mask. For a grid defined in the GRIDSPEC
Tile grid or in the UGRID convention, there is no mask variable defined.
However, a GRIDSPEC or a UGRID file may contain both the grid definition and the data.
The grid mask is usually constructed using the missing values defined in the data variable.
The regridding application provides the argument "--
src_missingvalue" or
"--
dst_missingvalue" for users to specify the variable name from where the mask can be
constructed.
If a destination point can't be mapped because it falls outside the unmasked source grid, then the default behavior of the application is to stop with an error. By specifying "-i" or the equivalent "--
ignore_unmapped" the user can cause the application to ignore unmapped destination points. In this case, the output matrix won't contain entries for the unmapped destination points.
This regridding application can be used to generate bilinear, patch, or first-order conservative interpolation weights. The default interpolation method is bilinear. The algorithm used by this application to generate the bilinear weights is the standard one found in many textbooks. Each destination point is mapped to a location in the source Mesh, the position of the destination point relative to the source points surrounding it is used to calculate the interpolation weights.
This application can also be used to generate patch interpolation weights. Patch interpolation is the ESMF version of a technique called "patch recovery" commonly used in finite element modeling [1] [2]. It typically results in better approximations to values and derivatives when compared to bilinear interpolation. Patch interpolation works by constructing multiple polynomial patches to represent the data in a source element. For 2D grids, these polynomials are currently 2nd degree 2D polynomials. The interpolated value at the destination point is the weighted average of the values of the patches at that point.
The patch interpolation process works as follows. For each source element containing a destination point we construct a patch for each corner node that makes up the element (e.g. 4 patches for quadrilateral elements, 3 for triangular elements). To construct a polynomial patch for a corner node we gather all the elements around that node. (Note that this means that the patch interpolation weights depends on the source element's nodes, and the nodes of all elements neighboring the source element.) We then use a least squares fitting algorithm to choose the set of coefficients for the polynomial that produces the best fit for the data in the elements. This polynomial will give a value at the destination point that fits the source data in the elements surrounding the corner node. We then repeat this process for each corner node of the source element generating a new polynomial for each set of elements. To calculate the value at the destination point we do a weighted average of the values of each of the corner polynomials evaluated at that point. The weight for a corner's polynomial is the bilinear weight of the destination point with regard to that corner. The patch method has a larger stencil than the bilinear, for this reason the patch weight matrix can be correspondingly larger than the bilinear matrix (e.g. for a quadrilateral grid the patch matrix is around 4x the size of the bilinear matrix). This can be an issue when performing a regrid weight generation operation close to the memory limit on a machine.
First-order conservative interpolation [4] is also available as a regridding method. This method will typically have
a larger local interpolation error than the previous two methods, but will do a much better job of preserving the value of the
integral of data between the source and destination grid. In this method the value across each source cell
is treated as a constant. The weights for a particular destination cell are the area of intersection of each
source cell with the destination cell divided by the area of the destination cell.
Areas in this case are calculated by connecting the corner coordinates of each grid cell (obtained from the grid file) with great circles. If the user doesn't specify
the user area's option ("--
user_areas"), then the conservation will hold for the great circle areas calculated by
ESMF (and these are output to the weight file). This means the following equation will hold: sum-over-all-source-cells(Vsi*Asi) = sum-over-all-destination-cells(Vdj*A'dj), where
V is the variable being regridded and A' is the area of a cell as calculated by ESMF. The subscripts s and d refer to source and destination values, and the i and j are the source
and destination grid cell indices (flattening the arrays to 1 dimension). If the user does specify the user area's option, then the conservation will be adjusted to work for the areas
provided by the user in the grid files (and these areas are output to the weight file). This means the following equation will hold: sum-over-all-source-cells(Vsi*Asi) = sum-over-all-destination-cells(Vdj*Adj), where A is the area of a cell as provided by the user.
Note that since the conservative assumes great circle edges to cells, the edges of a cell won't necessarily be the same as a straight line in latitude longitude. For small edges, this difference will be small, but for long edges it could be significant. This means if the user expects cell edges as straight lines in latitude longitude space, they should avoid using one large cell with long edges to compute an average over a region (e.g. over an ocean basin). The user should also avoid using cells which contain one edge that runs half way or more around the earth, because the regrid weight calculation assumes the edge follows the shorter great circle path. Also, there isn't a unique great circle edge defined between points on the exact opposite side of the earth from one another (antipodal points). However, the user can work around both of these problem by breaking the long edge into two smaller edges by inserting an extra node, or by breaking the large target grid cells into two or more smaller grid cells. This allows the application to resolve the ambiguity in edge direction.
It is important to note that the current implementation of conservative regridding doesn't normalize the interpolation weights by the destination fraction. This means that for a destination grid which only partially overlaps the source grid the destination field which is output from the regrid operation should be divided by the corresponding destination fraction to yield the true interpolated values for cells which are only partially covered by the source grid. The fraction also needs to be included when computing the total source and destination integrals.
The following pseudo-code shows how to compute the total source integral (src_total) given the source field values (src_field), the source area (src_area) called area_a in the weight file, and the source fraction (src_frac) called frac_a in the weight file:
src_total=0.0 for each source element i src_total=src_total+src_field(i)*src_area(i)*src_frac(i) end for
The following pseudo-code shows how to compute the total destination integral (dst_total) given the destination field values (dst_field) resulting from the sparse matrix multiply of the weights in the weight file by the source field, the destination area (dst_area) called area_b in the weight file, and the destination fraction (dst_frac) called frac_b in the weight file. It also shows how to adjust the destination field (dst_field) resulting from the sparse matrix multiply by the fraction (dst_frac) called frac_b in the weight file:
dst_total=0.0 for each destination element i if (dst_frac(i) not equal to 0.0) then dst_total=dst_total+dst_field(i)*dst_area(i) dst_field(i)=dst_field(i)/dst_frac(i) ! If mass computed here after dst_field adjust, would need to be: ! dst_total=dst_total+dst_field(i)*dst_area(i)*dst_frac(i) end if end for
The interpolation weights generated by this application are output to a NetCDF file (specified by the "-w" or "--
weight"
keywords). The format of this file is the same as that generated by SCRIP. See Section 9.8 for a description of the format.
Note that the sequence of the weights in the file can
vary with the number of processors used to run the application. This means that two weight files generated by using different
numbers of processors can contain exactly the same interpolation matrix, but can appear different in a direct line by line
comparison (such as would be done by ncdiff).
The command line arguments are all keyword based. Both the long keyword prefixed with '--'
or the
one character short keyword prefixed with '-' are supported. The format to run the application is
as follows:
ESMF_RegridWeightGen [--help] [--version] [--source|-s] src_grid_filename [--destination|-d] dst_grid_filename [--weight|-w] out_weight_file [--method|-m] [bilinear|patch|conserve] [--pole|-p] [none|all|teeth|1|2|..] [--ignore_unmapped|-i] --src_type [SCRIP|GRIDSPEC|ESMF|UGRID] --dst_type [SCRIP|GRIDSPEC|ESMF|UGRID] -t [SCRIP|GRIDSPEC|ESMF|UGRID] -r --src_regional --dst_regional --64bit_offset --src_meshname dummy_var_name --dst_meshname dummy_var_name --src_missingvalue var_name --dst_missingvalue var_name --user_areas where --help - Print the usage message and exit. --version - Print ESMF version and license information and exit. --source or -s - a required argument specifying the source grid file name --destination or -d - a required argument specifying the destination grid file name --weight or -w - a required argument specifying the output regridding weight file name --method or -m - an optional argument specifying which interpolation method is used. The value can be one of the following: bilinear - for bilinear interpolation, also the default method if not specified. patch - for patch recovery interpolation conserve - for first-order conservative interpolation --pole or -p - an optional argument indicating what to do with the pole. The value can be one of the following: none - No pole, the source grid ends at the top (and bottom) row of nodes specified in <source grid>. all - Construct an artificial pole placed in the center of the top (or bottom) row of nodes, but projected onto the sphere formed by the rest of the grid. The value at this pole is the average of all the pole values. This is the default option. teeth - No new pole point is constructed, instead the holes at the poles are filled by constructing triangles across the top and bottom row of the source Grid. This can be useful because no averaging occurs, however, because the top and bottom of the sphere are now flat, for a big enough mismatch between the size of the destination and source pole regions, some destination points may still not be able to be mapped to the source Grid. <N> - Construct an artificial pole placed in the center of the top (or bottom) row of nodes, but projected onto the sphere formed by the rest of the grid. The value at this pole is the average of the N source nodes next to the pole and surrounding the destination point (i.e. the value may differ for each destination point. Here N ranges from 1 to the number of nodes around the pole. --ignore_unmapped or -i - ignore unmapped destination points. If not specified the default is to stop with an error if an unmapped point is found. --src_type - an optional argument specifying the source grid file type. The value could be one of SCRIP, GRIDSPEC, ESMF or UGRID. The SCRIP file can be either structured or unstructured grid. The GRIDSPEC is the only for the structured grid defined in the CF convention. The ESMF and UGRID file types are only available for the unstructured grid. The default option is SCRIP. --dst_type - an optional argument specifying the destination grid file type. The value could be one of SCRIP, GRIDSPEC, ESMF or UGRID. The SCRIP file can be either structured or unstructured grid. The GRIDSPEC is the only for the structured grid defined in the CF convention. UGRID. The ESMF and UGRID file types are only available for the unstructured grid. The default option is SCRIP. -t - an optional argument specifying the file types for both the source and the destination grid files. The default option is SCRIP. If both -t and --src_type or --dst_type are given at the same time and they disagree with each other, an error message will be generated. -r - an optional argument specifying that the source and destination grids are regional grids. If the argument is not given, the grids are assumed to be global. --src_regional - an optional argument specifying that the source is a regional grid and the destination is a global grid. --dst_regional - an optional argument specifying that the destination is a regional grid and the source is a global grid. --64bit_offset - an optional argument specifying that the weight file will be created in the NetCDF 64-bit offset format to allow variables larger than 2GB. Note the 64-bit offset format is not supported in the NetCDF version earlier than 3.6.0. An error message will be generated if this flag is specified while the application is linked with a NetCDF library earlier than 3.6.0. --src_meshname - this argument is required if the source grid type is UGRID. It defines the dummy variable name that has all the topology information stored in its attributes. --dst_meshname - this argument is required if the destination grid type is UGRID. It defines the dummy variable name that has all the topology information stored in its attributes. --src_missingvalue - an optional argument that defines the variable name in the source grid file if the file type is GRIDSPEC. The regridder will generate a mask using the missing values of the data variable. The missing value is defined using an attribute called "_FillValue" or "missing_value" --dst_missingvalue - an optional argument that defines the variable name in the destination grid file if the file type is GRIDSPEC. The regridder will generate a mask using the missing values of the data variable. The missing value is defined using an attribute called "_FillValue" or "missing_value" --user_areas - an optional argument specifying that the conservation is adjusted to hold for the user areas provided in the grid files. If not specified, then the conservation will hold for the ESMF calculated (great circle) areas. Whichever areas the conservation holds for are output to the weight file.
The example below shows the command to generate a set of conservative interpolation weights between a global SCRIP format source grid file (src.nc) and a global SCRIP format destination grid file (dst.nc). The weights are written into file w.nc. In this case the ESMF library and applications have been compiled using an MPI parallel communication library (e.g. setting ESMF_COMM to openmpi) to enable it to run in parallel. To demonstrate running in parallel the mpirun script is used to run the application in parallel on 4 processors.
mpirun -np 4 ./ESMF_RegridWeightGen -s src.nc -d dst.nc -m conserve -w w.nc
The next example below shows the command to do the same thing as the previous example except for three changes. The first
change is this time the source grid is regional ("--
src_regional"). The second change is that
for this example bilinear interpolation ("-m bilinear") is being used. Because bilinear is the default, we could also
omit the "-m bilinear". The third change is that in this example some of the destination points are expected to
not be found in the source grid, but the user is ok with that and just wants those points to not appear in the weight file instead of causing an error ("-i").
mpirun -np 4 ./ESMF_RegridWeightGen -i --src_regional -s src.nc -d dst.nc \ -m bilinear -w w.nc
The default grid file format is SCRIP, to use a grid file in another grid format, you
need to use the switches "--
src_type", "--
dst_type" or "-t". For example, if the
source grid is in UGRID format and the destination grid is in GRIDSPEC format, the command
to run the application is:
mpirun -np 4 ./ESMF_RegridWeightGen -s src.nc -d dst.nc \ -m conserve -w w.nc --src_type UGRID --dst_type GRIDSPEC \ --src_meshname mesh_dummy
Since the source grid is a UGRID, an additional argument "--
src_meshname" needs to be provided. This is the dummy variable used to define all the mesh topology information in the
grid file.
The last example shows how to use the missing values of a data variable to generate the grid mask for a GRIDSPEC file and use user defined area for the conservative regridding.
mpirun -np 4 ./ESMF_RegridWeightGen -s src.nc -d dst.nc -m conserve \ -w w.nc --src_type GRIDSPEC --src_missingvalue datavar \ --user_areas
In the above example, "datavar" is the variable name defined in the source grid that will be used to construct the mask using its missing values.
A SCRIP format grid file is a NetCDF file and the header of a sample grid file is shown as follows:
netcdf remap_grid_T42 { dimensions: grid_size = 8192 ; grid_corners = 4 ; grid_rank = 2 ; variables: int grid_dims(grid_rank) ; double grid_center_lat(grid_size) ; grid_center_lat:units = "radians"; double grid_center_lon(grid_size) ; grid_center_lon:units = "radians" ; int grid_imask(grid_size) ; grid_imask:units = "unitless" ; double grid_corner_lat(grid_size, grid_corners) ; grid_corner_lat:units = "radians" ; double grid_corner_lon(grid_size, grid_corners) ; grid_corner_lon:units ="radians" ; // global attributes: :title = "T42 Gaussian Grid" ; }
The grid_size dimension is the total number of cells in the grid; grid_rank refers to the number of dimensions. grid_rank is 2 for a 2D logically rectangular grid and 1 for an unstructured grid. The integer array grid_dims gives the number of grid cells along each dimension. The number of corners (vertices) in each grid cell is given by grid_corners. Note that if your grid has a variable number of corners on grid cells, then you should set grid_corners to be the highest value and use redundant points on cells with fewer corners.
The integer array grid_imask is used to mask out grid cells which should not participate in the regridding. The array should by zero for any points that do not participate in the regridding and one for all other points. Coordinate arrays provide the latitudes and longitudes of cell centers and cell corners. The unit of the coordinates can be either "radians" or "degrees".
Both the SCRIP grid file format and the SCRIP weight file format work with the SCRIP 1.4 tools.
ESMF also supports a more general unstructured grid file format for describing meshes. In the ESMF file format, the node coordinates are defined in a separate array nodeCoords. nodeCoords is a two-dimensional array of dimension (nodeCount,coordDim). For a 2D Grid, coordDim is 2. nodeCoords(:,1) contains the longitude coordinates and nodeCoords(:,2) contains the latitude coordinates. The same order applies to centerCoords. The indices to the nodeCoords array are used in the element connectivity array elementConn, and they are 1-based. While in the SCRIP format, the two are combined into grid_corner_lon and grid_corner_lat arrays. Note that the elementConn array must be defined in an order such that the nodes it references trace the outside of a grid cell in a counterclockwise order.
The ESMF format is more general than the SCRIP format because it supports higher dimension coordinates and more general topologies. The following is a sample header of a mesh described in the ESMF format.
netcdf ne4np4-esmf { dimensions: nodeCount = 866 ; elementCount = 936 ; maxNodePElement = 4 ; coordDim = 2 ; variables: double nodeCoords(nodeCount, coordDim); nodeCoords:units = "degrees" ; int elementConn(elementCount, maxNodePElement) ; elementConn:long_name = "Node Indices that define the element connectivity"; elementConn:_FillValue = -1 ; byte numElementConn(elementCount) ; numElementConn:long_name = "Number of nodes per element" ; double centerCoords(elementCount, coordDim) ; centerCoords:units = "degrees" ; double elementArea(elementCount) ; elementArea:units = "radians^2" ; elementArea:long_name = "area weights" ; int elementMask(elementCount) ; elementMask:_FillValue = -9999. ; // global attributes: :gridType="unstructured"; :version = "0.9" ; :inputFile = "ne4np4-pentagons.nc" ; :timeGenerated = "Fri Apr 16 16:05:24 2010" ; }
In ESMF 5.3.0, two new NetCDF file formats are added to ESMF_RegridWeightGen, i.e. the CF GRIDSPEC convention for logically rectangular lat/lon grid and the CF UGRID convention for the unstructured grid.
GRIDSPEC is a draft proposal to extend the Climate and Forecast (CF) metadata conventions for the representation of gridded data for Earth System Models. The original GRIDSPEC standard was proposed by V. Balaji and Z. Liang of GFDL. It was further developed by D. Kinley and A. Pletzer of Tech X (see ref). The ESMF implementation is based on the GRIDSPEC document last updated on 2/9/2012.
GRIDSPEC extends the current CF convention to support grid mosaics, i.e., a grid consisting of multiple logically rectangular grid tiles. It also provides a mechanism for storing a grid dataset in multiple files. Therefore, it introduces different types of files, such as a mosaic file that defines the multiple tiles and their connectivity, a host file that aggregrates the grid mosiac and the data files, and a tile file for a single tile grid defination.
Currently, ESMF only supports the Grid creation and regridding from a single tile grid file. A tile file is a CF compliant NetCDF file for a logically rectangular lat/lon grid based on the CF Metadat Conventions V1.6. An example grid file is shown below. The cell center coordinates are defined in the variables whose attribute standard_name or long_name is set to either latitude or longitude. The latitude and the longitude variables are one-dimensional arrays if the grid is a regular lat/lon grid, two-dimensional arrays if the grid is curvilinear. The bound coordinate variables define the bound or the corner coordinates of a cell. The bound variable name is specified in the bounds attribute of the latitude and longitude variables. In the following example, the latitude bound variable is lat_bnds and the longitude bound variable is lon_bnds. The bound variables are 2D arrays for a regular lat/lon grid and a 3D array for a curvilinear grid. The first dimension of the bound array is 2 for a regular lat/lon grid and 4 for a curvilinear grid. The bound coordinates for a curvilinear grid is defined in counterclockwise order. In the example below, the grid is a regular lat/lon grid, thus the coordinate variables are 1D and the bound variables are 2D with the first dimension equal to 2. The bound coordinates will be read in and stored in a ESMF Grid object as the corner stagger coordinates when doing a conservative regrid.
Since a GRIDSPEC tile file does not have a way to specify the grid mask, the mask is usually derived by the missing values stored in a data variable. ESMF_RegridWeightGen provides an option for users to derive the grid mask from a data variable's missing values. The value of the missing value is defined by the variable attribue missing_value or _FillValue. If the value of the data point is equal to the missing value, the grid mask for that grid point is set to 0, otherwise, it is set to 1. In the following grid, the variable so can be used to derive the grid mask. A data variable could be a 2D, 3D or 4D. For example, it may have additional depth and time dimensions. It is assumed that the first and the second dimensions of the data variable should be the longitude and the latitude dimension. ESMF_RegridWeightGen will use the first 2D data values to derive the grid mask.
netcdf single_tile_grid { dimensions: time = 1 ; bound = 2 ; lat = 181 ; lon = 360 ; variables: double lat(lat) ; lat:bounds = "lat_bnds" ; lat:units = "degrees_north" ; lat:long_name = "latitude" ; lat:standard_name = "latitude" ; double lat_bnds(lat, bound) ; double lon(lon) ; lon:bounds = "lon_bnds" ; lon:long_name = "longitude" ; lon:standard_name = "longitude" ; lon:units = "degrees_east" ; double lon_bnds(lon, bound) ; float so(time, lat, lon) ; so:standard_name = "sea_water_salinity" ; so:units = "psu" ; so:missing_value = 1.e+20f ; }
The UGRID file format is a proposed extention to the CF metadata conventions for the unstructured grid data model. The latest proposal can be found here. The proposal is still evolving, the Mesh creation API and ESMF_RegridWeightGen in ESMF 5.3.0 is based on the version updated on March 14, 2012.
In the UGRID proposal, a 1D, 2D, or 3D mesh topology can be defined for an unstructured grid. Currently, ESMF only supports the 2D flexible mesh topology where each cell (a.k.a. "face" as defined in the UGRID document) in the mesh may have differnt number of corner nodes. The main addition of the UGRID extention is a dummy variable that defines the mesh topology and its connectivity. This additional variable has a required attribute standard_name with value "mesh_topology". In addition, it has three more required attributes: dimension, node_coordinates and face_node_connectivity. The value of dimension attribute should be 2 for a 2D mesh. The value of attribute node_coordinates is the names of the node longitude and latitude variables. The value of attribute face_node_connectivity is the variable name that defines the corner node indices for each face of the mesh.
In the following example, the dummy mesh topology variable is fvcom_mesh. As described above, its standard_name attribute has to be mesh_topology and dimension attribute has to be 2 for a 2D mesh. It defines the node coordinate variable names to be lon and lat. It also specifies the face/node connectivity variable name as nv.
The variable nv is a two-dimensional array that defines the node indices of each face. The first dimension defines the maximal number of nodes for each face. In this example, it is a triangle mesh so the number of nodes per face is 3. Since each face may have different number of corner nodes, some of the cells may have fewer nodes than the specified dimension. In that case, it is filled with the missing values defined by the attribute _FillValue. The nodes are in counter clockwise order. An optional attribute start_index defines whether the node index is 1-based or 0-based.
The coordinate variables follows the CF metadata convention for coordinates. They are 1D array with attribute standard_name being either latitutde or longitude. The units of the coordinates can be either degrees or radians.
The UGRID files may also contain data variables. The data may be located at the nodes or at the faces. Two additional attributes are introduced in the UGRID extension for the data variables: location and mesh. The location attribute defines where the data is located, it can be either face or node. The mesh attribute defines which mesh topology this variable belongs to since multiple mesh topologies may be defined in one file. ESMF_RegridWeightGen uses the data variable on the face to derive the masks for the mesh cells in the same way as for a GRIDSPEC file. Currently, ESMF only supports mask on the cells, not on the nodes.
When creating a ESMF Mesh from a UGRID file, user has to provide the mesh topology mesh variable name to ESMF_MeshCreate().
netcdf FVCOM_grid2d { dimensions: node = 417642 ; nele = 826866 ; three = 3 ; time = 1 ; variables: // Mesh topology int fvcom_mesh; fvcom_mesh:standard_name = "mesh_topology" ; fvcom_mesh:dimension = 2. ; fvcom_mesh:node_coordinates = "lon lat" ; fvcom_mesh:face_node_connectivity = "nv" ; int nv(nele, three) ; nv:standard_name = "face_node_connectivity" ; nv:start_index = 1. ; // Mesh node coordinates float lon(node) ; lon:standard_name = "longitude" ; lon:units = "degrees_east" ; float lat(node) ; lat:standard_name = "latitude" ; lat:units = "degrees_north" ; // Data variable float ua(time, nele) ; ua:standard_name = "barotropic_eastward_sea_water_velocity" ; ua:missing_value = -999. ; ua:location = "face" ; ua:mesh = "fvcom_mesh" ; float va(time, nele) ; va:standard_name = "barotropic_northward_sea_water_velocity" ; va:missing_value = -999. ; va:location = "face" ; va:mesh = "fvcom_mesh" ; }
The regridding weight output file is in NetCDF format and contain some grid information from each grid as well as the regridding indices and weights. Following is the header of a sample output weight file that was generated by regridding a logically rectangular 2D grid to a triangle mesh unstructured grid:
netcdf t42mpas-bilinear { dimensions: n_a = 8192 ; n_b = 20480 ; n_s = 42456 ; nv_a = 4 ; nv_b = 3 ; num_wgts = 1 ; src_grid_rank = 2 ; dst_grid_rank = 1 ; variables: int src_grid_dims(src_grid_rank) ; int dst_grid_dims(dst_grid_rank) ; double yc_a(n_a) ; yc_a:units = "degrees" ; double yc_b(n_b) ; yc_b:units = "radians" ; double xc_a(n_a) ; xc_a:units = "degrees" ; double xc_b(n_b) ; xc_b:units = "radians" ; double yv_a(n_a, nv_a) ; yv_a:units = "degrees" ; double xv_a(n_a, nv_a) ; xv_a:units = "degrees" ; double yv_b(n_b, nv_b) ; yv_b:units = "radians" ; double xv_b(n_b, nv_b) ; xv_b:units = "radians" ; int mask_a(n_a) ; mask_a:units = "unitless" ; int mask_b(n_b) ; mask_b:units = "unitless" ; double area_a(n_a) ; area_a:units = "square radians" ; double area_b(n_b) ; area_b:units = "square radians" ; double frac_a(n_a) ; frac_a:units = "unitless" ; double frac_b(n_b) ; frac_b:units = "unitless" ; int col(n_s) ; int row(n_s) ; double S(n_s) ; // global attributes: :title = "ESMF Offline Regridding Weight Generator" ; :normalization = "destarea" ; :map_method = "Bilinear remapping" ; :conventions = "NCAR-CSM" ; :domain_a = "T42_grid.nc" ; :domain_b = "grid-dual.nc" ; :grid_file_src = "T42_grid.nc" ; :grid_file_dst = "grid-dual.nc" ; :CVS_revision = "5.3.0 beta snapshot" ; }
Variables ended with "_a" are the variables for the source grid and the ones ended with "_b" are the variables for the destination grid. For instance, xc_a and yc_a are corresponding to the grid_center_lon and grid_center_lat variables in the source grid file. The grid information includes the center and corner coordinates and the grid mask arrays (mask_a and mask_b) from the input grid files. The grid areas (area_a and area_b) are either provided by the user or computed by ESMF_RegridWeightGen. The grid area array is only output when the conservative remapping option is used. The values of the area array are set to zero for bilinear and patch remappings. The grid frac arrays (frac_a and frac_b) are calculated by ESMF_RegridWeightGen. For conservative remapping, the grid frac array returns the area fraction of the grid cell which participates in the remapping. For bilinear and patch remapping, the destination grid frac array is one where the grid point participates in the remapping and zero otherwise. For bilinear and patch remapping, the source grid frac array is always set to zero.
The indices and weights generated by ESMF_FieldRegridStore() are stored in the output file as variables col, row and S. Where col and row are the indices to the source and the destination grid cells. These are a one-dimension array with length defined by dimension n_s. S is the weight which is multiplied by the source value indicated by col and then summed with the destination value indicated by row to build the final interpolated value of the destination.