Subsections

4 Processes

The ESMF development environment has several defining characteristics. First, both the ESMF Core Team and the JST are distributed. This makes incorporating simple, efficient communication mechanisms into the development process essential. Second, the JST and Core Team work on a range of different platforms, at sites that don't have the time, resources, or inclination to install demanding packages. Collaboration tools that require no purchase or installation before use are essential. Finally, ESMF is committed to open development. As much as possible, the ESMF team tries to keep the workings of the project - metrics, support and bug lists, schedules, task lists, source code, you name it - visible to the broad community.

4.1 Software Process Model

The ESMF software development cycle is based on the staged delivery model [#!mcconnell96!#]. The steps in this software development model are:

Software Concept Collect and itemize the high-level requirements of the system and identify the basic functions that the system must perform.
Requirements Analysis Write and review a requirements document - a detailed statement of the scientific and computational requirements for the software.
Architectural Design Define a high-level software architecture that outlines the functions, relationships, and interfaces for major components. Write and review an architecture document.
Stage 1, 2, ..., n Repeat the following steps creating a potentially releasable product at the end of each stage. Each stage produces a more robust, complete version of the software.
- Detailed Design Create a detailed design document and API specification. Incorporate the interface specification into a detailed design document and review the design.
- Code Construction and Unit Testing Implement the interface, debug and unit test.
- System Testing Assemble the complete system, verify that the code satisfies all requirements.
- Release Create a potentially releasable product, including User's Guide and User's Reference Manual. Frequently code produced at intermediate stages software will be used internally.
Code Distribution and Maintenance Official public release of the software, beginning of maintenance phase.

We have customized and extended this standard model to suit the ESMF project. At this stage of ESMF development, we are in the iterative design/implement/release cycle. Below are a few notes on earlier stages.

4.2 ESMF Process History

4.2.1 Software Concept

Participants in the ESMF project completed the Software Concept stage in the process of developing a unified set of proposals. A summary of the high-level requirements of ESMF - a statement of project scope and vision - is included in the General Requirements part of the ESMF Requirements Document[#!bib:ESMFreqdoc!#]. This was a successful effort in defining the scope of the project and agreeing to an overall design strategy.

4.2.2 Requirements Analysis

The ESMF Team spent about six months at the start of the project producing the ESMF Requirements Document. This outlined the major ESMF capabilities necessary to meet project milestones and achieve project goals. The second part of the document was a detailed requirements specification for each functionality class included in the framework. This document also included a discussion of the process that was used to initially collect requirements. The Requirements Document was a useful reference for the development team, especially for new developers coming in from outside of the Earth science domain. However, as the framework matured, support requests and the Change Review Board process took precedence in defining development tasks and setting priorities. The Requirements Document is bundled with the ESMF source distribution through version 2; with version 3 it was removed.

4.2.3 Architectural Design

The project had difficulty with the Architecture Document. The comments received back on the completed work, informally and from a peer review body, indicated that the presentation of the document was ineffective at conveying how the ESMF worked. Although the document was full of detailed and complex diagrams, the terminology and diagrams were oriented to software engineers and were not especially scientist-friendly. The detailed diagrams also made the document difficult to maintain. This experience helped to guide the ESMF project towards more user-oriented documents, but it also left a gap in the documentation that has taken time to fill.

4.3 Ongoing Development

The following are processes the ESMF team is actively following. These guidelines apply to core team developers and outside contributors who will be checking code into the main ESMF repository.

All design and code reviews are telecons held with the JST. Telecons are scheduled with the Core Team Manager, put on the ESMF calendar on the home page of the ESMF website, and announced on the esmf_jst@cgd.ucar.edu list.

4.3.1 Telecon Etiquette

When you call in, it's nice to give your name at the first opportunity. Telecon hosts will make an effort to introduce people on the JST calls, especially first-timers. Please don't put the telecon on hold (we sometimes get telecon-stopping music or beeps this way).

Within a week or so after the telecon, the host (the developer if it's a design or code review) is expected to send out a summary to esmf_jst@cgd.ucar.edu with the date of the call, the participants, and notes or conclusions.

4.3.2 Design Reviews

Introductory telecon(s). The point here is to scope out the problem at hand. These calls cover the following, as they apply.
- Understand the capability needed and review requirements.
- Discuss design alternatives.
- Survey and discuss any existing packages that cover the new functionality.
- Discuss potential use test cases.
- Figure out which customers will be involved in use test case development and identify customers that are likely to provide important input (sometimes we offer a friendly reminder service to these folks before relevant telecons).
For these introductory discussions, any form of material is fine - diagrams, slides, plain text ramblings, lists of questions, ...
Initial design review(s). The document presented should be in the format of the ESMF Reference Manual, either in plain text or in latex/ProTeX. This is so the document can be incorporated into project documentation after implementation. The initial review document should include at least the following sections:
- Class description.
- Use and Examples. Here the examples begin with the very simplest cases.
- API sufficient to cover the examples. This is because it can be difficult to follow the examples (e.g. tell what arguments and options are) without the basic API entries accompanying them.
This step is iterated until developers and customers converge.
Full telecon review(s). The developer should prepare the API specification using latex and ProTex following the conventions in the Reference Manual. Most of the Reference Manual section(s) for the new or modified class(es), including Class Options and Restrictions and Future Work, should be available at the time of this review. Diagrams should be ready here too.
Use test case telecon review. For each major piece of functionality, a use test case is prepared in collaboration with customers and executed before release. The use test case is performed on a realistic number of processors and with realistic input data sets.
It doesn't have to work (and probably won't) before it's reviewed, but it needs to work before the functionality appears in a release. The developer checks it into the top-level use_test_case directory on SourceForge and prepares a HTML page outlining it for the Test & Validation page on the ESMF website. Unlike unit and system tests, use test cases aren't distributed with the ESMF source.

4.3.3 Implementation and Test Before Internal Release

Code should be written in accordance with the interface specifications agreed to during design reviews and the coding conventions described in Section .

There is an internal release checklist on the Test & Validation page of the ESMF website that contains an exhaustive listing of develop and tester responsibilities for releases. For additional discussion of test and validation procedures, see Section 4.4.

The developer is responsible for working with the tester(s) to make sure that the following are present before an internal release:

100% unit and system test coverage of new interfaces, with the exception of interfaces where type/kind/rank is heavily overloaded. All arguments tested at least once.
Use test cases work

4.3.4 Implementation and Test Before Public Release

There is a public release checklist on the Test & Validation page of the ESMF website that contains an exhaustive listing of develop and tester responsibilities for releases. For additional discussion of test and validation procedures, see Section 4.4.

Same as for internal release, plus:

Design and Implementation Notes section for Reference Manual complete.
Developer and tester ensure that test coverage for new interfaces is sufficient, implementing any additional tests to make it so. This includes testing of options and tests for error handling and recovery.

4.3.5 Code Check-In

Developers are encouraged to check their changes into the repository as they complete them, as frequently as possible without breaking the existing code base.

Both core and contributors should test on at least three compilers before commit.
For core team developers, a mail should go out to esmf_core@cgd.ucar.edu before check-in for very large commits and for commits that will break the HEAD. For contributors a mail should go out to esmf_core@cgd.ucar.edu before ANY commit.
No code commits should be made between 0.00 and 4:00 Mountain Time. During this time the regression tests scripts are checking out code and any commits will lead to inconsistent test results which are hard to interpret.
Core team developers can be set up to receive email from SourceForge for every check-in, by writing esmf_support@ucar.edu with the request.

To accomplish the first item on the list after a commit of source code, an email can be sent to esmftest@cgd,ucar.edu with the exact subject "Run_ESMF_Test_Build". The mailbox is checked every quarter hour on the quarter hour. This email initiates a test on pluto that builds and installs ESMF with four compilers: g95, gfortran, lahey, and nag, with ESMF_BOPT set to "g" and "O".

When the test is started an email with the subject "ESMF_Test_Builds_Pluto_started", is sent to esmf_core@cgd.ucar.edu, with a time stamp in the body of the message. If a test is already running, an email, with the subject "ESMF_Test_Builds_Pluto_not_started", is sent with "Test not started, a test is already running." in the body. The test that is running will run to completion, a new test will NOT be queued up. A new "Run_ESMF_Test_Build" email must be sent when the running test is completed.

When the test is completed an email, "ESMF_Test_Builds_Pluto" with the test results is sent to esmf-test@lists.sourceforge.net, esmf_test@cgd.ucar.edu. The test results will also appear in the Regression Test webpage under "ESMF_Test_Builds" link towards the top of the page.

4.3.6 Code Reviews

All significant chunks of externally contributed code are reviewed by the JST. It's usual to do the code review after check-in. The code review should be scheduled with the Core Team Manager when the code is checked in, and the code review held before the next release.
We also do code reviews with core team members, as desired/required by the JST.

4.3.7 Releases

The ESMF produces internal releases and public releases based on the schedule generated by the CRB. Every public release is preceded by an internal release three months prior, for the purpose of beta testing. During those three months, bugs may be fixed and documentation improved, but no new functionality may be added. Occasionally the Core Team releases an internal release that does not become a public release. This would happen, for example, when major changes are being made to ESMF and user input is needed for multiple preliminary versions of the software.

The Integrator tags new system versions with coherent changes prior to release. The tagging convention for public and internal releases is described in Section .

Prior to release all ESMF software is regression-tested on all platforms and interfaces. The Integrator is responsible for regression testing, identifying problems and notifying the appropriate developers, and collecting and sending out Release Notes and Known Bugs.

ESMF releases are announced on the esmf_jst@cgd.ucar.edu mailing list and are posted on the ESMF website. Source code is available for download from the ESMF website and from the main ESMF SourceForge site.

4.3.8 Backups

The backup strategy for each entity of the ESMF project is as follows:

ESMF CVS source
Run rsync daily on the ESMF cvs repository and roll a tarball. On Sundays roll a tarball with a date stamp and move it to the Pluto archive directory.
ESMF GIT source
Run rsync daily on the ESMF git repository and roll a tarball. On Sundays roll a tarball with a date stamp and move it to the Pluto archive directory.
ESMFCONTRIB CVS source
Run rsync daily on the ESMFCONTRIB cvs repository and roll a tarball. On Sundays roll a tarball with a date stamp and move it to the Pluto archive directory.
ESMF website
On Sundays make a tarball with a date stamp of the ESMF website and move it to the Pluto archive directory.

To conserve memory only the backup files for the current year and the prior year are retained. For years beyond the prior year, only 6 month backup files are retained i.e. for 2010 to 2012 of the ESMF cvs files are:

20100103.esmf-cvsroot.tar.gz
20100606.esmf-cvsroot.tar.gz
20110102.esmf-cvsroot.tar.gz
20110605.esmf-cvsroot.tar.gz
20120101.esmf-cvsroot.tar.gz
All of 1012 and 1013

Once a year in January, the backup files of the year before the prior year will be cleaned up. For example, In January 2014 all of backup files of 2012 and 2013 would be archived, so the 2012 backup files will be cleaned up and only 6 month backup files will be retained.

4.4 Testing and Validation

ESMF software is subject to the following tests:

Unit tests, which are simple per-class tests.
Testing Harness, parameter space spanning tests similar to the unit tests
System tests, which generally involve inter-component interactions.
Use test cases (UTCs), which are tests at realistic problem sizes (e.g., large data sets, processor counts, grids).
Examples that range from simple to complex.
Beta testing through preliminary releases.

Unit tests, system tests, and examples are distributed with the ESMF software. UTCs, because of their size, are stored and distributed separately. Tests are run nightly, following a weekly schedule, on a wide variety of platforms. Beta testing of ESMF software is done by providing an Internal Release to customers three months before public release.

The ESMF team keeps track of test coverage on a per-method basis. This information is on the Metrics page under the Development link on the navigation bar.

Testing information is stored on a Test and Validation web page, under the Development link on the ESMF web site. This web page includes:

separate web pages for each system test and UTC;
links to the Developer's Guide, SourceForge Tracker, Requirements Spreadsheet, and any other pertinent information; and
separate web page for automated regression test information and results.

The ESMF is designed to run on several target platforms, in different configurations, and is required to interoperate with many combinations of application software. Thus our test strategy includes the following.

Tests are executed on as many target platforms as possible.
Tests are executed on a variety of programming paradigms (e.g pure shared memory, pure distributed memory and a mix of both).
Tests are executed in multiple configurations (e.g. uni-processor, multi-processor).
The result of each test is a PASS/FAIL.
In some cases, for floating point comparisons, an epsilon value will be used.
Tests are implemented for each language interface that is supported.

4.4.1 Unit Tests

Each class in the framework is associated with a suite of unit tests. Typically the unit tests are stored in one file per class, and are located near the corresponding source code in a test directory. The framework make system will have an option to build and run unit tests. The user has the option of building either a "sanity check" type test or an exhaustive suite. The exhaustive tests include tests of many functionalities and a variety of valid and invalid input values. The sanity check tests are a minimum set of tests to indicate whether, for example, the software has been installed correctly. It is the responsibility of the software developer to write and execute the unit tests. Unit tests are distributed with the framework software.

To achieve adequate unit testing, developers shall attempt to meet the following goals.

Individual procedures will be evaluated with at least one unit test function. However, as many test functions as necessary will be implemented to assure that each procedure works properly.
Developers should unit test their code to the degree possible before it is checked into the repository. It is assumed that developers will use stubs as necessary.
Variables are tested for acceptable range and precision.
Variables are tested for a range of valid values, including boundary values.
Unit tests should verify that error handling works correctly.

4.4.1.1 Writing Unit Tests

Unit tests usually test a single argument of a method to make it easier to identify the bug when a unit test fails. There are several steps to writing a unit test. First, each unit test must be labeled with one of the following tags:

NEX_UTest - This tag signifies a non-exhaustive test. These tests are always run and are considered to be sanity tests they usually consist of creating and destroying a specific class.
EX_UTest - This tag signifies an exhaustive unit test. These tests are more rigorous and are run when the ESMF_EXHAUSTIVE environmental variable is set to ON. These unit test must be between the #ifdef ESMF_EXHAUSTIVE and #endif definitions in the unit test file.
NEX_UTest_Multi_Proc_Only - These are non-exhaustive multi-processor unit tests that will not be run when the run_unit_tests_uni or unit_test_uni targets are specified.
EX_UTest_Multi_Proc_Only - These are exhaustive multi-proccesor unit tests that will not be run when the run_unit_tests_uni or unit_tests_uni targets are specified.

Note that when the NEX_UTest_Multi_Proc_Only or EX_UTest_Multi_Proc_Only tags are used, all the unit tests in the file must be labeled as such. You may not mix these tags with the other tags. In addition, verify that the makefile does not allow the unit tests with these tags to be run uni.

Second, a string is specified describing the test, for example:

	write(name, *) "Grid Destroy Test"

Third, a string to be printed when the test fails is specified, for example:

	write(failMsg, *) "Did not return ESMF_SUCCESS"

Fourth, the ESMF_Test subroutine is called to determine the test results, for example:

	call ESMF_Test((rc.eq.ESMF_SUCCESS), name, failMsg, result, ESMF_SRCLINE)

The following two tests are good examples of how unit tests should be written. The first test verify that getting the attribute count from a Field returns ESMF_SUCCESS, while the second verifies the attribute count is correct. These two tests could be combined into one with a logical AND statement when calling ESMF_Test, but breaking the tests up allows you to identify the source of the bug immediately.

      !------------------------------------------------------------------------
      !EX_UTest
      ! Getting Attrubute count from a Field
      call ESMF_FieldGetAttributeCount(f1, count, rc=rc)
      write(failMsg, *) "Did not return ESMF_SUCCESS"
      write(name, *) "Getting Attribute count from a Field "
      call ESMF_Test((rc.eq.ESMF_SUCCESS), name, failMsg, result, ESMF_SRCLINE)

      !------------------------------------------------------------------------
      !EX_UTest
      ! Verify Attribute Count Test
      write(failMsg, *) "Incorrect count"
      write(name, *) "Verify Attribute count from a Field "
      call ESMF_Test((count.eq.0), name, failMsg, result, ESMF_SRCLINE)

      !------------------------------------------------------------------------

Sometimes a unit test is written expecting a subset of the processors to fail the test. To handle this case, the unit test must verify results from each processor as in the unit test below:

    !------------------------------------------------------------------------
    !EX_UTest
    ! Verify that the rc is correct on all pets.
    write(failMsg, *) "Did not return FAILURE  on PET 1, SUCCESS otherwise"
    write(name, *) "Verify rc of a Gridded Component Test"
    if (localPet==1) then
      call ESMF_Test((rc.eq.ESMF_FAILURE), name, failMsg, result, ESMF_SRCLINE)
    else
      call ESMF_Test((rc.eq.ESMF_SUCCESS), name, failMsg, result, ESMF_SRCLINE)
    endif

    !------------------------------------------------------------------------

Some tests may require that a loop be written to verify multiple results. The following is an example of how a single tag, NEX_UTest, is used instead of a tag for each loop iteration.

 !-----------------------------------------------------------------------------
  !NEX_UTest
  write(name, *) "Verifying data in Array via Fortran array pointer access"
  write(failMsg, *) "Incorrect data detected"
  looptest = .true.
  do i = -12, -6
    j = i + 12 + lbound(fptr, 1)
    print *, fptr(j), fdata(i)
    if (fptr(j) /= fdata(i)) looptest = .false.
  enddo
  call ESMF_Test(looptest, name, failMsg, result, ESMF_SRCLINE)
  !-----------------------------------------------------------------------------

4.4.1.2 Analyzing unit test results

When unit test are run, a Perl script prints out the test results as shown in Section "Running ESMF Unit Tests" in the ESMF User's Guide. To print out the test results, the Perl script must determine the number of unit tests in each test file and the number of processors executing the unit test. It determines the number of tests by counting the EX_UTest, NEX_UTest, EX_UTest_Multi_Proc_Only, or NEX_UTest_Multi_Proc_Only tags in the test source file whichever is appropriate for the test being run. To determine the number of processors, it counts the number of "NUMBER_OF_PROCESSORS" strings in the unit test output Log file. The script then counts the number of PASS and FAIL strings in the test Log file. The Perl script first divides the number of PASS strings by the number of processors. If the quotient is not a whole number then the script concludes that the test crashed. If the quotient is a whole number, the script then divides the number of FAIL strings by the number of processors. The sum of the two quotients must equal the total number of tests, if not the test is marked as crashed.

4.4.1.3 Disabling unit tests

Sometimes in the software development process it becomes necessary to disable one or more unit tests. To disable a unit test, two lines need to be modified. First, the line calling "ESMF_Test" must be commented out. Second, the NEX_UTest, EX_UTest, NEX_UTest_Multi_Proc_Only and EX_UTest_Multi_Proc_Only tags must be modified so that they are not found by the Perl script that analyzes the test results. The recommended way to modify these tags is to replace the first underscore with "_disable_", thus NEX_UTest becomes NEX_disable_UTest.

4.4.1.4 Benchmarking Unit Tests

Benchmark testing is included in the ESMF regression tests to detect any unexpected change in the performance of the software. This capability is available to developers. Developers can run the unit tests and save their execution times to be used as a benchmark for future unit test runs.

The following section now appears in the output of "gmake info".

 
--------------------------------------------------------------
 * ESMF Benchmark directory and parameters *
ESMF_BENCHMARK_PREFIX:    ./DEFAULTBENCHMARKDIR
ESMF_BENCHMARK_TOLERANCE: 3%
ESMF_BENCHMARK_THRESHOLD_MSEC: 500
 
--------------------------------------------------------------

The steps for using the benchmarking test tool are as follows:

After building the unit tests, execute "gmake run_unit_tests" and verify that all tests pass. It not recommended that failing tests be benchmarked.
Set "BENCHMARKINSTALL = YES" and execute "gmake run_unit_tests_benchmark". This will cause the unit tests stdout files to be copied to the "DEFAULTBENCHMARKDIR" directory. The elapsed times of these unit tests are the now the benchmark. The default of DEFAULTBENCHMARKDIR is $ESMF_DIR/DEFAULTBENCHMARKDIR. It is advised that the benchmarking directory be outside the ESMF structure, to allow the developer to benchmark different versions of the software. The benchmark directory can be changed by setting ESMF_BENCHMARK_PREFIX.
Run the unit test a second time.
To compare the elapsed times of this unit test run to the benchmared run, set "BENCHMARKINSTALL = NO", and execute "gmake run_unit_tests_benchmark".

According to the default settings above, the benchmarking test will only analyze unit tests that run 500 msecs (ESMF_BENCHMARK_THRESHOLD_MSEC) or longer. If a unit test runs 3 percent (ESMF_BENCHMARK_TOLERANCE) or more beyond the benchmarked unit test, it will be flagged as failing the benchmark test. The developer may change these parameters as desired. The following is an example of the output of running "gmake run_unit_tests_benchmark":

 



The following unit tests with a threshold of 500 msecs. passed the 3% 
 tolerance benchmark test:

PASS: src/Infrastructure/DELayout/tests/ESMF_DELayoutWorkQueueUTest.F90
PASS: src/Infrastructure/Field/tests/ESMF_FieldCreateGetUTest.F90
PASS: src/Infrastructure/Field/tests/ESMF_FieldRegridCsrvUTest.F90
PASS: src/Infrastructure/Field/tests/ESMF_FieldRegridXGUTest.F90
PASS: src/Infrastructure/Field/tests/ESMF_FieldStressUTest.F90
PASS: src/Infrastructure/TimeMgr/tests/ESMF_CalRangeUTest.F90
PASS: src/Infrastructure/VM/tests/ESMF_VMBarrierUTest.F90
PASS: src/Infrastructure/VM/tests/ESMF_VMUTest.F90
PASS: src/Infrastructure/XGrid/tests/ESMF_XGridMaskingUTest.F90
PASS: src/Infrastructure/XGrid/tests/ESMF_XGridUTest.F90
PASS: src/Superstructure/Component/tests/ESMF_CompTunnelUTest.F90


The following unit tests with a threshold of 500 msecs. failed the 3% 
 tolerance benchmark test:

FAIL: src/Infrastructure/Field/tests/ESMF_FieldRegridUTest.F90
      Test elapsed time: 4331.446 msec.
      Benchmark elapsed time: 2958.47675 msec.
      Increase: 46.41%

FAIL: src/Infrastructure/FieldBundle/tests/ESMF_FieldBundleRegridUTest.F90
      Test elapsed time: 2051.05675 msec.
      Benchmark elapsed time: 1920.42125 msec.
      Increase: 6.8%

FAIL: src/Infrastructure/LogErr/tests/ESMF_LogErrUTest.F90
      Test elapsed time: 2986.40425 msec.
      Benchmark elapsed time: 2583.36775 msec.
      Increase: 15.6%



Found 167 exhaustive multi-processor unit tests files, of those with a 
 threshold of 500 msecs. 11 passed the 3% tolerance benchmark test, and 3 failed.

Benchmark install date: Thu Jun  4 13:26:55 MDT 2015

Note that only the unit tests that have an elapsed time of 500 msecs. or greater are listed. In addition, the date when the benchmark install was completed is displayed.

When a unit test run it benchmarked it is written to a directory such as "BENCHMARKDIR/test/testg/Darwin.gfortran.64.mpich2.default/". Therefore you can only compare unit tests elapsed between the identical configurations.

To implement the benchmarking tool, the unit tests were modified to record the elapsed time of each PET. The stdout file of each unit test has the following lines i.e.

ESMF_GridItemUTest.stdout: PET 0 Test Elapsed Time 5.7840000000000007 msec.
ESMF_GridItemUTest.stdout: PET 1 Test Elapsed Time 5.7259999999999982 msec.
ESMF_GridItemUTest.stdout: PET 2 Test Elapsed Time 6.6200000000000010 msec.
ESMF_GridItemUTest.stdout: PET 3 Test Elapsed Time 5.7190000000000021 msec.

The benchmarking tool uses the average of the four elapsed times to determine the test results since the elapsed times of each PET can vary.

4.4.2 Examples

The examples are written to help users understand a specific use of an ESMF capability. The examples appear as text in the ESMF Reference Manual, therefore care must be taken to insure that correct portions of the examples appear in the document. Latex tags have been created to designate which portions of the examples are visible in the document.

BOE and EOE are used between text describing the example. BOC and EOC are used between actual working code that appears in the Reference Manual. Below is an example of how the tags are used:

!-------------------------------- Example -----------------------------
!>%>%>%>%>%>%>%>%>%>%>%>%>%>%>%>%>%>%>%>%>%>%>%>%>%>%>%>%>%>%>%>%>%>%>%
!BOE
!\subsubsection{Get Grid and Array and other information from a Field}
!\label{sec:field:usage:field_get_default}
!
!  A user can get the internal {\tt ESMF\_Grid} and {\tt ESMF\_Array}
!  from a {\tt ESMF\_Field}.  Note that the user should not issue any destroy command
!  on the retrieved grid or array object since they are referenced
!  from within the {\tt ESMF\_Field}. The retrieved objects should be used
!  in a read-only fashion to query additional information not directly
!  available through the {\tt ESMF\_FieldGet()} interface.
!
!EOE

!BOC
    call ESMF_FieldGet(field, grid=grid, array=array, &
        typekind=typekind, dimCount=dimCount, staggerloc=staggerloc, &
        gridToFieldMap=gridToFieldMap, &
        ungriddedLBound=ungriddedLBound, ungriddedUBound=ungriddedUBound, &
        totalLWidth=totalLWidth, totalUWidth=totalUWidth, &
        name=name, &
        rc=rc)
!EOC
    if(rc .ne. ESMF_SUCCESS) finalrc = ESMF_FAILURE
    print *, "Field Get Grid and Array example returned"

    call ESMF_FieldDestroy(field, rc=rc)
    if(rc .ne. ESMF_SUCCESS) finalrc = ESMF_FAILURE
!>%>%>%>%>%>%>%>%>%>%>%>%>%>%>%>%>%>%>%>%>%>%>%>%>%>%>%>%>%>%>%>%>%>%>%

Note that any code or text that is not contained within the tag pairs does not appear in the Reference Manual.

Most examples can be run on multiple processors or a single processor. Those examples should have the tag, "ESMF_EXAMPLE" as a comment in the body of the example file. If the example can only run on multiple processors then use the tag, "ESMF_MULTI_PROC_EXAMPLE".

4.4.2.1 Disabling examples

When an example is removed from the makefile, the "ESMF_EXAMPLE" or "ESMF_MULTI_PROC_EXAMPLE" tags must be modified so that the example is not flagged as failed. The recommended way to modify these tags is to replace the first underscore with "_disable_", thus "ESMF_EXAMPLE" becomes "ESMF_disable_EXAMPLE".

4.4.3 System Tests

System tests are written to test functionality that spans several classes. The following areas should be addressed in system testing.

Design omissions (e.g. incomplete or incorrect behaviors).
Associations between objects (e.g. fields, grids, bundles).
Control and infrastructure. (e.g. couplers, time management, error handling).
Feature interactions or side effects when multiple features are used simultaneously.

The system tester should issue a test log after each software release is tested, which is recorded on the Test and Validation web page. The test log shall include: a test ID number, a software release ID number, testing environment descriptions, a list of test cases executed, results, and any unexpected events. Bugs should be documented in the SourceForge Bug Tracker and any bug fixes shall be validated.

4.4.3.1 Writing System Tests

System tests should contain the following sections:

Create - Create Components, Couplers, Clock, Grids, States etc.
Register - Register Components and the initialize, run and finalize subroutines.
Initialize - Initialize as needed.
Run - Run the test.
Finalize - Verify results.
Destroy - Destory all classes.

Most system tests can be run on multiple processors or a single processor. Those system tests should have the tag, "ESMF_SYSTEM_TEST" as a comment in the body of the system test. If the system test can only run on multiple processors then use the tag, "ESMF_MULTI_PROC_SYSTEM_TEST".

At the end of the system it is recommended that the ESMF_TestGlobal subroutine be used to gather test results from all processors and print out a single PASS/FAIL message instead of individual PASS/FAIL messages from all the processors. After the test is written it must be documented on the ESMF Test & Validation web page:

http://www.earthsystemmodeling.org/developers/test/system/

4.4.3.2 Disabling system tests

When a system test is removed from the makefile, the "ESMF_SYSTEM_TEST" or "ESMF_MULTI_PROC_SYSTEM_TEST" tags must be modified so that the system test is not counted as failed. The recommended way to modify these tags is to replace the first underscore with "_disable_", thus ESMF_SYSTEM_TEST becomes ESMF_disable_SYSTEM_TEST.

4.4.4 Test Harness

The Test Harness is a highly configurable test control system for conducting thorough testing of the Regridding and Redistribution processes. The Test Harness consists of a single shared executable and a collection of customizable resource files that define an ensemble of test configurations tailored to each ESMF class. The Test Harness is integrated into the Unit test framework, enabling the Test Harness to be built and run as part of the Unit tests. The test results are reported to a single standard-out file which is located with the unit test results.

See section for a complete discussion of the test harness.

4.4.4.1 Analyzing Test Harness results

When the Test Harness completes a run, the results from the ensemble of tests are reported in two ways. The first is analogous to the unit test reporting, since the test harness is run as part of the unit tests, a summary of the results are recorded just as with the unit tests. In addition to the standard unit test reporting, the test harness is also able to produce a human readable report. The report consists of a concise summary of the test configuration along with the test results. The test configuration is described in terms of the Field Taxonomy syntax and user provided strings. The intent is not to provide a exhaustive description of the test, but rather to provide a useful description of the failed tests.

Consider another example similar to the previous one, where two descriptor strings describing an ensemble regridding tests. The first uses the patch method and the second uses bilinear interpolation.

[ B1 G1; B2 G2 ] =P=> [ B1 G1; B2 G2 ] 
[ B1 G1; B2 G2 ] =B=> [ B1 G1; B2 G2 ]

Suppose the associated specifier files indicate that the source grid is rectilinear and is 100 X 50 in size. The destination grid is also rectilinear and is 80 X 20 in size. Both grids are block distributed in two ways, 1 X NPETS and NPETS X 1. And suppose that the first dimension of both the source and destination grids are periodic. If the test succeeds for the bilinear regridding, but fails for one of the patch regridding configurations, the reported results could look something like

SUCCESS: [B1 G1; B2 G2 ] =B=> [B1 G1; B2 G2 ] 
FAILURE: [B1{1} G1{100}+P; B2{npets} G2{50} ] =P=> [B1{1} G1{80}+P; B2{npets} G2{20} ] 
     failure at line 101 of test.F90
SUCCESS: [ B1{npets} G1{100} +P; B2{1} G2{50} ] =P=> [ B1{npets} G1{80}+P; B2{1} G2{20} ]

The report indicates that all the test configurations for the bilinear regridding are successful. This is indicated by the key word SUCCESS which is followed by the successful problem descriptor string. Since all of the tests in the first case pass, there is no need to include any of the specifier information. For the second ensemble of tests, one configuration passed, while the other failed. In this case, since there is a mixture of successes and failures, the report includes specifier information for all configurations to help indicate the source of the test failure. The supplemental information, while not a complete problem description since it lacks items such as the physical coordinates of the grid and the nature of the test field, includes information crucial to isolating the failed test.

4.4.5 Use Test Cases (UTCs)

Use Test Cases are problems of realistic size created to test the ESMF software. They were initiated when the ESMF team and its users saw that often ESMF capabilities could pass simple system tests but would fail out in the field, for real customer problems. UTCs have realistic processor counts, data set sizes, and grid and data array sizes. UTCs are listed on the Test & Validation page of the ESMF website. They are not distributed with the ESMF software; instead they are stored in a separate module in the main repository called use_test_cases.

4.4.6 Beta Testing

ESMF software is released in a beta form, as an Internal Release, three months before it is publicly released. This gives users a chance to test the software and report back any problems to support.

4.4.7 Automated Regression Tests

The purpose of regression testing is to reveal faults caused by new or modified software (e.g. side effects, incompatibility between releases, and bad bug fixes). Regression tests regularly exercise all interfaces of the code on all target platforms. The regression test results for the last two weeks can be found here. This web page provides a complete color-coded current view of the state of the trunk ESMF software, sorting options by platform or compiler are provided. A similiar test results web page for the branch is also available. Clicking on any of the cells will display the specific test report for that day. Hovering over the test name i.e., Blues gfortran, will reveal notes particular to that platform/compiler. Clicking on the test name, will take you to the home page of the platform.

The platforms that run the regression tests, email the test results to a server that updates the test results web page. A script checks for test reports every 15 minutes, and updates the web page. The time of the last update appears on the web page.

4.4.8 Investigating Test Failures

The regression test results web page provides a color-coded view of the state of the software. When a developer finds that a test fails on a particular platform with a particular compiler, sometimes the bug is readily identified and fixed. However other times the developer may want to know if the test fails on other platforms and if the failure is related to a compiler, mpi configuration or optimized/debug execution. The developer would need to click to all the cells of different platforms searching for the test results for that particular test.

A tool was created to allow the developers to query the test results for a specific test for a specific date, as long as it is within two weeks of the current date. The developer may send a query test results message to the following email address: esmftest@cgd.ucar.edu The subject of the email must be exactly "Test_Results_Query". The body of the email message must be "Test:" followed by the test name and "Date" followed by the desired date. The format must be a three letter month and a number. If the date is 2 digits, greater than 9, then insert one space between the month and date e.g. Apr 25. If the day is a single digit insert two spaces, between the month and day e.g. Apr 4.

Test:ESMF_FieldBundleSMMUTest.F90
Date:Feb  8
   or
Date Feb 28

This mail box is checked every quarter hour on the quarter hour, the results are emailed to:esmf_test@cgd.ucar.edu. The subject of the results email for this example would be:

        ESMF_FieldBundleSMMUTest.F90 test results for Feb  8

The body of the email would be as follows:

	ESMF_Blues_PGI:PASS: mvapich2/g: src/Infrastructure/FieldBundle/tests/ESMF_FieldBundleSMMUTest.F90
	ESMF_Blues_PGI:PASS: mvapich2/O: src/Infrastructure/FieldBundle/tests/ESMF_FieldBundleSMMUTest.F90
	ESMF_Blues_PGI:CRASHED: mpich3/g: src/Infrastructure/FieldBundle/tests/ESMF_FieldBundleSMMUTest.F90
	ESMF_Blues_PGI:PASS: mpich3/O: src/Infrastructure/FieldBundle/tests/ESMF_FieldBundleSMMUTest.F90
	ESMF_Blues_PGI:PASS: openmpi/g: src/Infrastructure/FieldBundle/tests/ESMF_FieldBundleSMMUTest.F90
	ESMF_Blues_PGI:PASS: openmpi/O: src/Infrastructure/FieldBundle/tests/ESMF_FieldBundleSMMUTest.F90
	ESMF_Discover_g95:PASS: mvapich2/g: src/Infrastructure/FieldBundle/tests/ESMF_FieldBundleSMMUTest.F90
	ESMF_Discover_g95:PASS: mvapich2/O: src/Infrastructure/FieldBundle/tests/ESMF_FieldBundleSMMUTest.F90
	ESMF_Haumea_g95:PASS: mpich2/g: src/Infrastructure/FieldBundle/tests/ESMF_FieldBundleSMMUTest.F90
	ESMF_Haumea_g95:PASS: mpich2/O: src/Infrastructure/FieldBundle/tests/ESMF_FieldBundleSMMUTest.F90
	ESMF_Haumea_g95:PASS: mvapich2/g: src/Infrastructure/FieldBundle/tests/ESMF_FieldBundleSMMUTest.F90
	ESMF_Haumea_g95:PASS: mvapich2/O: src/Infrastructure/FieldBundle/tests/ESMF_FieldBundleSMMUTest.F90
	ESMF_Pluto_g95:FAIL: mpich2/g: src/Infrastructure/FieldBundle/tests/ESMF_FieldBundleSMMUTest.F90
	ESMF_Pluto_g95:FAIL: mpich2/O: src/Infrastructure/FieldBundle/tests/ESMF_FieldBundleSMMUTest.F90
	ESMF_Pluto_g95:FAIL: mvapich2/g: src/Infrastructure/FieldBundle/tests/ESMF_FieldBundleSMMUTest.F90
	ESMF_Pluto_g95:FAIL: mvapich2/O: src/Infrastructure/FieldBundle/tests/ESMF_FieldBundleSMMUTest.F90

Note that if the date of the query is the current day, the developer should query periodically during the day since the test results are being updated as platforms report their test results. If a test crashes it can be because another test hung and the test in question did not run.

Another instance where this tool is useful is when a developer adds a new test, after the nightly tests run, the developer can run a query to quickly see the test results.

4.4.9 Building the Documentation

As software development progresses, the documentation is updated, built and posted at http://earthsystemmodeling.org/docs/nightly/develop/dev_guide/

The documents are built daily in the early morning, the results of the builds are posted at http://earthsystemmodeling.org/doc/

These documents can be updated by the developers, by checking out the documents from the repository and submitting the edited files. To have the new version of the documents posted on the web, the developer must sent a request to the following email address: esmftest@cgd.ucar.edu. The subject of the email indicates which document to build and post. The following is the list of subjects that have been implemented:

Build_Dev_Guide_Doc
Build_NUOPC_Doc
Build_Ref_Doc
Build_ESMPy_Doc
Build_CICE_NUOPC_CAP_Doc
Build_HYCOM_NUOPC_CAP_Doc
Build_LIS_NUOPC_CAP_Doc
Build_MOM_NUOPC_CAP_Doc
Build_WRFHYRO_NUOPC_CAP_Doc

A script checks for document build requests every quarter hour on the quarter hour. A document build is started and on successful completion the document is updated on the web and document build results is updated. An email will be sent to esmf_test@cgd.ucar.edu and esmf-test@lists.sourceforge.net when the build is done.

4.4.10 Testing for Releases

We provide two types of tar files, the ESMF source and the shared libraries of the supported platforms. Consequently, there are two test procedures followed before placing the tar files on the ESMF download website.

The Source Code Test Procedure is followed on all the supported platforms for the particular release.

Verify that the source code builds in both BOPT=g and BOPT=O.
Verify that the ESMF_COUPLED_FLOW demonstration executes successfully.
Verify that the unit tests run successfully, and that there are no NON-EXHAUSTIVE unit tests failures.
Verify that all system tests run successfully.

The Shared Libraries Test Procedure is also followed on all supported platforms for a release.

Change to the CoupledFlowEx directory and execute gmake. Verify that the demo runs successfully.
Change to the CoupledFlowSrc directory and execute gmake then gmake run. Verify that the demo runs successfully.
Change to the examples directory and execute gmake and gmake run. Verify that the example runs successfully.

4.5 User Support

4.5.1 Roles

The Advocate is the staff person assigned to a particular code e.g. GEOS-5. See section 2.1.1 for a full definition and list of responsibilities. The Handler is the staff person assigned to solve a support ticket. The Advocate and the Handler may be the same person or they may be different. See section 2.1.1 for complete definition and list of responsibilities.

4.5.2 Support Categories

New is a request that has not been replied to. Closed is a request that has been fixed to the user's satisfaction. Pending is a request that has been fixed to the Handler's satisfaction but has not yet been approved by the user.

4.5.3 Summary Work Flow

Message received.
The Integrator or in his absence the Support Lead, generates a SourceForge Bug, Feature, or Support Request ticket.
If the request contains more than one topic, then Integrator will open multiple tickets, one per topic. This can been done initially if obvious, or later if more research indicates it is necessary.
- The top line of the entry should be WHO: <Requester Name>.
- Indicate the institution and model if known.
- Keep title of initial email and the title of the SF ticket the same or close enough to be able to determine they are one and the same.
- Assign the ticket to the staff person best able to solve the ticket's issue.
Initial contact is made by:
- The Handler assigned by the Integrator in the ticket.
- The Support Lead if the Handler will be unavailable for more than a week.
The Handler works to solve the tickets issues. He or she will communicate periodically with the ticket's originator and will keep the rest of the Core team informed on the tickets progress at the monthly ticket review meetings. Once the issue has been solved, the ticket will be marked pending by the Handler.
At this point, the Handler contacts the originator to gauge their satisfaction with the solution. If the originator is satisfied, the ticket may be closed, and the mail folder on the IMAP server moved from Open to Closed by the Support Lead. If the customer does not respond, an attempt at contact will be made once a month for two months. If after this period, the originator still does not reply, a pending ticket may be closed with final notification to the originator.

4.5.4 General Guidelines for Handling Tickets

Include title and ticket number on all correspondence.
Make initial contact within 48 hours even if just to say message received.
The email address for ticket originators can be found in either freeCRM or the mail archive. Do not hesitate to contact the Support Lead if a required email address can not be found.
Copy esmf_support@ucar.edu on all replies.
Bugs that are fixed should be marked Closed, and Fixed. They should never be deleted.
Bugs that are duplicates should be marked Deleted, and Duplicate.
If the main issue in a Bug, Feature Request, or Support Request has not been implemented it should stay Open.
Users are always notified via email when their ticket is being closed even if they have been unresponsive.
If the solution to a ticket involves a test code, this should be incorporated into the code body as standard test. It should not be sent to the user as an unofficial code fragment.
If the solution to a ticket involved changes to the code, the user should be given a stable beta snapshot. The user should not be directed to the HEAD, which is inherently unstable.
If a ticket involves an older version of the code and a computing environment that the current distribution runs on, the ticket should be considered for closure when there is no means of testing or fixing the older code.
The Handler is responsible for changing the status of tickets assigned to them.

4.5.5 esmf_support@ucar.edu Mail Archives

The Support Lead manages the archive of esmf_support@ucar.edu email traffic and is responsible for the creation of ticket folders, component folders, and the proper placement of mail messages. The archive is located on the main CISL IMAP server and can be accessed by any Core member. Contact the Support Lead if you wish your local mail client enabled to view the archive. The IMAP archive will have the following appearance:

Component Name
- Open
  - Numbered Ticket Folder
  - Numbered Ticket Folder
  - Numbered Ticket Folder
- Closed
  - Numbered Ticket Folder
  - Numbered Ticket Folder
  - Numbered Ticket Folder
Component Name

The following rules apply to the above:

Email messages will be filed by component and number.
A folder labeled with the request number will be created.
This folder will then be placed in the components Open folder until closed.
The Support Lead will copy each related email message to its numbered folder.
When a ticket has been closed, the Support Lead will move the numbered folder from the components Open folder to its Closed folder.
There will be only one New folder to which highly active tickets may be placed for easier filing at the discretion of the Support Lead.

4.5.6 INFO:Code (subject) mail messages

Advocates need to share the information they have received from their codes with the rest of the Core team. This will be done by sending an email to esmf_support@ucar.edu with a subject line labeled INFO: Code e.g. INFO: CCSM, INFO: GEOS-5. These messages will be filed on the IMAP server (see above section) under the code referenced. All information about a code that is general and not related to a specific support request will be archived in this manner.

4.5.7 freeCRM

A client relationship management tool (freeCRM http://www.freecrm.com) is being used to archive codes, their affiliated contacts, degree of componentization, issues, and applicable funding information if known. The following is a list of roles and responsibilities associated with this software:

Advocates are responsible for the accuracy and completeness of all information associated with codes to which they are assigned. This information includes a pull down menu that specifies the state of the code's ESMF'ization. This piece of information is critical and needs to be updated whenever an Advocate updates his or her codes. Other information includes type of code, parent agency etc. This information will be reviewed on a semi-annual basis.
The Integrator is responsible for creating a back up of all freeCRM data on a monthly basis.
The Core Team Manager is responsible for the accuracy and completeness of all funding related information.
The Support Lead is responsible for creating code 'companies' and informing the Integrator of any additions so that the back up scripts can be modified. He or she is additionally responsible for conducting semi-annual quality control checks of all information in the system.
All team members are responsible for updating and adding to the list of contacts.

4.5.8 Annual Code Contact

Once a year all codes in the freeCRM data base will be contacted in order to gauge their development progress, and to update our component metrics. This process will contain the following steps:

Advocates will login into freeCRM and get a list of all their codes. This list will be emailed to esmf_core@cgd.ucar.edu.
Advocates will review their list and determine which codes on the list need to be contacted. Contact is not needed if sufficient knowledge is already known about a code.
Advocates will review all the information contained in freeCRM concerning their assigned codes AND review all the esmf_support traffic for the last year.
Advocates will draft the contact email and send it to esmf_core@cgd.ucar.edu to be reviewed by the Core Team Manager. Once corrected, the Advocates may send their email. Since this is a group level effort, the email message may be signed “The ESMF Team” if desired.
The Support Lead will track the draft and completed emails as well as the responses and will provide a report to the Core Team Manager at the end of the process.
As responses come in, the Advocates are responsible for updating the information in freeCRM.
The Support Lead will tally the results and update the components page on the ESMF Web site and will also update the components metric chart.

4.5.9 Dealing with Applications that use ESMF

More and more applications are being distributed with embedded ESMF interfaces. It may difficult to determine if a reported problem with one of these applications is related to an incorrect ESMF implementation, a true ESMF bug, or an issue within the parent model. The following are several definitions:

End User: A person who downloads or otherwise receives an application that contains ESMF code. While they may be trying to modify this application, they were not the person or persons who originally inserted ESMF into the application. Most likely, they will be entirely unfamiliar with ESMF.
Application Developer: The person or persons who took a model, inserted ESMF code, and made the resulting application available to others.

The following are some guidelines for dealing with such Applications that use ESMF:

For support requests related to applications that include ESMF, our primary contact for resolving the request should be the developers of the distributed application and not the End User. As such, every effort should be made to identify and contact the developers of the distributed application in order to make them aware of the reported issue and to get them actively engaged in resolution of the problem. Additionally, they should be cc'ed on all correspondence with the End User.
During the resolution of the issue, it will be necessary to cc all email traffic to the End User. In dealings with the End User emphasize that the ESMF group is committed to any user of ESMF regardless of source. That commitment is predicated, however, on participation of the application developers.
The Handler should establish which version of ESMF the application is using.
The Handler should try and determine whether the ESMF code in question was modified in any way by the Application Developers.
The Handler should try and determine whether the code in question has ESMF interface names but is not ESMF code. The time manager in WRF falls into this category. It has ESMF interfaces but was not developed by us.
It will be solely the Core Team's discretion whether or not to support older versions of ESMF, ESMF code that has been modified by others, or code that uses ESMF interface names but was developed entirely separate from ESMF.
In no way should the Handler try running the End User's code.
In the event that the developers of the distributed application are unknown, unreachable, or uncooperative, the End User must be politely informed that the group can not troubleshoot code belonging to another group. This will have to been handled with a degree of sensitivity because it is likely that the end user has already tried to contact the application developers without success.

esmf_support@ucar.edu