Subsections
4 Processes
The ESMF development environment has several defining characteristics.
First, both the ESMF Core Team and the JST are distributed. This makes
incorporating simple, efficient communication mechanisms into the
development process essential. Second, the JST and Core Team work on
a range of different platforms, at sites that don't have the time, resources,
or inclination to install demanding packages. Collaboration
tools that require no purchase or installation before use are essential.
Finally, ESMF is committed to open development. As much as possible,
the ESMF team tries to keep the workings of the project - metrics,
support and bug lists, schedules, task lists, source code, you name it -
visible to the broad community.
The ESMF software development cycle is based on the staged
delivery model [#!mcconnell96!#]. The steps in this software development
model are:
- Software Concept Collect and itemize the high-level requirements of the system and identify the basic functions that the system must perform.
- Requirements Analysis Write and review a requirements document - a detailed statement of the scientific and computational requirements for the software.
- Architectural Design Define a high-level software architecture that outlines the functions, relationships, and interfaces for major components. Write and review an architecture document.
- Stage 1, 2, ..., n Repeat the following steps creating a potentially releasable product at the end of each stage. Each stage produces a more robust, complete version of the software.
- Detailed Design Create a detailed design document and API specification. Incorporate the interface specification
into a detailed design document and review the design.
- Code Construction and Unit Testing Implement the interface,
debug and unit test.
- System Testing Assemble the complete system, verify that the
code satisfies all requirements.
- Release Create a potentially releasable product, including User's Guide and User's Reference Manual. Frequently code produced at intermediate stages software will be used internally.
- Code Distribution and Maintenance Official public release
of the software, beginning of maintenance phase.
We have customized and extended this standard model to suit the ESMF project.
At this stage of ESMF development, we are in the iterative design/implement/release
cycle. Below are a few notes on earlier stages.
Participants in the ESMF project completed the Software Concept
stage in the process of developing a unified set of proposals.
A summary of the high-level requirements of ESMF - a statement of project
scope and vision - is included in the General Requirements
part of the ESMF Requirements Document[#!bib:ESMFreqdoc!#].
This was a successful effort in defining the scope of the project
and agreeing to an overall design strategy.
The ESMF Team spent about six months at the start of the project
producing the ESMF Requirements Document.
This outlined the major ESMF capabilities necessary to meet project milestones
and achieve project goals. The second part of the document was a detailed
requirements specification for each functionality class included in
the framework. This document also included a discussion of the
process that was used to initially collect requirements.
The Requirements Document was a useful reference for the development
team, especially for new developers coming in from outside of the
Earth science domain. However, as the framework matured, support
requests and the Change Review Board process took precedence in defining
development tasks and setting priorities. The Requirements
Document is bundled with the ESMF source distribution through version 2;
with version 3 it was removed.
The project had difficulty with the Architecture Document.
The comments received back on the completed work, informally and
from a peer review body, indicated that the presentation of the document
was ineffective at conveying how the ESMF worked. Although the
document was full of detailed and complex diagrams, the terminology and
diagrams were oriented to software engineers and were not especially
scientist-friendly. The detailed diagrams also made the document difficult
to maintain. This experience helped to guide the ESMF project
towards more user-oriented documents, but it also left a gap in the
documentation that has taken time to fill.
The following are processes the ESMF team is actively following. These
guidelines apply to core team developers and outside contributors
who will be checking code into the main ESMF repository.
All design and code reviews are telecons held with the JST.
Telecons are scheduled with the Core Team Manager, put on the ESMF
calendar on the home page of the ESMF website, and announced on the
esmf_jst@cgd.ucar.edu list.
When you call in, it's nice to give your name at the first opportunity.
Telecon hosts will make an effort to introduce people on the JST calls,
especially first-timers. Please don't put the telecon on hold (we sometimes
get telecon-stopping music or beeps this way).
Within a week or so after the telecon, the host (the developer if it's a design
or code review) is expected to send out a summary to esmf_jst@cgd.ucar.edu
with the date of the call, the participants, and notes or conclusions.
- Introductory telecon(s). The point here is to scope out the problem
at hand. These calls cover the following, as they apply.
- Understand the capability needed and review requirements.
- Discuss design alternatives.
- Survey and discuss any existing packages that cover the new functionality.
- Discuss potential use test cases.
- Figure out which customers will be involved in use test case development and identify customers that are likely to provide important input
(sometimes we offer a friendly reminder service to these folks before relevant
telecons).
For these introductory discussions, any form of material is fine - diagrams,
slides, plain text ramblings, lists of questions, ...
- Initial design review(s). The document presented should be in the format
of the ESMF Reference Manual, either in plain text or in latex/ProTeX.
This is so the document can be incorporated into project documentation
after implementation. The initial review document should include at least
the following sections:
- Class description.
- Use and Examples. Here the examples begin with the very simplest
cases.
- API sufficient to cover the examples. This is because it can be difficult
to follow the examples (e.g. tell what arguments and options are) without the
basic API entries accompanying them.
This step is iterated until developers and customers converge.
- Full telecon review(s). The developer should prepare the API specification
using latex and ProTex following the conventions in the Reference Manual.
Most of the Reference Manual section(s) for the new or modified class(es),
including Class Options and Restrictions and Future Work, should be available
at the time of this review. Diagrams should be ready here too.
- Use test case telecon review. For each major piece of functionality, a use
test case is prepared in collaboration with customers and executed before
release. The use test case is performed on a realistic number of processors
and with realistic input data sets.
It doesn't have to work (and probably won't) before it's reviewed, but it
needs to work before the functionality appears in a release. The developer
checks it into the top-level use_test_case directory on SourceForge and
prepares a HTML page outlining it for the Test & Validation page on the
ESMF website. Unlike unit and system tests, use test cases aren't
distributed with the ESMF source.
Code should be written in accordance with the interface specifications
agreed to during design reviews and the coding conventions described in
Section .
There is an internal release checklist on the Test & Validation page of the
ESMF website that contains an exhaustive listing of develop and tester
responsibilities for releases. For additional discussion
of test and validation procedures, see Section 4.4.
The developer is responsible for working with the tester(s) to make sure
that the following are present before an internal release:
- 100% unit and system test coverage of new interfaces, with the exception
of interfaces where type/kind/rank is heavily overloaded. All arguments tested
at least once.
- Use test cases work
There is a public release checklist on the Test & Validation page of the ESMF
website that contains an exhaustive listing of develop and tester
responsibilities for releases. For additional discussion
of test and validation procedures, see Section 4.4.
Same as for internal release, plus:
- Design and Implementation Notes section for Reference Manual complete.
- Developer and tester ensure that test coverage for new interfaces is sufficient, implementing any additional tests to make it so. This includes testing of options and tests for error handling and recovery.
Developers are encouraged to check their changes into the repository
as they complete them, as frequently as possible without breaking the
existing code base.
- Both core and contributors should test on at least three compilers before commit.
- For core team developers, a mail should go out to esmf_core@cgd.ucar.edu
before check-in for very large commits and for commits that will break the
HEAD. For contributors a mail should go out to esmf_core@cgd.ucar.edu before
ANY commit.
- No code commits should be made between 0.00 and 4:00 Mountain Time. During
this time the regression tests scripts are checking out code and any commits
will lead to inconsistent test results which are hard to interpret.
- Core team developers can be set up to receive email from SourceForge
for every check-in, by writing esmf_support@ucar.edu with the request.
To accomplish the first item on the list after a commit of source code, an email can be sent
to esmftest@cgd,ucar.edu with the exact subject "Run_ESMF_Test_Build". The mailbox is checked
every quarter hour on the quarter hour. This email initiates a test on pluto that
builds and installs ESMF with four compilers: g95, gfortran, lahey, and nag, with
ESMF_BOPT set to "g" and "O".
When the test is started an email with the subject "ESMF_Test_Builds_Pluto_started",
is sent to esmf_core@cgd.ucar.edu, with a time stamp in the body of the message.
If a test is already running, an email, with the subject "ESMF_Test_Builds_Pluto_not_started",
is sent with "Test not started, a test is already running." in the body.
The test that is running will run to completion, a new test will NOT be queued up. A new
"Run_ESMF_Test_Build" email must be sent when the running test is completed.
When the test is completed an email, "ESMF_Test_Builds_Pluto" with the test results is sent to
esmf-test@lists.sourceforge.net, esmf_test@cgd.ucar.edu.
The test results will also appear in the Regression Test webpage under "ESMF_Test_Builds"
link towards the top of the page.
- All significant chunks of externally contributed code are reviewed
by the JST. It's usual to do the code review after check-in. The code review should
be scheduled with the Core Team Manager when the code is checked in, and
the code review held before the next release.
- We also do code reviews with core team members, as desired/required
by the JST.
The ESMF produces internal releases and public releases based
on the schedule generated by the CRB. Every public release
is preceded by an internal release three months prior, for the
purpose of beta testing. During those three months, bugs may be
fixed and documentation improved, but no new functionality may be
added. Occasionally the Core Team releases an internal release
that does not become a public release. This would happen, for
example, when major changes are being made to ESMF and user
input is needed for multiple preliminary versions of the software.
The Integrator tags new system versions with coherent changes prior
to release. The tagging convention for public and internal releases is
described in Section .
Prior to release all ESMF software is regression-tested on all platforms
and interfaces. The Integrator is responsible for regression testing,
identifying problems and notifying the appropriate developers, and
collecting and sending out Release Notes and Known Bugs.
ESMF releases are announced
on the esmf_jst@cgd.ucar.edu mailing list and are posted on the ESMF
website. Source code is available for download from the ESMF
website and from the main ESMF SourceForge site.
The backup strategy for each entity of the ESMF project is as follows:
- ESMF CVS source
Run rsync daily on the ESMF cvs repository and roll a tarball.
On Sundays roll a tarball with a date stamp and move it to the Pluto archive directory.
- ESMF GIT source
Run rsync daily on the ESMF git repository and roll a tarball.
On Sundays roll a tarball with a date stamp and move it to the Pluto archive directory.
- ESMFCONTRIB CVS source
Run rsync daily on the ESMFCONTRIB cvs repository and roll a tarball.
On Sundays roll a tarball with a date stamp and move it to the Pluto archive directory.
- ESMF website
On Sundays make a tarball with a date stamp of the ESMF website and move it to the Pluto archive directory.
To conserve memory only the backup files for the current year and the prior year are retained. For years beyond the prior year, only 6 month backup files are retained i.e. for 2010 to 2012 of the ESMF cvs files are:
20100103.esmf-cvsroot.tar.gz
20100606.esmf-cvsroot.tar.gz
20110102.esmf-cvsroot.tar.gz
20110605.esmf-cvsroot.tar.gz
20120101.esmf-cvsroot.tar.gz
All of 1012 and 1013
Once a year in January, the backup files of the year before the prior year will be cleaned up. For example, In January 2014 all of backup files of 2012 and 2013 would be archived, so the 2012 backup files will be cleaned up and only 6 month backup files will be retained.
4.4 Testing and Validation
ESMF software is subject to the following tests:
- Unit tests, which are simple per-class tests.
- Testing Harness, parameter space spanning tests similar to the unit tests
- System tests, which generally involve inter-component interactions.
- Use test cases (UTCs), which are tests at realistic problem
sizes (e.g., large data sets, processor counts, grids).
- Examples that range from simple to complex.
- Beta testing through preliminary releases.
Unit tests, system tests, and examples are distributed with the
ESMF software. UTCs, because of their size, are
stored and distributed separately. Tests are run nightly,
following a weekly schedule, on a wide variety of platforms.
Beta testing of ESMF software is done by providing an Internal Release
to customers three months before public release.
The ESMF team keeps track of test coverage on a per-method basis.
This information is on the Metrics page under the Development
link on the navigation bar.
Testing information is stored on a Test and Validation web page,
under the Development link on the ESMF
web site. This web page includes:
- separate web pages for each system test and UTC;
- links to the Developer's Guide, SourceForge Tracker, Requirements
Spreadsheet, and any other pertinent information; and
- separate web page for automated regression test information and results.
The ESMF is designed to run on several target platforms, in different
configurations, and is required to interoperate with many combinations
of application software. Thus our test strategy includes the following.
- Tests are executed on as many target platforms as possible.
- Tests are executed on a variety of programming paradigms
(e.g pure shared memory, pure distributed memory and a mix of both).
- Tests are executed in multiple configurations (e.g. uni-processor,
multi-processor).
- The result of each test is a PASS/FAIL.
- In some cases, for floating point comparisons, an epsilon value
will be used.
- Tests are implemented for each language interface that is
supported.
Each class in the framework is associated with a suite of unit tests.
Typically the unit tests are stored in one file per class, and are
located near the corresponding source code in a test directory. The
framework make system will have an option to build and run unit tests.
The user has the option of building either a "sanity check" type test
or an exhaustive suite. The exhaustive tests include tests of many
functionalities and a variety of valid and invalid input values. The sanity
check tests are a minimum set of tests to indicate whether, for example, the
software has been installed correctly. It is the responsibility of the
software developer to write and execute the unit tests. Unit tests
are distributed with the framework software.
To achieve adequate unit testing, developers shall attempt to meet the following goals.
- Individual procedures will be evaluated with at least one unit
test function. However, as many test functions as necessary will be
implemented to assure that each procedure works properly.
- Developers should unit test their code to the degree possible
before it is checked into the repository. It is assumed that
developers will use stubs as necessary.
- Variables are tested for acceptable range and precision.
- Variables are tested for a range of valid values, including boundary
values.
- Unit tests should verify that error handling works correctly.
Unit tests usually test a single argument of a method to make it easier to
identify the bug when a unit test fails.
There are several steps to writing a unit test.
First, each unit test must be labeled with one of the following tags:
- NEX_UTest - This tag signifies a non-exhaustive test. These tests are always run and
are considered to be sanity tests they usually consist of creating and destroying a specific class.
- EX_UTest - This tag signifies an exhaustive unit test. These tests are more rigorous and
are run when the ESMF_EXHAUSTIVE environmental variable is set to ON. These unit test must be between the #ifdef ESMF_EXHAUSTIVE
and #endif definitions in the unit test file.
- NEX_UTest_Multi_Proc_Only - These are non-exhaustive multi-processor unit tests that will not be
run when the run_unit_tests_uni or unit_test_uni targets are specified.
- EX_UTest_Multi_Proc_Only - These are exhaustive multi-proccesor unit tests that will not be
run when the run_unit_tests_uni or unit_tests_uni targets are specified.
Note that when the NEX_UTest_Multi_Proc_Only or EX_UTest_Multi_Proc_Only tags are used, all the unit tests in
the file must be labeled as such. You may not mix these tags with the other tags. In addition, verify that the makefile
does not allow the unit tests with these tags to be run uni.
Second, a string is specified describing the test, for example:
write(name, *) "Grid Destroy Test"
Third, a string to be printed when the test fails is specified, for example:
write(failMsg, *) "Did not return ESMF_SUCCESS"
Fourth, the ESMF_Test subroutine is called to determine the test results, for example:
call ESMF_Test((rc.eq.ESMF_SUCCESS), name, failMsg, result, ESMF_SRCLINE)
The following two tests are good examples of how unit tests should be written.
The first test verify that getting the attribute count from a Field returns ESMF_SUCCESS, while
the second verifies the attribute count is correct. These two tests could be combined into one
with a logical AND statement when calling ESMF_Test, but breaking the tests up allows you
to identify the source of the bug immediately.
!------------------------------------------------------------------------
!EX_UTest
! Getting Attrubute count from a Field
call ESMF_FieldGetAttributeCount(f1, count, rc=rc)
write(failMsg, *) "Did not return ESMF_SUCCESS"
write(name, *) "Getting Attribute count from a Field "
call ESMF_Test((rc.eq.ESMF_SUCCESS), name, failMsg, result, ESMF_SRCLINE)
!------------------------------------------------------------------------
!EX_UTest
! Verify Attribute Count Test
write(failMsg, *) "Incorrect count"
write(name, *) "Verify Attribute count from a Field "
call ESMF_Test((count.eq.0), name, failMsg, result, ESMF_SRCLINE)
!------------------------------------------------------------------------
Sometimes a unit test is written expecting a subset of the processors to fail the test. To
handle this case, the unit test must verify results from each processor as in the unit test below:
!------------------------------------------------------------------------
!EX_UTest
! Verify that the rc is correct on all pets.
write(failMsg, *) "Did not return FAILURE on PET 1, SUCCESS otherwise"
write(name, *) "Verify rc of a Gridded Component Test"
if (localPet==1) then
call ESMF_Test((rc.eq.ESMF_FAILURE), name, failMsg, result, ESMF_SRCLINE)
else
call ESMF_Test((rc.eq.ESMF_SUCCESS), name, failMsg, result, ESMF_SRCLINE)
endif
!------------------------------------------------------------------------
Some tests may require that a loop be written to verify multiple results. The following is
an example of how a single tag, NEX_UTest, is used instead of a tag for each loop iteration.
!-----------------------------------------------------------------------------
!NEX_UTest
write(name, *) "Verifying data in Array via Fortran array pointer access"
write(failMsg, *) "Incorrect data detected"
looptest = .true.
do i = -12, -6
j = i + 12 + lbound(fptr, 1)
print *, fptr(j), fdata(i)
if (fptr(j) /= fdata(i)) looptest = .false.
enddo
call ESMF_Test(looptest, name, failMsg, result, ESMF_SRCLINE)
!-----------------------------------------------------------------------------
When unit test are run, a Perl script prints out the test results as shown in
Section "Running ESMF Unit Tests" in the ESMF User's Guide. To print out the test results,
the Perl script must determine the number of unit tests in each test file and the number of
processors executing the unit test. It determines the number of tests by counting the
EX_UTest, NEX_UTest, EX_UTest_Multi_Proc_Only, or NEX_UTest_Multi_Proc_Only
tags in the test source file whichever is appropriate for the test being run.
To determine the number of processors, it counts the number of "NUMBER_OF_PROCESSORS" strings in the
unit test output Log file. The script then counts the number of PASS and FAIL strings in the
test Log file.
The Perl script first divides the number of PASS strings by the number of processors. If the
quotient is not a whole number then the script concludes that the test crashed. If the quotient
is a whole number, the script then divides the number of FAIL strings by the number of processors.
The sum of the two quotients must equal the total number of tests, if not the test is marked
as crashed.
Sometimes in the software development process it becomes necessary to disable one or more unit tests.
To disable a unit test, two lines need to be modified. First, the line calling "ESMF_Test" must be commented out.
Second, the NEX_UTest, EX_UTest, NEX_UTest_Multi_Proc_Only and EX_UTest_Multi_Proc_Only tags must be modified
so that they are not found by the Perl script that analyzes the test results.
The recommended way to modify these tags is to replace the first underscore with "_disable_", thus NEX_UTest becomes
NEX_disable_UTest.
Benchmark testing is included in the ESMF regression tests to detect any unexpected
change in the performance of the software. This capability is available to developers.
Developers can run the unit tests and save their execution
times to be used as a benchmark for future unit test runs.
The following section now appears in the output of "gmake info".
--------------------------------------------------------------
* ESMF Benchmark directory and parameters *
ESMF_BENCHMARK_PREFIX: ./DEFAULTBENCHMARKDIR
ESMF_BENCHMARK_TOLERANCE: 3%
ESMF_BENCHMARK_THRESHOLD_MSEC: 500
--------------------------------------------------------------
The steps for using the benchmarking test tool are as follows:
- After building the unit tests, execute "gmake run_unit_tests" and
verify that all tests pass. It not recommended that failing tests be benchmarked.
- Set "BENCHMARKINSTALL = YES" and execute "gmake run_unit_tests_benchmark".
This will cause the unit tests stdout files to be copied to the "DEFAULTBENCHMARKDIR"
directory. The elapsed times of these unit tests are the now the benchmark.
The default of DEFAULTBENCHMARKDIR is $ESMF_DIR/DEFAULTBENCHMARKDIR.
It is advised that the benchmarking directory be outside the ESMF structure, to allow
the developer to benchmark different versions of the software. The benchmark directory
can be changed by setting ESMF_BENCHMARK_PREFIX.
- Run the unit test a second time.
- To compare the elapsed times of this unit test run to the benchmared run, set
"BENCHMARKINSTALL = NO", and execute "gmake run_unit_tests_benchmark".
According to the default settings above, the benchmarking test will only analyze unit tests that run
500 msecs (ESMF_BENCHMARK_THRESHOLD_MSEC)
or longer. If a unit test runs 3 percent (ESMF_BENCHMARK_TOLERANCE) or more beyond the benchmarked
unit test, it will be flagged as failing the benchmark test. The developer may change these parameters
as desired. The following is an example of the output of running "gmake run_unit_tests_benchmark":
The following unit tests with a threshold of 500 msecs. passed the 3%
tolerance benchmark test:
PASS: src/Infrastructure/DELayout/tests/ESMF_DELayoutWorkQueueUTest.F90
PASS: src/Infrastructure/Field/tests/ESMF_FieldCreateGetUTest.F90
PASS: src/Infrastructure/Field/tests/ESMF_FieldRegridCsrvUTest.F90
PASS: src/Infrastructure/Field/tests/ESMF_FieldRegridXGUTest.F90
PASS: src/Infrastructure/Field/tests/ESMF_FieldStressUTest.F90
PASS: src/Infrastructure/TimeMgr/tests/ESMF_CalRangeUTest.F90
PASS: src/Infrastructure/VM/tests/ESMF_VMBarrierUTest.F90
PASS: src/Infrastructure/VM/tests/ESMF_VMUTest.F90
PASS: src/Infrastructure/XGrid/tests/ESMF_XGridMaskingUTest.F90
PASS: src/Infrastructure/XGrid/tests/ESMF_XGridUTest.F90
PASS: src/Superstructure/Component/tests/ESMF_CompTunnelUTest.F90
The following unit tests with a threshold of 500 msecs. failed the 3%
tolerance benchmark test:
FAIL: src/Infrastructure/Field/tests/ESMF_FieldRegridUTest.F90
Test elapsed time: 4331.446 msec.
Benchmark elapsed time: 2958.47675 msec.
Increase: 46.41%
FAIL: src/Infrastructure/FieldBundle/tests/ESMF_FieldBundleRegridUTest.F90
Test elapsed time: 2051.05675 msec.
Benchmark elapsed time: 1920.42125 msec.
Increase: 6.8%
FAIL: src/Infrastructure/LogErr/tests/ESMF_LogErrUTest.F90
Test elapsed time: 2986.40425 msec.
Benchmark elapsed time: 2583.36775 msec.
Increase: 15.6%
Found 167 exhaustive multi-processor unit tests files, of those with a
threshold of 500 msecs. 11 passed the 3% tolerance benchmark test, and 3 failed.
Benchmark install date: Thu Jun 4 13:26:55 MDT 2015
Note that only the unit tests that have an elapsed time of 500 msecs. or greater are listed. In addition, the date when
the benchmark install was completed is displayed.
When a unit test run it benchmarked it is written to a directory such as
"BENCHMARKDIR/test/testg/Darwin.gfortran.64.mpich2.default/".
Therefore you can only compare unit tests elapsed between the identical configurations.
To implement the benchmarking tool, the unit tests were modified to record the elapsed time of each PET.
The stdout file of each unit test has the following lines i.e.
ESMF_GridItemUTest.stdout: PET 0 Test Elapsed Time 5.7840000000000007 msec.
ESMF_GridItemUTest.stdout: PET 1 Test Elapsed Time 5.7259999999999982 msec.
ESMF_GridItemUTest.stdout: PET 2 Test Elapsed Time 6.6200000000000010 msec.
ESMF_GridItemUTest.stdout: PET 3 Test Elapsed Time 5.7190000000000021 msec.
The benchmarking tool uses the average of the four elapsed times to determine the test results
since the elapsed times of each PET can vary.
The examples are written to help users understand a specific use of an
ESMF capability. The examples appear as text in the ESMF Reference Manual, therefore
care must be taken to insure that correct portions of the examples appear in the
document. Latex tags have been created to designate which portions of the
examples are visible in the document.
BOE and EOE are used between text describing the example. BOC and EOC are used between
actual working code that appears in the Reference Manual. Below is an example of how
the tags are used:
!-------------------------------- Example -----------------------------
!>%>%>%>%>%>%>%>%>%>%>%>%>%>%>%>%>%>%>%>%>%>%>%>%>%>%>%>%>%>%>%>%>%>%>%
!BOE
!\subsubsection{Get Grid and Array and other information from a Field}
!\label{sec:field:usage:field_get_default}
!
! A user can get the internal {\tt ESMF\_Grid} and {\tt ESMF\_Array}
! from a {\tt ESMF\_Field}. Note that the user should not issue any destroy command
! on the retrieved grid or array object since they are referenced
! from within the {\tt ESMF\_Field}. The retrieved objects should be used
! in a read-only fashion to query additional information not directly
! available through the {\tt ESMF\_FieldGet()} interface.
!
!EOE
!BOC
call ESMF_FieldGet(field, grid=grid, array=array, &
typekind=typekind, dimCount=dimCount, staggerloc=staggerloc, &
gridToFieldMap=gridToFieldMap, &
ungriddedLBound=ungriddedLBound, ungriddedUBound=ungriddedUBound, &
totalLWidth=totalLWidth, totalUWidth=totalUWidth, &
name=name, &
rc=rc)
!EOC
if(rc .ne. ESMF_SUCCESS) finalrc = ESMF_FAILURE
print *, "Field Get Grid and Array example returned"
call ESMF_FieldDestroy(field, rc=rc)
if(rc .ne. ESMF_SUCCESS) finalrc = ESMF_FAILURE
!>%>%>%>%>%>%>%>%>%>%>%>%>%>%>%>%>%>%>%>%>%>%>%>%>%>%>%>%>%>%>%>%>%>%>%
Note that any code or text that is not contained within the tag pairs does not appear in the
Reference Manual.
Most examples can be run on multiple processors or a single processor. Those examples should
have the tag, "ESMF_EXAMPLE" as a comment in the body of the example file. If the example
can only run on multiple processors then use the tag, "ESMF_MULTI_PROC_EXAMPLE".
When an example is removed from the makefile, the "ESMF_EXAMPLE" or
"ESMF_MULTI_PROC_EXAMPLE"
tags must be modified so that the example is not flagged as failed.
The recommended way to modify these tags is to replace the first underscore with "_disable_",
thus "ESMF_EXAMPLE" becomes
"ESMF_disable_EXAMPLE".
System tests are written to test functionality that spans several
classes. The following areas should be addressed in system testing.
- Design omissions (e.g. incomplete or incorrect behaviors).
- Associations between objects (e.g. fields, grids, bundles).
- Control and infrastructure. (e.g. couplers, time management, error handling).
- Feature interactions or side effects when multiple features are used
simultaneously.
The system tester should issue a test log after each software release is tested,
which is recorded on the Test and Validation web page. The test
log shall
include: a test ID number, a software release ID number, testing environment
descriptions, a list of test cases executed, results, and any unexpected
events. Bugs should be documented in the SourceForge Bug
Tracker
and
any bug fixes shall be validated.
System tests should contain the following sections:
- Create - Create Components, Couplers, Clock, Grids, States etc.
- Register - Register Components and the initialize, run and finalize subroutines.
- Initialize - Initialize as needed.
- Run - Run the test.
- Finalize - Verify results.
- Destroy - Destory all classes.
Most system tests can be run on multiple processors or a single processor. Those system tests should
have the tag, "ESMF_SYSTEM_TEST" as a comment in the body of the system test. If the system test
can only run on multiple processors then use the tag, "ESMF_MULTI_PROC_SYSTEM_TEST".
At the end of the system it is recommended that the ESMF_TestGlobal subroutine be used to gather
test results from all processors and print out a single PASS/FAIL message instead
of individual PASS/FAIL messages from all the processors.
After the test is written it must be documented on the ESMF Test & Validation
web page:
http://www.earthsystemmodeling.org/developers/test/system/
When a system test is removed from the makefile, the "ESMF_SYSTEM_TEST" or
"ESMF_MULTI_PROC_SYSTEM_TEST"
tags must be modified so that the system test is not counted as failed.
The recommended way to modify these tags is to replace the first underscore with "_disable_", thus
ESMF_SYSTEM_TEST becomes
ESMF_disable_SYSTEM_TEST.
The Test Harness is a highly configurable test control system for conducting
thorough testing of the Regridding and Redistribution processes. The Test Harness
consists of a single shared executable and a collection of customizable resource
files that define an ensemble of test configurations tailored to each ESMF class.
The Test Harness is integrated into the Unit test framework, enabling
the Test Harness to be built and run as part of the Unit tests. The test results
are reported to a single standard-out file which is located with the unit test
results.
See section for a complete discussion of the test harness.
When the Test Harness completes a run, the results from the ensemble of tests are
reported in two ways. The first is analogous to the unit test reporting, since the
test harness is run as part of the unit tests, a summary of the results are recorded
just as with the unit tests. In addition to the standard unit test reporting, the
test harness is also able to produce a human readable report. The report consists
of a concise summary of the test configuration along with the test results. The test
configuration is described in terms of the Field Taxonomy syntax and user provided
strings. The intent is not to provide a exhaustive description of the test, but
rather to provide a useful description of the failed tests.
Consider another example similar to the previous one, where two descriptor strings
describing an ensemble regridding tests. The first uses the patch method and the
second uses bilinear interpolation.
[ B1 G1; B2 G2 ] =P=> [ B1 G1; B2 G2 ]
[ B1 G1; B2 G2 ] =B=> [ B1 G1; B2 G2 ]
Suppose the associated specifier files indicate that the source grid is rectilinear
and is 100 X 50 in size. The destination grid is also rectilinear and is 80 X 20
in size.
Both grids are block distributed in two
ways, 1 X NPETS and NPETS X 1. And suppose that the first dimension of both the
source and destination grids are periodic. If the test succeeds for the bilinear
regridding, but fails for one of the patch regridding configurations, the reported results
could look something like
SUCCESS: [B1 G1; B2 G2 ] =B=> [B1 G1; B2 G2 ]
FAILURE: [B1{1} G1{100}+P; B2{npets} G2{50} ] =P=> [B1{1} G1{80}+P; B2{npets} G2{20} ]
failure at line 101 of test.F90
SUCCESS: [ B1{npets} G1{100} +P; B2{1} G2{50} ] =P=> [ B1{npets} G1{80}+P; B2{1} G2{20} ]
The report indicates that all the test configurations for the bilinear regridding
are successful. This is indicated by the key word SUCCESS which is followed by the
successful problem descriptor string. Since all of the tests in the first case pass,
there is no need to include any of the specifier information. For the second
ensemble of tests, one configuration passed, while the other failed. In this case,
since there is a mixture of successes and failures, the report includes
specifier information for all configurations to help indicate the source of the
test failure. The supplemental information, while not a complete problem description
since it lacks items such as the physical coordinates of the grid and the nature of
the test field, includes information crucial to isolating the failed test.
Use Test Cases are problems of realistic size created to test the ESMF
software. They were initiated when the ESMF team and its users saw that
often ESMF capabilities could pass simple system tests but would fail
out in the field, for real customer problems. UTCs have realistic
processor counts, data set sizes, and grid and data array sizes. UTCs are
listed on the Test & Validation page of the ESMF website. They
are not distributed with the ESMF software; instead they are stored in
a separate module in the main repository called use_test_cases.
ESMF software is released in a beta form, as an Internal Release,
three months before it is publicly released. This gives users
a chance to test the software and report back any problems to
support.
The purpose of regression testing is to reveal faults caused by new
or modified software (e.g. side effects, incompatibility between
releases, and bad bug fixes).
Regression tests regularly exercise all interfaces of the code on
all target platforms.
The regression test results for the last two weeks can be found
here.
This web page provides a complete color-coded current view of the state of the trunk ESMF software, sorting options by platform or compiler are provided.
A similiar test results web page for the branch is also available.
Clicking on any of the cells will display the specific test report for that day.
Hovering over the test name i.e., Blues gfortran, will reveal notes particular to that platform/compiler. Clicking on the test name, will take you to the home page of the platform.
The platforms that run the regression tests, email the test results to a server that updates the test results web page. A script checks for test reports every 15 minutes, and updates the web page. The time of the last update appears on the web page.
The regression test results web
page
provides a color-coded view of the state of the software.
When a developer finds that a test fails on a particular platform with a particular compiler, sometimes the bug is readily identified and fixed.
However other times the developer may want to know if the test fails on other platforms and if the failure is related to a compiler, mpi configuration or optimized/debug execution.
The developer would need to click to all the cells of different platforms searching for the test results for that particular test.
A tool was created to allow the developers to query the test results for a specific test for a specific date, as long as it is within two weeks of the current date.
The developer may send a query test results message to the following email address: esmftest@cgd.ucar.edu
The subject of the email must be exactly "Test_Results_Query". The body of the email message must be "Test:" followed by the test name and "Date" followed by the desired date. The format must be a three letter month and a number. If the date is 2 digits, greater than 9, then insert one space between the month and date e.g. Apr 25. If the day is a single digit insert two spaces, between the month and day e.g. Apr 4.
Test:ESMF_FieldBundleSMMUTest.F90
Date:Feb 8
or
Date Feb 28
This mail box is checked every quarter hour on the quarter hour, the results are emailed to:esmf_test@cgd.ucar.edu.
The subject of the results email for this example would be:
ESMF_FieldBundleSMMUTest.F90 test results for Feb 8
The body of the email would be as follows:
ESMF_Blues_PGI:PASS: mvapich2/g: src/Infrastructure/FieldBundle/tests/ESMF_FieldBundleSMMUTest.F90
ESMF_Blues_PGI:PASS: mvapich2/O: src/Infrastructure/FieldBundle/tests/ESMF_FieldBundleSMMUTest.F90
ESMF_Blues_PGI:CRASHED: mpich3/g: src/Infrastructure/FieldBundle/tests/ESMF_FieldBundleSMMUTest.F90
ESMF_Blues_PGI:PASS: mpich3/O: src/Infrastructure/FieldBundle/tests/ESMF_FieldBundleSMMUTest.F90
ESMF_Blues_PGI:PASS: openmpi/g: src/Infrastructure/FieldBundle/tests/ESMF_FieldBundleSMMUTest.F90
ESMF_Blues_PGI:PASS: openmpi/O: src/Infrastructure/FieldBundle/tests/ESMF_FieldBundleSMMUTest.F90
ESMF_Discover_g95:PASS: mvapich2/g: src/Infrastructure/FieldBundle/tests/ESMF_FieldBundleSMMUTest.F90
ESMF_Discover_g95:PASS: mvapich2/O: src/Infrastructure/FieldBundle/tests/ESMF_FieldBundleSMMUTest.F90
ESMF_Haumea_g95:PASS: mpich2/g: src/Infrastructure/FieldBundle/tests/ESMF_FieldBundleSMMUTest.F90
ESMF_Haumea_g95:PASS: mpich2/O: src/Infrastructure/FieldBundle/tests/ESMF_FieldBundleSMMUTest.F90
ESMF_Haumea_g95:PASS: mvapich2/g: src/Infrastructure/FieldBundle/tests/ESMF_FieldBundleSMMUTest.F90
ESMF_Haumea_g95:PASS: mvapich2/O: src/Infrastructure/FieldBundle/tests/ESMF_FieldBundleSMMUTest.F90
ESMF_Pluto_g95:FAIL: mpich2/g: src/Infrastructure/FieldBundle/tests/ESMF_FieldBundleSMMUTest.F90
ESMF_Pluto_g95:FAIL: mpich2/O: src/Infrastructure/FieldBundle/tests/ESMF_FieldBundleSMMUTest.F90
ESMF_Pluto_g95:FAIL: mvapich2/g: src/Infrastructure/FieldBundle/tests/ESMF_FieldBundleSMMUTest.F90
ESMF_Pluto_g95:FAIL: mvapich2/O: src/Infrastructure/FieldBundle/tests/ESMF_FieldBundleSMMUTest.F90
Note that if the date of the query is the current day, the developer should query periodically during the day since
the test results are being updated as platforms report their test results.
If a test crashes it can be because another test hung and the test in question did not run.
Another instance where this tool is useful is when a developer adds a new test, after the nightly tests run, the developer can run a query to quickly see the test results.
As software development progresses, the documentation is updated, built and posted at
http://earthsystemmodeling.org/docs/nightly/develop/dev_guide/
The documents are built daily in the early morning, the results of the builds are posted at
http://earthsystemmodeling.org/doc/
These documents can be updated by the developers, by checking out the documents from the repository and submitting the edited files. To have the new version of the documents posted on the web, the developer must sent a request to the following email address: esmftest@cgd.ucar.edu. The subject of the email indicates which document to build and post.
The following is the list of subjects that have been implemented:
- Build_Dev_Guide_Doc
- Build_NUOPC_Doc
- Build_Ref_Doc
- Build_ESMPy_Doc
- Build_CICE_NUOPC_CAP_Doc
- Build_HYCOM_NUOPC_CAP_Doc
- Build_LIS_NUOPC_CAP_Doc
- Build_MOM_NUOPC_CAP_Doc
- Build_WRFHYRO_NUOPC_CAP_Doc
A script checks for document build requests every quarter hour on the quarter hour. A document build is started and on successful completion the document is updated on the web and document build results is updated. An email will be sent to esmf_test@cgd.ucar.edu and esmf-test@lists.sourceforge.net when the build is done.
We provide two types of tar files, the ESMF source and the shared
libraries of the supported platforms. Consequently, there are two test
procedures followed before placing the tar files on the ESMF download website.
The Source Code Test Procedure is followed on all the supported
platforms for the particular release.
- Verify that the source code builds in both BOPT=g and BOPT=O.
- Verify that the ESMF_COUPLED_FLOW demonstration executes successfully.
- Verify that the unit tests run successfully, and that there are no NON-EXHAUSTIVE unit tests failures.
- Verify that all system tests run successfully.
The Shared Libraries Test Procedure is also followed on all supported
platforms for a release.
- Change to the CoupledFlowEx directory and execute gmake. Verify that the demo runs successfully.
- Change to the CoupledFlowSrc directory and execute gmake then gmake run. Verify that the demo runs successfully.
- Change to the examples directory and execute gmake and gmake run. Verify that the example runs successfully.
4.5 User Support
The Advocate is the staff person assigned to a particular code e.g. GEOS-5. See section 2.1.1 for a full definition and list of responsibilities.
The Handler is the staff person assigned to solve a support ticket. The Advocate and the Handler may be the same person or they may be different. See section 2.1.1 for complete definition and list of responsibilities.
New is a request that has not been replied to.
Closed is a request that has been fixed to the user's satisfaction.
Pending is a request that has been fixed to the Handler's satisfaction but has not yet been approved by the user.
- Message received.
- The Integrator or in his absence the Support Lead, generates a SourceForge Bug, Feature, or Support Request ticket.
- If the request contains more than one topic, then Integrator will open multiple tickets, one per topic. This can been done initially if obvious, or later if more research indicates it is necessary.
- The top line of the entry should be WHO: <Requester Name>.
- Indicate the institution and model if known.
- Keep title of initial email and the title of the SF ticket the
same or close enough to be able to determine they are one and the same.
- Assign the ticket to the staff person best able to solve the ticket's issue.
- Initial contact is made by:
- The Handler assigned by the Integrator in the ticket.
- The Support Lead if the Handler will be unavailable for more than a week.
- The Handler works to solve the tickets issues. He or she will communicate
periodically with the ticket's originator and will keep the rest of the Core team
informed on the tickets progress at the monthly ticket review meetings. Once the
issue has been solved, the ticket will be marked pending by the Handler.
- At this point, the Handler contacts the originator to gauge their satisfaction with
the solution. If the originator is satisfied, the ticket may be closed, and the mail
folder on the IMAP server moved from Open to Closed by the Support Lead. If the customer
does not respond, an attempt at contact will be made once a month for two months.
If after this period, the originator still does not reply, a pending ticket may be closed
with final notification to the originator.
- Include title and ticket number on all correspondence.
- Make initial contact within 48 hours even if just to say message received.
- The email address for ticket originators can be found in either freeCRM or the mail archive. Do not hesitate to contact the Support Lead if a required email address can not be found.
- Copy esmf_support@ucar.edu on all replies.
- Bugs that are fixed should be marked Closed, and Fixed. They should never be deleted.
- Bugs that are duplicates should be marked Deleted, and Duplicate.
- If the main issue in a Bug, Feature Request, or Support Request has not been implemented it should stay Open.
- Users are always notified via email when their ticket is being closed even if they have been unresponsive.
- If the solution to a ticket involves a test code, this should be incorporated into the code body as standard test. It should not be sent to the user as an unofficial code fragment.
- If the solution to a ticket involved changes to the code, the user should be given a stable beta snapshot. The user should not be directed to the HEAD, which is inherently unstable.
- If a ticket involves an older version of the code and a computing environment that the current distribution runs on, the ticket should be considered for closure when there is no means of testing or fixing the older code.
- The Handler is responsible for changing the status of tickets assigned to them.
The Support Lead manages the archive of esmf_support@ucar.edu email traffic and is responsible for the creation of ticket folders, component folders, and the proper placement of mail messages. The archive is located on the main CISL IMAP server and can be accessed by any Core member. Contact the Support Lead if you wish your local mail client enabled to view the archive. The IMAP archive will have the following appearance:
- Component Name
- Open
- Numbered Ticket Folder
- Numbered Ticket Folder
- Numbered Ticket Folder
- Closed
- Numbered Ticket Folder
- Numbered Ticket Folder
- Numbered Ticket Folder
- Component Name
The following rules apply to the above:
- Email messages will be filed by component and number.
- A folder labeled with the request number will be created.
- This folder will then be placed in the components Open folder until closed.
- The Support Lead will copy each related email message to its numbered folder.
- When a ticket has been closed, the Support Lead will move the numbered folder from the components Open folder to its Closed folder.
- There will be only one New folder to which highly active tickets may be placed
for easier filing at the discretion of the Support Lead.
4.5.6 INFO:Code (subject) mail messages
Advocates need to share the information they have received from their codes with the rest
of the Core team. This will be done by sending an email to esmf_support@ucar.edu with a
subject line labeled INFO: Code e.g. INFO: CCSM, INFO: GEOS-5. These messages will be
filed on the IMAP server (see above section) under the code referenced. All information
about a code that is general and not related to a specific support request will be archived
in this manner.
A client relationship management tool (freeCRM http://www.freecrm.com) is being used
to archive codes, their affiliated contacts, degree of componentization, issues, and
applicable funding information if known. The following is a list of roles and
responsibilities associated with this software:
- Advocates are responsible for the accuracy and completeness of all information
associated with codes to which they are assigned. This information includes a pull
down menu that specifies the state of the code's ESMF'ization. This piece of
information is critical and needs to be updated whenever an Advocate updates his or
her codes. Other information includes type of code, parent agency etc. This
information will be reviewed on a semi-annual basis.
- The Integrator is responsible for creating a back up of all freeCRM data on a
monthly basis.
- The Core Team Manager is responsible for the accuracy and completeness of all
funding related information.
- The Support Lead is responsible for creating code 'companies' and informing
the Integrator of any additions so that the back up scripts can be modified. He or
she is additionally responsible for conducting semi-annual quality control checks of
all information in the system.
- All team members are responsible for updating and adding to the list of
contacts.
Once a year all codes in the freeCRM data base will be contacted in order to gauge
their development progress, and to update our component metrics. This process will
contain the following steps:
- Advocates will login into freeCRM and get a list of all their codes.
This list will be emailed to esmf_core@cgd.ucar.edu.
- Advocates will review their list and determine which codes on the list need to be
contacted. Contact is not needed if sufficient knowledge is already known about a
code.
- Advocates will review all the information contained in freeCRM concerning their
assigned codes AND review all the esmf_support traffic for the last year.
- Advocates will draft the contact email and send it to esmf_core@cgd.ucar.edu to be
reviewed by the Core Team Manager. Once corrected, the Advocates may send their email. Since
this is a group level effort, the email message may be signed “The ESMF Team” if
desired.
- The Support Lead will track the draft and completed emails as well as the
responses and will provide a report to the Core Team Manager at the end of the
process.
- As responses come in, the Advocates are responsible for updating the
information in freeCRM.
- The Support Lead will tally the results and update the components page on the
ESMF Web site and will also update the components metric chart.
More and more applications are being distributed with embedded ESMF interfaces. It may
difficult to determine if a reported problem with one of these applications is related
to an incorrect ESMF implementation, a true ESMF bug, or an issue within the parent model.
The following are several definitions:
- End User: A person who downloads or otherwise receives an application that contains
ESMF code. While they may be trying to modify this application, they were not the person
or persons who originally inserted ESMF into the application. Most likely, they will be
entirely unfamiliar with ESMF.
- Application Developer: The person or persons who took a model, inserted ESMF code,
and made the resulting application available to others.
The following are some guidelines for dealing with such Applications that use ESMF:
- For support requests related to applications that include ESMF, our primary contact
for resolving the request should be the developers of the distributed application and not
the End User. As such, every effort should be made to identify and contact the developers
of the distributed application in order to make them aware of the reported issue and to get
them actively engaged in resolution of the problem. Additionally, they should be cc'ed on
all correspondence with the End User.
- During the resolution of the issue, it will be necessary to cc all email traffic to the
End User. In dealings with the End User emphasize that the ESMF group is committed to any user
of ESMF regardless of source. That commitment is predicated, however, on participation of the
application developers.
- The Handler should establish which version of ESMF the application is using.
- The Handler should try and determine whether the ESMF code in question was modified in any
way by the Application Developers.
- The Handler should try and determine whether the code in question has ESMF interface names
but is not ESMF code. The time manager in WRF falls into this category. It has ESMF interfaces
but was not developed by us.
- It will be solely the Core Team's discretion whether or not to support older versions of ESMF,
ESMF code that has been modified by others, or code that uses ESMF interface names but was developed
entirely separate from ESMF.
- In no way should the Handler try running the End User's code.
- In the event that the developers of the distributed application are
unknown, unreachable, or uncooperative, the End User must be politely informed that the group
can not troubleshoot code belonging to another group. This will have to been handled with a
degree of sensitivity because it is likely that the end user has already tried to contact the
application developers without success.
esmf_support@ucar.edu