How to write dr2xml ping file
=============================
A dr2xml ping file describes, for every CMIP6 requested variable name, which is the corresponding
model field in XIOS namespace (i.e. as described by some ‘field definition’ specific to the model). It
is used for interfacing the field reference generated by dr2xml to model “native” field definitions. Its
syntax is the one of an XIOS ‘field definition’ item.
Example: CMIP6 request includes ‘ tos’ , the sea surface temperature, while Nemo sends field ‘ sst’
to XIOS. Here is an example of a ping file making the relation between ‘sst’ and ‘tos’ and
ensuring a transformation of units from Celsius to Kelvin:
.. code-block:: html
sst + 273.15
where ‘ CMIP_’ is the prefix that you have defined in lab_and_model_settings. In the case of an
empty string prefix, you have to take care that this does not generate name collision with some XIOS
field identifier already used by the model.
“Home” variables have not to be defined in ping files, but only in the model native field_def (without
any prefix).
The ping file can also be used to tell dr2xml that some DR-requested variables are not to be
produced, either because the model cannot produce it or because the lab does not want to. Any field
definition entry that has a ‘field ref’ beginning with dummy will not be included in output files by
dr2xml (this feature will actually be implemented in next version of dr2xml).
To help you writing the ping file, a skeleton per realm is provided in directory output_samples/
where the exhaustive list of all variables for all MIPs at all priority level is provided as XIOS ‘field ids’.
Since it is very likely your lab is not involved in all MIPs, neither is concerned with all tiers and all
variable priority levels, you may wish to regenerate these ping file templates to reduce their length
and avoid to have to keep numerous dummy entries. For that purpose,
create_ping_files.ipynb (available in dr2xml/ repository) python notebook drives you to
create ping files templates with some user control (including accounting for a list of excluded
variables). If not familiar with notebooks, you can alternatively use the equivalent classical Python
script create_ping_files.py (available in dr2xml/doc/ repository).
Next, what you have to do is to identify, for each prefixed MIP variable (in the ‘field id’ namespace)
the associated model variable (in the ‘field ref’ namespace) in replacement of ‘ dummy’ when
existing; see sample of a ping file template below:
.. code-block:: html
On the basis of this example, you have to take care:
- that the field_ref ( 'sst' ) is the name of a field already known to XIOS, and actually sent by
the model (either un-conditionally, or when the model uses ‘ xios_field_is_active’ to
know if it should send it)
- that you use in field id, attached to the prefix, only the so-called MIP variable names
(hereabove: 'tos' ) listed at http://clipc-services.ceda.ac.uk/dreq/index/var.html ; in some
cases, you may add a suffix (see below)
- for the case of variables required on ocean transect, or as ocean zonal means or ocean
sections, to provide as ‘field ref’ the model variable with the relevant shape
- in the case of variables that are defined both at half and full model levels, that you use suffix
‘_half’ behind the (prefix+CMIP variable name) for defining the values at half levels;
What you do not have to manage is:
- describe some spatial operations that are explicitly described by the so-called CMOR
variables, such as e.g. zonal means from lat-lon grids, or extracting profiles or values at given
locations (as e.g. ’rlu’ requested in table cfSites as profiles on sampled locations and on
atmospheric levels, or as ‘ta’ requested on 3 pressure levels in table em1hr) [-not yet
implemented-];
- describe the vertical interpolations from atmospheric model levels to pressure of height
levels; this applies only to dimensions sets: 'alt16', 'alt40' and 'plev*' [-not yet available-];
- describe time operation (averages, min, max, climatologies) nor variables such as tasmin or
tasmax , these are automatically handled in generated field_def;
The ping file offers some facilities:
- when the units of the native model variable does not match the expected standard, you can
code, in the ping file, the units conversion as an XIOS arithmetic operation (cf. conversion
from ‘degC’ to ‘K’ in the example above);
- you may derive a requested variable by some arithmetics on XIOS-known fields, as e.g. in
(where uppercase ids are native model ‘field ids’):
.. code-block:: html
WG1_ISBA + WGI1_ISBA
- you may decide that your model is more cost-effective than XIOS for computing some
vertical interpolation; in that case, you may indicate that by a ping-file entry having as
identifier e.g. CV_ta_plev19 , i.e. the concatenation of your prefix, the CMIP varname and
the dimension label (as stated in the Data Request at http://clipc-
services.ceda.ac.uk/dreq/index/grids.html); this works for interpolation to pressure and
height levels, [-not yet implemented-]
- The same applies for zonal means, with suffix ‘_zm’ , which should stand after any vertical
interpolation suffix (e.g. CV_tas_zm , or CV_ta_plev19_zm ) [-not yet implemented-]
In addition, dr2xml addresses a number of shortcomings of CMIP6 Data Request beta.45:
- in some cases, there is not enough information in the DR to derive a variable from another
one, such as for ta850 , the temperature at pressure level 850 hPa; the ping_file templates
include some field_defs for that [-not yet implemented-];
- ambiguous MIP variables names: 64 MIP variables names are ambiguous in the sense that
the set of corresponding CMOR Variables are not homogeneous regarding the area part of
the ‘cell_method’. In that case, dr2xml will suffix the MIP variable name with a shortcut for
area type, as derived by the code below; this occurs both in file_def files and ping files. So,
you may have to fill in some consecutive ping file lines with the same content if you think the
actual geophysical field is the same.
.. list-table:: Suffixes relative to cell-methods
:widths: 25 25
:header-rows: 1
* - if cell_method includes:
- Automatic suffix is:
* - where floating_ice_shelf
- _fixf
* - where grounded_ice_shelf
- _gisf
* - where snow over sea_ice area
- _sosi
* - where ice_free_sea over area
- _ifs
* - where land
- _land
* - where sea_ice
- _si
* - where sea
- _sea
* - where snow
- _snow
* - where cloud
- _cloud
* - where landuse
- _lu
* - where ice_shelf
- _isf
The list of such MIP varnames is: nep tnpeo hfss lai albisccp mrlsl hfgeoubed
treeFrac mrsos sisnthick cVeg topg fbddtdife parasolRefl rlds lwsnl
snc snm snw hfgeou o2sat fddtdisi rlus cWood prra agesno ts cMisc
grassFrac prsn fbddtdic fbddtdin fbddtdip cSoil sbl orog cLitter
prveg tpf fLuc fbddtalk fddtdife fddtalk pctisccp mrros lithk sootsn
mrro tas tsn tran rsds hfdsn pflw fddtdic fddtdin fddtdip fbddtdisi
rsus cProduct sftgif hfls dms
Example: 'hfss' , the Surface Upward Sensible Heat Flux, is ambiguous (the variable related
to the whole atmospheric mesh appears in table Amon, but the Ice Sheet part only in table
LImongre), while 'hfssIs' (also Ice Sheet part in table LImon) is not ambiguous:
.. code-block:: html
- in table Omon, there are some references to 'zfull' and 'zhalf' instead of 'zfullo' and 'zhalfo'; [-
no special processing is done by dr2xml for that yet-].