How to write dr2xml ping file
A dr2xml ping file describes, for every CMIP6 requested variable name, which is the corresponding model field in XIOS namespace (i.e. as described by some ‘field definition’ specific to the model). It is used for interfacing the field reference generated by dr2xml to model “native” field definitions. Its syntax is the one of an XIOS ‘field definition’ item.
Example: CMIP6 request includes ‘ tos’ , the sea surface temperature, while Nemo sends field ‘ sst’ to XIOS. Here is an example of a ping file making the relation between ‘sst’ and ‘tos’ and ensuring a transformation of units from Celsius to Kelvin:
<field_definition>
<field id="CMIP_tos" field_ref="sst"> sst + 273.15 <field/>
</field_definition>
where ‘ CMIP_’ is the prefix that you have defined in lab_and_model_settings. In the case of an empty string prefix, you have to take care that this does not generate name collision with some XIOS field identifier already used by the model.
“Home” variables have not to be defined in ping files, but only in the model native field_def (without any prefix).
The ping file can also be used to tell dr2xml that some DR-requested variables are not to be produced, either because the model cannot produce it or because the lab does not want to. Any field definition entry that has a ‘field ref’ beginning with dummy will not be included in output files by dr2xml (this feature will actually be implemented in next version of dr2xml).
To help you writing the ping file, a skeleton per realm is provided in directory output_samples/ where the exhaustive list of all variables for all MIPs at all priority level is provided as XIOS ‘field ids’.
Since it is very likely your lab is not involved in all MIPs, neither is concerned with all tiers and all variable priority levels, you may wish to regenerate these ping file templates to reduce their length and avoid to have to keep numerous dummy entries. For that purpose, create_ping_files.ipynb (available in dr2xml/ repository) python notebook drives you to create ping files templates with some user control (including accounting for a list of excluded variables). If not familiar with notebooks, you can alternatively use the equivalent classical Python script create_ping_files.py (available in dr2xml/doc/ repository).
Next, what you have to do is to identify, for each prefixed MIP variable (in the ‘field id’ namespace) the associated model variable (in the ‘field ref’ namespace) in replacement of ‘ dummy’ when existing; see sample of a ping file template below:
<field_definition>
<field id="CMIP6_tos" field_ref="dummy" />
<field id="CMIP6_sos" field_ref="dummy" />
</field_definition>
On the basis of this example, you have to take care:
that the field_ref ( ‘sst’ ) is the name of a field already known to XIOS, and actually sent by the model (either un-conditionally, or when the model uses ‘ xios_field_is_active’ to know if it should send it)
that you use in field id, attached to the prefix, only the so-called MIP variable names (hereabove: ‘tos’ ) listed at http://clipc-services.ceda.ac.uk/dreq/index/var.html ; in some cases, you may add a suffix (see below)
for the case of variables required on ocean transect, or as ocean zonal means or ocean sections, to provide as ‘field ref’ the model variable with the relevant shape
in the case of variables that are defined both at half and full model levels, that you use suffix
‘_half’ behind the (prefix+CMIP variable name) for defining the values at half levels;
What you do not have to manage is:
describe some spatial operations that are explicitly described by the so-called CMOR variables, such as e.g. zonal means from lat-lon grids, or extracting profiles or values at given locations (as e.g. ’rlu’ requested in table cfSites as profiles on sampled locations and on atmospheric levels, or as ‘ta’ requested on 3 pressure levels in table em1hr) [-not yet implemented-];
describe the vertical interpolations from atmospheric model levels to pressure of height levels; this applies only to dimensions sets: ‘alt16’, ‘alt40’ and ‘plev*’ [-not yet available-];
describe time operation (averages, min, max, climatologies) nor variables such as tasmin or tasmax , these are automatically handled in generated field_def;
The ping file offers some facilities:
when the units of the native model variable does not match the expected standard, you can code, in the ping file, the units conversion as an XIOS arithmetic operation (cf. conversion from ‘degC’ to ‘K’ in the example above);
you may derive a requested variable by some arithmetics on XIOS-known fields, as e.g. in (where uppercase ids are native model ‘field ids’):
<field id="CMIP_mrsos" field_ref="WG1_ISBA" > WG1_ISBA + WGI1_ISBA
</field>
you may decide that your model is more cost-effective than XIOS for computing some vertical interpolation; in that case, you may indicate that by a ping-file entry having as identifier e.g. CV_ta_plev19 , i.e. the concatenation of your prefix, the CMIP varname and the dimension label (as stated in the Data Request at http://clipc- services.ceda.ac.uk/dreq/index/grids.html); this works for interpolation to pressure and height levels, [-not yet implemented-]
The same applies for zonal means, with suffix ‘_zm’ , which should stand after any vertical interpolation suffix (e.g. CV_tas_zm , or CV_ta_plev19_zm ) [-not yet implemented-]
In addition, dr2xml addresses a number of shortcomings of CMIP6 Data Request beta.45:
in some cases, there is not enough information in the DR to derive a variable from another one, such as for ta850 , the temperature at pressure level 850 hPa; the ping_file templates include some field_defs for that [-not yet implemented-];
ambiguous MIP variables names: 64 MIP variables names are ambiguous in the sense that the set of corresponding CMOR Variables are not homogeneous regarding the area part of the ‘cell_method’. In that case, dr2xml will suffix the MIP variable name with a shortcut for area type, as derived by the code below; this occurs both in file_def files and ping files. So, you may have to fill in some consecutive ping file lines with the same content if you think the actual geophysical field is the same.
if cell_method includes: |
Automatic suffix is: |
|---|---|
where floating_ice_shelf |
_fixf |
where grounded_ice_shelf |
_gisf |
where snow over sea_ice area |
_sosi |
where ice_free_sea over area |
_ifs |
where land |
_land |
where sea_ice |
_si |
where sea |
_sea |
where snow |
_snow |
where cloud |
_cloud |
where landuse |
_lu |
where ice_shelf |
_isf |
The list of such MIP varnames is: nep tnpeo hfss lai albisccp mrlsl hfgeoubed treeFrac mrsos sisnthick cVeg topg fbddtdife parasolRefl rlds lwsnl snc snm snw hfgeou o2sat fddtdisi rlus cWood prra agesno ts cMisc grassFrac prsn fbddtdic fbddtdin fbddtdip cSoil sbl orog cLitter prveg tpf fLuc fbddtalk fddtdife fddtalk pctisccp mrros lithk sootsn mrro tas tsn tran rsds hfdsn pflw fddtdic fddtdin fddtdip fbddtdisi rsus cProduct sftgif hfls dms
Example: ‘hfss’ , the Surface Upward Sensible Heat Flux, is ambiguous (the variable related to the whole atmospheric mesh appears in table Amon, but the Ice Sheet part only in table LImongre), while ‘hfssIs’ (also Ice Sheet part in table LImon) is not ambiguous:
<field id="hfss_landIce" field_ref="H_ISBA_P3" />
<field id="hfss" field_ref="H" />
<field id="hfssIs" field_ref="H_ISBA_P3" />
in table Omon, there are some references to ‘zfull’ and ‘zhalf’ instead of ‘zfullo’ and ‘zhalfo’; [- no special processing is done by dr2xml for that yet-].