cmiputil.dds module¶
Module to parse DDS (Dataset Descriptor Structure) used in OPeNDAP.
DDS¶
For the definition of DDS, see OpenDAP UserGuide. In this module, we change the notation in the DDS syntax as follows:
declarations := list(declaration)declaration := Var | StructStruct := stype { declarations } (name | name arr)stype := Dataset|Structure|Sequence|GridGrid := Grid { ARRAY: declaration MAPS: declarations } (name | name arr)Var := btype (name | name arr)btype := Byte|Int32|UInt32|Float64|String|Url| …arr := [integer] | [name = integer]
As you can see from above syntax, one Struct can contain other Struct recursively, and consists the tree structure. The root of the tree must be one “Dataset”.
In this module, each element of above syntax is implemented as one class.
Basic Usage¶
Text form of DDS will be obtained by, for example,
ESGFDataInfo.getDDS(). Use parse_dataset() to parse it to
get the tree structure. The root of the tree is a Dataset
instance, and you can access nodes and leafs of the tree by dot
notation (see also ‘Example’ section below):
ds = parse_dataset(text=sample1)
ds.tas # Grid('tas, arrary=Var(tas, ...), maps={'time':..., 'lat':..., 'lon':...})
ds.tas.array.arr[0] # Arr('time', 8412)
Example
>>> sample1 = '''
... Dataset {
... Float64 lat[lat = 160];
... Float64 lat_bnds[lat = 160][bnds = 2];
... Float64 lon[lon = 320];
... Float64 lon_bnds[lon = 320][bnds = 2];
... Float64 height;
... Float64 time[time = 8412];
... Float64 time_bnds[time = 8412][bnds = 2];
... Grid {
... ARRAY:
... Float32 tas[time = 8412][lat = 160][lon = 320];
... MAPS:
... Float64 time[time = 8412];
... Float64 lat[lat = 160];
... Float64 lon[lon = 320];
... } tas;
... } CMIP6.CMIP.MRI.MRI-ESM2-0.piControl.r1i1p1f1.Amon.tas.gn.tas.20190222.aggregation.1;'''
>>> sample1_struct = Dataset(
... 'CMIP6.CMIP.MRI.MRI-ESM2-0.piControl.r1i1p1f1.Amon.tas.gn.tas.20190222.aggregation.1',
... {
... 'lat':
... Var('lat', 'Float64', arr=[Arr('lat', 160)]),
... 'lat_bnds':
... Var('lat_bnds', 'Float64', arr=[Arr('lat', 160),
... Arr('bnds', 2)]),
... 'lon':
... Var('lon', 'Float64', arr=[Arr('lon', 320)]),
... 'lon_bnds':
... Var('lon_bnds', 'Float64', arr=[Arr('lon', 320),
... Arr('bnds', 2)]),
... 'height':
... Var('height', 'Float64'),
... 'time':
... Var('time', 'Float64', arr=[Arr('time', 8412)]),
... 'time_bnds':
... Var('time_bnds', 'Float64', arr=[Arr('time', 8412),
... Arr('bnds', 2)]),
... 'tas':
... Grid('tas',
... array=Var(
... 'tas',
... 'Float32',
... arr=[Arr('time', 8412),
... Arr('lat', 160),
... Arr('lon', 320)]),
... maps={
... 'time': Var('time', 'Float64', arr=[Arr('time', 8412)]),
... 'lat': Var('lat', 'Float64', arr=[Arr('lat', 160)]),
... 'lon': Var('lon', 'Float64', arr=[Arr('lon', 320)])
... })
... })
>>> sample1_struct == parse_dataset(sample1)
True
>>> from cmiputil import dds
>>> sample2 = '''
... Dataset {
... Int32 catalog_number;
... Sequence {
... String experimenter;
... Int32 time;
... Structure {
... Float64 latitude;
... Float64 longitude;
... } location;
... Sequence {
... Float64 depth;
... Float64 salinity;
... Float64 oxygen;
... Float64 temperature;
... } cast;
... } station;
... } data;
... '''
>>> sample2_struct = Dataset(
... 'data', {
... 'catalog_number':
... Var('catalog_number', 'Int32'),
... 'station':
... Sequence(
... 'station', {
... 'experimenter':
... Var('experimenter', 'String'),
... 'time':
... Var('time', 'Int32'),
... 'location':
... Structure(
... 'location', {
... 'latitude': Var('latitude', 'Float64'),
... 'longitude': Var('longitude', 'Float64')
... }),
... 'cast':
... Sequence(
... 'cast', {
... 'depth': Var('depth', 'Float64'),
... 'salinity': Var('salinity', 'Float64'),
... 'oxygen': Var('oxygen', 'Float64'),
... 'temperature': Var('temperature', 'Float64')
... })
... })
... })
>>> sample2_struct == parse_dataset(sample2)
True
>>> sample3 = '''
... Dataset {
... Structure {
... Float64 lat;
... Float64 lon;
... } location;
... Structure {
... Int32 minutes;
... Int32 day;
... Int32 year;
... } time;
... Float64 depth[500];
... Float64 temperature[500];
... } xbt-station;
... '''
>>> sample3_struct = Dataset(
... 'xbt-station', {
... 'location':
... Structure('location', {
... 'lat': Var('lat', 'Float64'),
... 'lon': Var('lon', 'Float64')
... }),
... 'time':
... Structure(
... 'time', {
... 'minutes': Var('minutes', 'Int32'),
... 'day': Var('day', 'Int32'),
... 'year': Var('year', 'Int32')
... }),
... 'depth':
... Var('depth', 'Float64', arr=[Arr('', 500)]),
... 'temperature':
... Var('temperature', 'Float64', arr=[Arr('', 500)])
... })
>>> sample3_struct == parse_dataset(sample3)
True
-
class
dds.Arr(name='', val=None, text=None)[source]¶ Bases:
objectClass for arr.
arr := [integer] | [name = integer]As a text form:
text = '[time = 8412]' text = '[500]'
Example
>>> text = '[lat = 160];' >>> Arr(text=text) Arr('lat', 160)
>>> text = '[500];' >>> Arr(text=text) Arr('', 500)
-
text_formatted(indent=None, linebreak=None)[source]¶ Text form of arr.
indent and linebreak are dummy here.
-
property
text¶
-
-
class
dds.BType[source]¶ Bases:
enum.EnumValues for
Var.btype.-
Byte= 'Byte'¶
-
Float32= 'Float32'¶
-
Float64= 'Float64'¶
-
Int16= 'Int16'¶
-
Int32= 'Int32'¶
-
String= 'String'¶
-
UInt32= 'UInt32'¶
-
Url= 'Url'¶
-
-
class
dds.Dataset(name='', decl=None, text=None)[source]¶ Bases:
dds.StructClass for Dataset.
See
Struct.-
stype= 'Dataset'¶
-
-
class
dds.Decl(name='')[source]¶ Bases:
objectClass for declaration, that is, base class for
VarandStruct. No need to use this class explicitly.declaration := Var | Struct
-
class
dds.Decls[source]¶ Bases:
dictClass for declarations.
declarations := list(declaration)In this module, declarations are expressed as dict, not list. At this point, this class is just an alias for dict.
-
class
dds.Grid(name='', array=None, maps=None, text=None)[source]¶ Bases:
dds.StructClass for Grid.
Grid := Grid { ARRAY: declaration MAPS: declarations } (name | name arr)Examples
>>> text = ''' ... Grid { ... ARRAY: ... Float32 tas[time = 8412][lat = 160][lon = 320]; ... MAPS: ... Float64 time[time = 8412]; ... Float64 lat[lat = 160]; ... Float64 lon[lon = 320]; ... } tas;''' >>> Grid(text=text) Grid('tas', array=Var('tas', 'Float32', arr=[Arr('time', 8412), Arr('lat', 160), Arr('lon', 320)]), maps={'time': Var('time', 'Float64', arr=[Arr('time', 8412)]), 'lat': Var('lat', 'Float64', arr=[Arr('lat', 160)]), 'lon': Var('lon', 'Float64', arr=[Arr('lon', 320)])})
- Parameters
If text is not
None, other attributes are overridden by the result ofparse().-
stype= 'Grid'
-
property
text¶ Text to construct this instance.
-
class
dds.SType[source]¶ Bases:
enum.EnumValues for
Struct.stype-
Dataset= 'Dataset'¶
-
Grid= 'Grid'¶
-
Sequence= 'Sequence'¶
-
Structure= 'Structure'¶
-
-
class
dds.Sequence(name='', decl=None, text=None)[source]¶ Bases:
dds.StructClass for Sequence.
See
Struct.Examples
>>> text = ''' ... Sequence { ... Float64 depth; ... Float64 salinity; ... Float64 oxygen; ... Float64 temperature; ... } cast;''' >>> Sequence(text=text) Sequence('cast', {'depth': Var('depth', 'Float64'), 'salinity': Var('salinity', 'Float64'), 'oxygen': Var('oxygen', 'Float64'), 'temperature': Var('temperature', 'Float64')})
-
stype= 'Sequence'¶
-
-
class
dds.Struct(name='', decl=None, text=None)[source]¶ Bases:
dds.DeclClass for struct, that is, base class for
Structure,Sequence,GridandDataset. Do not use this directly.struct := stype { declarations } varstype := Dataset|Structure|Sequence|GridYou can access items of
self.declas if they are the attribute of this class, via dot notation.Examples
>>> text = ''' ... Sequence { ... Float64 depth; ... Float64 salinity; ... Float64 oxygen; ... Float64 temperature; ... } cast;''' >>> s = Sequence(text=text) >>> s.salinity Var('salinity', 'Float64')
>>> text = ''' ... Dataset { ... Int32 catalog_number; ... Sequence { ... String experimenter; ... Int32 time; ... Structure { ... Float64 latitude; ... Float64 longitude; ... } location; ... } station; ... } data;''' >>> d = parse_dataset(text) >>> d.station.location.latitude Var('latitude', 'Float64')
If text is not
None, other attributes are overridden by the result ofparse()or left untouced..-
parse(text)[source]¶ Parse text to construct
Struct.If given text is not valid for each subclass, the instance is left as ‘null’ instance.
-
stype= None
-
property
text¶ Text to construct this instance.
-
-
class
dds.Structure(name='', decl=None, text=None)[source]¶ Bases:
dds.StructClass for Structure.
See
Struct.-
stype= 'Structure'¶
-
-
class
dds.Var(name='', btype=None, arr=None, text=None)[source]¶ Bases:
dds.DeclClass for Var.
Var := basetype (name*|*name arr)- Parameters
- Raises
TypeError – if btype or arr is invalid
If text is not
None, other attributes are overridden by the result ofparse().-
text_formatted(indent=None, linebreak=None)[source]¶ Formatted text expression of this instance.
indent and linebreak are dummy arguments here.
-
property
text¶ Text to construct this instance.
-
dds.check_braces_matching(text)[source]¶ Check if braces(
{and}) in given text match.Raises ValueError unless match.
Examples
>>> text = 'Dataset{varline} hoge' >>> check_braces_matching(text) # True
>>> text = 'Struct{ Sequence{Var} fuga }} hoge' >>> check_braces_matching(text) Traceback (most recent call last): ... ValueError: braces do not match: too many right braces: 1 more.
>>> text = 'Struct{ Sequence{{Var} fuga } hoge' >>> check_braces_matching(text) Traceback (most recent call last): ... ValueError: braces do not match: too many left braces: 1 more.
-
dds.parse_arrdecls(text)[source]¶ Parse text contains multiple
Arrdefinitions and return a list of them.