user308827 user308827 - 2 months ago 13
Python Question

Error on using xarray open_mfdataset function

I am trying to combine multiple netCDF files with the same dimensions, their dimensions are as follows:

OrderedDict([(u'lat', <type 'netCDF4._netCDF4.Dimension'>: name = 'lat', size = 720
), (u'lon', <type 'netCDF4._netCDF4.Dimension'>: name = 'lon', size = 1440
), (u'time', <type 'netCDF4._netCDF4.Dimension'>: name = 'time', size = 96
), (u'nv', <type 'netCDF4._netCDF4.Dimension'>: name = 'nv', size = 2
)])
OrderedDict([(u'lat', <type 'netCDF4._netCDF4.Dimension'>: name = 'lat', size = 720
), (u'lon', <type 'netCDF4._netCDF4.Dimension'>: name = 'lon', size = 1440
), (u'time', <type 'netCDF4._netCDF4.Dimension'>: name = 'time', size = 96
), (u'nv', <type 'netCDF4._netCDF4.Dimension'>: name = 'nv', size = 2
)])
OrderedDict([(u'lat', <type 'netCDF4._netCDF4.Dimension'>: name = 'lat', size = 720
), (u'lon', <type 'netCDF4._netCDF4.Dimension'>: name = 'lon', size = 1440
), (u'time', <type 'netCDF4._netCDF4.Dimension'>: name = 'time', size = 96
), (u'nv', <type 'netCDF4._netCDF4.Dimension'>: name = 'nv', size = 2
)])
OrderedDict([(u'lat', <type 'netCDF4._netCDF4.Dimension'>: name = 'lat', size = 720
), (u'lon', <type 'netCDF4._netCDF4.Dimension'>: name = 'lon', size = 1440
), (u'time', <type 'netCDF4._netCDF4.Dimension'>: name = 'time', size = 96
), (u'nv', <type 'netCDF4._netCDF4.Dimension'>: name = 'nv', size = 2
)])
OrderedDict([(u'lat', <type 'netCDF4._netCDF4.Dimension'>: name = 'lat', size = 720
), (u'lon', <type 'netCDF4._netCDF4.Dimension'>: name = 'lon', size = 1440
), (u'time', <type 'netCDF4._netCDF4.Dimension'>: name = 'time', size = 96
), (u'nv', <type 'netCDF4._netCDF4.Dimension'>: name = 'nv', size = 2
)])
OrderedDict([(u'lat', <type 'netCDF4._netCDF4.Dimension'>: name = 'lat', size = 720
), (u'lon', <type 'netCDF4._netCDF4.Dimension'>: name = 'lon', size = 1440
), (u'time', <type 'netCDF4._netCDF4.Dimension'>: name = 'time', size = 96
), (u'nv', <type 'netCDF4._netCDF4.Dimension'>: name = 'nv', size = 2
)])
OrderedDict([(u'lat', <type 'netCDF4._netCDF4.Dimension'>: name = 'lat', size = 720
), (u'lon', <type 'netCDF4._netCDF4.Dimension'>: name = 'lon', size = 1440
), (u'time', <type 'netCDF4._netCDF4.Dimension'>: name = 'time', size = 96
), (u'nv', <type 'netCDF4._netCDF4.Dimension'>: name = 'nv', size = 2
)])


However, on using open_mfdataset, I get this error:

xr.open_mfdataset(path_file, decode_times=False)

*** ValueError: cannot infer dimension to concatenate: supply the ``concat_dim`` argument explicitly


How to fix this error? My dimensions are the same in all the files

Answer

This error message is probably arising because you have two files with the same variables and coordinate values, and xarray doesn't know whether it should stack them together along a new dimension or simply check to make sure none of the values conflict.

It would be nice if explicitly calling open_mfdataset with concat_dim=None disabled all attempts at concatenation. This change should make it into the next release of xarray (v0.9.0).

In the meantime, you can work around this by opening the files individually and merging them explicitly, e.g.,

def open_mfdataset_merge_only(paths, **kwargs):
    if isinstance(paths, basestring):
        paths = sorted(glob(paths))
    return xr.merge([xr.open_dataset(path, **kwargs) for path in paths])

Under the covers, this is basically all that open_mfdataset is doing.

Comments