Draft: Resolve "change to_xarray from DataArray to Dataset"
Additions to the points of issue #7:
- according to benchmarks and discussions (
ref1,
ref2,
ref3
),
odc-stacmay be faster thanstackstacin wide area cases. -
odc-staccan mosaic automatically withgroup_by="solar_day", -
stackstacwas a side project (see comment) that does not seems to be actively maintained anymore (last PR merged in Aug. 2024). Regarding some evolution of STAC (e.g. issue 262), it would pystac to an old version. Thus, an alternative seems necessary to go forward, and it could beodc-stac.
odc-stac has still a couple of limits:
- it does not forward the full STAC metadata: this affects "bijection" between ItemCollection and xarray
- it does not apply automatically scale and offset: https://github.com/opendatacube/odc-stac/issues/55
- in my tests (2023)
odc-stacwas much slower thanstackstacto build the xarray delayed datacube, as it would read the header of each asset file
TODO:
-
check the impact of xarray.Dataset on the other methods/functions: - keeping
stackstacthe impact is small, but has a few consequences --> needs to convert to DataArray to:- filter on asset extra_fields, e.g.
subxr.sel(band=(subxr.common_name=="nir"))not working anymore - plot several bands and times as facets as faceting with multiple variables is not available for
xarray.Dataset:subxr.to_array("band").plot(row="time", col="band") - use numerical indexes, e.g.
subxr.to_array("band")[:,:2,:,:]
- filter on asset extra_fields, e.g.
- using
odc.stac:- makes some methods using id unusable as is, e.g. write
- would need some mechanism/dev to apply band scale and offset when present in collection metadata
- keeping
-
replace stackstacbyxpystacto allow choice of the backend engine -
replace .sel(band=band)by[band] -
add option for dataarray output -
update major version of simplestac -
make a simple comparison stackstacvsodc-stacbefore includingodc-stacin the documentation
Closes #7
Edited by DE BOISSIEU FLORIAN