anu_ctlab_io.zarr
Read and write data from/to the ANU CTLab zarr data format.
This is an optional extra module, and must be explicitly installed to be used (e.g., pip install anu_ctlab_io[zarr]).
Package Contents
- class anu_ctlab_io.zarr.OMEZarrVersion(*args, **kwds)
Bases:
enum.EnumOME-Zarr specification version to use when writing.
- v05 = '0.5'
- anu_ctlab_io.zarr.dataset_to_zarr(dataset, path, datatype=None, dataset_id=None, ome_zarr_version=OMEZarrVersion.v05, max_shard_size_mb=None, history=None, chunk_size_mb=None, chunks='auto', shards='auto', create_array_kwargs=None, dimension_separator_threshold=64, **extra_attrs)
Write a
Datasetto Zarr format.By default, chunks and shards use power-of-two square or cubic shapes targeting
32**3and512**3elements respectively. Shapes are trimmed to the array dimensions, and shards are rounded up to chunk multiples.- Parameters:
dataset (anu_ctlab_io._dataset.Dataset) – The
Datasetto write.path (pathlib.Path | str) – Path to write the Zarr store.
datatype (anu_ctlab_io._datatype.DataType | str | None) – The data type identifier. If None, attempts to infer from dataset.
dataset_id (str | None) – Unique identifier for the dataset. Auto-generated if not provided.
ome_zarr_version (OMEZarrVersion | None) – OME-Zarr specification version to use. Set to
OMEZarrVersion.v05(default) to write OME-Zarr V0.5 group format. Set toNoneto write a simple Zarr V3 array with mango metadata.max_shard_size_mb (float | None) – Maximum shard size in MB for Zarr v3 sharding. Deprecated and ignored. Passing this emits a warning and leaves layout selection to
chunks/shards.history (anu_ctlab_io._parse_history.History | None) – Dictionary of history entries to add. Keys should be identifiers, values are history strings.
chunk_size_mb (float | None) – Target chunk size in MB for automatic chunking. Deprecated and ignored. Passing this emits a warning and leaves layout selection to
chunks/shards.chunks (ChunkSpec) – Explicit chunk shape as a tuple shape (e.g.,
(10, 512, 512)), int (target # of elements), or'auto'. Can be provided on its own to write a non-sharded Zarr array, or together withshardsto use the sharding codec. An integer specifies the target number of elements for an automatically derived layout.'auto'uses a default target corresponding to32**3or256**2elements. To write without sharding, passshards=Noneexplicitly. A value of0in a shape tuple means “span this axis” — the full array axis for unsharded writes, or the full shard axis whenshardsis also provided.shards (ShardSpec) – Explicit shard shape as a tuple shape (e.g.,
(100, 512, 512)), int (target # of elements), or'auto'. May be provided together with an explicitchunkstuple. Providing an explicit shard shape withchunks='auto'is an error. An integer specifies the target number of elements for an automatically derived layout. UseNoneto disable sharding, or'auto'to use the default target of512**3or8192**2elements. A value of0in a shape tuple means “span the full array axis”. When provided, the user is responsible for ensuring shard shapes are evenly divisible by chunk shapes.create_array_kwargs (dict[str, Any] | None) – Additional keyword arguments to pass to zarr.create_array(). For example, to set compression:
create_array_kwargs={'compressors': [ZstdCodec(level=5)]}.dimension_separator_threshold (int | None) – Use
'/'as the chunk key dimension separator when the number of chunks exceeds this threshold; otherwise use'.'.Noneuses the Zarr default of'/'.extra_attrs (Any) – Additional attributes to include in mango metadata.
- Return type:
None
- anu_ctlab_io.zarr.dataset_from_zarr(path, **kwargs)
Loads a
Datasetfrom the path to a zarr.This method is used by
Dataset.from_path, by preference call that constructor directly.- Parameters:
Path – The path to the zarr to be loaded
kwargs (Any) – Currently this method consumes no kwargs, but will pass provided kwargs to
dask.Array.from_path.path (pathlib.Path)
- Return type:
anu_ctlab_io._dataset.Dataset