anu_ctlab_io.zarr

Read and write data from/to the ANU CTLab zarr data format.

This is an optional extra module, and must be explicitly installed to be used (e.g., pip install anu_ctlab_io[zarr]).

Package Contents

class anu_ctlab_io.zarr.OMEZarrVersion(*args, **kwds)

Bases: enum.Enum

OME-Zarr specification version to use when writing.

v05 = '0.5'
anu_ctlab_io.zarr.dataset_to_zarr(dataset, path, datatype=None, dataset_id=None, ome_zarr_version=OMEZarrVersion.v05, max_shard_size_mb=None, history=None, chunk_size_mb=None, chunks='auto', shards='auto', create_array_kwargs=None, dimension_separator_threshold=64, **extra_attrs)

Write a Dataset to Zarr format.

By default, chunks and shards use power-of-two square or cubic shapes targeting 32**3 and 512**3 elements respectively. Shapes are trimmed to the array dimensions, and shards are rounded up to chunk multiples.

Parameters:
  • dataset (anu_ctlab_io._dataset.Dataset) – The Dataset to write.

  • path (pathlib.Path | str) – Path to write the Zarr store.

  • datatype (anu_ctlab_io._datatype.DataType | str | None) – The data type identifier. If None, attempts to infer from dataset.

  • dataset_id (str | None) – Unique identifier for the dataset. Auto-generated if not provided.

  • ome_zarr_version (OMEZarrVersion | None) – OME-Zarr specification version to use. Set to OMEZarrVersion.v05 (default) to write OME-Zarr V0.5 group format. Set to None to write a simple Zarr V3 array with mango metadata.

  • max_shard_size_mb (float | None) – Maximum shard size in MB for Zarr v3 sharding. Deprecated and ignored. Passing this emits a warning and leaves layout selection to chunks/shards.

  • history (anu_ctlab_io._parse_history.History | None) – Dictionary of history entries to add. Keys should be identifiers, values are history strings.

  • chunk_size_mb (float | None) – Target chunk size in MB for automatic chunking. Deprecated and ignored. Passing this emits a warning and leaves layout selection to chunks/shards.

  • chunks (ChunkSpec) – Explicit chunk shape as a tuple shape (e.g., (10, 512, 512)), int (target # of elements), or 'auto'. Can be provided on its own to write a non-sharded Zarr array, or together with shards to use the sharding codec. An integer specifies the target number of elements for an automatically derived layout. 'auto' uses a default target corresponding to 32**3 or 256**2 elements. To write without sharding, pass shards=None explicitly. A value of 0 in a shape tuple means “span this axis” — the full array axis for unsharded writes, or the full shard axis when shards is also provided.

  • shards (ShardSpec) – Explicit shard shape as a tuple shape (e.g., (100, 512, 512)), int (target # of elements), or 'auto'. May be provided together with an explicit chunks tuple. Providing an explicit shard shape with chunks='auto' is an error. An integer specifies the target number of elements for an automatically derived layout. Use None to disable sharding, or 'auto' to use the default target of 512**3 or 8192**2 elements. A value of 0 in a shape tuple means “span the full array axis”. When provided, the user is responsible for ensuring shard shapes are evenly divisible by chunk shapes.

  • create_array_kwargs (dict[str, Any] | None) – Additional keyword arguments to pass to zarr.create_array(). For example, to set compression: create_array_kwargs={'compressors': [ZstdCodec(level=5)]}.

  • dimension_separator_threshold (int | None) – Use '/' as the chunk key dimension separator when the number of chunks exceeds this threshold; otherwise use '.'. None uses the Zarr default of '/'.

  • extra_attrs (Any) – Additional attributes to include in mango metadata.

Return type:

None

anu_ctlab_io.zarr.dataset_from_zarr(path, **kwargs)

Loads a Dataset from the path to a zarr.

This method is used by Dataset.from_path, by preference call that constructor directly.

Parameters:
  • Path – The path to the zarr to be loaded

  • kwargs (Any) – Currently this method consumes no kwargs, but will pass provided kwargs to dask.Array.from_path.

  • path (pathlib.Path)

Return type:

anu_ctlab_io._dataset.Dataset