stouputils.collections.z_array module#

array_to_disk(
data: NDArray[Any] | zarr.Array[Any],
delete_input: bool = True,
more_data: NDArray[Any] | zarr.Array[Any] | None = None,
) tuple[zarr.Array[Any], str, int][source]#

Easily handle large numpy arrays on disk using zarr for efficient storage and access.

Zarr provides a simpler and more efficient alternative to np.memmap with better compression and chunking capabilities.

Parameters:
  • data (NDArray | zarr.Array) – The data to save/load as a zarr array

  • delete_input (bool) – Whether to delete the input data after creating the zarr array

  • more_data (NDArray | zarr.Array | None) – Additional data to append to the zarr array

Returns:

The zarr array, the directory path, and the total size in bytes

Return type:

tuple[zarr.Array, str, int]

Examples

>>> import numpy as np
>>> data = np.random.rand(1000, 1000)
>>> zarr_array = array_to_disk(data)[0]
>>> zarr_array.shape
(1000, 1000)
>>> more_data = np.random.rand(500, 1000)
>>> longer_array, dir_path, total_size = array_to_disk(zarr_array, more_data=more_data)