flox.rechunk_for_cohorts#

flox.rechunk_for_cohorts(array, axis, labels, force_new_chunk_at, chunksize=None, ignore_old_chunks=False, debug=False)[source]#

Rechunks array so that each new chunk contains groups that always occur together.

Parameters:
arraydask.array.Array

array to rechunk

axisint

Axis to rechunk

labelsnp.array

1D Group labels to align chunks with. This routine works well when labels has repeating patterns: e.g. 1, 2, 3, 1, 2, 3, 4, 1, 2, 3 though there is no requirement that the pattern must contain sequences.

force_new_chunk_atSequence

Labels at which we always start a new chunk. For the example labels array, this would be 1.

chunksizeint, optional

nominal chunk size. Chunk size is exceeded when the label in force_new_chunk_at is less than chunksize//2 elements away. If None, uses median chunksize along axis.

Returns:
dask.array.Array

rechunked array