`dandi.support.digests`#

Provides helper to compute digests (md5 etc) on files

class dandi.support.digests.Digester(digests: list[str] = <factory>, blocksize: int = 65536)[source]#

Helper to compute multiple digests in one pass for a file

blocksize: int = 65536#: Chunk size (in bytes) by which to consume a file.

digest_funcs: list[Callable[[], Hasher]]#

digests: list[str]#: List of any supported algorithm labels, such as md5, sha1, etc.

dandi.support.digests.checksum_zarr_dir(files: dict[str, tuple[str, int]], directories: dict[str, tuple[str, int]]) → str[source]#

Calculate the Zarr checksum of a directory only from information about the files and subdirectories immediately within it.

Parameters:

files – A mapping from names of files in the directory to pairs of their MD5 digests and sizes
directories – A mapping from names of subdirectories in the directory to pairs of their Zarr checksums and the sum of the sizes of all files recursively within them

dandi.support.digests.get_dandietag(filepath: str | Path) → DandiETag[source]#

dandi.support.digests.get_digest(filepath: str | Path, digest: str = 'sha256') → str[source]#

dandi.support.digests.get_zarr_checksum(path: Path, known: dict[str, str] | None = None) → str[source]#

Compute the Zarr checksum for a file or directory tree.

If the digests for any files in the Zarr are already known, they can be passed in the known argument, which must be a dict mapping slash-separated paths relative to the root of the Zarr to hex digests.

dandi.support.digests.md5file_nocache(filepath: str | Path) → str[source]#: Compute the MD5 digest of a file without caching with fscacher, which has been shown to slow things down for the large numbers of files typically present in Zarrs

dandi.support.digests#

`dandi.support.digests`#