dandi organize

dandi [<global options>] organize [<options>] [<path> ...]

(Re)organize files according to their metadata.

The purpose of this command is to take advantage of metadata contained in *.nwb files to provide datasets with consistently-named files whose names reflect the data they contain.

*.nwb files are organized into a hierarchy of subfolders, one per “subject”, e.g. sub-0001 if an *.nwb file contained a Subject group with subject_id=0001. Each file in a subject-specific subfolder follows the pattern:

sub-<subject_id>[_key-<value>][_mod1+mod2+...].nwb

where the following keys are considered if present in the data:

  • sessession_id

  • tistissue_sample_id

  • sliceslice_id

  • cellcell_id

and modX are “modalities” as identified based on detected neural data types (such as “ecephys”, “icephys”) per extensions found in nwb-schema definitions.

In addition, an “obj” key with a value corresponding to the crc32 checksum of “object_id” is added if the aforementioned keys and the list of modalities are not sufficient to disambiguate different files.

You can visit https://dandiarchive.org for a growing collection of (re)organized dandisets.

Options

-d, --dandiset-path <dir>

The root directory of the Dandiset to organize files under. If not specified, the Dandiset under the current directory is assumed. For ‘simulate’ mode, the target Dandiset/directory must not exist.

-f, --files-mode [dry|simulate|copy|move|hardlink|symlink|auto]

How to relocate the files.

  • auto [default] — The first of symlink, hardlink, and copy that is supported by the local filesystem

  • dry — No action is performed, suggested renames are printed

  • simulate — A hierarchy of empty files at --dandiset-path is created. Note that the previous layout should be removed prior to this operation.

--invalid [fail|warn]

What to do if files without sufficient metadata are encountered [default: fail]

--media-files-mode [copy|move|symlink|hardlink]

How to relocate video files referenced by NWB files [default: symlink]

--required-field <field>

Force a given field to be included in the organized filename of any file for which it is nonempty. Can be specified multiple times.

The valid field names are:

  • subject_id (already required by default)

  • session_id

  • tissue_sample_id

  • slice_id

  • cell_id

  • probe_ids

  • obj_id

  • modalities (already required by default)

  • extension (already required by default)

--update-external-file-paths

Rewrite the external_file arguments of ImageSeries in NWB files. The new values will correspond to the new locations of the video files after being organized. This option requires --files-mode to be “copy” or “move”.

Development Options

The following options are intended only for development & testing purposes. They are only available if the DANDI_DEVEL environment variable is set to a nonempty value.

--devel-debug

Do not use pyout callbacks, do not swallow exceptions, do not parallelize.