Catalog catalogs name, location and size of files and directories and identifies and removes duplicate files in a file storage system. Catalog is a command line tool that works across a single directory, multiple directories or entire file system that resides on multiple storage systems.
As large batches of text, picture and video files and directories are retriecved from multiple third party sources this tool is an integral part of our machine learning processes to reduce later processing requirements.