Skip to content
/ tozan Public

Index filesystem by creating metadata database

License

Notifications You must be signed in to change notification settings

paazmaya/tozan

Repository files navigation

tozan

Index filesystem by creating metadata database

Windows build status CircleCI Node.js v22 CI codecov FOSSA Status DeepSource Code Smells OpenSSF Scorecard

Go trough the files under a given directory, generate a hash of each of the files (which by default is SHA1), and store the hashes to a SQLite database (which by default is in memory). In case the given file was already listed in the database, its entry will be updated.

Please note that the minimum supported version of Node.js is 22.11.0, which is the active Long Term Support (LTS) version.

Background for the project name

The name of the project (Tozan, 当山) is for honouring the legacy of a certain master from the Ryukyu archipelago, Japan, who contributed to the martial arts that we today know as karate and ryukyu kobujutsu.

Read more about why these martial arts are important for me at karatejukka.fi.

Installation

Install via npm, as a global command line utility:

[sudo] npm install --global tozan

Please note that while in Linux and with sudo, some of the dependencies might fail to install, which can be fixed in some case by sudo npm install --global --unsafe-perm tozan. See more details about the unsafe-perm option at docs.npmjs.com.

The SHA hash is calculated with OpenSSL, specifically with its openssl dgst command, hence it needs to be available in the PATH.

The existence of OpenSSL can be checked with the command openssl version, which should output something similar to (example in macOS):

LibreSSL 2.8.3

In case the installed OpenSSL does not support the default hashing algorithm (SHA-256), the hash algorithm need to be defined via command line options. The supported digest algorithms can be seen with the command openssl list -digest-algorithms.

Command line options

Easiest way to see the supported options, is to execute with help output:

tozan --help

The most recent major version has the similar output to the following:

tozan [options] <directory>

  -h, --help              Help and usage instructions
  -V, --version           Version number
  -D, --database String   SQLite database to use - default: :memory:
  -H, --hash String       Hashing algorithm understood by OpenSSL - default: sha1
  -i, --ignore-dot-files  Ignore files and directories that begin with a dot

Version 6.0.0

For more information on the possible database file options, see sqlite3 documentation for the filename parameter.

Using programmatically

First install as a dependency:

npm install --save tozan

Use in a Node.js script:

import tozan from 'tozan';

tozan('directory-for-scanning', {
  ignoreDotFiles: true, // Ignore files and directories that begin with a dot
  algorithm: 'sha512' // Hash algorithm to use
  database: 'tozan-meta.sqlite' // Possible database file to be used with SQLite
});

Clearest example of the usage is in the command line interface.

Speed comparison between hashing algorithms

These numbers are from running time node bin/tozan.js --hash [algorithm] node_modules with different algorithms. At the time the node_modules folder contained total of 11410 files.

Algorithm Time
md4 1m 11.409s
md5 1m 16.059s
sha1 1m 13.361s
sha256 1m 12.263s
sha384 1m 15.404s
sha512 1m 11.746s
streebog512 1m 11.888s
whirlpool 1m 8.089s

Looks like the differences are not that big. Feel free to add and update the comparison with more data and more alternatives.

Contributing

First thing to do is to file an issue. Then possibly open a Pull Request for solving the given issue. ESLint is used for linting the code, please use it by doing:

npm install
npm run lint

Unit tests are written with tape and can be executed with npm test. Code coverage is inspected with nyc and can be executed with npm run coverage after running npm test. Please make sure it is over 90% at all times.

Version history

Changes happening across different versions and upcoming changes are tracked in the CHANGELOG.md file.

License

Licensed under the MIT license.

Copyright (c) Juga Paazmaya paazmaya@yahoo.com

FOSSA Status

About

Index filesystem by creating metadata database

Topics

Resources

License

Stars

Watchers

Forks