Skip to content

gagle/node-binary-reader

Repository files navigation

binary-reader

Buffered binary reader with a fluent api

NPM version Build Status Dependency Status

NPM installation

This module is a wrapper around the fs.read() function. It has an internal buffer that maintains the last chunk of bytes read from disk, so it minimizes the number of I/O calls. If the requested bytes are already in the buffer it doesn't perform any I/O call and the bytes are copied directly from the internal buffer. It also implements a fluent interface for your ease, so it also tries to minimize the number of nested asynchronous calls.

Documentation

Functions

Objects


What are its uses?

Anything that needs to read big binary files to extract just a little portion of data, e.g. metadata readers: music, images, fonts, etc.

Benefits:

  • Read big binary files without caring about how to retrieve the data and without implementing your own internal cursor system.
  • Avoid the callback nesting. It uses a very lightweight and fast asynchronous series control flow library: deferred-queue.
  • Ease the error handling.
  • It is lazy! It delays the open and read calls until they are necessary, i.e. br.open(file).seek(50).close() does nothing.

How it works?

To make the things easier there are 5 cases depending on the buffer position and the range of the bytes that you want to read. These cases are only applicable if the buffer size is smaller than the file size, otherwise the whole file is read into memory, so only one I/O call is done.

Suppose a buffer size of 5 bytes (green background).
The pointer p is the cursor and it points to the first byte to read.
The pointer e is the end and it points to the last byte to read.
The x bytes are not in memory. They need to be read from disk.
The y bytes are already in memory. No need to read them again.

For the sake of simplicity, assume that the x group of bytes has a length smaller than the buffer size. The binary reader takes care of this and makes all the necessary calls to read all the bytes.


module.open(path[, options]) : Reader

Returns a new Reader. The reader is lazy so the file will be opened with the first read() call.

Options:

  • highWaterMark - Number
    The buffer size. Default is 16KB.

Reader

The reader uses a fluent interface. The way to go is to chain the operations synchronously and, after all, close the file. They will be executed in series and asynchronously. If any error occurs, an error event is fired, the pending tasks are cancelled and the file is automatically closed.

The read() and seek() functions receive a callback. This callback is executed when the current operation finishes and before the next one. If you need to stop executing the subsequent tasks because you've got an error or by any other reason, you must call to cancel(). You cannot call to close() because the task will be enqueued and what you need is to close the file immediately. For example:

br.open (file)
    .on ("error", function (error){
      console.error (error);
    })
    .on ("close", function (){
      ...
    })
    .read (1, function (bytesRead, buffer){
      //The subsequent tasks are not executed
      this.cancel ();
    })
    .read (1, function (){
      //This is never executed
    })
    .close ();

Events

Methods


close

Arguments: none.

Emitted when the reader is closed or cancelled.

error

Arguments: error.

Emitted when an error occurs.


Reader#cancel([error]) : undefined

Stops the reader immediately, that is, this operation is not deferred, it cancels all the pending tasks and the file is automatically closed. If you pass an error, it will be forwarded to the error event instead of emitting a close event.

This function is mostly used when you need to execute some arbitrary code, you get an error and therefore you need to close the reader.

br.open (file)
    .on ("error", function (error){
      console.error (error);
    })
    .on ("close", function (){
      ...
    })
    .read (1, function (bytesRead, buffer, cb){
      var me = this;
      asyncFn (function (error){
        if (error){
          //The error is forwarded to the "error" event
          //No "close" event is emitted if you pass an error
          me.cancel (error);
        }else{
          //Proceed with the next task
          cb ();
        }
      });
    })
    .read (1, function (){
      ...
    })
    .close ();

Reader#close() : Reader

Closes the reader.

This operation is deferred, it's enqueued in the list of pending tasks.

In the following example, the close operation is executed after the read operation, so the reader reads 1 byte and then closes the file.

br.open (file)
    .on ("error", function (error){
      console.error (error);
    })
    .on ("close", function (){
      ...
    })
    .read (1, function (){ ... })
    .close ();

Reader#isEOF() : Boolean

Checks whether the internal cursor has reached the end of the file. Subsequent reads return an empty buffer. This operation is not deferred, it's executed immediately.

In this example the cursor is moved to the last byte but it's still not at the end, it will be after the read.

var r = br.open (file)
    .on ("error", function (error){
      console.error (error);
    })
    .on ("close", function (){
      ...
    })
    .seek (0, { end: true }, function (){
      console.log (r.isEOF ()); //false
    })
    .read (1, function (){
      console.log (r.isEOF ()); //true
    })
    .close ();

Reader#read(bytes, callback) : Reader

Reads data and the cursor is automatically moved forwards. The callback receives three arguments: the number of bytes that has been read, the buffer with the raw data and a callback that's used to allow asynchronous operations between tasks. The buffer is not a view, it's a new instance, so you can modify the content without altering the internal buffer.

This operation is deferred, it's enqueued in the list of pending tasks.

For example:

br.open (file)
    .on ("error", function (error){
      console.error (error);
    })
    .on ("close", function (){
      ...
    }))
    .read (1, function (bytesRead, buffer, cb){
      //Warning! If you use the "cb" argument you must call it or the reader
      //will hang up
      process.nextTick (cb);
    })
    .read (1, function (){ ... })
    .close ();

Reader#seek(position[, whence][, callback]) : Reader

Moves the cursor along the file.

This operation is deferred, it's enqueued in the list of pending tasks.

The whence parameter it's used to tell the reader from where it must move the cursor, it's the reference point. It has 3 options: start, current, end.

For example, to move the cursor from the end:

seek (0, { start: true });
seek (0);

By default the cursor it's referenced from the start of the file.

To move the cursor from the current position:

seek (5, { current: true });
seek (-5, { current: true });

The cursor can be moved with positive and negative offsets.

To move the cursor from the end:

seek (3, { end: true });

This will move the cursor to the fourth byte from the end of the file.

Reader#size() : Number

Returns the size of the file. This operation is not deferred, it's executed immediately.


Reader#tell() : Number

Returns the position of the cursor. This operation is not deferred, it's executed immediately.

br.open (file)
    .on ("error", function (error){
      console.error (error);
    })
    .on ("close", function (){
      ...
    })
    .seek (0, { end: true }, function (){
      console.log (this.tell () === this.size () - 1); //true
    })
    .read (1, function (){
      console.log (this.tell () === this.size ()); //true
      console.log (this.isEOF ()); //true
    })
    .close ();

About

Buffered binary reader with a fluent api.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published