Using the filesystem as a searchable database.
Rocket-Store is a high performance solution to simple data storage and retrieval. It's taking advantage of modern file system's exceptionally advanced cashing mechanisms.
It's packaged in a single file to include, with few dependencies.
result = await rs.post("cars","Mercedes",{owner:"Lisa Simpson",reg:"N3RD"});
result = await rs.get("cars","*",rs._ORDER_DESC);
result = await rs.delete("cars","*cede*");
- Extremely fast
- Very reliant
- Very little footprint.
- Very flexible.
- One dependency
- Works without configuration or setup.
- Data stored in JSON format
- Configurable
- Also available for PHP
- Has a session store module for express
- Asynchronous mutation safe
- Can import as Common Js & Module
Just run:
npm install rocket-store
Common JS
const rs = require('rocket-store').default;
Module
import * as store from "rocket-store";
const rs = await store.Rocketstore();
// ---- ALTERNATIVE ----
// it can also be accessed and launched with the 'default' method
const rs = await store.default();
Rocket-Store does not require initialization:
- The storage area defaults to the OS temp dir.
- When trying to get a non existant collection, the reply is that no records were found.
- When posting to a non existant collection, it is created.
However you can set the storage area and data format to use, with the setOption function, before doing any operation on the data.
Rocket-Store was made to replace a more complex database, in a setting that required a low footprint and high performance.
Rocket-Store is intended to store and retrieve records/documents, organized in collections, using a key.
Terms used:
- Collection: name of a collections of records. (Like an SQL table)
- Record: the data store. (Like an SQL row)
- Data storage area: area/directory where collections are stored. (Like SQL data base)
- Key: every record has exactly one unique key, which is the same as a file name (same restrictions) and the same wildcards used in searches.
Compare Rocket-Store, SQL and file system terms:
Rocket-Store | SQL | File system |
---|---|---|
storage area | database | data directory root |
collection | table | directory |
key | key | file name |
record | row | file |
Stores a record in a collection identified by a unique key
post(string <collection>, string <key>, mixed <record> [, integer options])
Collection name to contain the records.
Key uniquely identifying the record
No path separators or wildcards etc. are allowed in collection names and keys. Illigal charakters are silently striped off.
Options
- _ADD_AUTO_INC: Add an auto incremented sequence to the beginning of the key
- _ADD_GUID: Add a Globally Unique IDentifier to the key
Returns an associative array containing the result of the operation:
- count : number of records affected (1 on succes)
- key: string containing the actual key used
If the key already exists, the record will be replaced.
If no key is given, an auto-incremented sequence is used as key.
If the function fails for any reason, an error is thrown.
Find and retrieve records, in a collection.
get([string <collection> [,string <filename with wildcards> [integer <option flags]]]])
Collection to search. If no collection name is given, get will return a list of data base assets: collections and sequences etc.
Key to search for. Can be mixed with wildcards '*' and '?'. An undefined or empty key is the equivalent of '*'
Options:
- _ORDER : Results returned are ordered alphabetically ascending.
- _ORDER_DESC : Results returned are ordered alphabetically descending.
- _KEYS : Return keys only (no records)
- _COUNT : Return record count only
Return an array of
- count : number of records affected
- key : array of keys
- result : array of records
NB: wildcards are very expensive on large datasets with most filesystems. (on a regular PC with +10^7 records in the collection, it might take up to a second to retreive one record, whereas one might retrieve up to 100.000 records with an exact key match)
Delete one or more records, whos key match.
delete([string <collection> [,string <key with wildcards>]])
Collection to search. If no collection is given, THE WHOLE DATA BASE IS DELETED!
Key to search for. Can be mixed with wildcards '*' and '?'. If no key is given, THE ENTIRE COLLECTION INCLUDING SEQUENCES IS DELETED!
Return an array of
- count : number of records or collections affected
Configuration options is an associative array, that can be parsed during require or with the options function The array can have these options:
Common JS
const rs = require('rocket-store').default;
await rs.options({
data_storage_area : "/home/rddb/webapp",
data_format : rs._FORMAT_JSON,
check_files : rs._FILECHECK_DEFAULT
});
Module
import * as store from "rocket-store";
const rs = await store.Rocketstore({
data_storage_area : "/home/rddb/webapp",
data_format : rs._FORMAT_JSON,
check_files : rs._FILECHECK_DEFAULT,
});
index name | values |
---|---|
data_storage_area | The directory where the database resides. The default is to use a subdirectory to the temporary directory provided by the operating system. If that doesn't work, the DOCUMENT_ROOT directory is used. |
data_format | Specify which format the records are stored in. Values are: _FORMAT_NATIVE - default. and RS_FORMAT_JSON - Use JSON data format. |
check_files | Specify how to strong check collection names. Values are: _FILECHECK_DEFAULT - default. and _FILECHECK_LOW - Simpler approach. |
Common JS
// Initialize (Not required)
const rs = require('./rocket-store');
// POST a record
result = await rs.post("cars", "Mercedes_Benz_GT_R", {owner: "Lisa Simpson"});
// GET a record
result = await rs.get("cars", "*");
console.log(result);
Module
// Initialize (Not required)
import * as store from "rocket-store";
const rs = await store.Rocketstore();
// POST a record
result = await rs.post("cars", "Mercedes_Benz_GT_R", {owner: "Lisa Simpson"});
// GET a record
result = await rs.get("cars", "*");
console.log(result);
The above example will output this:
{
count: 1,
key: [ 'Mercedes_Benz_GT_R' ],
result: [
{ owner: 'Lisa Simpson' }
]
}
File names must always be unique. If you have more than one instance of a file name, you can add an auto incremented sequence to the name:
await rs.post("cars", "BMW_740li", { owner: "Greg Onslow" }, rs._ADD_AUTO_INC);
await rs.post("cars", "BMW_740li", { owner: "Sam Wise" }, rs._ADD_AUTO_INC);
await rs.post("cars", "BMW_740li", { owner: "Bill Bo" }, rs._ADD_AUTO_INC);
result = await rs.get("cars", "*");
console.log(result);
The above will output this:
{
count: 4,
key: [
'1-BMW_740li',
'2-BMW_740li',
'3-BMW_740li'
],
result: [
{ owner: 'Greg Onslow' },
{ owner: 'Sam Wise' },
{ owner: 'Bill Bo' }
]
}
Another option is to add a GUID to the key. The GUID is a combination of a timestamp and a random sequence, formatet in accordance to RFC 4122 (Valid but slightly less random)
If ID's are generated more than 1 millisecond apart, they are 100% unique. If two ID's are generated at shorter intervals, the likelyhod of collission is up to 1 of 10^15.
await rs.post("cars", "BMW_740li", { owner: "Greg Onslow" }, rs._ADD_GUID);
await rs.post("cars", "BMW_740li", { owner: "Sam Wise" }, rs._ADD_GUID);
await rs.post("cars", "BMW_740li", { owner: "Bill Bo" }, rs._ADD_GUID);
result = await rs.get("cars", "*");
console.log(result);
The above will output this:
{
count: 4,
key: [
'16b4ffd8-87a0-4000-839f-ea5dd495b000-BMW_740li',
'16b4ffd8-87b0-4000-8032-45d788fac000-BMW_740li',
'16b4ffd8-87b0-4000-839f-95bd498f5000-BMW_740li'
],
result: [
{ owner: 'Greg Onslow' },
{ owner: 'Sam Wise' },
{ owner: 'Bill Bo' }
]
}
const dataset = {
Gregs_BMW_740li : { owner: "Greg Onslow" },
Lisas_Mercedes_Benz_GT_R : { owner: "Lisa Simpson" },
Bills_BMW_740li : { owner: "Bill Bo" },
};
var promises = [];
var ii = 0;
for(let i in dataset){
ii++;
promises[promises.length] = rs.post("cars", i, dataset[i]);
if(ii >= 20){
ii = 0;
await Promise.all(promises);
}
}
if(promises.length > 0)
await Promise.all(promises);
result = await rs.get("cars", "*");
console.log(result);
The above example might output this:
{ count: 3,
key:[
'Lisas_Mercedes_Benz_GT_R',
'Gregs_BMW_740li',
'Bills_BMW_740li',
],
result: [
{ owner: 'Lisa Simpson' },
{ owner: 'Greg Onslow' },
{ owner: 'Bill Bo' },
]
}
result = await rs.get("cars", "*BMW*");
result = await rs.get("cars", "*BMW*", rs._ORDER_DESC);
rs.get();
rs.delete("cars", "*BMW*");
rs.delete("cars");
rs.delete();
This was made with node ver 11. A compromise was struck, to compensate for the immaturity of the node file system library; There is no proper glob functionality, to filter a directory search on a low level. Instead, an array of all entries is read.
This consumes a lot of memory, with a large database. There is no avoiding that, short of improving opon the node file system library. This is beyond my intentions, at this time. I hope it will be remedied by the node core team.
Since the memory will be used anyway, it is applied to improve speed on key searching, by keeping the read keys in memory between searched, as a key_cash.
A draw back of this, is that collection names are restricted to valid variable names, as well as directory names.
Another issue is that file locking is yet to be implementet in node. Therefore a time consuming locking mecahnism is implemented as symlinks.
Both solutions will hopefully be changed, as node matures.
Benchmarks are performed with 1 million records in in a single collection.
Before rewrite
System | Mass insert | exact key search | wildcard search | no hit | delete |
---|---|---|---|---|---|
Debian, i7 3rd gen, SSD | 69000/sec. | 87000/sec. | 14,6/sec. | 123000/sec. | 525/sec. |
Raspbarry Pi Zero | 561/sec. | 96/sec. | 0.27/sec. | 147/sec. | 10.3/sec. |
After rewrite Bench mark test System: i7 3rd gen on SSD
(index) | Values |
---|---|
Mass insert | 10697 /sec |
Exact key search | 79745 /sec |
Exact ramdom key search no hit | 26042 /sec |
Wildcard ramdom key search 2 hits | 112 /sec |
│Wildcard ramdom delete 2 hits | 109 /sec |
Wildcard ramdom key search no hit | 152 /sec |
Exact random delete | 1563 /sec |
Deleting test data if any Mass delete:: 11.377s
- I appreciate all kinds of contribution.
- Don't hesitate to submit an issue report on github. But please provide a reproducible example.
- Code should look good and compact, and be covered by a test case or example.
- Please don't change the formatting style laid out, without a good reason. I know its not the most common standard, but its rather efficient one.
First run npm install typescript -g
after thaht run npm run build
.
In the examples
folder, there are two folders: CommonJS and Module.
Enter one of them and run the command npm i && npm run test
.
0.10.19
- Correct check_files type declaration
- Update README.md file - added links of Rocket Store ported to other languages
- Added correct examples by folders
0.10.18
- Added new Option for Check files names
0.10.17
- Correction of file path in utils for MODULE compilation
- Update dependecies
0.10.15
- Unification in import / requirement of the Rocket Store package
0.10.11 - 0.10.14
- Config workflow
- Bug fixes
- Update Docs
- Auto compile before publish code
- Update examples
0.10.10
- Code rewrite updating it to the latest standard, removing a dependency.
- Now it can be imported as CommonJS and as Module in your project.
0.10.9
- removed remove fs-extra module
0.10.8
- removed unneeded module sanitise-filename
0.10.7
- Bug fix: Wildcard search on windows OS failed to find valid keys.
0.10.6
- Bug fix: Corupted og invalid files now returns an empty record, instead of throwing an error.
0.10.5 repository version correction
0.10.4
- Bug fix: Asynchronous integrity of records failed. Circumvent bug in fs.extra
0.10.3
- Bug fix: Options data_storage_area ignored.
0.10.2
- Data storage directory is now set immediately. An error is thrown later, if creation fails.
0.10.1
- Refactoring of get methods
- Added get flags _COUNT and _KEYS
0.9.4:
- Added Globally Unique IDentifier option to key genration. post flag: _ADD_GUID
0.9.3:
- Cash update dublicate bug fix.
0.9.2:
- Minor fixes and rewrites