by: Stefan Kruger
Note: this is a proof of concept; it's not battle tested or supported in any way. If you find bugs (of which there will be plenty), do let us know – or better, consider a pull request.
he fastest way to deploy Cloudant Envoy to Bluemix is to click the Deploy to Bluemix button below.
](https://bluemix.net/deploy?repository=https://github.com/cloudant-labs/envoy)
Don't have a Bluemix account? If you haven't already, you'll be prompted to sign up for a Bluemix account when you click the button. Sign up, verify your email address, then return here and click the the Deploy to Bluemix button again. Your new credentials let you deploy to the platform and also to code online with Bluemix and Git. If you have questions about working in Bluemix, find answers in the Bluemix Docs.
Cloudant Envoy is a Node.js application on top of the Express.js framework. To install, clone the repo and run npm install
. The Envoy server needs admin credentials for the backing Cloudant database, and it expects the following environment variables to be set:
export PORT=8001
export MBAAS_DATABASE_NAME='dbname'
export COUCH_HOST='https://key:passwd@account.cloudant.com'
After those variables are set, you can start the Envoy server with npm start
. Note that the port is the port that Envoy will listen to, not the port of the Cloudant server.
- PORT - the port number Envoy will listen on. When running in Bluemix, Envoy detects the Cloud Foundry port assigned to this app automatically. When running locally, you'll need to provide your own e.g.
export PORT=8001
- COUCH_HOST - The URL of the Cloudant service to connected to. Not required in Bluemix, as the attached Cloudant service is detected automatically.
COUCH_HOST
is required when running locally e.g.export COUCH_HOST='https://key:passwd@account.cloudant.com'
- MBAAS_DATABASE_NAME - the name of the Cloudant database to use. Defaults to
mbaas
- LOG_FORMAT - the type of logging to output. One of
combined
,common
,dev
,short
,tiny
,off
. Defaults tooff
. (see https://www.npmjs.com/package/morgan) - DEBUG - see debugging section
Debugging messages are controlled by the DEBUG
environment variable. To see detailed debugging outlining the API calls being made between Envoy and Cloudant then set the DEBUG
environment variable to cloudant,nano
e.g
export DEBUG=cloudant,nano
node app.js
or
DEBUG=cloudant,nano node app.js
Cloudant has the potential to be an ideal backend for a mobile application. It is scalable, it syncs, and being schema-free it can cope with the frequent data changes that tend to happen in mobile development.
However, Cloudant was never designed to be an mbaas – a complete mobile application backend, and comparisons with dedicated mobile application backends such as Facebook's (now defunct) Parse stack highlight our shortcomings in this area. Some of the problematic areas include:
-
Authorisation
Many use cases suitable for a database-backed mobile app require record-level access controls to ensure that each user can only see and update their own data. This problem is compounded by the need for analytics across the whole data set. The currently recommended solution of a database per user and replication of all user databases into a single analytics database is not viable as the number of users grow beyond a certain number.
-
Authentication
Every mobile enterprise application will likely need to tap into an existing user database, be it a local LDAP server or a third-party OAuth2 provider, like Facebook or Google.
-
Unreliable networks
A mobile app needs to carry on working in the face of unreliable networks. Cloudant's approach is to use its excellent replication capabilities to do bi-directional sync to a local data store on the device, but this is only viable for small data sets as mobile devices by their very nature have limited storage facilities.
There are many different ways to address these issues, client side, server side, or in a middle layer. We propose that a thin middleware gateway application be constructed. The reason for this is that it would allow the mobile-specific functionality could be kept separate from the database itself. The mbaas aspect can be developed independently which means less complexity in the core and the development time and cost can be spread across more people.
Cloudant Envoy
is a thin gateway server application that sits between a Cloudant database and a mobile application. It implements document-level auth, users (and the beginnings of groups; not yet complete). This would provide a way around the first and the third problem areas as described above: each app would be backed by a single database instead of a database per user, and reads and changes would be filtered by user identity.
It is important to understand what this isn't. This isn't intended to be a new replicator, or even a way of providing features for other use cases: the intention is to make us more competitive in the mobile sphere. By its very nature (e.g. millions of simultaneous users) this layer needs to be as thin as it can be in order to not to become a bottle neck.
The Envoy server should not present an undue load on the underlying Cloudant cluster: if a million sync requests imply a million changes feeds, we're no better off than the million user databases replicating to a single analytics database. In order to make this robust it may be necessary to have a single changes feed follower pushing data onto an external message queue which can be scaled separately.
Here's what this currently does:
- Implement a per-document access rights model and the corresponding CRUD API calls
- Ensure that the replication-specific end points respect the access rights model
- CORS
Here's what's currently outstanding:
- Extend Query to always implicitly search based on user
- Implement a new set of API end points to allow the registration of a new user
- Implement auth against third-party authentication services
- Groups concept is not fully fleshed out
Central to the proposed solution is to add a new private field into each document which carries access rights information:
{
"_id": "c3065e59c9fa54cc81b5623fa06902f0",
"_rev": "1-9f7a5dd995bf4953bdb53f22f9b73558",
"com.cloudant.meta": {
"auth": {
"users": [ "harry", "hermione", "ron" ],
"groups": [ ]
}
},
"age": 5,
"type": "owl"
}
This states that the users harry
, hermione
and ron
all can read, write and delete this document. The com.cloudant.meta
field will be inserted on create, maintained on updates, but removed before a document is returned in response to a client request. Obviously, this field will be visible from the Cloudant console and in responses to client requests which go to the underlying database directly, bypassing the new layer.
The groups
list will allow groups of users to be granted access. For the initial release, there will only be a single, fixed group called public
which grants read-only access to the document to every user.
We do not expose views in this new layer: client data access will need to be via Query only. This is vital.
If I am user harry
and I create a new document, the assumption is that I am the sole user with access rights:
curl 'https://harry:alohomora@hogwarts.com/creatures' \
-X PUT \
-H "Content-Type: application/json" \
-d '{ "age": 456, "type": "thestral" }'
will result in the following document being written to the database:
{
"_id": "0d711609b3ab27a9069e7da766d93334",
"_rev": "1-42261671e23759c51e7f0899ee99418d",
"com.cloudant.meta": {
"auth": {
"users": [ "harry" ],
"groups": [ ]
}
},
"age": 456,
"type": "thestral"
}
and if I read the document with
curl 'https://harry:alohomora@hogwarts.com/creatures/0d711609b3ab27a9069e7da766d93334'
the result should be
{
"_id": "0d711609b3ab27a9069e7da766d93334",
"_rev": "1-42261671e23759c51e7f0899ee99418d",
"age": 456,
"type": "thestral"
}
If hermione
now were to request this document she should get a 401 Unauthorized
response.
CouchDB uses certain special fields that starts with an underscore to denote metadata. Ideally, we'd use something like "_auth" for our purpose, but CouchDB will strip out any underscored fields it doesn't recognise. For this reason, we use the field name com.cloudant.meta
, as we don't want to modify the behaviour of the CouchDB underneath. The consequence this has is that documents may not contain a field called
com.cloudant.meta
.
With this in place we can tackle the other problem: subset or filtered replication. Given that we now have a single database backing the app used by multiple users we need to ensure that mobile sync also obeys the access rules. This means that we need to ensure that the changes
, bulk_docs
and revs_diff
end points also respect the authentication rules.
We'd need to implement the following parts of the CouchDB CRUD API.
If the requesting user isn't either in the users
list or the document has the group public
listed, the request should fail with a 401 Unauthorized
response.
If the requesting user isn't in the users
list, the request should fail with a 401 Unauthorized
response. Note that the public
group grants read access only.
Create a new document with the creating user in the users
list.
Should behave like the current, but where new documents are created, they should have the creating user added to the users
field, and where documents are provided with {_id, _rev}
these should be subjected to the authorisation check as for a POST
to /{db}/{docid}
.
Note: this is potentially a performance problem as we need to check ownership of every document in the list that is given with {_id, _rev}
. It may be possible to implement this efficiently by first requesting all docs representing updates using _all_docs?keys=[key1, key2, ..., keyN]
(or the POST version, rather) and validating the auth details.
The changes feed should be filtered according to the same rules as a GET
to /{db}/{docid}
: only return changes related to documents where the requesting user is either listed in the users
list, or the document has the public
group ownership.
RevsDiff should check the returned list according to the same rules as a GET
to /{db}/{docid}
: only return changes related to documents where the requesting user is either listed in the users
list, or the document has the public
group ownership.
Creates a Cloudant Query index, supplementing the index to allows the ownership of a document to be queried.
Queries using Cloudant Query only returning the querying user's documents.
In order to respect groups, the query would have to be modified like so:
{
"selector": {
"$or": [
"auth.users": { "$elemMatch": { "$eq": "USER" } },
"auth.groups": { "$elemMatch": { "$eq": "public" } }
]
}
}
to ensure that the user can only see the documents they are authorised to see. The current implementation only uses auth.users.