Skip to content

The nYnab web API protocol

rienafairefr edited this page Mar 26, 2017 · 3 revisions

Reverse-engineering process

I've setup a Fiddler proxy between my browser and the internet, with https decrypting. I could then see the calls that are done to the app.youneedabudget.com website, the api endpoints and protocol. I've reverse-engineered a protocol that seems to correctly read and write to nYnab data, I'll try to explain here the final result.

The endpoint and communication

The 'https://app.youneedabudget.com/api/v1/catalog' endpoint is used for authentication, and back and forth communication with the API once authenticated. It expects data as a json-encoded dictionary and returns a json-encoded dictionary. The returned data can contain an 'error' field that would indicate a potential error, type of error etc. look at nYnabConnection::dorequest method for some details

Authentication

It is a simple user/password authentication scheme, pynYnab needs to use user and password in cleartext in order to login to nYnab on behalf of an user. Login request is a POST to the endpoint, with the header X-YNAB-Device-Id set, which is a randomly generated UUID, and we add login request data as a json-encoded dictionary

{"email": email, "password": password, "remember_me": True, "device_info": {"id": device_id}

We get back a session_token, we then use it in all requests to read or modify data by adding it as 'X-Session-Token' header. We also get the user in the 'user', we store its id for future use

The objects

Many, if not all, objects in the nYnab realm are identified by an id which is a UUID. The web-app is organised around a root nYnab object that stores entities, grouped in two categories, budget entities and catalog entities. budget entities are things like transactions, categories, monthly budgets, etc catalog entities are things like user, which budget is opened by which user, user settings

The protocol

All the back-and-forth between client and server is done through the changed_entities field. Let's say we want to retrieve data from the server for the catalog object. The request dictionary has fields:

opname, starting_device_knowledge, ending_device_knowledge, device_knowledge_of_server, changed_entities

with opname = syncCatalogData, and the other fields depend on client/server knowledge

Knowledge

Knowledge is basically a way for the web app to permit asynchonous modifications of the server data. When the server data can be modified by different clients with different levels of knowledge about the server data. It ensures there can be no data corruption on the server. There are two knowledges, the client and the server knowledge. When we receive data from the server, the server will tell at which knowledge this data corresponds. Then when the client sends back modified data, if the device_knowledge_of_server doesnt match with what the server has on record about what the client knows, it will fail. Server knowledge doesn't seem to be an increasing value like a number of seconds since a certain time, it's more like a random value, could be a hash value for the overall user data, seemingly akin to the SHA-1 hash that is used by git to uniquely identify a commit.

When the catalog or budget objects are not yet synced, just created, the client knowledge is 0. So we send starting_device_knowledge=0, ending_device_knowledge=0, device_knowledge_of_server=0 and changed_entities={}

sync

Once we send the request with the appropriate values, we receive back a sync_data dictionary, containing the change in the state of the server, containing fields

server_knowledge_of_device, current_server_knowledge, changed_entities

server_knowledge_of_device is the value at which the server already knows about the client. With a freshly created client, with a random device-id, then this value will be 0, but for subsequent requests this value will increase with increasing modifications to the data done by the client and pushed to the server. current_server_knowledge is the value we talked about before, seemingly random. changed_entities is a dictionary containing lists for all the data that the server determined that the client doesn't know about.

api_dict

What nYnab sends on the wire is not python objects, it's a serialization. Some values are scaled or formatted to be sent, and some conversion is needed to get actual 'business' values. e.g. dates are passed through YYYY-MM-DD strings, amount of currency values are passed as integers (scaled by 100 or 1000 to get the actual decimal money values), etc.

So we have our changed_entities, what do we do with it ? It's a dictionary, one entry for each attribute in the budget or catalog object. Example, for catalog objects, fields are

ce_user_budgets,ce_user_settings,ce_budget_versions,ce_users,ce_budgets 

see schema/roots.py for the Budget schema

Each value in the dictionary corresponds to a list of entities that need to be merged into the client data. In pynYNAB I went the option to let the object management be done by a sqlalchemy backend, with an in-memory sqlite database, so it's relatively easy to get all the incoming data, add them to the db, and all the relationships is automatic. the special entity attribute is_tombstone signifies a deleted entity.

pushing modifications

To push new or modified entities, we have to send to the server the same request dictionary, but this time starting_device_knowledge and ending_device_knowledge are not 0 anymore. starting_device_knowledge is the device knowledge before the modifications and ending_device_knowledge is increased. In pynYNAB the value is increased by the total number of modified or create objects, not sure it's how the web app does it, but it apparently works.

in pynYNAB modification tracking, to know what modifications to send to the server, is done through snapshotting. the clear_changed_entities method takes a snapshot of the state of the client data, then another snapshot is made before pushing, and the difference is computed and sent.

A safeguard was added, when calling client.push a parameter expected_delta is needed, to ensure that we push only what we actually modified. It happened to some people that pynYNAB considered all the transactions data as having been modified (it happened when nYNAB migrated from a *100 to a *1000 scaling for current values), so pushed all their data modified, losing hundreds of transactions. Now, if we add a single entity, we do client.push(1), if we modify 5 entities we do client.push(5).