Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Key paths are ambiguous, there is no trivial way to serialize a unique entity key! #59

Closed
dani-corie opened this issue Feb 17, 2018 · 13 comments · Fixed by #474 or #480
Closed

Key paths are ambiguous, there is no trivial way to serialize a unique entity key! #59

dani-corie opened this issue Feb 17, 2018 · 13 comments · Fixed by #474 or #480
Assignees
Labels
api: datastore Issues related to the googleapis/nodejs-datastore API. help wanted We'd love to have community involvement on this issue. type: feature request ‘Nice-to-have’ improvement, new feature or different behavior or design.

Comments

@dani-corie
Copy link

dani-corie commented Feb 17, 2018

Hey... So I've read a few issues on this. For example there's "Get Unique Entity Key String" about "urlsafe" keys. There the apparently accepted solution was to serialize and encode the key path...

However, key paths in nodejs are ambiguous! See Key IDs Are Coming Back with String Values.

Environment details

  • OS: Ubuntu Linux 16.04
  • Node.js version: 8.9.4
  • npm version: 5.6.0
  • @google-cloud/datastore version: 1.3.3

Steps to reproduce

const key = db.key(["Post", "31337", "Comment", db.int("999999999999999999")]);
console.log(JSON.stringify(key));
// {"id":"999999999999999999","kind":"Comment","parent":{"name":"31337","kind":"Post","path":["Post","31337"]},"path":["Post","31337","Comment","999999999999999999"]}

const rekey = db.key(key.path);
console.log(JSON.stringify(rekey));
// {"name":"999999999999999999","kind":"Comment","parent":{"name":"31337","kind":"Post","path":["Post","31337"]},"path":["Post","31337","Comment","999999999999999999"]}

It's quite clear that any queries relying on the above serialized key path will fail!

I'm not sure if I can just JSON.parse() a key and use it in a query... If yes, that might be a solution, though this serialized format is ridiculously verbose to use as a foreign key or even as a transmission format.

I can work around this by NOT using entity groups AT ALL (cutting out one of the ways I could optimize a Datastore db), and only having references to fixed Kinds (coming from a relational background I can live with this one)... In this case, I can always just store or send a numeric Id (as a decimal string due to JS number limitations). But still, it's kinda painful compared to having a globally unique serializable Id I could easily use for caching, references, etc...

Thanks!

@stephenplusplus
Copy link
Contributor

Do you have any suggestions? Let's say we had a new method, "key.serialize()" (or something). What would it return that would be acceptable?

@stephenplusplus stephenplusplus added priority: p2 Moderately-important priority. Fix may not be included in next release. type: question Request for information or clarification. Not an issue. labels Feb 19, 2018
@beaulac
Copy link
Contributor

beaulac commented Feb 19, 2018

@daniel-jozsef

You only need to stringify the key's path property to get a UID (unless you're using separate namespaces):

const uid = JSON.stringify(key.path);

// Naive way to recover `int`s is to just parse them as such if the string is numeric
const parsedPath = (JSON.parse(path)).map(e => /^[0-9]+$/.test(e) ? ds.int(e) : e);

const keyFromUid = ds.key(parsedPath);

You may be interested in a tool I made to address my own 'wishlist' for DS keys. Specifically the keyToUid and uidToKey functions.

@dani-corie
Copy link
Author

@beaulac yes using regular expressions can be a workaround, thanks for the idea. :) It did occur to me, but it's somewhat ugly, and numeric names would break it. If I don't use names, I even considered treating every even place of the path array as ds.int.

A framework solution would be nice though, this is among the simplest use cases.

@stephenplusplus I'd expect path to be descriptive. Maybe a db.id object should be added that wraps db.int, and serializes as { id: value }.

A built-in base64 key would be comfortable, but I see ambiguous paths as a huge issue by itself.

@dani-corie
Copy link
Author

dani-corie commented Feb 20, 2018

I think the google cloud node sdk as a whole needs a robust solution to integers.

JS not having integers is a ridiculously big issue by itself... These piecemeal solutions such as db.int (which serializes into string, so breaks on serialize-then-parse) make working with the library somewhat painful.

I'd say adopting a bigint library from the community, or just deciding on an unambiguous serialization format (that deserializes back into the original object without loss) is what a real solution would look like.

@stephenplusplus
Copy link
Contributor

It would be nice if we could change key.path, but I think it might be dangerous without being certain how it's being used in the wild.

What about a new property, let's say pathSerialized (more creative names welcome)?

const key = ds.key(['Post', '31337', 'Comment', db.int("999999999999999999")])

key.pathSerialized
// {
//   kind: 'Comment',
//   id: '999999999999999999',
//   parent: {
//     kind: 'Post',
//     name: '31337'
//   }
// }

// To re-use:
const recreatedKey = ds.key(key.pathSerialized)

@stephenplusplus stephenplusplus added type: enhancement and removed type: question Request for information or clarification. Not an issue. labels Feb 20, 2018
@dani-corie
Copy link
Author

@stephenplusplus that sounds like a solution.

Maybe it could be called "objectpath", as it's an object tree as opposed to an array. (also this allows datastore.key() to seamlessly digest both these and array paths, it just needs to check isArray

@stephenplusplus
Copy link
Contributor

Sounds good, thanks for the suggestion! There are a few other things that are going to take priority, so I'll mark this with a help wanted label in case anyone wants to pick it up before I can.

@stephenplusplus stephenplusplus added the help wanted We'd love to have community involvement on this issue. label Feb 20, 2018
@beaulac
Copy link
Contributor

beaulac commented Feb 21, 2018

How would the Datastore.key factory/constructor differentiate such an objectpath from the current options object?

const options = {
  namespace: "...",
  path: ["kind", "id"]
}

What if instead the Key class provided methods for (de)serialization from a GQL Key string? This is a completely unambiguous key, as it additionally (optionally) provides the projectId and namespace, while being less verbose as a serialized format than JSON.

An added benefit of is that it provides a quick way to look entities up in the Datastore Viewer, which is invaluable while debugging :)

@stephenplusplus stephenplusplus removed the priority: p2 Moderately-important priority. Fix may not be included in next release. label Feb 21, 2018
@stephenplusplus
Copy link
Contributor

That sounds pretty great! It looks like an example could be (please correct if wrong):

path.keyString = 'KEY([Building:C, Floor:1, Room:123])'

Do you know of a reliable way we could differentiate between IDs and names when creating a Key object from that input?

@dani-corie
Copy link
Author

How does GQL do it? If it needs to have access to db metadata, then it's not fully descriptive.

@dani-corie
Copy link
Author

Also, I wonder if Datastore allows entities of the same Kind with both Keys and Ids... If yes, then it's a real clusterf...

@beaulac
Copy link
Contributor

beaulac commented Feb 21, 2018

@stephenplusplus
Here is an example GQL Key:

KEY(parentKind,1234,childKind,'childName')
  • Kinds (even positioned elements) are unquoted strings. e.g. parentKind
  • Identifiers (odd positioned elements) are either
    • Quoted (' or ") strings e.g. 'childName'
    • Unquoted integers e.g. 1234

GQL key syntax is defined in the GQL docs as:

<Key> := KEY "("
    [ "PROJECT" "(" <string-literal> ")" "," ]
    [ "NAMESPACE" "(" <string-literal> ")" "," ]
    <key-path-element>+, ")"

<key-path-element> :=
  <kind> "," ( <integer-literal> | <string-literal> )

(integer/string literals defined here)

When specifying a project and/or namespace, path parsing starts after the definition of a PROJECT or NAMESPACE. In these examples, parentKind is treated as being in position 0 for the purposes of determining whether a path element is even or odd.

  • KEY(PROJECT('my-project'),NAMESPACE('ns'),parentKind,1234,childKind,'childName')
  • KEY(PROJECT('my-project'),parentKind,1234,childKind,'childName')
  • KEY(NAMESPACE('ns'),parentKind,1234,childKind,'childName')

@daniel-jozsef

  • No DB metadata is needed to parse a key.
  • Yes, Datastore allows mixed names + ids for the same entity kind.
    These two properties are connected: Datastore does not enforce a schema for an entity's key (beyond syntax) because there is no predefined metadata for entities. It's very flexible, and indeed, as you're implying, with great power comes great responsibility 😉.

@dani-corie
Copy link
Author

So the entire gql key is going to be a monolithic string in js? I guess it would work fine...

Though handling int64s is something that definitely needs a comprehensive solution.

@JustinBeckwith JustinBeckwith added type: feature request ‘Nice-to-have’ improvement, new feature or different behavior or design. and removed type: enhancement labels Jun 5, 2019
@stephenplusplus stephenplusplus self-assigned this Aug 16, 2019
hermanbanken added a commit to hermanbanken/nodejs-datastore that referenced this issue Aug 27, 2019
stephenplusplus pushed a commit to hermanbanken/nodejs-datastore that referenced this issue Oct 21, 2019
@google-cloud-label-sync google-cloud-label-sync bot added the api: datastore Issues related to the googleapis/nodejs-datastore API. label Jan 31, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: datastore Issues related to the googleapis/nodejs-datastore API. help wanted We'd love to have community involvement on this issue. type: feature request ‘Nice-to-have’ improvement, new feature or different behavior or design.
Projects
None yet
4 participants