Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pushPayload/save can create duplicate records in the store #4262

Closed
stevesims opened this issue Mar 23, 2016 · 25 comments · Fixed by #6272
Closed

pushPayload/save can create duplicate records in the store #4262

stevesims opened this issue Mar 23, 2016 · 25 comments · Fixed by #6272

Comments

@stevesims
Copy link

I'm working on an app that uses websockets to push out changes from a server to multiple clients, keeping them in sync. (Specifically the server uses Rails Action Cable, and the client subscribes to channels it wants to receive updates on.) Data updates are sent down the socket/channel in JSON-API format, thus the clients simply call this.store.pushPayload(data); to update themselves.

On the whole this works great, but occasionally we'll see duplicate records in the store of a client that has created a record. Some investigation shows that this happens when the socket received it's copy of the record before the save() call had completed.

Obviously having two copies of the same record (with the same ID) in the store is less than ideal.

The reason for the duplicate is because the original record on which the save call was made gets updated with the attributes returned back from the server (adapterDidCommit in ember-data/-private/system/model/internal-model), filling in the ID of the record object being saved, whilst the pushPayload call from the socket has already pushed a new copy with the same ID.

It seems to me that in an ideal world, adapterDidCommit could notice that the id has been set/updated, and if it can find a duplicate record in the store it could remove that dupe.

A simple workaround for this problem is to slightly delay the call to pushPayload, but that will only make the problem less likely to occur.

@fivetanley
Copy link
Member

Is there a way to break this down into runnable code to debug? I don't think I've seen same IDs create duplicate records, so it would be great to have some code as a starting point to determine when this is happening.

@fivetanley
Copy link
Member

JSBin is a great place to start; you could mock the websocket using setTimeout/etc. http://emberjs.jsbin.com/nunico/1/edit?html,js,output

@fivetanley
Copy link
Member

Also does the record have an id already when you are calling save?

@stevesims
Copy link
Author

Not really sure how I could simulate this in JSBin - I know that this is happening when the pushPayload from the socket occurred before the record's save promise resolves simply through putting in console.log statements. To mock this out I'd need to simulate both the websocket and also a server receiving the ember-data save.

To answer your question, no, as it's a newly created record the original object does not have an ID.

I'm not familiar with the internals of ember-data, but I presume that the adapterDidCommit implementation in ember-data/-private/system/model/internal-model is what's responsible for filling in the ID on a record returned back from the server's response to a POST to save a record.

Looking at that code, it seems inevitable that as my socket is (occasionally) calling pushPayload and inserting a copy of the record into the store, the adapterDidCommit will just update the existing record to have a matching ID to the one that's already been pushed. Hence the duplicate.

@zion03
Copy link

zion03 commented Apr 9, 2016

+1 I also have the same error.
In my case I wrote additional logic and pushObject makes duplicate keys.

@pangratz
Copy link
Member

This ember-twiddle demonstrates how it is possible to have duplicates records in the store.

The problem is that store.push creates a DS.Model and store.createRecord creates a DS.Model and both have different InternalModels, though they actually reference the same record. A possible solution at first sight could be to re-wire the InternalModels when data comes back for createRecord and there is already an InternalModel for the returned id in the store. But this raises the issue that there would be two DS.Model instances, which have the same InternalModel but comparing them via === evaluates to false.

@boris-petrov
Copy link

+1 I hit the same issue. It would be nice to have a fix for this as soon as possible. Until then, does anyone have an idea for a workaround? I tried unloading the record returned from save, however it doesn't seem to fix the problem.

@mmun
Copy link
Member

mmun commented Apr 13, 2016

(Just commenting so I get updates on this and they show up under Participating because I get too many notifications otherwise)

@luisjuniorj
Copy link

(Just commenting so I get updates on this)

@courajs
Copy link
Contributor

courajs commented May 21, 2016

My workaround has been just to delay your push until the server response comes back. This way it will function as an update to the record rather than creating a new one

@christophersansone
Copy link
Contributor

I just ran into this as well. To me, any server solution is questionable because, while you may be able to guarantee the order in which the server completes the response and pushes data, you cannot guarantee the order in which the client will receive and process it. You can delay the data push, either on the server sending it or the client receiving it, but that can still be error prone.

For now, my solution is this. When the client receives a data push, it looks at the payload, determines if there are any models in flight that may be the same model in the data push, and waits for their completion. This way, we have some guarantees that (a) the data will not be processed until it can, and (b) it will process as soon as it knows it can.

Hopefully this helps others facing this problem. It assumes a JSON-API payload, but obviously it can be adapted as needed. It assumes that store and serializer are in context.

  // returns a promise that resolves when the data payload can be processed by the store
  const waitForReady = function(data) {
    const typeKey = data && data.data && data.data.type;
    if (!typeKey) { return Ember.RSVP.resolve(); }

    const type = serializer.modelNameFromPayloadKey(typeKey);
    const id = data && data.data && data.data.id;
    // if the model is already in the store, return immediately
    if (store.peekRecord(type, id)) { return Ember.RSVP.resolve(); }

    // watch for models currently being created
    const creatingModels = store.peekAll(type).filter((m) => (m.get('isSaving')) && (m.get('dirtyType') == 'created'));
    // if no models of this type are being created, return immediately
    if (Ember.isEmpty(creatingModels)) { return Ember.RSVP.resolve(); }

    var resolvedCount = 0;

    return new Ember.RSVP.Promise((resolve, reject) => {
      // resolve once all create requests have resolved
      const didResolve = () => {
        resolvedCount += 1;
        if (creatingModels.length === resolvedCount) {
          finish();
        }
      };

      // resolve if the model now exists in the store
      const didCreate = () => {
        if (store.peekRecord(type, id)) { finish(); }
        didResolve();
      };

      const start = () => {
        creatingModels.forEach((model) => {
          model.on('didCreate', didCreate);
          model.on('becameInvalid', didResolve);
          model.on('becameError', didResolve);
        });
      }

      const finish = () => {
        creatingModels.forEach((model) => {
          model.off('didCreate', didCreate);
          model.off('becameInvalid', didResolve);
          model.off('becameError', didResolve);
        });
        resolve();
      };

      start();
    });
  };

@rmachielse
Copy link

We're having exactly the same issue. The action cable message comes in before the request is finished. We temporarily solved it by pushing to the store after all xhr requests have been finished:

$(document).one('ajaxStop', () => {
  this.get('store').pushPayload(payload);
});

@pete-the-pete
Copy link
Contributor

I also came across this because because I have different endpoints for writes vs reads.
An EventStream is used to push() data into the store on arrival. Records created by
the user go through a createRecord() flow.

As mentioned above, the createRecord() and push() have different internal models, so if
the push() happens before the createRecord resolves, I can get duplicate records. The
createRecord response updates its internalModel to have the same data (including id) as
the one frmo push().

Here's an EmberTwidde demonstrating the problem: Duplicate Models

From what I can tell, this is never a problem when the createRecord completes before the push() because the push()
flow checks for existing records: _load
and updates them. The createRecord flow culminates at didSaveRecord, which updates the existing internalModel, but doesn't
have a check about others.

My work around was to check for an existing model of the same id and unloading it.

if (existingRecord) {
  store.unloadRecord(existingRecord);
}

Unfortunately this has some other issues of not actually removing that record from hasMany relationships, but I have a filter to not show 'deleted' records.

@runspired Is this something that should be looked into now? or addressed with future improvements?

@runspired
Copy link
Contributor

@pete-the-pete I have the beginnings of a plan here, but it'll require more refactoring before I'm ready to address this. The major problem here is deciding which state is "canonical".

  • Is it the state returned by the call to save ?
  • Is it the state already in the store via the push ?
  • Is it both ?

I suspect there is no universal answer to this question, but that the 90% answer is that it's "both".

As @pangratz mentioned, the easy path is to rewire the InternalModel's. I find this acceptable, even with the trade off of there maybe being two record instances, provided we agree on what canonical state should mean. (I'll get into why it's only a maybe below).

We should also not do this silently. For the solution outlined below:

  • CRi is the instance of internalModel created by createRecord
  • CRm is the instance of DS.Model created by createRecord
  • Pi is the instance of internalModel created by push
  • Pm is the potential instance of DS.Model created by push

I believe the solution is along the lines of:

  • Warn the user in dev mode that a push occurred before save resolved.
  • Rewire the CRm to use the Pi
  • If Pm does not exist, update Pi to use CRm as it's record instance
  • If Pm does exist, warn the user in dev mode that CRm is now stranded (it's not actually ;) )
  • If Pm does exist, make the setters on CRm also update Pm (ewww)

Why is Pm only maybe instantiated after a push?

Via public APIs, it will currently always be instantiated, however, #4664 allows us to push without materializing the DS.Model instance and emberjs/rfcs#161 will soon lead to better streaming update support.

@runspired
Copy link
Contributor

@pete-the-pete I would also like to note that the other solution here is to use an api actions like approach in which you never create a client side model at all if you know you will be receiving the update via websocket, I suspect we can do this elegantly, ping me privately with model details and I'll walk you through it.

@eodb
Copy link

eodb commented Mar 30, 2017

We have the same problem in our application which also uses websockets for server communication. We worked around the issue by postponing the push of any payload until the promises returned by the save() call of the new record(s) has been resolved.
A work around we would like to get rid of if anyhow possible.

@chrism
Copy link
Contributor

chrism commented Aug 13, 2017

Is there any update on addressing this issue?

I see that #4664 is now merged, but emberjs/rfcs#161 still seems open.

I'm struggling to find a decent solution or workaround.

@Techn1x
Copy link

Techn1x commented Nov 23, 2017

+1

The workarounds listed above are pretty dodgy, would be great to have a solution to this as more and more applications use action cables

@Techn1x
Copy link

Techn1x commented Apr 4, 2018

This has arisen as a problem for me again due to the following scenario (even with @christophersansone 's workaround code above):

  1. asset belongsTo assetType
  2. new Asset, POST request started
  3. Server receives request, changes made to AssetType, Asset created, committed to db
  4. AssetType is pushed down the cable, it is not 'in flight' so it updates the AssetType in the store. The AssetType model contains relationships to Assets, so each related Asset is also created/saved in the store in the same payload
  5. Asset is pushed down the cable, waits due to 'in flight' POST request thanks to above workaround code
  6. POST request returns, goes to create Asset in the store, fails due to Asset ID already existing

I'm not entirely sure how to fix the workaround code to cater for this scenario, and I really need both Assets and AssetTypes pushed down the cable.

The approach that does cater for this scenario @rmachielse 's heavy-handed approach of blocking all cable updates while their are any in-flight requests... but trying to avoid that...

@Techn1x
Copy link

Techn1x commented Apr 4, 2018

The $(document).one('ajaxStop', ()=>{}) approach is also not ideal solution since the order in which cable messages are pushed to the store cannot be guaranteed, so old data could be pushed after newer data...

@christophersansone
Copy link
Contributor

@Techn1x I don't think I have had this specific use case, but I have learned to work around these kinds of problems with Ember Data by simply building a custom method on the model's adapter (which have come a long way since my original workaround). This way, you can handle the response yourself, rather than having the model do it (incorrectly in your case). If you can avoid creating a model in the store up front (and just use a Javascript object), that makes things a bit easier too. Something like this to pass in the model attributes, and return a promise that resolves to the model instance:

const adapter = this.get('store').adapterFor('asset');
adapter.createModel(modelAttributes).then((model) => ...);

...

const AssetAdapter = JSONAPIAdapter.extend({
  createModel(attrs) {
    return this.ajax(url, 'POST', { data: attrs }).then((response) => {
      const store = this.get('store');
      store.pushPayload(response);
      return store.peekRecord('asset', response.data.id);
    });
  }
});

This way, you avoid the whole process of having a model without an ID (that then has certain expectations of how and when its data should arrive). The model is created and updated from the server payloads, and the order in which it is received (POST response or websocket) does not matter.

If you still need to have an actual ED model prior to saving, you can still use this approach, but the response will result in a separate model instance because the original model will not be updated. Just update any references to point to the new model, and then unload the original model. Not a big deal, but I just find it easier to use a POJO in this case unless I really need some computed properties on the model class.

It seems like this RFC will provide a more public API to internalModel, which should offer some ability to resolve this.

@Techn1x
Copy link

Techn1x commented Apr 4, 2018

Thanks @christophersansone ! I hope so, because my solution is very janky at the moment. Your new solution seems like a lot of changes to my current app implementation, I'll wait and see what this RFC provides

As for an alternative, what do you think of something like this? Is this a terrible abomination? Basically it overrides the updateId() method in the store service (where the duplicate ID error occurs), and ignores it, since we're using GUID's for our record ID's anyway, so if any two ID's are the same then they are definitely the same record.

updateId(internalModel, data) {
let oldId = internalModel.id;
let modelName = internalModel.modelName;
let id = coerceId(data.id);
// ID absolutely can't be missing if the oldID is empty (missing Id in response for a new record)
assert(`'${modelName}' was saved to the server, but the response does not have an id and your record does not either.`, !(id === null && oldId === null));
// ID absolutely can't be different than oldID if oldID is not null
assert(`'${modelName}:${oldId}' was saved to the server, but the response returned the new id '${id}'. The store cannot assign a new id to a record that already has an id.`, !(oldId !== null && id !== oldId));
// ID can be null if oldID is not null (altered ID in response for a record)
// however, this is more than likely a developer error.
if (oldId !== null && id === null) {
warn(`Your ${modelName} record was saved to the server, but the response does not have an id.`, !(oldId !== null && id === null));
return;
}
let existingInternalModel = this._existingInternalModelForId(modelName, id);
assert(`'${modelName}' was saved to the server, but the response returned the new id '${id}', which has already been used with another record.'`,
isNone(existingInternalModel) || existingInternalModel === internalModel);
this._internalModelsFor(internalModel.modelName).set(id, internalModel);
internalModel.setId(id);
},

// app/services/store.js
import DS from 'ember-data';

import { assert, warn } from '@ember/debug';
import { isNone } from '@ember/utils';

export default DS.Store.extend({
	
  coerceId(id) {
    if (id === null || id === undefined || id === '') { return null; }
    if (typeof id === 'string') { return id; }
    return '' + id;
  },

  updateId(internalModel, data) {
    let oldId = internalModel.id;
    let modelName = internalModel.modelName;
    let id = this.coerceId(data.id);

    // ID absolutely can't be missing if the oldID is empty (missing Id in response for a new record)
    assert(`'${modelName}' was saved to the server, but the response does not have an id and your record does not either.`, !(id === null && oldId === null));

    // ID absolutely can't be different than oldID if oldID is not null
    assert(`'${modelName}:${oldId}' was saved to the server, but the response returned the new id '${id}'. The store cannot assign a new id to a record that already has an id.`, !(oldId !== null && id !== oldId));

    // ID can be null if oldID is not null (altered ID in response for a record)
    // however, this is more than likely a developer error.
    if (oldId !== null && id === null) {
      warn(`Your ${modelName} record was saved to the server, but the response does not have an id.`, !(oldId !== null && id === null));
      return;
    }

    let existingInternalModel = this._existingInternalModelForId(modelName, id);

    // assert(`'${modelName}' was saved to the server, but the response returned the new id '${id}', which has already been used with another record.'`,
    //   isNone(existingInternalModel) || existingInternalModel === internalModel);
    if(!isNone(existingInternalModel) && existingInternalModel !== internalModel) {
      // Unload existing (older) model from the cable, use the newer one from the POST
      existingInternalModel.unloadRecord();

      // Skip the assert() check in .set(), just set the ID in the identity map directly
      // this._internalModelsFor(internalModel.modelName).set(id, internalModel);
      this._internalModelsFor(internalModel.modelName)._idToModel[id] = internalModel;
	
      // Now ensure the internalModel has the correct ID
      internalModel.setId(id);
    } else {
      // Normal procedure
      this._internalModelsFor(internalModel.modelName).set(id, internalModel);
      internalModel.setId(id);
    }
  },
});

@christophersansone
Copy link
Contributor

@Techn1x At this point, whatever works for you. My personal opinion would be to use the adapter methods since it allows the level of customization you need and uses public APIs. The ED team put a lot of hard work into redesigning the adapter API to do exactly this sort of thing. And you do not have to implement the new approach everywhere... just implement it in the specific places where you are running into this problem. You can probably even create a generic createModel method on the application adapter to offer a consistent, universal workaround when you need it.

@runspired
Copy link
Contributor

Closing this as a duplicate of #1829

We've discussed a few things RE this problem lately, and now that json-api has lid support coming in operations we may have a path forward.

  1. Use lid for saving nested newly-created relationships.
  2. End users may choose to make their API preserve and attach lid to other payloads in a session if needed.
  3. Users should always opt for a client-generated universal-id system for more complex situations. The json-api spec may even upgrade this to a SHOULD recommendation.

@Techn1x
Copy link

Techn1x commented Jun 28, 2018

@christophersansone Ended up pushing the records manually for the models that needed it, like you suggested (skipping any kind of ID check), but my approach was slightly different in that it works for models (rather than just raw attributes) - the downside is it uses _internalModel to create a snapshot.

Posting here in case anyone has a need for it (or if anyone can suggest improvements)

Model

//app/models/item.js
export default DS.Model.extend(SaveManualMixin, {
});

Mixin

// app/mixins/save-manual-mixin.js
import Mixin from '@ember/object/mixin';

// This is a workaround for models being received & created by actioncable BEFORE the POST request returns
// The POST request then also tries to create the record, but ID already exists, so it errors out.

// NOTE: When using this mixin, make sure that after you save, you remove any references to the old record, since it gets unloaded
// NOTE: This mixin is only needed for models that have both of the following features;
//		a) are pushed down the cable
//		b) are able to be created by the user with POST

export default Mixin.create({
  save(options) {
    if(this.get('isNew')) {	// i.e. POST
      return this.saveManual(options).then((savedRecord) => {
        // Since the saved record is not actually the same record as the one that was passed, we need to unload the old record
        this.unloadRecord();
        return savedRecord;
      });
    }
    // If the record is not brand new, just use the standard save method
    return this._super(...arguments);
  },

  saveManual(options) {		
    let store = this.get('store');
    let modelClass = store.modelFor(this.constructor.modelName);	// I think this.constructor === modelClass, but let's do it this way anyway
    let internalModel = this._internalModel;	// I know, I know, it's private.....

    let adapter = store.adapterFor(modelClass.modelName);
    return adapter.createRecordManual(store,modelClass,internalModel.createSnapshot(options));
  }	
});

Adapter

//app/adapters/application.js
export default DS.JSONAPIAdapter.extend({
  createRecordManual(store, type, snapshot) {
    let data = {};
    let serializer = store.serializerFor(type.modelName);
    let url = this.buildURL(type.modelName, null, snapshot, 'createRecord');

    serializer.serializeIntoHash(data, type, snapshot, { includeId: true });

    return this.ajax(url, 'POST', { data: data }).then((response) => {
      store.pushPayload(response);
      return store.peekRecord(type.modelName, response.data.id);
    });
  }
});

Now it doesn't matter if the cable creates the record before the POST returns, and I can just do this...

item.save().then((savedRecord) => {
  this.controller.set('model',savedRecord);	// Replace reference to old model item since it was unloaded
  ...
});

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.