Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SecondaryIndex does not support “partial” or “conditional” datasets #104

Closed
shpakovski opened this issue Aug 4, 2014 · 10 comments
Closed

Comments

@shpakovski
Copy link

Hello, I got a use case:

You may want to index only archived objects to restore them later by name. So one would expect something like this to work:

YapDatabaseSecondaryIndexWithObjectBlock block = ^(NSMutableDictionary *dict, NSString *collection, NSString *key, MYCategory *category) {
    if ([category isKindOfClass:[MYCategory class]] && category.archived) {
        dict[MYCategoriesIndexKeyDisplayName] = category.displayName;
    }
};

Later, you query archived objects by MYCategoriesIndexKeyDisplayName, restore them by setting archived to NO, and automatically delete from the index. Unfortunately, this is impossible with current implementation.


Here is why:

The most likely you had set archived property later than the object was stored into the YapDatabase. As result, the SecondaryIndex only updates the row literally generating an UPDATE statement for SQLite without any effect in the table. Here is a workaround:

YapDatabaseSecondaryIndexWithObjectBlock block = ^(NSMutableDictionary *dict, NSString *collection, NSString *key, MYCategory *category) {
    if ([category isKindOfClass:MYCategory.class] && category.archived) {
        dict[MYCategoriesIndexKeyArchived] = @(category.archived);
        dict[MYCategoriesIndexKeyDisplayName] = category.displayName;
    }
};

The problem is that you have to keep an index for all categories potentially increasing search time. I believe it would be more effective to allow “partial” indexes and, as result, to reduce the number of rows in the index database table.


Not sure if this is a bug or a feature, so I updated the Wiki page.

Thanks,
Vadim

P. S. Maybe there is another way to solve this use-case?

@mackross
Copy link
Contributor

mackross commented Aug 4, 2014

I'm doing something similar in one of our apps. I ended up using a different collection for archived objects, i.e. "archived/employees". It might not be possible in your implementation but it worked well for my use case.

@shpakovski
Copy link
Author

@mackross Great idea, thank you for confirmation that we are not alone :) I would definitely go this way, but the app supports Sync where “more types of objects and relationship = more problems”.

@robbiehanson
Copy link
Contributor

The most likely you had set archived property later than the object was stored into the YapDatabase.

WOAH !!!!! Stop right there.

I think we might have a very fundamental misunderstanding of how YapDatabase works.
Correct me if I'm wrong, but this is what I'm hearing. You're doing something like this (more or less):

MyCategory *cat1 = [[MyCategory alloc] initWith...];

[databaseConnection readWriteWithTransaction:^(YapDatabaseReadWriteTransaction *transaction){

    [transaction setObject:cat1 forKey:cat1.uuid inCollection:@"categories"];
}];

// And then at some later point in time you do this:
cat1.archived = NO;
// And you expect the database to automatically know that cat1 was changed somehow?
// And that it magically updates itself somehow ?

This would be a very fundamental misunderstanding of YapDatabase. And would likely cause all kinds of things to break.

In particular, it would be treating YapDatabase as if it was Core Data. It would be pretending that MyCategory is a subclass of the non-existent YDBManagedObject class. And it would assume that YapDatabaseConnection is somehow like NSManagedObjectContext.

For a deeper explanation of what I'm alluding to here, please see my blog article Response: CoreData.next.

And if I misunderstood you entirely, please let me know. That happens from time to time. Especially on Friday afternoons. :)

@shpakovski
Copy link
Author

@robbiehanson Ha-ha, got you! :) No worries, please. By “later” I meant setting the archived flag in the cat1.copy inside of the readWrite… transaction, so that the index is updated automagically but as expected.

@robbiehanson
Copy link
Contributor

By “later” I meant setting the archived flag in the cat1.copy inside of the readWrite… transaction

*breathes a sigh of relief :)

The most likely you had set archived property later than the object was stored into the YapDatabase. As result, the SecondaryIndex only updates the row literally generating an UPDATE statement for SQLite without any effect in the table.

IF this is the case, then it's a bug. Here's what should happen:

When you modify the cat1 object (cat1 = [cat1 copy]; cat1.archived = NO; [transaction setObj...]),
then it should hit this method within YapDatabaseSecondaryIndexTransaction:

- (void)handleUpdateObject:(id)object
          forCollectionKey:(YapCollectionKey *)collectionKey
              withMetadata:(id)metadata
                     rowid:(int64_t)rowid;

This method invokes your block, and then has this bit of code at the bottom:

if ([secondaryIndexConnection->blockDict count] == 0)
{
    // Remove associated values from index (if needed).
    // This was an update operation, so the rowid may have previously had values in the index.

    [self removeRowid:rowid];
}
else
{
    // Add values to index (or update them).
    // This was an update operation, so we need to insert or update.

And thus, it should remove the category from the secondaryIndex at this point. So I believe it should support this partial index that you want.

If this isn't what you're seeing, let me know. Or, even better, take a look at that method in YapDatabaseSecondaryIndexTransaction, and tell me what's going wrong.

Again, please let me know if I misunderstood you. Obviously such a thing isn't without precedent.

@shpakovski
Copy link
Author

I assume the “problem” is literally some lines below:

    // Add values to index (or update them).
    // This was an update operation, so we need to insert or update.

    [self addRowid:rowid isNew:NO];
    [secondaryIndexConnection->blockDict removeAllObjects];

Here you pass NO which affects the following:

    if (isNew)
        statement = [secondaryIndexConnection insertStatement];
    else
        statement = [secondaryIndexConnection updateStatement];

Finally, the updateStatement generates an UPDATE SQL query, but there is no row to update in the table because [blockDict count] == 0 removed it previously.

As result, the secondary index will update only rows created in handleInsertObject:forCollectionKey:withMetadata:rowid: where the flag isNew is always YES.

My initial question was: is this a bug or a feature? Thanks in advance :)

@robbiehanson
Copy link
Contributor

Finally, the updateStatement generates an UPDATE SQL query, but there is no row to update in the table because [blockDict count] == 0 removed it previously.

This is a contradiction. If [blockDict count] == 0, then the code in the 'else' statement would not have run. Only the code in the 'if' statement, which would have removed the row, as expected.

I feel like we're going back and forth here without making any progress. I still don't fully understand what the issue is, or what the question is. So let's start fresh, and lay bare the assumptions and expectations.

From a high level perspective, the YDBSecondaryIndex extension is initialized with a block that is expected to fill out values in a dictionary (if needed). And the YDBSecondaryIndex extension is expected it invoke this block anytime an object is inserted or updated within the database. If the block doesn't add any values to the dictionary, then the YDBSecondaryIndex is expected to NOT index that row. Meaning that if an object was previously indexed, but is updated in such a way that the block now returns an empty dictionary for it, then the YDBSecondaryIndex is expected to remove the row from it's secondary index table.

Now let's run through the scenarios:

(Terminology: "Inserted" == added for the first time, "Updated" == modifying an object that previously existed in the database.)

  1. Category Inserted, archived is NO => Block returns empty dictionary, YDBSecondaryIndex ignores row
  2. Category Inserted, archived is YES => Block returns non-empty dictionary, YDBSecondaryIndex inserts values for row
  3. Category Updated, archived toggled from NO to YES => Block returns non-empty dictionary, YDBSecondaryIndex inserts values for row
  4. Category Updated, archived toggled from YES to NO => Block returns empty dictionary, YDBSecondaryIndex removes all values for row

These are the expectations. From what I understand, you seem to be indicating a problem with scenario 4. And you mentioned that you have code more-or-less like this:

[databaseConnection readWriteWithTransaction:^(YapDatabaseReadWriteTransaction *transaction){

    MYCategory *category = [transaction objectForKey:catID inCollection:@"categories"];
    category = [category copy];
    category.archived = NO;

    [transaction setObject:category forKey:catID inCollection:@"categories"]; // point of update
}];

And you originally used a block like this:

YapDatabaseSecondaryIndexWithObjectBlock block = ^(NSMutableDictionary *dict, NSString *collection, NSString *key, MYCategory *category) {
    if ([category isKindOfClass:[MYCategory class]] && category.archived) {
        dict[MYCategoriesIndexKeyDisplayName] = category.displayName;
    }
};

Thus we expect that the updated category will be removed from the secondary index. And we can step through the code as follows:

  • At the "point of update", the database architecture will invoke [YapDatabaseSecondaryIndexTransaction handleUpdateObject:forCollectionKey:withMetadata:rowid:]
  • Within this method, the YDBSecondaryIndex will invoke your block, and will pass it the updated MYCategory object
  • The block will return an empty dictionary
  • As per the code at the bottom of that method, the 'if' statement block will be executed (not the 'else' statement block).
  • The removeRowid method will properly remove the value(s) from the secondary index.

Is the code not executing these steps properly? If so, could you step through the code in the debugger, and let me know where it is doing the wrong thing?

@shpakovski
Copy link
Author

Thanks for patience Robbie, I much appreciate. We are almost there!

From what I understand, you seem to be indicating a problem with scenario 4.

No. I see the problem with scenario 3:

Category Updated, archived toggled from NO to YES => Block returns non-empty dictionary, YDBSecondaryIndex inserts values for row.

From my experience, YDBSecondaryIndex does not insert values for row.

[databaseConnection readWriteWithTransaction:^(YapDatabaseReadWriteTransaction *transaction){

    MYCategory *category = [transaction objectForKey:catID inCollection:@"categories"];
    category = [category copy];
    category.archived = YES;

    [transaction setObject:category forKey:catID inCollection:@"categories"]; // point of update
}];

This code will not insert the row because handleUpdateObject:forCollectionKey:withMetadata:rowid: always generates UPDATE SQL statement which does nothing (at least in iOS 8) if the row with the given rowId is not yet in the table.

P. S. I am sorry for the initial example. It should have been more refined.

robbiehanson added a commit that referenced this issue Aug 12, 2014
@robbiehanson
Copy link
Contributor

This code will not insert the row because handleUpdateObject:forCollectionKey:withMetadata:rowid: always generates UPDATE SQL statement which does nothing (at least in iOS 8) if the row with the given rowId is not yet in the table.

You nailed it. Unit test added. Bug fixed.

@shpakovski
Copy link
Author

So this was a bug after all. Thanks a ton!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants