Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Grove missing 1Password Events Due to Eventual Consistency #64

Open
brian-avery-angi opened this issue Nov 6, 2024 · 7 comments
Open
Assignees
Labels
bug Something isn't working

Comments

@brian-avery-angi
Copy link

First, thanks for implementing Grove, it's an awesome tool.

The 1Password connectors (item usage, audit, signin attempts) all use the item timestamp as the POINTER_PATH in Grove. This catches a lot of events, but in our testing, we're finding that it's missing some item usage events.

1Password operates on an eventual consistency model and suggests that you use a reset cursor including the timestamp for the initial request and then the cursor past the initial request instead to avoid missing data (https://developer.1password.com/docs/events-api/reference/#cursor). Looking at the Grove code, the cursor is used within the same request, but when Grove exits and restarts, it uses the timestamp and only requests events past the date, missing events that took longer to come in.

I've been reviewing the code and am seeing three issues so far. :

  • Grove's code seems to expect pointer to be a timestamp. Would there to be any issues using a cursor (example from 1Password's docs: aGVsbG8hIGlzIGl0IG1lIHlvdSBhcmUgbG9va2luZyBmb3IK) as the pointer instead of a timestamp?
  • The connector save method seems to expect an object to be within the JSON of an item. However, the cursor is at the top level of the JSON. I'm thinking this could be copied to the items so it can be accessed via the JMESPath search, but this feels like a hack.
@hcpadkins
Copy link
Contributor

Hey there,

Thank you very much for the detailed report, and suggestions to resolve this issue! We'll do some investigation into this further, and I'll get back to you around a fix soon.

@brian-avery-angi
Copy link
Author

Thanks! I'm happy to submit a fix (and I've been working on one), just want to know if there's an advised way to address those two concerns.

@hcpadkins
Copy link
Contributor

Ah that's even better, thank you for the help!

In terms of the first approach, that shouldn't be an issue. We intentionally didn't want to assume that pointers were time-stamps, as we thought that we'd eventually encounter an API which uses something other than time as a marker. The only side effect would be a migration path / steps required for any users currently using the connector, as any stored pointers in the cache would no longer be valid.

In terms of the second, depending on the structure, usually we attempt to decouple pagination cursors from pointers - as they usually serve two distinct but related purposes.

Previously, we have handles this by using a Grove internal type called AuditLogEntries to return a page of results from an API operation performed by a client, as well as the cursor for the next page. This allows the list of results to be passed off to save, and the cursor - if set - used to perform subsequent requests page by page. It also helps to keep the specific API related concerns out of the connector event loop, and in the API client.

As an example, this is how we're performing this operation in the Github connector.

Please let me know if you have any questions, or concerns, and thank you again for this issue and for the help with the fix!

@hcpadkins hcpadkins self-assigned this Nov 11, 2024
@hcpadkins hcpadkins added the bug Something isn't working label Nov 11, 2024
@brian-avery-angi
Copy link
Author

Thanks and awesome! I'll see if I can recognize the original value as a timestamp and we can seed on that.

Looking at 1Password (https://developer.1password.com/docs/events-api/reference/#post-apiv1itemusages), I see:

{
	"cursor": "aGVsbG8hIGlzIGl0IG1lIHlvdSBhcmUgbG9va2luZyBmb3IK",
	"has_more": true
	"items": [
		{
			"uuid": "56YE2TYN2VFYRLNSHKPW5NVT5E",
			"timestamp": "2023-03-15T16:33:50-03:00",
			"used_version": 0,
			"vault_uuid": "VZSYVT2LGHTBWBQGUJAIZVRABM",
			"item_uuid": "SDGD3I4AJYO6RMHRK8DYVNFIDZ",
			"user": {
				"uuid": "4HCGRGYCTRQFBMGVEGTABYDU2V",
				"name": "Wendy Appleseed",
				"email": "wendy_appleseed@agilebits.com"
			},
			"client": {
				"app_name": "1Password Browser",
				"app_version": "20240",
				"platform_name": "Chrome",
				"platform_version": "string",
				"os_name": "MacOSX",
				"os_version": "13.2",
				"ip_address": "192.0.2.254"
			},
			"location": {
				"country": "Canada",
				"region": "Ontario",
				"city": "Toronto",
				"latitude": 43.5991,
				"longitude": -79.4988
			},
			"action": "secure-copy"
		}
	]
}

I think the save function gets the pointer to save here (

newest = jmespath.search(self.POINTER_PATH, entries[-1])
). This looks at a JMEPath newest = jmespath.search(self.POINTER_PATH, entries[-1]) and the value for entries is returned from 1Password here --
return AuditLogEntries(cursor=cursor, entries=result.body.get("items", []))

So the question I was trying to ask is what's the best way to get the cursor (outside of the items array) accessible from inside of the items array when the save function tries to access it. My current approach is to copy it into each item.

@hcpadkins
Copy link
Contributor

hcpadkins commented Nov 12, 2024

Ah sorry for the misunderstanding! I've just managed to grok what the API is saying this morning, and I think I see what you mean.

As I understand it, when using a reset cursor a datestamp is only ever provided on the first interaction with the API - in the Reset Cursor request. After this request, all subsequent collections would use the cursor returned on the root of the response from the last page of the last collection, which will not be present in the log records submitted to save as its outside of the element currently passed.

I'm having a look through the code now to see if there's any escape hatches we can use rather than having to copy the cursor into the data returned by the API. I'll get back to you shortly!

@brian-avery-angi
Copy link
Author

Awesome. Thank you! Yeah, that's what I was seeing with this issue. Currently this is dependent on time for all requests whereas it should only be dependent on time for the first interaction.

@hcpadkins
Copy link
Contributor

hcpadkins commented Nov 13, 2024

I've been thinking on this one, and unfortunately, for now copying the cursor to each log record entry is likely the most appropriate way forward to unblock this implementation.

A longer term fix would likely be to introduce an additional and optional DATA_PATH to be used in conjunction with POINTER_PATH. This would allow handling these rare, but rather annoying, edge cases where the pagination cursor is also used as a pointer.

However, implementing is likely to require quite a few changes to existing connectors and clients, as well as a potentially breaking API change.

One outstanding question is how long the cursors are valid for in the 1Password API. It seems unlikely that they would be valid forever, but I'm unable to find any documentation around whether they are only valid for N minutes, hours, or otherwise. It might require some experimentation, but in the mean time, I'll see about sending a message across to our account contact at 1Password to confirm the expected behavior.

I'll think on whether there is a nicer way to achieve this functionality within the core without such a wide change, but for now, copying the pointer into each log record is likely the best way to resolve this issue and get this shipped.

Thank you again for your help, and your patience while we've been digging further into this one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants