Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support writing to multiple S3 buckets #353

Open
zerebubuth opened this issue Oct 18, 2018 · 1 comment
Open

Support writing to multiple S3 buckets #353

zerebubuth opened this issue Oct 18, 2018 · 1 comment
Assignees

Comments

@zerebubuth
Copy link
Member

In order to be resilient to region failure, and to get tiles closer to the clients downloading them, it would be helpful to be able to write tiles into multiple buckets.

It would be simple to support a new type of store, perhaps called "multis3" or just switched when name is a list, which wraps a list of S3 objects and writes from first to last, reading from the last.

The reading from last is important so that get-before-put doesn't think that a tile written to only one of the buckets is okay. Alternatively, if we want to get more complex, we could do read repair by:

  1. Reading from the first bucket - if no tile, return None.
  2. Read from second through last buckets, if no tile then copy the one from the first bucket.
  3. Return the tile.

I'm not sure whether this is worthwhile - it's a lot of extra complexity to save the work of re-rendering the tile. My feeling is that, while for some expensive tiles that would be worthwhile, the majority of tiles are so cheap to re-render that it's not worth the read repair...?

@rmarianski
Copy link
Member

I'm not sure whether this is worthwhile

I haven't thought through the details, but my gut reaction is to just do the simplest thing, which sounds like it's just checking the get before put on the last location. IIRC we have some retries built in, ie if a write fails we will try n times before giving up. I'd imagine that this practically should cover nearly all common failures, and if there's an edge case it should be fine to re-render the odd tile. But I don't feel strongly about this, and can see the argument for optimizations too :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants