Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move tiles of interest set out of Redis and into S3 #193

Merged
merged 18 commits into from
May 1, 2017

Conversation

iandees
Copy link
Member

@iandees iandees commented Apr 25, 2017

Moves the tiles of interest set out of Redis and into S3.

logger.info('Enqueueing to SQS ... done')

if cfg.seed_should_add_to_tiles_of_interest:
logger.info('Adding to Redis ... ')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per the PR title, is Redis still needed here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was a terminology problem (should have said "Tiles of Interest" not "Redis").

Fixed in 7a52e76

@nvkelso
Copy link
Member

nvkelso commented Apr 26, 2017

Do we want to allow read/write to local disk for the TOI (not just S3)?

@iandees
Copy link
Member Author

iandees commented Apr 26, 2017

This isn't ready for review yet. I still have lots to go :) These (and other things that Rob brought up with me) are all good points.

@nvkelso nvkelso changed the title Move tiles of interest set out of Redis and into S3 WIP: Move tiles of interest set out of Redis and into S3 Apr 26, 2017
@nvkelso
Copy link
Member

nvkelso commented Apr 26, 2017

Woot! I've added WIP: to the PR title to make that clear.

@iandees
Copy link
Member Author

iandees commented Apr 28, 2017

This is just about done, with the exception of a couple things:

  • Tests are failing: It's not clear to me how the configuration tests work with an empty config. if self.yml['toi-store'] fails because of a KeyError, why doesn't self.yml['tiles']['seed'] also fail right above it? Is there a default config that's getting used somewhere?
  • The serializing/deserializing code assumes a tiles of interest set that contains Coordinate objects, whereas it contained coord_ints before. Since we're storing as strings anyway, I thought it'd be nicer to end up with the standard Coordinate object so we didn't have to do conversion back and forth. There might be too much work to double-check the existing assumptions (and the memory hit might be too large) to make this happen, though. (Not going to do this for now as it adds too much complexity to this PR)

@zerebubuth
Copy link
Member

Is there a default config that's getting used somewhere?

Yup! There's a default config function, and the result of that is merged into the config or replaces an empty config.

That threw me, the first time I ran into it. Any way we can make it more obvious what's going on?

@rmarianski
Copy link
Member

The history with the configuration is that we used to, a long time ago in a galaxy far far away, support command line switches for each of the configuration values. The idea was that you could pass in a flag for a command, and have that trump configuration values, which would trump the default settings. The "merge" idea came from combining the input from all 3 of these sources, and providing a single point of access for the rest of the code, which is what the "Configuration" object is.

Since then, we don't have command line switches for anything except where the config file is, which we could manage more directly. We could gut out most of what's there, it's just that there wasn't much motivation to do so imho. If it's confusing to maintain though, let's go ahead and do it.

@iandees
Copy link
Member Author

iandees commented Apr 28, 2017

Thanks for the pointers. It's likely ok as-is but I was going straight to the Configuration constructor and not seeing the call to default_yml_config().

Since this "default config" can't actually run any of the tilequeue commands (so it's more of "null object" pattern than a default), does it make more sense to adjust the Configuration constructor so that it is resilient to having missing fields in the yaml?

@rmarianski
Copy link
Member

does it make more sense to adjust the Configuration constructor so that it is resilient to having missing fields in the yaml?

Sure, although what can we do, if not provide default values? If it's clearer to just inline those defaults rather than have a separate section for it, that sounds fine to me.

@iandees
Copy link
Member Author

iandees commented Apr 28, 2017

If it's clearer to just inline those defaults rather than have a separate section for it, that sounds fine to me

Yea, maybe that's what I was describing. Regardless, I don't think it needs to happen in this PR.

@iandees iandees changed the title WIP: Move tiles of interest set out of Redis and into S3 Move tiles of interest set out of Redis and into S3 Apr 28, 2017
Copy link
Member

@rmarianski rmarianski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. 👍

Since there's new config for it, mind updating the sample config file just to see what all the possible values could be there?

s3:
# The bucket and key for the s3 object that will contain the tiles of interest
bucket: test_bucket
key: toi/toi.txt.gz
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 thanks!

I like that you namespaced the configuration based on type. I wish I had done it that way for the other configuration sections instead of flattening it out, feels cleaner.

"""
logger = make_logger(cfg, 'load_tiles_of_interest')
logger.info('Loading tiles of interest')
def tilequeue_dump_tiles_of_interest_from_redis(cfg, peripherals):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this is here as a 1 time convienance for those migrating from earlier configurations, but we don't expect it to run more than once?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right. I'll probably put together a PR after this one to remove this bit of code altogether.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a comment explaining this bit (like you've done for tilequeue_load_tiles_of_interest below).

Copy link
Member

@nvkelso nvkelso May 1, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, please file an issue (and the comment) so we don't loose track of it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in d04ca1d.

@iandees iandees merged commit 5f5e214 into master May 1, 2017
@iandees iandees deleted the iandees/tear-the-toi-from-redis branch May 1, 2017 23:28
@iandees iandees removed the in review label May 1, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants