Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable ImSim to run from Sky rather than/in addition to Instance Catalogs #222

Closed
danielsf opened this issue Jun 26, 2019 · 4 comments
Closed
Labels

Comments

@danielsf
Copy link
Contributor

Given the resources (compute, storage, and personnel hours) devoted to InstanceCatalog generation during DC2, I would like to explore the possibility of allowing ImSim to run directly on the object catalogs (the equivalent of what cosmoDC2 was for DC2), without having to provide separate, externally generated InstanceCatalogs. My concerns with relying on InstanceCatalogs are as follows:

  • The way InstanceCatalogs work, for each object in each pointing you provide position, brightness, and shape information. This means that you end up producing one row per source per epoch, even for perfectly static sources. This seems inefficient. I would like to entertain a scheme in which ImSim queries the object catalogs for the usual information (position, brightness, shape) for static sources and and also accepts variability parameters for variable/moving sources and in which ImSim is smart enough to convert those variability parameters into the proper phase for a light curve. Presumably, we could use a similar scheme for our supernova model, i.e. rather than providing separate SED text files for every supernova at every stage in its evolution, we can just feed the SALT2 parameters directly to ImSim and have ImSim convert that into the brightness and color information it needs to produce an image.

  • At several points during DC2 production, we ran up against limits in disk space and inode allocation when generating InstanceCatalogs. It is possible that the coming transition to binary InstanceCatalogs will make the question of storage a non-issue. The inode problem will remain, though.

  • InstanceCatalog generation is just one more step at which we are vulnerable to over-specialization (the fact that only one or two people know how to run each step in our pipeline). The fewer steps there are to image simulation with ImSim, the less likely we will be to become dependent on having the right set of people free when we need to burn CPU hours RIGHT NOW (as we did at the end of 2018).

After off-line conversations with @jchiang87 and @cwwalter, I believe we are converging on a system in which ImSim still uses something like an InstanceCatalog, but in which ImSim also provides the tools to convert "object catalogs" (Jim's term for the cosmoDC2+stars+variability catalogs that were inputs to InstanceCatalogs in DC2) into InstanceCatalogs. Users who want to run ImSim on NERSC can just chain the InstanceCatalog generation and image generation steps together into one batch job. Users who want to run remotely can generate their InstanceCatalogs at NERSC (or wherever the object catalogs live) and export them to their remote systems where they will run ImSim. This scheme will make ImSim effectively independent of CatSim. This means we will need to address how to provide the variability models and astrometric models for the Earth's motion that are currently provided by CatSim.

I am opening this issue so that the broader community can comment on this strategy.

@cwwalter cwwalter changed the title Enable ImSim to run without InstanceCatalogs Enable ImSim to run from Sky rather than/in addition to InstanceCatalogs Jun 26, 2019
@cwwalter
Copy link
Member

I've changed the title to 'Enable ImSim to run from Sky rather than/in addition to InstanceCatalogs' to be a little more explicit.

Here is also bit more background information on what we are thinking about (more summaries of the discussion with @jchiang87 and @danielsf ).

As part of the imSim redesign we are considering a new data product (which we are currently calling Sky Catalogs) that would be produced by the CS team from CosmoDCx + stars etc. These data file(s) will contain descriptions all of the objects we would want to simulate on the sky. It will be split up into sky chunks which only covers what we want to simulate. It will have all of the information for the objects that we want to simulate, critically including the proper motion and transient variability parameters.

This way, rather than having the objects on the same piece of sky repeated over and over again in the instance catalogs, they will be in one place and there will only be one entry for each object including anything we need to know for temporal variability. These sky catalog pieces can be distributed to remote sites. Using a pared down OpSim database file we can drive running imSim from that file.

imSim will have the option of either producing instance catalogs from the sky catalog (which can then be run) or it will be able to be run directly on the sky catalog itself. Produced instance catalogs can be used for checks of input used and also for running in lighter weight environments with resource limitations. Users will also be able to continue to create and run with instance catalogs for non DC scale simulation work where they have made the instance catalogs by hand or by some other program.

We imagine it likely that the sky catalogs will be produced and saved for DCx runs but that that instance catalogs won't be produced. This should reduce disk usage. One the sky catalogs are made the CS groups job will be done and imSim can run independently of it. As Scott mentions above we need consider how to properly treat the variability and astrometric models currently in CatSim now.

@jchiang87
Copy link
Collaborator

@danielsf @cwwalter It's probably worth having a detailed document of the redesign we're proposing here so that there is something more coherent than a github issue to use as a reference later on. We can certainly continue to gather input in this issue (and even later, as comments on a google doc), but once things have crystallized, we should write up the final design.

@cwwalter
Copy link
Member

cwwalter commented Jun 26, 2019

OK sounds good.. I've been starting to try to set up some general infrastructure for the redesign planning and was planning on trying to use the wiki here in addition to issues so we can use that to host or have the information. (Note new "Redesign" label on this issue BTW).

@cwwalter cwwalter changed the title Enable ImSim to run from Sky rather than/in addition to InstanceCatalogs Enable ImSim to run from Sky rather than/in addition to Instance Catalogs Jun 26, 2019
@cwwalter
Copy link
Member

cwwalter commented Mar 9, 2020

See #71.

@cwwalter cwwalter closed this as completed Mar 9, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants