Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stepchain: sort out duplicate outputModules #6787

Closed
amaltaro opened this issue Apr 8, 2016 · 12 comments
Closed

Stepchain: sort out duplicate outputModules #6787

amaltaro opened this issue Apr 8, 2016 · 12 comments
Assignees
Milestone

Comments

@amaltaro
Copy link
Contributor

amaltaro commented Apr 8, 2016

As Stepchain is now, it does not support KeepOutput:True for different steps using exactly the same outputModule (not matter the datatier used). The problem is that merge jobs are created for all files created under a specific outputModule (including all datatiers).

@amaltaro amaltaro added this to the WMAgent1604 milestone Apr 8, 2016
@amaltaro amaltaro self-assigned this Apr 8, 2016
@hufnagel
Copy link
Member

hufnagel commented Apr 8, 2016

How can different steps have exactly the same output ?

@amaltaro
Copy link
Contributor Author

amaltaro commented Apr 8, 2016

not the same output, the same output module (e.g. RAWSIMoutput). This template simulates this problem:
https://github.com/dmwm/WMCore/blob/master/test/data/ReqMgr/requests/DMWM/StepChain_MC.json

@hufnagel
Copy link
Member

hufnagel commented Apr 8, 2016

Ah, so it's a naming issue in the CMSSW configuration ?

@hufnagel
Copy link
Member

hufnagel commented Apr 8, 2016

Well, WMCore reuses the names internally. Can we somehow attach the step name in the identifiers WMCore uses ? Only other solution I see is to tell people that configs in stepchains need unique output identifier across all steps, which I don't think will fly.

@amaltaro
Copy link
Contributor Author

amaltaro commented Apr 8, 2016

No, it's a problem on our end. I still have to investigate it further, but quickly explaining what happened in my tests.

  • The wmbs subscriptions are created for Task+outputModule (remember, stepchains have only one processing/production task)
  • Step1 produces output files "for" RAWSIMoutput, with GEN-SIM tier
  • Step2 produces output files "for" RAWSIMoutput too, however with GEN-SIM_RAW tier
  • the agent then, creates merge job(s) for that subscription, and the files available for it belong both to the GEN-SIM and GEM-SIM-RAW tiers (!! crash !!)

@hufnagel
Copy link
Member

hufnagel commented Apr 8, 2016

Ok, so the agent only used the name of an output module, as I expected. Solutions still stand. Either we change that to also take into account step name or we enforce unique output identifier across all steps.

@amaltaro
Copy link
Contributor Author

amaltaro commented Apr 8, 2016

Can we somehow attach the step name in the identifiers WMCore uses ?

If I answer this, then I have a fix, which I don't yet :-)

Only other solution I see is to tell people that configs in stepchains need unique output identifier across all steps, which I don't think will fly.

Nope, it did not fly! Folks have tagged already StepChain as incomplete...

@vlimant
Copy link
Contributor

vlimant commented Nov 5, 2016

maybe you want to lower this down from high priority, as it was agreed that we will not get this working.

@amaltaro
Copy link
Contributor Author

amaltaro commented May 2, 2017

@amaltaro amaltaro added this to the WMAgent1705 milestone May 2, 2017
@amaltaro
Copy link
Contributor Author

amaltaro commented Jul 6, 2017

And for my information, the reason jobs from different datatiers ran under the same merge job is that they were associated to the same fileset, since the fileset follows the taskName + outputModule convention, like

['/TestWorkload/GENSIM/DIGIMergeRAWSIMoutput/merged-Merged',
 '/TestWorkload/GENSIM/DIGIMergeRAWSIMoutput/merged-logArchive',
 '/TestWorkload/GENSIM/GENSIMMergeRAWSIMoutput/merged-Merged',
 '/TestWorkload/GENSIM/GENSIMMergeRAWSIMoutput/merged-logArchive',
 '/TestWorkload/GENSIM/RECOMergeAODSIMoutput/merged-Merged',
 '/TestWorkload/GENSIM/RECOMergeAODSIMoutput/merged-logArchive',
 '/TestWorkload/GENSIM/RECOMergeRECOSIMoutput/merged-Merged',
 '/TestWorkload/GENSIM/RECOMergeRECOSIMoutput/merged-logArchive',
 '/TestWorkload/GENSIM/unmerged-AODSIMoutput',
 '/TestWorkload/GENSIM/unmerged-RAWSIMoutput',
 '/TestWorkload/GENSIM/unmerged-RECOSIMoutput',
 '/TestWorkload/GENSIM/unmerged-logArchive',
 'TestWorkload-GENSIM-1c749bf27f4b19a3b443e91e3cf25363']

where we - of course - cannot create two:
'/TestWorkload/GENSIM/unmerged-RAWSIMoutput',

thus the only way forward is changing how the fileset naming is created. I have a "almost working version" which also adds the datatier to the fileset name.

Other possibility would be adding the cmsRun step (cmsRun1, cmsRun2), adding the stepname would be complicated because that's only available for StepChains...

@ticoann @ericvaandering if you have an strong opinion about any of those, please let me know before it's too late :)

@ericvaandering
Copy link
Member

Fileset names are just arbitrary placeholders in WMBS, right? From what I've seen on what I'm working on, if you don't name them they just get a UUID.

I don't understand how adding the data tier helps. It's already there in GENSIM, right?

@amaltaro
Copy link
Contributor Author

amaltaro commented Jul 6, 2017

Not really, filesets are named after workflow + task name + (merge or unmerge) + output module.
Only the top level fileset is named with a UUID (if there is not input data), otherwise the block name is appended to it (or the dataset?)

Then this fileset is mapped back in JobAccountant for the proper file accounting:
https://github.com/dmwm/WMCore/pull/6835/files#diff-f0ead693808668dc1e56aa1f2d92f676R532

GENSIM in the example above is just my task name (actually, StepName for the first step (Step1)).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants