-
Notifications
You must be signed in to change notification settings - Fork 107
JobSplitting Argorithms
ticoann edited this page Apr 13, 2017
·
7 revisions
Job splitting is to make the job length for optimal use of resources (~8 hours in default) - There are various parameters are used to calculate approximate job length (most importantly TimePerEvent)
The main parameter used for job splitting is "events_per_job" (in splitting algo). This is set in the Spec(EventsPerJob)/Splitting Algorithm, if this value is not set it will be calculated by "TimePerEvent". If "events_per_job" is specified "TimePerEvent" is ignored for job splitting (only used for estimated JobTime)
events_per_job = int((8.0 * 3600.0) / timePerEvent)
EventAwareLumiBased algorithm (code)
-
It converts events_per_job to lumis_per_job then create the job by iterating through files in the same location
- In case a job cannot be created on multiple input files. "halt_job_on_file_boundaries == True"
For each file f, if the file contains events below formula is how lumisPerJob is calculated f['avgEvtsPerLumi'] = round(float(f['events'])/f['lumiCount']) lumisPerJob = events_per_job / f['avgEvtsPerLumi'] if the file has 0 event, lumisPerJob = f['lumiCount']
- In case the job can be created over multiple input files,
Add more than one file until event in the job reaches to events_per_job. (also converting events_per_job to lumisPerJob [(code)](https://github.com/dmwm/WMCore/blob/1.1.3.pre2/src/python/WMCore/JobSplitting/EventAwareLumiBased.py#L182)
When lumisInJob reaches lumisPerJob, create one job. (code)
-
There is a case that a job is created but make it fail right away.
If an inputfile has only one lumi and avgEvtsPerLumi (events in the file/lumis in the file) is bigger than max_events_per_lumi (default 20K) - fail this job (on creation).
EventBased algorithm (code)
-
In case file (fake file for MC) contains more events than events_per_job.
Job is created on partial file (using mask)
-
In case file (fake file for MC) contains less events than events_per_job.
Add more files until events in the job reaches events_per_job