You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Since #725 was implemented, group operations do not appear by default in the project status output. This is because the way that groups are selected with the -o option is using _gather_flow_groups. If you try to include a singleton operation along with a group operation that contains it you get an error (shown in the next section).
The reason _gather_flow_groups avoids returning overlapping groups and operations is because it is used in other places like run where, of course, you want to avoid repeating operations. I think that, when implementing #725 groups were just forgotten. _fetch_status goes through a lot of hoops to reconcile having both singleton operations and grouped operations, all of which was discussed in #547 and implemented in #593, and it would be nice to recover this default behaviour.
I see several different options for solving this:
Switch from using _gather_flow_groups to the groups property, making sure to only select the groups that were selected with -o, if present.
Seeing that the overlap is not always a problem, maybe a parameter could be added to _gather_flow_groups so as to avoid the _verify_group_compatibility check (activated by default).
Maybe the _verify_group_compatibility check does not really belong in _gather_flow_groups. In that case it may be just taken out and used only when necessary, although this should be evaluated by someone with more experience than me with the flow codebase.
Overview: 2 jobs/aggregates, 2 jobs/aggregates with eligible operations.
label
-------
operation/group number of eligible jobs submission status
----------------- ------------------------- -------------------
op1 2 [U]: 2
op2 2 [U]: 2
If you try to include a singleton operation along with a group operation that contains it (e.g. python project.py status -o op1 opgroup) you get the following error:
ERROR:flow.project:Error during status update: Cannot specify groups or operations that will be included twice when using the -o/--operation option.
Use '--ignore-errors' to complete the update anyways.
Traceback (most recent call last):
File "/home/javier/git_repos/signac-flow/mytesting/duplicate_operation_reproduction.py", line 18, in <module>
FlowProject().main()
File "/home/javier/git_repos/signac-flow/flow/project.py", line 5165, in main
args.func(args)
File "/home/javier/git_repos/signac-flow/flow/project.py", line 4799, in _main_status
raise error
File "/home/javier/git_repos/signac-flow/flow/project.py", line 4793, in _main_status
self.print_status(jobs=aggregates, **args)
File "/home/javier/git_repos/signac-flow/flow/project.py", line 3011, in print_status
status_results, job_labels, individual_jobs = self._fetch_status(
File "/home/javier/git_repos/signac-flow/flow/project.py", line 2724, in _fetch_status
status_groups = set(self._gather_flow_groups(names))
File "/home/javier/git_repos/signac-flow/flow/project.py", line 3795, in _gather_flow_groups
raise ValueError(
ValueError: Cannot specify groups or operations that will be included twice when using the -o/--operation option.
A related bug
When this problem is solved (e.g., by switching from using _gather_flow_groups to groups in _fetch_status), a bug appears when reporting 2 or more singleton operations along with a group that contains them: only one of the operations is reported and it counts duplicate eligible/queued/etc jobs (as many duplicates as "sibling" operations). This is because in this line the operation_status dictionary is reused for each of the operations in the group and later, when a display name is assigned to it in this line, it is overwritten each time into the shared dictionary. This is easily fixable by using a copy for each one.
I decided to include this in the same issue because of the close coupling between this two issues (this bug only arises when the other functionality is restored).
System configuration
Please complete the following information:
Operating System [e.g. macOS]: tested in Debian 11 and Arch Linux 2023.09.01
Version of Python [e.g. 3.7]: 3.8
Version of signac [e.g. 1.0]: 2.1.0
Version of signac-flow: 0.26.1
The text was updated successfully, but these errors were encountered:
Description
Since #725 was implemented, group operations do not appear by default in the project status output. This is because the way that groups are selected with the
-o
option is using_gather_flow_groups
. If you try to include a singleton operation along with a group operation that contains it you get an error (shown in the next section).The reason
_gather_flow_groups
avoids returning overlapping groups and operations is because it is used in other places likerun
where, of course, you want to avoid repeating operations. I think that, when implementing #725 groups were just forgotten._fetch_status
goes through a lot of hoops to reconcile having both singleton operations and grouped operations, all of which was discussed in #547 and implemented in #593, and it would be nice to recover this default behaviour.I see several different options for solving this:
_gather_flow_groups
to thegroups
property, making sure to only select the groups that were selected with-o
, if present._gather_flow_groups
so as to avoid the_verify_group_compatibility
check (activated by default)._verify_group_compatibility
check does not really belong in_gather_flow_groups
. In that case it may be just taken out and used only when necessary, although this should be evaluated by someone with more experience than me with the flow codebase.To reproduce
Initialize a signac project with
signac init
.Add a couple of jobs:
Register a couple of operations and a group:
Run
python project.py status
Error output
The output shows this,
opgroup
is ignored:If you try to include a singleton operation along with a group operation that contains it (e.g.
python project.py status -o op1 opgroup
) you get the following error:A related bug
When this problem is solved (e.g., by switching from using
_gather_flow_groups
togroups
in_fetch_status
), a bug appears when reporting 2 or more singleton operations along with a group that contains them: only one of the operations is reported and it counts duplicate eligible/queued/etc jobs (as many duplicates as "sibling" operations). This is because in this line theoperation_status
dictionary is reused for each of the operations in the group and later, when a display name is assigned to it in this line, it is overwritten each time into the shared dictionary. This is easily fixable by using a copy for each one.I decided to include this in the same issue because of the close coupling between this two issues (this bug only arises when the other functionality is restored).
System configuration
Please complete the following information:
The text was updated successfully, but these errors were encountered: