-
Notifications
You must be signed in to change notification settings - Fork 340
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
More unassigned when using 1 vehicles instead of 2, while creating always 1 route only #1102
Comments
TL;DR we "simply" provide a suboptimal solution on the first instance with a single vehicle. Longer version:
Yes, the reason is that we do not have a guarantee of optimality. Because a solution for scenario 2 that uses a single vehicle is also a solution for scenario 1, this should indeed not happen provided you always get optimal solutions from the solving engine. I agree this is counterintuitive to get a better solution by just adding a vehicle that seems unused. What actually happens is:
|
Hi, Thanks for the answer, very clear. From an algorithmic perspective, I see why you say that solution 1 is "much cheaper and natural" than solution 2. However, for the use case I'm modelling the solution 2 is "much better" in reality: in other words, each unassigned job has a "hidden" cost that is part of the "real-word objective function". Do you think there is a way to drive the heuristic/search algorithms to evaluate that solution 2 is better than solution 1? Thanks! |
It's not about evaluation, if we came across solution 2 while searching we would pick it as better because it has more jobs assigned. It is more about how to navigate the solution space to even reach that solution. One option used in many approaches would be to allow for cost degradation in the search so you could "deteriorate" solution 1 until it becomes possible timewise to add the last unassigned job. The problem of this approach is that:
The simplest idea I can think of without bending too much the current approach would be to play with route seeding rules: we have some rules for trying to insert jobs that are hard to fit in last, or once the routes are already packed, quite like in this example. This can include: seed an empty route with the furthest job first, or the job that has the tightest TW, or the highest amount etc. |
Hum, actually not sure about that really, I just tried bypassing the heuristic and forcing the unassigned job (id 6) in a route. Then the local search quickly swaps that (expensive) job for a cheaper one and we're back to the initial problem with an even worse solution: 2 unassigned! |
My last comment actually gave me another idea: if you add a high priority to that unassigned job in the first place, then we manage to insert it and we reach the solution with all jobs assigned. This is because we have local search operators aiming at improving priority score regardless of the cost increase. Not sure how this can help in a general workflow. |
Hi, thanks a lot for your help. Yes, the problem with assigning a high priority to the unassigned job is that before solving the problem I don't that that specific job will be unassigned. What if I assign a high priority to all jobs? In other words, is assigning priority=100 to all jobs generating an "algorithmic difference" compared to not assigning priorities? Would heuristics and local search follow a different path in those two scenario? |
No, the priority difference across jobs would have an impact but you'd loose that if all jobs have the same priority. What you could try is to re-optimize after adding a high priority to unassigned jobs only to see if by any chance more tasks are assigned. But this really sounds a bit random. |
ok, thanks for the suggestion, will try that. Do you think you'll furtherly investigate this suboptimality to make it part of future developments? |
Except for that single ticket, we have no other report of this kind of problem among our own API users, so frankly that would be very low priority at the moment. |
@cv-or could you run your problem instance one time with v1.8 (if possible feature-wise)? In my personal experience, v1.8 delivers more consistent (good, but not perfectly consistent) results (e.g. when changing only 1 job) considering unassigned jobs than v1.13 & v1.14, particularly when it comes to using complicated skills combinations. v1.13 & v1.14 are generally faster and seem to have better (less km) results (especially for easy problems with few restrictions) but at the cost of oftentimes not assigning all jobs where v1.8 would assign more jobs. The "balance" between the optimization objectives seems a little obfuscated/unclear across the versions here, like you cannot maneuver them easily and/or due to the heuristics used it's decided automatically/unintentionally. Just for reference, VRP adds a really nice logic about optimization objectives. E.g. in the real world - and that goes I think for most problems out there - you'd always want to assign all jobs first. In the end, that's probably what you're paid for, regardless of the number of kilometers. Only second goes saving km and only third saving entire vehicles. For the latter it's not even so clear as using a new vehicle might have a fixed cost. Long story short: I think this issue is part of a bigger problem with unassigned jobs. Linking: |
The initial post here was about getting different solutions with different input setups, not across versions which is a bit different.
This is definitely not obvious from the docs, but we try by order of importance to:
This objective hierarchy has always been consistent across versions.
Hum, we have to be careful not to fall for survivorship bias here. Let's say for the sake of simplicity that regarding assigned jobs, solutions are either good (think optimal number of task assigned) or bad (suboptimal number of tasks assigned), then on any given instance:
If we only look at tickets and "regression" reports, we only see situations where v1.8 was good and v1.14 is now bad (can definitely happen because as we change the internals, we change the solving paths and potential solutions), but we miss the big picture. My point here is we should analyze things carefully based on a meaningful set of instances. The closest I have at hand in the usual benchmarks are the CVRP instances where not all jobs are always assigned. Here is a quick summary of the comparison across versions and exploration levels for those instances.
Except for @do-me do you have a representative set of instances we could investigate to showcase the difference between v1.8 and v1.14? |
On a more dev-related note, I've just started working on #874, which I hope will be a way to reach better solutions by diversifying the search. This may definitely have an impact on the "assigned tasks" objective. |
Results: v1.8 vs v1.14 unassigned jobs@jcoupey I finally managed to test quite some real-world instances with roughly 40 - 60 jobs and identify a dozen comparison cases. Result: in 237 cases, v1.8 is 12x assigning more jobs than v1.14. v1.14 manages only one time to assign more than v1.8. In the other cases the versions behave equally. This seems to confirm my initial thesis, so it's not survivorship bias. Unless we did something terribly wrong between the versions, I'd guess that v1.8 is simply better at assigning more jobs. Please see the zip file attached with 13 problems, each with two jsons for v1.8 and v1.14 as the format changed. The instances in Unfortunately I wasn't able to measure the computation time. Considering km I only used the cases where both versions allocated equally. Result: it's a draw. Most times equal, 2 wins for v1.8 and 2 wins for v1.14. |
The changes were made in a backward-compatible way. That is if you do not badly need the @do-me can you confirm this before I dig any further? I'd like to reproduce on the exact same files to rule out possible "external" errors. Also I see you stripped out vehicles and jobs locations, probably for obvious privacy reasons. This makes it harder to look at solutions with e.g. https://github.com/VROOM-Project/vroom-scripts/blob/master/src/plot.py. Another option would be to obfuscate actual coordinates by adding the same random offset to all locations in an instance. Or share instances privately if that is an option? |
Yes, the distances can be safely removed and have no importance in these instances. I can share some coordinates but prefer to send these via email. |
While both problems raised in this ticket share the same symptoms (too many unassigned jobs), they are very different in nature:
In order to split the concerns, I've ticketed point 2 in #1193. Otherwise I'm inclined to close for point 1: except if we find evidence of this problem happening wildly at scale, I think we have to leave with it as an inevitable consequence of providing sub-optimal solutions. |
Hi,
I'm facing a pretty unexpected behaviour regarding unassigned jobs. I'm running two scenarios:
Scenario 1: I'm passing to Vroom 10 jobs and 1 vehicle only -> 9 jobs are assigned, 1 route is created and I get 1 unassigned. So far, so good. This scenario is included in the self-contained instance "my_problem_1_vehicle.json" attached.
Scenario 2: I'm passing to Vroom the exact same 10 jobs BUT 2 identical vehicles -> all 10 jobs are assigned and 1 route is created. This is pretty surprising, not because all 10 jobs are assigned, but because all 10 jobs are assigned to 1 route only (see self-contained instance "my_problem_2_vehicles.json" attached).
Moreover, if I run Vroom on the steps identified by scenario 2 in plan mode, I get no violation.
Data for the problem: instances.zip. I'm running v1.13.
What can be the issue? Is there a reason why passing 2 vehicles instead of 1 leads to a better solution?
The text was updated successfully, but these errors were encountered: