Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

More unassigned when using 1 vehicles instead of 2, while creating always 1 route only #1102

Closed
cv-or opened this issue Apr 22, 2024 · 16 comments

Comments

@cv-or
Copy link

cv-or commented Apr 22, 2024

Hi,

I'm facing a pretty unexpected behaviour regarding unassigned jobs. I'm running two scenarios:

Scenario 1: I'm passing to Vroom 10 jobs and 1 vehicle only -> 9 jobs are assigned, 1 route is created and I get 1 unassigned. So far, so good. This scenario is included in the self-contained instance "my_problem_1_vehicle.json" attached.

Scenario 2: I'm passing to Vroom the exact same 10 jobs BUT 2 identical vehicles -> all 10 jobs are assigned and 1 route is created. This is pretty surprising, not because all 10 jobs are assigned, but because all 10 jobs are assigned to 1 route only (see self-contained instance "my_problem_2_vehicles.json" attached).

Moreover, if I run Vroom on the steps identified by scenario 2 in plan mode, I get no violation.

Data for the problem: instances.zip. I'm running v1.13.

What can be the issue? Is there a reason why passing 2 vehicles instead of 1 leads to a better solution?

@jcoupey
Copy link
Collaborator

jcoupey commented Apr 26, 2024

TL;DR we "simply" provide a suboptimal solution on the first instance with a single vehicle.

Longer version:

Is there a reason why passing 2 vehicles instead of 1 leads to a better solution?

Yes, the reason is that we do not have a guarantee of optimality. Because a solution for scenario 2 that uses a single vehicle is also a solution for scenario 1, this should indeed not happen provided you always get optimal solutions from the solving engine.

I agree this is counterintuitive to get a better solution by just adding a vehicle that seems unused. What actually happens is:

  1. For scenario 1, the solution we get is much cheaper and "natural" if you look at the geometry, and the leftover job is not included to that plan due to TW constraints. So no matter what, our search process never manages to reach the "full solution" that is much more expensive and much more convoluted, in fact the solution 1 is actually already a heuristic solution, meaning it has not been improved by the local search. This is because most of our approach targets reducing the cost, while in this situation we'd have to reach to a much more expensive solution to be able to include the missing job.
  2. For scenario 2, the added vehicle is not actually unused: the heuristic solutions all use two vehicles in the first place. In that case, the setup is different and the local search is able to reach the 1-vehicle solution by reducing overall cost.

@cv-or
Copy link
Author

cv-or commented Apr 26, 2024

Hi,

Thanks for the answer, very clear. From an algorithmic perspective, I see why you say that solution 1 is "much cheaper and natural" than solution 2. However, for the use case I'm modelling the solution 2 is "much better" in reality: in other words, each unassigned job has a "hidden" cost that is part of the "real-word objective function".

Do you think there is a way to drive the heuristic/search algorithms to evaluate that solution 2 is better than solution 1?

Thanks!

@jcoupey
Copy link
Collaborator

jcoupey commented Apr 29, 2024

Do you think there is a way to drive the heuristic/search algorithms to evaluate that solution 2 is better than solution 1?

It's not about evaluation, if we came across solution 2 while searching we would pick it as better because it has more jobs assigned. It is more about how to navigate the solution space to even reach that solution.

One option used in many approaches would be to allow for cost degradation in the search so you could "deteriorate" solution 1 until it becomes possible timewise to add the last unassigned job. The problem of this approach is that:

  • it is the opposite of our current approach that is designed so that the local search will converge very fast to the closest local minima;
  • it requires a lot of tuning for how much degradation you allow, for how long you allow to wander in poor areas of the solution space before stopping etc.
  • there is still no guarantee solution 2 would be found!

The simplest idea I can think of without bending too much the current approach would be to play with route seeding rules: we have some rules for trying to insert jobs that are hard to fit in last, or once the routes are already packed, quite like in this example. This can include: seed an empty route with the furthest job first, or the job that has the tightest TW, or the highest amount etc.
In your example none of those criteria work, but the unassigned job is kind of an outlier in term of geographic position. Maybe a seeding rule like "start with the job that is the more isolated" would work in that case.

@jcoupey
Copy link
Collaborator

jcoupey commented Apr 29, 2024

Maybe a seeding rule like "start with the job that is the more isolated" would work in that case.

Hum, actually not sure about that really, I just tried bypassing the heuristic and forcing the unassigned job (id 6) in a route. Then the local search quickly swaps that (expensive) job for a cheaper one and we're back to the initial problem with an even worse solution: 2 unassigned!

@jcoupey
Copy link
Collaborator

jcoupey commented Apr 29, 2024

My last comment actually gave me another idea: if you add a high priority to that unassigned job in the first place, then we manage to insert it and we reach the solution with all jobs assigned. This is because we have local search operators aiming at improving priority score regardless of the cost increase. Not sure how this can help in a general workflow.

@cv-or
Copy link
Author

cv-or commented May 2, 2024

Hi, thanks a lot for your help.

Yes, the problem with assigning a high priority to the unassigned job is that before solving the problem I don't that that specific job will be unassigned.

What if I assign a high priority to all jobs? In other words, is assigning priority=100 to all jobs generating an "algorithmic difference" compared to not assigning priorities? Would heuristics and local search follow a different path in those two scenario?

@jcoupey
Copy link
Collaborator

jcoupey commented May 2, 2024

is assigning priority=100 to all jobs generating an "algorithmic difference" compared to not assigning priorities?

No, the priority difference across jobs would have an impact but you'd loose that if all jobs have the same priority. What you could try is to re-optimize after adding a high priority to unassigned jobs only to see if by any chance more tasks are assigned. But this really sounds a bit random.

@cv-or
Copy link
Author

cv-or commented May 16, 2024

ok, thanks for the suggestion, will try that.

Do you think you'll furtherly investigate this suboptimality to make it part of future developments?

@jcoupey
Copy link
Collaborator

jcoupey commented May 21, 2024

Do you think you'll furtherly investigate this suboptimality to make it part of future developments?

Except for that single ticket, we have no other report of this kind of problem among our own API users, so frankly that would be very low priority at the moment.

@do-me
Copy link

do-me commented Jun 28, 2024

@cv-or could you run your problem instance one time with v1.8 (if possible feature-wise)?

In my personal experience, v1.8 delivers more consistent (good, but not perfectly consistent) results (e.g. when changing only 1 job) considering unassigned jobs than v1.13 & v1.14, particularly when it comes to using complicated skills combinations.

v1.13 & v1.14 are generally faster and seem to have better (less km) results (especially for easy problems with few restrictions) but at the cost of oftentimes not assigning all jobs where v1.8 would assign more jobs.

The "balance" between the optimization objectives seems a little obfuscated/unclear across the versions here, like you cannot maneuver them easily and/or due to the heuristics used it's decided automatically/unintentionally.

Just for reference, VRP adds a really nice logic about optimization objectives. E.g. in the real world - and that goes I think for most problems out there - you'd always want to assign all jobs first. In the end, that's probably what you're paid for, regardless of the number of kilometers. Only second goes saving km and only third saving entire vehicles. For the latter it's not even so clear as using a new vehicle might have a fixed cost.

Long story short: I think this issue is part of a bigger problem with unassigned jobs.

Linking:

@jcoupey
Copy link
Collaborator

jcoupey commented Jul 2, 2024

The initial post here was about getting different solutions with different input setups, not across versions which is a bit different.

The "balance" between the optimization objectives seems a little obfuscated/unclear across the versions here

This is definitely not obvious from the docs, but we try by order of importance to:

  1. maximize overall priority score (sum of priorities across assigned tasks).
  2. maximize number of assigned tasks.
  3. minimize total solution cost.

This objective hierarchy has always been consistent across versions.

v1.13 & v1.14 are generally faster [...] at the cost of oftentimes not assigning all jobs where v1.8 would assign more jobs

Hum, we have to be careful not to fall for survivorship bias here. Let's say for the sake of simplicity that regarding assigned jobs, solutions are either good (think optimal number of task assigned) or bad (suboptimal number of tasks assigned), then on any given instance:

  • if v1.8 was good and v1.14 is still good, no-one will complain;
  • if v1.8 was bad and v1.14 is still bad, no-one will complain either (except in the rare situation where there is a way to spot suboptimality as for this ticket);
  • if v1.8 was bad and v1.14 is good, that's great but in practice no-one ever opens a ticket or calls you up in this situation.

If we only look at tickets and "regression" reports, we only see situations where v1.8 was good and v1.14 is now bad (can definitely happen because as we change the internals, we change the solving paths and potential solutions), but we miss the big picture.

My point here is we should analyze things carefully based on a meaningful set of instances. The closest I have at hand in the usual benchmarks are the CVRP instances where not all jobs are always assigned. Here is a quick summary of the comparison across versions and exploration levels for those instances.

V1.8 x=0 V1.14 x=0 V1.8 x=1 V1.14 x=1 V1.8 x=2 V1.14 x=2 V1.8 x=3 V1.14 x=3 V1.8 x=4 V1.14 x=4 V1.8 x=5 V1.14 x=5
Assigned jobs 44820 44813 44941 44941 44947 44948 44948 44951 44950 44952 44952 44954
All jobs solutions 158 157 168 168 169 170 169 170 169 171 170 173
Optimal solutions 13 12 22 23 27 27 34 38 38 43 42 45

Except for x=0, v1.14 consistently provides better results in term of assigned tasks (and in term of instances solved with all tasks). I'm not reporting average gap to optimal solutions (wrt "usual" cost) that has significantly decreased and average computing times that have been drastically reduced.

@do-me do you have a representative set of instances we could investigate to showcase the difference between v1.8 and v1.14?

@jcoupey
Copy link
Collaborator

jcoupey commented Jul 2, 2024

On a more dev-related note, I've just started working on #874, which I hope will be a way to reach better solutions by diversifying the search. This may definitely have an impact on the "assigned tasks" objective.

@do-me
Copy link

do-me commented Aug 23, 2024

Results: v1.8 vs v1.14 unassigned jobs

@jcoupey I finally managed to test quite some real-world instances with roughly 40 - 60 jobs and identify a dozen comparison cases.

Result: in 237 cases, v1.8 is 12x assigning more jobs than v1.14. v1.14 manages only one time to assign more than v1.8. In the other cases the versions behave equally.

This seems to confirm my initial thesis, so it's not survivorship bias. Unless we did something terribly wrong between the versions, I'd guess that v1.8 is simply better at assigning more jobs.

Please see the zip file attached with 13 problems, each with two jsons for v1.8 and v1.14 as the format changed. The instances in 0 dir are the ones where v1.14 wins, in all other v1.8 wins.

Unfortunately I wasn't able to measure the computation time.

Considering km I only used the cases where both versions allocated equally. Result: it's a draw. Most times equal, 2 wins for v1.8 and 2 wins for v1.14.

vroom_exports.zip

@jcoupey
Copy link
Collaborator

jcoupey commented Aug 26, 2024

each with two jsons for v1.8 and v1.14 as the format changed

The changes were made in a backward-compatible way. That is if you do not badly need the distances matrix for solving, you can run v1.14.0 on the same instance (with the matrix top-level key). My guess is that is the case in the instances you shared since all vehicle.max_distance have a value of $2^{31} - 1$ so are never actual binding constraints and thus can be safely removed.

@do-me can you confirm this before I dig any further? I'd like to reproduce on the exact same files to rule out possible "external" errors.

Also I see you stripped out vehicles and jobs locations, probably for obvious privacy reasons. This makes it harder to look at solutions with e.g. https://github.com/VROOM-Project/vroom-scripts/blob/master/src/plot.py. Another option would be to obfuscate actual coordinates by adding the same random offset to all locations in an instance. Or share instances privately if that is an option?

@do-me
Copy link

do-me commented Aug 28, 2024

Yes, the distances can be safely removed and have no importance in these instances. I can share some coordinates but prefer to send these via email.

@jcoupey
Copy link
Collaborator

jcoupey commented Nov 22, 2024

While both problems raised in this ticket share the same symptoms (too many unassigned jobs), they are very different in nature:

  1. initial problem (first comments) is about reaching a better solution because of a different fleet setup that "helps" the whole search process;
  2. the concerns in the follow-up discussion with @do-me are more related to a regression in the heuristic behavior on instances with a complex skills setup.

In order to split the concerns, I've ticketed point 2 in #1193. Otherwise I'm inclined to close for point 1: except if we find evidence of this problem happening wildly at scale, I think we have to leave with it as an inevitable consequence of providing sub-optimal solutions.

@jcoupey jcoupey closed this as completed Nov 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants