In the last post, I introduced a mathematical formulation of the Vehicle Routing Problem inspired by a recent paper on taxi routing. The paper caught my eye initially because the authors mentioned being able “to dispatch in real time thousands of taxis serving more than 25,000 customers per hour.” Let’s go a bit deeper now and look at the authors’ solution algorithms.
This post does not have mathematical expressions in it but will definitely be easier to follow for readers with a background in Operations Research. I’ll be a bit snarky, mainly to point out areas where I think further work is warranted.
Initial Algorithms, Results
The paper we are discussing first compares four solution algorithms for the formulation from the previous post.
The baseline is set by a greedy algorithm that goes through riders sequentially, in order of their earliest possible pickup time, and assigns them to the driver that can pick them up first. The authors also use an algorithm they label MIOoptimal, which calls the Gurobi commercial solver. The authors next apply the well-known 2-OPT heuristic algorithm. I won’t describe this heuristic in detail here but it’s fairly general and often the first non-greedy heuristic people (especially Computer Scientist types?) reach for when faced with a Vehicle Routing type problem. Finally, the authors use their own algorithm called maxflow. maxflow selects several sets of random but feasible times for rider pickups. It then assigns riders to drivers by solving a relatively simple network flow problem, for each set of pickup times.
This chart displaying the running times of the MIOoptimal algorithm demonstrates both that it would be impractical to solve realistic-sized problems to optimality and that the length of rider pickup time windows are a key factor in determining computational complexity. Longer time windows mean more possibilities.
MIOoptimal is able to increase profits 15–20% above the greedy baseline in the scenarios where demand is high. You can see why there is so much interest in optimizing vehicle routing.
I really like the setup here. The use of a heuristic algorithm on any problem should be motivated by an examination of how well the algorithm performs relative to optimal on small problem instances and a demonstration of why solving to (guaranteed) optimality is impractical on larger instances.
Snark
The authors highlight the relatively good performance of the maxflow algorithm when demand was high and time windows were 1 minute long. However, the algorithm performs worse than the greedy algorithm in the low- and medium-demand cases when rider time windows are 6 minutes long, the longest tested. That’s surprisingly poor, to be quite honest. maxflow does worse than the 2-OPT algorithm whenever time windows are longer than a minute.
Taking a step back from the math, in what context would people make requests for rides well in advance of when they want to be picked up but also demand to be picked up within a 1 minute time window? It doesn’t make sense in New York City but it makes even less sense when you think of taxi riders in rural areas or developing countries. Moving away from literal taxi routing, it doesn’t make sense for applications involving moving freight.
Of course the maxflow algorithm does well when time windows are tiny. The selection of rider pickup times, which maxflow accomplishes in a rather… haphazard fashion is relatively unimportant in such problems. The assignment of riders to drivers is more important and much simpler here, and maxflow solves this sub-problem to optimality. It sure is handy that the problem is simpler in these instances.
I’m reminded of the Aircraft Sequencing Problem, selecting the order in which aircraft utilize a common resource such as a runway. The seminal paper on this problem introduced a Dynamic Programming algorithm with “Constrained Position Shifting,” requiring that no aircraft be shifted more than a few spots from where it sits in some given nominal ordering. Such a constraint is the absolute best way to mitigate the curse of dimensionality and make Dynamic Programming an attractive framework. I’ve heard people talk about Constrained Position Shifting as a way to ensure fairness. Sure, I guess, although it seems pretty clear to me that the math came first.
All of this is just to say that the maxflow algorithm may not make sense for your particular Vehicle Routing Problem. In its defense, the maxflow algorithm is most intriguing in the context of the algorithms the authors build on top of it for dealing with larger problems.
Fancier Algorithms
The authors have established that it’s impractical to solve larger problem instances to optimality. Their solution is to “sparsify the flow graph,” essentially to rule out some possibilities and to remove some of the binary variables from larger optimization problem.
They start by proposing a k-neighborhood algorithm. Driver time is our currency in the problem we are solving. We trade it in for profit. Recall that if we choose to ask driver i to pick up rider j first, then the rider’s pickup time must be larger than both the quickest i can get to j and the earliest j can be picked up. So the cost of this assignment, in driver time, is the larger of these two quantities.
If we choose to have some driver serve rider j2 immediately after j1, then the time between the two pickups must be both larger than the minimum time to drive j1 to their destination and then get to j2’s location and larger than the minimum time between the two relevant time windows. This is the cost of such an assignment.
The k-neighborhood algorithm looks at each rider and identifies both the k least costly ways to serve the rider and the k least costly next riders to serve after the rider. All of the identified interesting assignments are collected. Any potential assignments not identified are discarded with decision variables fixed at 0.
Are you with me so far? Great, let’s dive deeper into the madness.
The authors’ backbone algorithm involves applying the k-neighoboorhood algorithm, then running the maxflow algorithm on the resulting, reduced-size problem. All of the assignments that appear in any of the solutions that the maxflow algorithm produces are then stored. (Recall that the maxflow algorithm involves trying out lots of different sets of rider pickup times, each of which generates its own set of driver assignments.) Any assignments not identified as optimal in any of the solutions produced by the maxflow algorithm are discarded, with decision variables fixed at 0. Finally, the backbone algorithm solves the resulting mini-problem to optimality using the MIOoptimal algorithm.
We’ve gone through the looking glass now. But wait, the authors go slightly deeper.
The authors’ local-backbone algorithm takes as input a feasible solution for the master problem. It runs a modified version of the backbone algorithm which consistently includes the decision variables corresponding to driver assignments made in the provided feasible solution. The result of this is a new solution for the master problem. We can plug that new solution right back in and iterate.
The results above are from problem instances which involved 2,700 drivers and 6,000–6,500 riders. The K parameter is the k used in the k-neighborhood algorithm. The k-neighborhood algorithm isn’t used by the 2-OPT heuristic. The MIOoptimal results shown here are the result of applying the MIOoptimal algorithm to reduced-size problems defined by the k-neighborhood algorithm. Here all algorithms had a 5 minute limit on computation time. MIOoptimal was unable to solve even the reduced-size problems to optimality when K was larger than 2. Rider time windows were 5 minutes long.
5 minute time windows still seems short to me but it’s definitely better than 1 minute time windows.
It’s again interesting to see much fancier algorithms performing worse than the greedy algorithm, here when K = 2. This points to the difficulty of the problem and the importance of knowing the holistic state of the world. The greedy algorithm must be choosing assignments for drivers which are still reasonable but which don’t appear in the driver’s top 2 options at that time and location in a counterfactual scenario where there are no other drivers.
There might be a way to improve the k-neighborhood algorithm based on what’s going on around each potential assignment. This could involve looking at the distribution of the costs considered by the algorithm. Maybe only consider k assignments plus the next n assignments which have a cost of less than x driver minutes. I don’t know. But it seems like this is where the algorithms could be improved (?).
Wrap Up
The authors go on to test their algorithms on online problems in a simulation, resolving updated problem instances every 30 seconds. This is where the authors can claim to serve 25,000+ customers per hour.
I found the chart above interesting as it shows the benefits of optimizing vehicle paths and of getting information about ride requests ahead of time.
When solving Vehicle Routing Problems in real life, we don’t always get information on ride requests ahead of time. A lot depends on context. I don’t think the general public realizes the efficiency benefits unlocked by “prior request time.”
I saw that Lyft recently began offering people a “Wait and Save” option. Might this be an attempt to split the “extra profit” generated by prior request time?
In a section on Extensions, the authors talk about being able to forecast demand and “to route idle taxis to areas of popular demand.” This would be the next best thing to prior request time.
Leave a Reply