Bin Packing and Machine Scheduling
5. The list processing algorithm for machine scheduling
We want to schedule these six tasks on some fixed number of identical machines in such a manner that the tasks are completed by the earliest possible time. This is sometimes referred to as finding the makespan for the tasks that make up the job. We also want to be specific about which tasks should be scheduled on which machines during a given time period. We will assume that the scheduling is carried out so that no machine will remain voluntarily idle, and that once a machine begins working on a task it will continue to work on it without interruption until the task is completed.
Continuing our analysis of how Figure 2 is generated, what happens when time 19 arrives, and Machine 2 tries to find an unassigned task in the priority list? Since both task T4 and T6 are not ready at time 19 (because they can only begin when T3 is done), Machine 2 will stay idle until time 22. At time 22, both machines are free, and in accordance with our tie-breaking rule, T4 gets assigned to Machine 1 while T6 gets assigned to Machine 2. The completion time for this list is 34.
Is this the best we can do? One way to check, when task times are integers, is that we know that a lower bound for completing the tasks is the ceiling function applied to the following quotient: the sum of all task times divided by the number of machines. In this case, we get a lower bound of 30, which means that there is perhaps some hope of finding a better schedule, one that finishes by time 30.
In addition to 30 there is potentially another independent lower bound that we can take into account. Suppose we find the length of all the paths in the task analysis digraph. In this example, two typical such paths are T2, T5 and T3, T4. The lengths of these two paths, computed by summing the numbers inside the vertices making up the paths, are respectively, 19 and 26. How are these numbers related to the completion time for all of the tasks? Clearly, since we are working with directed paths, the early tasks in the paths must be completed before the later ones. Thus, the completion time for all the tasks is at least as long as the time necessary to complete all of the tasks in any of the paths in the task analysis digraph. In particular, the earliest completion time is at least as long as the length of the longest path in this digraph. In this example, the length of this longest path is 26 and the path involved is T3, T4. The path in the task analysis digraph which has the longest length is known as a critical path. A given task analysis digraph has one or more critical paths. (This is true even if the digraph has no directed edges. In this case the length of the critical path is the amount of time it takes to complete the longest task.) Speeding up tasks that are not on the critical path does not affect this lower bound, regardless of the number of machines available. The earliest completion time must still be at least as long as the critical path, despite having a lot of processors to do the tasks not on the critical path(s).
Is there another list we could try that might achieve a better completion time? We have mentioned that tasks on the critical path are "bottlenecks" in the sense that when they are delayed, the time to complete all the tasks grows. This suggests the idea of a critical path list. Begin by putting the first task on a longest path (breaking ties with the task of lowest number) at the start of the list. Now remove this task and edges that come out of it from the task analysis digraph and repeat the process by finding a new task to add to the list that is at the start of a longest path. Doing this gives rise to the list T3, T2, T1, T4, T6, T5. This list, though different from the decreasing time list actually, gives rise to exactly the same schedule we had before that finished at time 32. There are 6! = 720 different lists that can arise with six tasks, but the schedules that these lists give rise to need not be different, as we see in this case.
The analysis of this small example mirrors the fact that for large versions of the machine scheduling problem, there is no known polynomial time procedure that will locate what the optimal schedule might be. This point brings us full circle to why the list processing heuristic was explored as a way of trying to find good approximate schedules. More is known for the case where the tasks making up the job can be done in any order, so-called independent tasks. This is the case where the task analysis digraph has no edges. Ronald Graham has shown that for independent tasks the list algorithm finds a completion time using the decreasing time list which is never more than
where T is the optimal time to schedule the tasks and m (at least 2) is the number of machines the tasks are scheduled on. This result is a classic example of the interaction of theoretical and applied mathematics.
Welcome to the
These web essays are designed for those who have already discovered the joys of mathematics as well as for those who may be uncomfortable with mathematics.
Search Feature Column
Feature Column at a glance