## Bin Packing and Machine Scheduling

5. The list processing algorithm for machine scheduling Figure 1

We want to schedule these six tasks on some fixed number of identical machines in such a manner that the tasks are completed by the earliest possible time. This is sometimes referred to as finding the makespan for the tasks that make up the job. We also want to be specific about which tasks should be scheduled on which machines during a given time period. We will assume that the scheduling is carried out so that no machine will remain voluntarily idle, and that once a machine begins working on a task it will continue to work on it without interruption until the task is completed.

There is an appealing heuristic with which to approach this problem. The heuristic has the advantage of being relatively simple to program on a computer, and when carried out by hand by different people, leads to the same schedule for the tasks (because the ways ties can be broken is specified). Typically, the heuristic gives a good approximation to an optimal schedule. This algorithm (heuristic) is known as the list processing algorithm. The algorithm works by coordinating the scheduling of the tasks on the machines and taking account of a "priority list" and then coordinating this list with the demands imposed by the task analysis digraph.

You can think of the given list as a kind of priority list which is independent of the scheduling requirements imposed by the task analysis digraph. The tasks are given in the list so that when read from left to right, tasks of higher priority are listed first. For example, the list may reflect an ordering of the tasks based on the size of cash payments that will be made when the tasks are completed. Alternatively, the list may have been chosen with the specific goal of trying to minimize the completion time for the tasks.

The list processing heuristic assigns at a time t a ready task (reading from left to right) that has not already been assigned to the lowest-numbered processor which is not currently working on a task. Figure 2

Continuing our analysis of how Figure 2 is generated, what happens when time 19 arrives, and Machine 2 tries to find an unassigned task in the priority list? Since both task T4 and T6 are not ready at time 19 (because they can only begin when T3 is done), Machine 2 will stay idle until time 22. At time 22, both machines are free, and in accordance with our tie-breaking rule, T4 gets assigned to Machine 1 while T6 gets assigned to Machine 2. The completion time for this list is 34.

Is this the best we can do? One way to check, when task times are integers, is that we know that a lower bound for completing the tasks is the ceiling function applied to the following quotient: the sum of all task times divided by the number of machines. In this case, we get a lower bound of 30, which means that there is perhaps some hope of finding a better schedule, one that finishes by time 30.

Is it possible to improve the schedule that is displayed in Figure 2? One idea is to use a list to prevent the difficulties when lengthy tasks appear late in a list or when tasks on the critical path(s) are given "high priority." Thus, one can construct the list which orders the tasks by decreasing time (ties broken in favor of tasks with a lower number). The decreasing time list in this case would be: T3, T2, T4, T1, T6, T5. This list leads to a schedule with completion time 32. You can practice trying to use the list processing algorithm on this list with 2 processors. The result is a schedule that finishes at time 32 with only 4 time units of idle time on the second processor. Here is the way the tasks are scheduled: Machine 1: T3, T4, T5; Machine 2: T2, T1, T6, idle from 28-32. Although this schedule is better than the one in Figure 2, it may not be the optimal one, because there is still hope that a schedule which uses 30 time units on each machine with no idle time is possible.

Is there another list we could try that might achieve a better completion time? We have mentioned that tasks on the critical path are "bottlenecks" in the sense that when they are delayed, the time to complete all the tasks grows. This suggests the idea of a critical path list. Begin by putting the first task on a longest path (breaking ties with the task of lowest number) at the start of the list. Now remove this task and edges that come out of it from the task analysis digraph and repeat the process by finding a new task to add to the list that is at the start of a longest path. Doing this gives rise to the list T3, T2, T1, T4, T6, T5. This list, though different from the decreasing time list actually, gives rise to exactly the same schedule we had before that finished at time 32. There are 6! = 720 different lists that can arise with six tasks, but the schedules that these lists give rise to need not be different, as we see in this case.

We have tried three lists and they each finish later than the theoretical, but a priori possible, optimum time of 30. There is also the possibility that there is a schedule that completes at time 31, with 2 units of idle time. You can check that no sum of two sets of task times yields a value of 30. You can also check that although there are two sets of task times (e.g. 13, 12, and 6; 14, 8 7) that sum to 31 and 29, no schedule based on the assignment of the associated tasks in order to schedule two machines obeys the restrictions imposed by the task analysis digraph. Thus, with a bit of effort one sees that the optimal schedule in this case completes at time 32.

The analysis of this small example mirrors the fact that for large versions of the machine scheduling problem, there is no known polynomial time procedure that will locate what the optimal schedule might be. This point brings us full circle to why the list processing heuristic was explored as a way of trying to find good approximate schedules. More is known for the case where the tasks making up the job can be done in any order, so-called independent tasks. This is the case where the task analysis digraph has no edges. Ronald Graham has shown that for independent tasks the list algorithm finds a completion time using the decreasing time list which is never more than where T is the optimal time to schedule the tasks and m (at least 2) is the number of machines the tasks are scheduled on. This result is a classic example of the interaction of theoretical and applied mathematics.

Bin packing, an applied problem, led to many application insights as well as tools for solving a variety of theoretical problems. Bin packing relates to some machine scheduling problems, which in turn have rich connections with both pure and applied problems. Next month, I will explore some of these connections.