Files

T

Mortdecai 624a14fd9a Proof: unweighted avg completion time is a biased metric

Mathematical proof that unweighted average task completion time
is gameable by scheduling policy (SPT), while work-weighted
completion time is schedule-invariant. Demonstrates that SPT's
apparent advantage is an artifact of the metric, not genuine
throughput improvement.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

2026-03-28 16:53:13 -04:00

6.9 KiB

Raw Blame History

Unweighted Average Completion Time Is Not a Fair Metric for Task Scheduling

A mathematical proof that unweighted average task completion time is a biased statistic that incentivizes cherry-picking easy work, and that any scheduling advantage it appears to reveal is an artifact of the metric — not a reflection of genuine throughput or service quality.

1. Definitions

Let there be n tasks with processing times p_1, p_2, \ldots, p_n.

A schedule \sigma is a permutation of \{1, 2, \ldots, n\} assigning tasks to execution order on a single executor.

The completion time of task \sigma(k) under schedule \sigma is:

C_{\sigma(k)} = \sum_{j=1}^{k} p_{\sigma(j)}

The unweighted mean completion time is:

\bar{C}(\sigma) = \frac{1}{n} \sum_{k=1}^{n} C_{\sigma(k)}

The work-weighted mean completion time is:

\bar{C}_w(\sigma) = \frac{\sum_{k=1}^{n} p_{\sigma(k)} \cdot C_{\sigma(k)}}{\sum_{k=1}^{n} p_{\sigma(k)}}

2. SPT Is Optimal for the Unweighted Statistic

Theorem 1. The schedule that minimizes \bar{C}(\sigma) is Shortest Processing Time first (SPT): sort tasks so that p_{\sigma(1)} \le p_{\sigma(2)} \le \cdots \le p_{\sigma(n)}.

Proof (exchange argument).

Consider any schedule \sigma in which two adjacent tasks i, j satisfy p_i > p_j with task i scheduled immediately before task j. Let t be the start time of task i.

	Task `i` finishes	Task `j` finishes	Sum
Before swap (`i` then `j`)	`t + p_i`	`t + p_i + p_j`	`2t + 2p_i + p_j`
After swap (`j` then `i`)	`t + p_j`	`t + p_j + p_i`	`2t + p_i + 2p_j`

The change in the sum of completion times is:

(2p_i + p_j) - (p_i + 2p_j) = p_i - p_j > 0

Every swap of a longer-before-shorter adjacent pair strictly reduces the total. Any non-SPT schedule contains such a pair. Repeated swaps converge to SPT. Therefore SPT uniquely minimizes \bar{C}(\sigma). \blacksquare

3. The Work-Weighted Statistic Is Schedule-Invariant

Theorem 2. The work-weighted mean completion time \bar{C}_w(\sigma) is the same for every schedule \sigma.

Proof.

Expand the numerator:

\sum_{k=1}^{n} p_{\sigma(k)} \cdot C_{\sigma(k)} = \sum_{k=1}^{n} p_{\sigma(k)} \sum_{j=1}^{k} p_{\sigma(j)}

Reindex by letting a = \sigma(k) and b = \sigma(j). The double sum counts every ordered pair (a, b) where b is scheduled no later than a:

= \sum_{\substack{a, b \\ b \preceq_\sigma a}} p_a \, p_b

For any pair (a, b) with a \ne b, exactly one of \{b \preceq_\sigma a\} or \{a \prec_\sigma b\} holds. The diagonal terms (a = b) contribute p_a^2 regardless of order. Therefore:

\sum_{\substack{a, b \\ b \preceq_\sigma a}} p_a \, p_b = \sum_{a} p_a^2 + \sum_{\substack{a \ne b \\ b \prec_\sigma a}} p_a \, p_b

Now consider the complementary sum:

\sum_{\substack{a \ne b \\ a \prec_\sigma b}} p_a \, p_b

Together the two off-diagonal sums cover all unordered pairs \{a, b\}:

\sum_{\substack{a \ne b \\ b \prec_\sigma a}} p_a \, p_b + \sum_{\substack{a \ne b \\ a \prec_\sigma b}} p_a \, p_b = \sum_{a \ne b} p_a \, p_b

The right-hand side is schedule-independent. By symmetry of p_a p_b, both off-diagonal sums are equal:

\sum_{\substack{a \ne b \\ b \prec_\sigma a}} p_a \, p_b = \frac{1}{2} \sum_{a \ne b} p_a \, p_b

Therefore:

\sum_{k=1}^{n} p_{\sigma(k)} \cdot C_{\sigma(k)} = \sum_a p_a^2 + \frac{1}{2} \sum_{a \ne b} p_a \, p_b = \frac{1}{2}\left(\sum_a p_a\right)^2 + \frac{1}{2}\sum_a p_a^2

This expression contains no reference to \sigma. Since the denominator \sum p_a is also schedule-independent:

\bar{C}_w(\sigma) = \frac{\frac{1}{2}\left(\sum p_a\right)^2 + \frac{1}{2}\sum p_a^2}{\sum p_a}

is constant across all schedules. \blacksquare

4. Concrete Example

Two tasks: A with p_A = 1 hour, B with p_B = 10 hours.

SPT order (A first)

Task	Completion time
A	1
B	11

Unweighted mean: (1 + 11) / 2 = 6.0
Work-weighted mean: (1 \times 1 + 10 \times 11) / 11 = 111/11 \approx 10.09

Reverse order (B first)

Task	Completion time
B	10
A	11

Unweighted mean: (10 + 11) / 2 = 10.5
Work-weighted mean: (10 \times 10 + 1 \times 11) / 11 = 111/11 \approx 10.09

SPT appears 4.5 hours better on the unweighted metric but provides zero improvement on the work-weighted metric. The apparent advantage exists only because the unweighted statistic lets a 1-hour task "vote" equally with a 10-hour task.

5. Connection to Little's Law

Little's Law states L = \lambda W, where L is the average number of tasks in the system, \lambda is the arrival rate, and W is the average time a task spends in the system.

For a stable system, L and \lambda are determined by arrival and service rates — not by scheduling policy. Therefore W = L / \lambda is schedule-invariant when measured correctly (i.e., weighted by the quantity being served).

SPT appears to violate this only because the unweighted statistic counts completions rather than work, systematically underweighting large tasks.

6. Consequences

Theorem 3 (Metric Bias). Any scheduling policy that minimizes unweighted mean completion time necessarily maximizes the completion time of the largest task relative to other schedules.

Proof. SPT places the largest task last. Its completion time equals the total processing time \sum p_i, which is the maximum possible completion time for any individual task. Meanwhile, FIFO or any non-SPT order would allow the large task to finish earlier. \blacksquare

This creates a starvation incentive: rational agents optimizing the unweighted statistic will indefinitely defer large tasks in favor of small ones.

Real-world manifestations

Domain	Gameable metric	Perverse outcome
Support desks	Tickets closed / day	Complex issues ignored
Sprint planning	Story count velocity	Work split into trivial pieces
Emergency rooms	Average wait time	Critical patients deprioritized
Academic publishing	Papers per year	Incremental work favored over deep research

7. Conclusion

The unweighted average completion time is a biased statistic that:

Can be gamed by scheduling policy (Theorem 1), unlike work-weighted completion time which is schedule-invariant (Theorem 2).
Incentivizes starvation of large tasks (Theorem 3).
Contradicts Little's Law unless tasks are uniformly sized.

A metric that can be improved by reordering work — without doing any additional work — is measuring the scheduling policy, not the system's capacity or effectiveness.

Unweighted average completion time is not a fair or accurate measurement of task execution performance.

This proof was developed conversationally and formalized on 2026-03-28.

6.9 KiB Raw Blame History