← run

algo-greedy-01

1.000
16/16 tests· algorithms
Challenge · difficulty 4/5
# Freelance gig scheduling for maximum payout

You are a freelancer planning your calendar. There is a queue of one-day gigs
on offer. Each gig `i` has:

- a **deadline** `d_i` (a positive integer) — the gig is only worth anything if
  you complete it on or before day `d_i`, and
- a **payout** `p_i` (a non-negative integer) — what you earn if you complete it
  in time.

Days are numbered `1, 2, 3, ...`. Every gig takes **exactly one full day**, and
you can work on **at most one gig per day**. You may accept any subset of the
gigs and choose which day to do each accepted gig, as long as no two accepted
gigs share a day and each accepted gig is done on some day `<= its deadline`.
Gigs you don't accept earn nothing.

Implement a file **`solution.py`** containing:

```python
def max_payout(gigs: list[tuple[int, int]]) -> int:
    """Return the maximum total payout achievable.

    `gigs` is a list of (deadline, payout) pairs. Return an int.
    """
```

## What to return

Return the **maximum total payout** you can earn with a valid schedule. You do
**not** need to return the schedule itself — only the best achievable total.

## Why this is subtle

A tempting greedy — "sort gigs by payout, and place each accepted gig on the
earliest free day within its deadline" — is **wrong**. Consider two gigs:

- gig A: deadline 2, payout 100
- gig B: deadline 1, payout 50

The correct answer is **150**: do A on day 2 and B on day 1. But if you place
the higher-payout gig A on the *earliest* free day (day 1), you occupy the only
day B could ever use, and you're stuck with just 100. The fix is an
exchange-argument insight: when you accept a gig, reserve it as **late** as its
deadline allows, keeping earlier days open for gigs with tighter deadlines.
Likewise, sorting purely by deadline (or accepting gigs first-come) can force
you to keep a cheap gig over a strictly better one competing for the same day.

## Constraints

- `0 <= len(gigs)`; large instances (up to `2 * 10^5` gigs) are tested, so an
  `O(n^2)` day-by-day scan will be too slow — aim for roughly `O(n log n)`.
- `1 <= d_i` and `0 <= p_i`, each up to about `10^9`. Deadlines may be far
  larger than the number of gigs.
- The empty gig list returns `0`.

## Examples

```python
assert max_payout([]) == 0
assert max_payout([(1, 42)]) == 42
assert max_payout([(2, 100), (1, 50)]) == 150          # place the rich gig late
assert max_payout([(1, 1), (1, 100)]) == 100           # one day-1 slot, keep 100
assert max_payout([(2, 100), (1, 19), (2, 27), (1, 25), (3, 15)]) == 142
assert max_payout([(1000, 5), (1000, 6), (1000, 7)]) == 18  # all fit
```
tests/test_gig_scheduler.py
import heapq
import itertools
import random

from solution import max_payout


# ----------------------------------------------------------------------------
# Independent oracles used to check the solution on random / large inputs.
# ----------------------------------------------------------------------------

def heap_oracle(gigs):
    """Classic min-heap greedy for job sequencing with deadlines.

    Process gigs in increasing deadline order, keep a min-heap of accepted
    payouts, and whenever more gigs are accepted than the current deadline
    allows, drop the smallest-payout accepted gig. The heap's sum is optimal.
    This is a different implementation from the reference (which uses a
    disjoint-set over day-slots), so agreement is a strong correctness signal.
    """
    heap = []
    for d, p in sorted(gigs, key=lambda x: x[0]):
        heapq.heappush(heap, p)
        if len(heap) > d:
            heapq.heappop(heap)
    return sum(heap)


def brute_force(gigs):
    """Exhaustive optimum for tiny inputs.

    Try every subset of gigs; a subset is schedulable iff we can match each
    gig to a distinct day <= its deadline (checked by the standard earliest
    -deadline-first feasibility greedy). Only feasible for very small n.
    """
    n = len(gigs)
    best = 0
    for r in range(n + 1):
        for subset in itertools.combinations(range(n), r):
            chosen = sorted(subset, key=lambda i: gigs[i][0])
            day = 0
            ok = True
            for i in chosen:
                day += 1
                if day > gigs[i][0]:
                    ok = False
                    break
            if ok:
                best = max(best, sum(gigs[i][1] for i in subset))
    return best


# ----------------------------------------------------------------------------
# Hand-verified small cases.
# ----------------------------------------------------------------------------

def test_empty():
    assert max_payout([]) == 0


def test_single_gig_within_deadline():
    assert max_payout([(1, 42)]) == 42
    assert max_payout([(5, 7)]) == 7


def test_two_gigs_no_conflict():
    assert max_payout([(1, 10), (2, 20)]) == 30


def test_latest_slot_subtlety():
    # The exchange-argument crux: the high-payout gig has the LATER deadline.
    # Placing it at its latest feasible day (day 2) leaves day 1 free for the
    # tight-deadline gig, yielding 150. A greedy that places the high-payout
    # gig at the EARLIEST free day (day 1) blocks the other gig -> only 100.
    assert max_payout([(2, 100), (1, 50)]) == 150


def test_must_prefer_higher_payout_on_tie_deadline():
    assert max_payout([(1, 1), (1, 100)]) == 100
    assert max_payout([(1, 100), (1, 1)]) == 100


def test_classic_five_gig_instance():
    gigs = [(2, 100), (1, 19), (2, 27), (1, 25), (3, 15)]
    assert max_payout(gigs) == 142


def test_four_gig_instance():
    assert max_payout([(1, 10), (2, 10), (2, 15), (1, 20)]) == 35


def test_all_same_deadline_one():
    assert max_payout([(1, 3), (1, 9), (1, 4), (1, 2)]) == 9


def test_deadlines_far_beyond_count():
    assert max_payout([(1000, 5), (1000, 6), (1000, 7)]) == 18


def test_greedy_by_deadline_alone_is_wrong():
    gigs = [(1, 5), (2, 6), (2, 7), (3, 4), (1, 20), (3, 3)]
    assert max_payout(gigs) == brute_force(gigs)


def test_zero_payout_gigs_are_never_harmful():
    assert max_payout([(1, 0), (1, 0)]) == 0
    assert max_payout([(2, 0), (1, 5)]) == 5


# ----------------------------------------------------------------------------
# Randomized cross-checks against brute force (small) and heap oracle.
# ----------------------------------------------------------------------------

def test_random_small_vs_bruteforce():
    rng = random.Random(1234)
    for _ in range(400):
        n = rng.randint(0, 7)
        gigs = [(rng.randint(1, 6), rng.randint(0, 50)) for _ in range(n)]
        assert max_payout(gigs) == brute_force(gigs)


def test_random_medium_vs_heap_oracle():
    rng = random.Random(99)
    for _ in range(60):
        n = rng.randint(1, 500)
        gigs = [(rng.randint(1, n + 5), rng.randint(1, 10**6)) for _ in range(n)]
        assert max_payout(gigs) == heap_oracle(gigs)


def test_random_tight_deadlines_vs_heap_oracle():
    rng = random.Random(7)
    for _ in range(60):
        n = rng.randint(1, 400)
        maxd = rng.randint(1, 5)
        gigs = [(rng.randint(1, maxd), rng.randint(1, 1000)) for _ in range(n)]
        assert max_payout(gigs) == heap_oracle(gigs)


def test_large_input_performance_and_correctness():
    # Big instance: an O(n^2) slot scan would blow the time budget, and a
    # naive greedy would give a wrong total. Must match the heap oracle.
    rng = random.Random(2024)
    n = 200_000
    gigs = [(rng.randint(1, 10**9), rng.randint(1, 10**9)) for _ in range(n)]
    assert max_payout(gigs) == heap_oracle(gigs)


def test_large_tight_deadlines():
    rng = random.Random(555)
    n = 200_000
    gigs = [(rng.randint(1, 50), rng.randint(1, 10**9)) for _ in range(n)]
    assert max_payout(gigs) == heap_oracle(gigs)
System prompt
You are an expert programmer. Solve the task exactly as specified. Output your solution as fenced code blocks using the required file name(s) and the exact function/type signatures requested. Prefer correctness; do not include prose outside code unless asked.
Proposed solution
```python
from typing import List, Tuple

def max_payout(gigs: List[Tuple[int, int]]) -> int:
    """Return the maximum total payout achievable.

    `gigs` is a list of (deadline, payout) pairs. Return an int.
    """
    if not gigs:
        return 0

    # Sort gigs by deadline ascending.
    # This ensures that when we process a gig, all gigs with earlier deadlines
    # have already been considered and placed.
    gigs.sort(key=lambda x: x[0])

    # We maintain a max-heap of payouts for gigs that have been accepted
    # but not yet assigned to a specific day.
    # The heap size at any point represents the number of days we have
    # "reserved" for these gigs.
    import heapq
    max_heap = []

    # We iterate through the sorted list of gigs.
    # For each gig (deadline, payout):
    # 1. We tentatively accept it by pushing its payout onto the heap.
    # 2. If the heap size exceeds the current deadline, it means we have
    #    more accepted gigs than available days up to this deadline.
    #    To maximize total payout, we should drop the gig with the smallest
    #    payout from the heap (since we can't schedule it).
    # 3. The size of the heap at the end of processing a gig represents the
    #    number of days we have successfully scheduled up to that deadline.
    #    The sum of the heap values is the total payout of the schedule.
    for deadline, payout in gigs:
        heapq.heappush(max_heap, payout)
        if len(max_heap) > deadline:
            # We have too many gigs for the days we have available.
            # Remove the smallest payout to keep the total maximum.
            heapq.heappop(max_heap)

    return sum(max_heap)
```

Test output (stdout)

................                                                         [100%]
16 passed in 0.30s