algo-greedy-01
1.000
Challenge · difficulty 4/5
# Freelance gig scheduling for maximum payout
You are a freelancer planning your calendar. There is a queue of one-day gigs
on offer. Each gig `i` has:
- a **deadline** `d_i` (a positive integer) — the gig is only worth anything if
you complete it on or before day `d_i`, and
- a **payout** `p_i` (a non-negative integer) — what you earn if you complete it
in time.
Days are numbered `1, 2, 3, ...`. Every gig takes **exactly one full day**, and
you can work on **at most one gig per day**. You may accept any subset of the
gigs and choose which day to do each accepted gig, as long as no two accepted
gigs share a day and each accepted gig is done on some day `<= its deadline`.
Gigs you don't accept earn nothing.
Implement a file **`solution.py`** containing:
```python
def max_payout(gigs: list[tuple[int, int]]) -> int:
"""Return the maximum total payout achievable.
`gigs` is a list of (deadline, payout) pairs. Return an int.
"""
```
## What to return
Return the **maximum total payout** you can earn with a valid schedule. You do
**not** need to return the schedule itself — only the best achievable total.
## Why this is subtle
A tempting greedy — "sort gigs by payout, and place each accepted gig on the
earliest free day within its deadline" — is **wrong**. Consider two gigs:
- gig A: deadline 2, payout 100
- gig B: deadline 1, payout 50
The correct answer is **150**: do A on day 2 and B on day 1. But if you place
the higher-payout gig A on the *earliest* free day (day 1), you occupy the only
day B could ever use, and you're stuck with just 100. The fix is an
exchange-argument insight: when you accept a gig, reserve it as **late** as its
deadline allows, keeping earlier days open for gigs with tighter deadlines.
Likewise, sorting purely by deadline (or accepting gigs first-come) can force
you to keep a cheap gig over a strictly better one competing for the same day.
## Constraints
- `0 <= len(gigs)`; large instances (up to `2 * 10^5` gigs) are tested, so an
`O(n^2)` day-by-day scan will be too slow — aim for roughly `O(n log n)`.
- `1 <= d_i` and `0 <= p_i`, each up to about `10^9`. Deadlines may be far
larger than the number of gigs.
- The empty gig list returns `0`.
## Examples
```python
assert max_payout([]) == 0
assert max_payout([(1, 42)]) == 42
assert max_payout([(2, 100), (1, 50)]) == 150 # place the rich gig late
assert max_payout([(1, 1), (1, 100)]) == 100 # one day-1 slot, keep 100
assert max_payout([(2, 100), (1, 19), (2, 27), (1, 25), (3, 15)]) == 142
assert max_payout([(1000, 5), (1000, 6), (1000, 7)]) == 18 # all fit
```
tests/test_gig_scheduler.py
import heapq
import itertools
import random
from solution import max_payout
# ----------------------------------------------------------------------------
# Independent oracles used to check the solution on random / large inputs.
# ----------------------------------------------------------------------------
def heap_oracle(gigs):
"""Classic min-heap greedy for job sequencing with deadlines.
Process gigs in increasing deadline order, keep a min-heap of accepted
payouts, and whenever more gigs are accepted than the current deadline
allows, drop the smallest-payout accepted gig. The heap's sum is optimal.
This is a different implementation from the reference (which uses a
disjoint-set over day-slots), so agreement is a strong correctness signal.
"""
heap = []
for d, p in sorted(gigs, key=lambda x: x[0]):
heapq.heappush(heap, p)
if len(heap) > d:
heapq.heappop(heap)
return sum(heap)
def brute_force(gigs):
"""Exhaustive optimum for tiny inputs.
Try every subset of gigs; a subset is schedulable iff we can match each
gig to a distinct day <= its deadline (checked by the standard earliest
-deadline-first feasibility greedy). Only feasible for very small n.
"""
n = len(gigs)
best = 0
for r in range(n + 1):
for subset in itertools.combinations(range(n), r):
chosen = sorted(subset, key=lambda i: gigs[i][0])
day = 0
ok = True
for i in chosen:
day += 1
if day > gigs[i][0]:
ok = False
break
if ok:
best = max(best, sum(gigs[i][1] for i in subset))
return best
# ----------------------------------------------------------------------------
# Hand-verified small cases.
# ----------------------------------------------------------------------------
def test_empty():
assert max_payout([]) == 0
def test_single_gig_within_deadline():
assert max_payout([(1, 42)]) == 42
assert max_payout([(5, 7)]) == 7
def test_two_gigs_no_conflict():
assert max_payout([(1, 10), (2, 20)]) == 30
def test_latest_slot_subtlety():
# The exchange-argument crux: the high-payout gig has the LATER deadline.
# Placing it at its latest feasible day (day 2) leaves day 1 free for the
# tight-deadline gig, yielding 150. A greedy that places the high-payout
# gig at the EARLIEST free day (day 1) blocks the other gig -> only 100.
assert max_payout([(2, 100), (1, 50)]) == 150
def test_must_prefer_higher_payout_on_tie_deadline():
assert max_payout([(1, 1), (1, 100)]) == 100
assert max_payout([(1, 100), (1, 1)]) == 100
def test_classic_five_gig_instance():
gigs = [(2, 100), (1, 19), (2, 27), (1, 25), (3, 15)]
assert max_payout(gigs) == 142
def test_four_gig_instance():
assert max_payout([(1, 10), (2, 10), (2, 15), (1, 20)]) == 35
def test_all_same_deadline_one():
assert max_payout([(1, 3), (1, 9), (1, 4), (1, 2)]) == 9
def test_deadlines_far_beyond_count():
assert max_payout([(1000, 5), (1000, 6), (1000, 7)]) == 18
def test_greedy_by_deadline_alone_is_wrong():
gigs = [(1, 5), (2, 6), (2, 7), (3, 4), (1, 20), (3, 3)]
assert max_payout(gigs) == brute_force(gigs)
def test_zero_payout_gigs_are_never_harmful():
assert max_payout([(1, 0), (1, 0)]) == 0
assert max_payout([(2, 0), (1, 5)]) == 5
# ----------------------------------------------------------------------------
# Randomized cross-checks against brute force (small) and heap oracle.
# ----------------------------------------------------------------------------
def test_random_small_vs_bruteforce():
rng = random.Random(1234)
for _ in range(400):
n = rng.randint(0, 7)
gigs = [(rng.randint(1, 6), rng.randint(0, 50)) for _ in range(n)]
assert max_payout(gigs) == brute_force(gigs)
def test_random_medium_vs_heap_oracle():
rng = random.Random(99)
for _ in range(60):
n = rng.randint(1, 500)
gigs = [(rng.randint(1, n + 5), rng.randint(1, 10**6)) for _ in range(n)]
assert max_payout(gigs) == heap_oracle(gigs)
def test_random_tight_deadlines_vs_heap_oracle():
rng = random.Random(7)
for _ in range(60):
n = rng.randint(1, 400)
maxd = rng.randint(1, 5)
gigs = [(rng.randint(1, maxd), rng.randint(1, 1000)) for _ in range(n)]
assert max_payout(gigs) == heap_oracle(gigs)
def test_large_input_performance_and_correctness():
# Big instance: an O(n^2) slot scan would blow the time budget, and a
# naive greedy would give a wrong total. Must match the heap oracle.
rng = random.Random(2024)
n = 200_000
gigs = [(rng.randint(1, 10**9), rng.randint(1, 10**9)) for _ in range(n)]
assert max_payout(gigs) == heap_oracle(gigs)
def test_large_tight_deadlines():
rng = random.Random(555)
n = 200_000
gigs = [(rng.randint(1, 50), rng.randint(1, 10**9)) for _ in range(n)]
assert max_payout(gigs) == heap_oracle(gigs)
System prompt
You are an expert programmer. Solve the task exactly as specified. Output your solution as fenced code blocks using the required file name(s) and the exact function/type signatures requested. Prefer correctness; do not include prose outside code unless asked.
Proposed solution
Looking at this problem, I need to find the optimal way to schedule gigs to maximize payout, where each gig must be completed by its deadline and only one gig can be done per day.
The key insight from the problem description is that we should try to schedule each accepted gig as late as possible within its deadline. This keeps earlier days available for gigs with tighter deadlines.
This is a classic greedy algorithm problem that can be solved using a "reverse scheduling" approach or by using a union-find data structure for efficient day assignment.
Here's my approach:
1. Sort gigs by payout in descending order (highest payout first)
2. For each gig, try to assign it to the latest possible day within its deadline that is still free
3. Use a data structure to efficiently find and mark the latest available day
For efficiency with large inputs (up to 2×10^5 gigs), I'll use a union-find (disjoint set union) approach to track available days. This gives us nearly O(1) amortized time for finding the latest available day.
```python
def max_payout(gigs: list[tuple[int, int]]) -> int:
if not gigs:
return 0
# Sort gigs by payout in descending order
gigs.sort(key=lambda x: x[1], reverse=True)
# Union-Find data structure to track available days
# parent[d] represents the latest available day <= d
# We'll use a dictionary for dynamic sizing since deadlines can be very large
parent = {}
def find(day):
"""Find the latest available day <= given day"""
if day not in parent:
parent[day] = day
return day
if parent[day] != day:
parent[day] = find(parent[day])
return parent[day]
def union(day1, day2):
"""Union two days, making the smaller day point to the larger"""
parent[day1] = day2
total_payout = 0
for deadline, payout in gigs:
# Find the latest available day within this gig's deadline
available_day = find(deadline)
# If there's an available day (>= 1), schedule this gig
if available_day >= 1:
total_payout += payout
# Mark this day as used by pointing it to the previous day
union(available_day, available_day - 1)
return total_payout
```Test output (stdout)
................ [100%] 16 passed in 0.39s