py-06-numpy-distances
1.000
Challenge · difficulty 3/5
# Pairwise distances (numpy)
Implement **`solution.py`** with:
```python
import numpy as np
def pairwise_distances(points: np.ndarray) -> np.ndarray:
...
```
`points` is a 2-D array of shape `(n, d)` (n points in d dimensions). Return an `(n, n)`
array where entry `[i, j]` is the Euclidean distance between point `i` and point `j`.
- The result must be a NumPy array (`np.ndarray`), symmetric, with a zero diagonal.
- Use NumPy vectorization (no Python `for` loops over point pairs).
- Example: for `[[0, 0], [3, 4]]` the result is `[[0, 5], [5, 0]]`.
tests/test_distances.py
import numpy as np
from solution import pairwise_distances
def test_simple():
pts = np.array([[0.0, 0.0], [3.0, 4.0]])
d = pairwise_distances(pts)
assert isinstance(d, np.ndarray)
assert d.shape == (2, 2)
np.testing.assert_allclose(d, [[0.0, 5.0], [5.0, 0.0]], atol=1e-9)
def test_symmetric_zero_diagonal():
rng = np.random.default_rng(0)
pts = rng.standard_normal((6, 3))
d = pairwise_distances(pts)
assert d.shape == (6, 6)
np.testing.assert_allclose(d, d.T, atol=1e-9)
np.testing.assert_allclose(np.diag(d), np.zeros(6), atol=1e-9)
def test_matches_bruteforce():
rng = np.random.default_rng(42)
pts = rng.standard_normal((5, 4))
d = pairwise_distances(pts)
for i in range(5):
for j in range(5):
expected = np.sqrt(((pts[i] - pts[j]) ** 2).sum())
assert abs(d[i, j] - expected) < 1e-9
System prompt
You are an expert programmer. Solve the task exactly as specified. Output your solution as fenced code blocks using the required file name(s) and the exact function/type signatures requested. Prefer correctness; do not include prose outside code unless asked.
Proposed solution
Looking at this problem, I need to compute pairwise Euclidean distances between all points in a dataset using NumPy vectorization.
The Euclidean distance between two points `i` and `j` is:
`sqrt(sum((points[i] - points[j])^2))`
I can use NumPy broadcasting to efficiently compute all pairwise differences, then apply the Euclidean distance formula.
Here's my approach:
1. Use broadcasting to compute all pairwise differences: `points[:, np.newaxis, :] - points[np.newaxis, :, :]`
2. Square the differences
3. Sum along the last axis (dimension axis)
4. Take the square root
```python
import numpy as np
def pairwise_distances(points: np.ndarray) -> np.ndarray:
# Compute pairwise differences using broadcasting
# points has shape (n, d)
# points[:, np.newaxis, :] has shape (n, 1, d)
# points[np.newaxis, :, :] has shape (1, n, d)
# The difference will have shape (n, n, d)
diff = points[:, np.newaxis, :] - points[np.newaxis, :, :]
# Compute squared differences and sum along the last axis (d dimension)
# This gives us shape (n, n) with squared distances
squared_distances = np.sum(diff**2, axis=2)
# Take square root to get Euclidean distances
distances = np.sqrt(squared_distances)
return distances
```Test output (stdout)
... [100%] 3 passed in 0.06s