← run

tool-02-calculator

1.000
3/3 tests· tool-calling
Challenge · difficulty 3/5
# Arithmetic via calculator tools

Tests whether the model composes multiple tool calls instead of computing itself.

Given `add(a, b)` and `multiply(a, b)` tools, the model must compute `(3 + 4) * 5` **using
the tools** and report `35`. Scored on: it called `add` with {3,4}, called `multiply` using
the intermediate result and 5, and gave the final answer `35`. Defined in `task.py`.
Proposed solution
TOOL CALLS:
[
 {
  "name": "add",
  "arguments": {
   "a": 3,
   "b": 4
  }
 },
 {
  "name": "multiply",
  "arguments": {
   "a": 7,
   "b": 5
  }
 }
]

FINAL:
The final number is 35.